Tags

  • AWS (7)
  • Apigee (3)
  • ArchLinux (5)
  • Array (6)
  • Backtracking (6)
  • BinarySearch (6)
  • C++ (19)
  • CI&CD (3)
  • Calculus (2)
  • DesignPattern (43)
  • DisasterRecovery (1)
  • Docker (8)
  • DynamicProgramming (20)
  • FileSystem (11)
  • Frontend (2)
  • FunctionalProgramming (1)
  • GCP (1)
  • Gentoo (6)
  • Git (15)
  • Golang (1)
  • Graph (10)
  • GraphQL (1)
  • Hardware (1)
  • Hash (1)
  • Kafka (1)
  • LinkedList (13)
  • Linux (27)
  • Lodash (2)
  • MacOS (3)
  • Makefile (1)
  • Map (5)
  • MathHistory (1)
  • MySQL (21)
  • Neovim (10)
  • Network (66)
  • Nginx (6)
  • Node.js (33)
  • OpenGL (6)
  • PriorityQueue (1)
  • ProgrammingLanguage (9)
  • Python (10)
  • RealAnalysis (20)
  • Recursion (3)
  • Redis (1)
  • RegularExpression (1)
  • Ruby (19)
  • SQLite (1)
  • Sentry (3)
  • Set (4)
  • Shell (3)
  • SoftwareEngineering (12)
  • Sorting (2)
  • Stack (4)
  • String (2)
  • SystemDesign (13)
  • Terraform (2)
  • Tree (24)
  • Trie (2)
  • TwoPointers (16)
  • TypeScript (3)
  • Ubuntu (4)
  • Home

    [LeetCode 10] Regular Expression Matching

    Published Oct 27, 2021 [  DynamicProgramming  ]

    Problem

    Given an input string s and a pattern p, implement regular expression matching with support for . and * where:

    • . Matches any single character.​​​​
    • * Matches zero or more of the preceding element.

    The matching should cover the entire input string (not partial).

    Example 1:

    Input: s = "aa", p = "a"
    Output: false
    Explanation: "a" does not match the entire string "aa".
    

    Example 2:

    Input: s = "aa", p = "a*"
    Output: true
    Explanation: '*' means zero or more of the preceding element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".
    

    Example 3:

    Input: s = "ab", p = ".*"
    Output: true
    Explanation: ".*" means "zero or more (*) of any character (.)".
    

    Example 4:

    Input: s = "aab", p = "c*a*b"
    Output: true
    Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore, it matches "aab".
    

    Example 5:

    Input: s = "mississippi", p = "mis*is*p*."
    Output: false
    

    Constraints:

    • 1 <= s.length <= 20
    • 1 <= p.length <= 30
    • s contains only lowercase English letters.
    • p contains only lowercase English letters, ., and *.
    • It is guaranteed for each appearance of the character *, there will be a previous valid character to match.

    Thoughts

    Recursive

    If there were no *, we can simply check from left to right if each character of the text matches the pattern.

    def match(text, pattern):
        if not pattern: return not text
        first_match = bool(text) and pattern[0] in {text[0], '.'}
        return first_match and match(text[1:], pattern[1:])
    

    If a * is present in the pattern, it will be in the second position pattern[1]. Then, we may ignore this part of the pattern, or delete a matching character. If we have a match on the remaining strings after any of these operations, then the initial inputs matched.

    class Solution(object):
        def isMatch(self, text, pattern):
            if not pattern:
                return not text
    
            first_match = bool(text) and pattern[0] in {text[0], '.'}
    
            if len(pattern) >= 2 and pattern[1] == '*':
                return (self.isMatch(text, pattern[2:]) or
                        first_match and self.isMatch(text[1:], pattern))
            else:
                return first_match and self.isMatch(text[1:], pattern[1:])
    

    Dynamic Programming

    As the problem has optimal substructure, it is natural to cache intermediate results. Because calls will only ever be made to match(test[i:], pattern[j:]), we use dp(i, j) to handle those calls instead, saving us expensive string-building operations and allowing us to cache the intermediate results

    Top-Down

    class Solution(object):
        def isMatch(self, text, pattern):
            memo = {}
            def dp(i, j):
                if (i, j) not in memo:
                    if j == len(pattern):
                        ans = i == len(text)
                    else:
                        first_match = i < len(text) and pattern[j] in {text[i], '.'}
                        if j+1 < len(pattern) and pattern[j+1] == '*':
                            ans = dp(i, j+2) or first_match and dp(i+1, j)
                        else:
                            ans = first_match and dp(i+1, j+1)
    
                    memo[i, j] = ans
                return memo[i, j]
    
            return dp(0, 0)
    
    function isMatch(s: string, p: string): boolean {
        // the memo object should be a global one
        const memo = {};
    
        // i, j are indexes for s & p
        const dp = (i, j) => {
            // make unique string key here
            const key = i + ',' + j
    
            // if key is not in memo, update it
            if(!(key in memo)) {
                let res;
    
                // final result
                // if we are at the pattern end, i should also be at string end for it to match
                if (j === p.length) {
                    res = i === s.length
                } else {
                    // check valid i & string match on pattern
                    const firstMatch = i < s.length && [s[i], '.'].includes(p[j])
    
                    // if we have '*', either ignore previous, or count once
                    if(j + 1 < p.length && p[j + 1] === '*') {
                        res = dp(i, j + 2) || (firstMatch && dp(i + 1, j))
                    } else {
                        // if we dont have the '*' check firstMatch & remaining
                        res = firstMatch && dp(i + 1, j + 1);
                    }
                }
                memo[key] = res;
            }
    
            // return the value with key
            return memo[key];
        };
    
        // let's start at the beginning
        return dp(0, 0);
    }
    

    Bottom-Up

    class Solution(object):
        def isMatch(self, text, pattern):
            dp = [[False] * (len(pattern) + 1) for _ in range(len(text) + 1)]
    
            dp[-1][-1] = True
            for i in range(len(text), -1, -1):
                for j in range(len(pattern) - 1, -1, -1):
                    first_match = i < len(text) and pattern[j] in {text[i], '.'}
                    if j+1 < len(pattern) and pattern[j+1] == '*':
                        dp[i][j] = dp[i][j+2] or first_match and dp[i+1][j]
                    else:
                        dp[i][j] = first_match and dp[i+1][j+1]
    
            return dp[0][0]
    

    Reference