3. Longest Substring Without Repeating Characters

View question in leetcode

Medium

30.7%

hash-table

two-pointers

string

sliding-window

Problem

Given a string s, find the length of the longest substring without repeating characters.

Example 1:

Input: s = "abcabcbb"
Output: 3
Explanation: The answer is "abc", with the length of 3.

Example 2:

Input: s = "bbbbb"
Output: 1
Explanation: The answer is "b", with the length of 1.

Example 3:

Input: s = "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3.
Notice that the answer must be a substring, "pwke" is a subsequence and not a substring.

Example 4:

Input: s = ""
Output: 0

Constraints:

0 <= s.length <= 5 * 10⁴
s consists of English letters, digits, symbols and spaces.

Discussion

Trial 1

Trial 1 gives a rather naive approach. We loop through characters in input string to search for the longest substring, and we use a hash map to save the characters for duplicate character detection.

When duplicate character is found we have to remove the character in hashmap until the duplicated character is removed. We have to first locate all the keys in the hash map and scan for the first one for each removal. Which is performing badly.

Thus this algorithm gives time complexity O(n^2) and space complexity O(n).

class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        result = 0
        cache = {}

        for i in s:
            if i in cache:
                result = max(result, len(cache))
                while i in cache:
                    del cache[list(cache.keys())[0]]

            cache[i] = 0

        return max(result, len(cache))

Trial 2

Trial 2 improves Trial 1 with a starting pointer. Again we loop through characters in input string to search fir the longest substring, with a hash map saving the characters. Besides we have a starting pointer to track the start of the substring, (we can also interpret this approach as Sliding Window).

Ths benefit is that we no longer have to track the length of the substring by the dict itself.

This algorithm improves time complexity to O(2n) = O(n) and space complexity remains O(n).

class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        start_ptr = 0
        cache = {}
        i = 0
        result = 0

        while i < len(s):
            if s[i] in cache:
                del cache[s[start_ptr]]
                start_ptr += 1

            else:
                cache[s[i]] = 0
                i += 1
                result = max(result, i - start_ptr)

        return result

Solution

The final solution further improves Trial 2 by saving the position of the character in the hash map. So when duplicated character is found, no removal in the hash map takes place. Instead we move the starting pointer to the position right after the duplicated character's latest position.

if char in map:
    if map[char] >= start_ptr:
        start_ptr = map[char] + 1

Such operation is O(1) compared to O(n) in the above two approaches.

class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        start_ptr = 0
        cache = {}
        i = 0
        result = 0

        while i < len(s):
            char = s[i]

            if char in cache:
                if cache[char] >= start_ptr:
                    start_ptr = cache[char] + 1

            result = max(result, i - start_ptr + 1)
            cache[char] = i
            i += 1

        return result

Complexity Analysis

Time Complexity: O(n), as through the looping in the input string, only atomic pointer update takes place when duplicated character is found.
Space Complexity: O(n), as we only maintain the hash map with each characters' value and position