003 Longest Substring Without Repeating Characters

Question

Given a string, find the length of the longest substring without repeating characters.

Example 1:

Input: "abcabcbb"

Output: 3

Explanation: The answer is "abc", with the length of 3.

Example 2:

Input: "bbbbb"

Output: 1

Explanation: The answer is "b", with the length of 1.

Example 3:

Input: "pwwkew"

Output: 3

Explanation: The answer is "wke", with the length of 3.

Note that the answer must be a substring, "pwke" is a subsequence and not a substring.

Idea

The most intuitive way is to save each character and its index into a map. Every time it encounters a repeating character, let’s say c, remove all characters before c and their indices from the map. Besides, the max length should also be stored and updated accordingly. Well, it does work, but not perfectly. The only reason is we operate each character twice. In the worst case, the time complexity is O(n^2).

So do we really need to check whether a character is in the map so that we could know that whether the character is repeating? Let’s change our mind a little bit. Our previous thought is to check the map to see whether the character is in the substring. That’s why we need to remove all characters that are not in the substring from the map. What if we set a variable s representating the start of substring? In this way, as long as the index is less than s, the corresponding character is not in the substring. We don’t need to perform deletion any more. We only check each character once!

Solution

class Solution {
private:
  int quick_max(int x, int y) { return x ^ ((x ^ y) & -(x < y)); }

public:
  int lengthOfLongestSubstring(string s) {
    vector<int> map(256, -1);
    auto max_len = 0, start = -1;

    for (int i = 0; i < s.size(); ++i) {
      if (map[s[i]] > start) {
        start = map[s[i]];
      }

      map[s[i]] = i;
      max_len = quick_max(max_len, i - start);
    }

    return max_len;
  }
};

There’s a tricky function quick_max to compare two integers and return the larger one. It is bitwise operation. std::max uses conditional statement which impacts pipeline very much. I generally avoid using conditional statement as much as possible to gain better performance. In this case, the comparison happens in every iteration, we do need to use a much more efficient way to make it. This small change greatly improves performance. The running time decreases from 16ms to 8ms. 1X speedup!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s