003 Longest Substring Without Repeating Characters

Question

Given a string, find the length of the longest substring without repeating characters.

Example 1:

Input: "abcabcbb"

Output: 3

Explanation: The answer is "abc", with the length of 3.

Example 2:

Input: "bbbbb"

Output: 1

Explanation: The answer is "b", with the length of 1.

Example 3:

Input: "pwwkew"

Output: 3

Explanation: The answer is "wke", with the length of 3.

Note that the answer must be a substring, "pwke" is a subsequence and not a substring.

Idea

The most intuitive way is to save each character and its index into a map. Every time it encounters a repeating character, let’s say c, remove all characters before c and their indices from the map. Besides, the max length should also be stored and updated accordingly. Well, it does work, but not perfectly. The only reason is we operate each character twice. In the worst case, the time complexity is O(n^2).

So do we really need to check whether a character is in the map so that we could know that whether the character is repeating? Let’s change our mind a little bit. Our previous thought is to check the map to see whether the character is in the substring. That’s why we need to remove all characters that are not in the substring from the map. What if we set a variable s representating the start of substring? In this way, as long as the index is less than s, the corresponding character is not in the substring. We don’t need to perform deletion any more. We only check each character once!

Solution

class Solution {
private:
  int quick_max(int x, int y) { return x ^ ((x ^ y) & -(x < y)); }

public:
  int lengthOfLongestSubstring(string s) {
    vector<int> map(256, -1);
    auto max_len = 0, start = -1;

    for (int i = 0; i < s.size(); ++i) {
      if (map[s[i]] > start) {
        start = map[s[i]];
      }

      map[s[i]] = i;
      max_len = quick_max(max_len, i - start);
    }

    return max_len;
  }
};

There’s a tricky function quick_max to compare two integers and return the larger one. It is bitwise operation. std::max uses conditional statement which impacts pipeline very much. I generally avoid using conditional statement as much as possible to gain better performance. In this case, the comparison happens in every iteration, we do need to use a much more efficient way to make it. This small change greatly improves performance. The running time decreases from 16ms to 8ms. 1X speedup!

001 Two Sum

From now on, I’d like to solve all algorithm questions on Leetcode one by one to help me pass the job hunting interview in the future, no matter for internship or regular job. 🙂 I also pushed my solution on GitHub.

Let’s get started from the first question: Two Sum.

Question

Given an array of integers, return indices of the two numbers such that they add up to a specific target.

You may assume that each input would have exactly one solution, and you may not use the same element twice.

Example:

Given nums = [2, 7, 11, 15], target = 9,
Because nums[0] + nums[1] = 2 + 7 = 9,
return [0, 1].

Idea

The main thought is to traverse the whole vector. For each element, check whether the subtrahend exists in the vector. If yes, then return indices of current element and subtrahend. As a result, we need to maintain a structure that could provide us both the value and its index with a good performance. unordered_map is a best choice here.

We’ve not considered another case where the subtrahend doesn’t exist for the moment. What should we do? Think about it, for example, a + b = c where c is our target. Now we get a and check whether b exists. It doesn’t. In order to let us find a when we reach b, we should store a and its index. That’s it. In this way, we can outline the whole traverse.

Solution

class Solution {
public:
  vector<int> twoSum(vector<int> &nums, int target) {
    unordered_map<int, int> map;

    for (auto i = 0; i < nums.size(); ++i) {
      const auto key = target - nums[i];

      if (map.find(key) != map.end()) {
        return vector<int>({i, map[key]});
      }

      map[nums[i]] = i;
    }

    return vector<int>();
  }
};

Runtime: 4 ms, faster than 99.70% of C++ online submissions for Two Sum. Memory Usage: 10.1 MB, less than 55.86% of C++ online submissions for Two Sum.

Digression

Three years and four months ago, I submitted a solution with same idea but different code. Its running time is 16ms. Wow. Totally identical idea! This could demonstrate how important writing code efficiently. 🙂 Here is the old solution:

class Solution {
public:
    vector<int> twoSum(vector<int>& nums, int target) {
        vector<int> ret;
        if (nums.size() == 0) {
            return ret;
        }
        unordered_map<int, size_t> hash;
        for (int i = 0; i < nums.size(); ++i) {
            int left = target - nums[i];
            if (hash.find(left) != hash.end()) {
                ret.push_back(i);
                ret.push_back(hash[left]);
                return ret;
            } else {
                hash[nums[i]] = i;
            }
        }
        return ret;
    }
};