Words Concatenation

Felix Lee
4 min readJul 16, 2022

--

Problem Statement

Given a string s and a list of words words of the same length. Return all starting indices of substring(s) that is a concatenation of each word exactly once, in any order, and without any intervening characters.

Example 1
Input: String="catdogcat", Words=["cat", "dog"]
Output: [0, 3]
Explanation: The two substring containing both the words are "catdog" starting at index 0 & "dogcat" starting at index 3.
Example 2
Input: String="acatcatdogcat", Words=["cat", "dog"]
Output: [4, 7]
Explanation: The two substring containing both the words are "catdog" starting at index 1 & "dogcat" starting at index 4.
Example 3
Input: String="catadogcatdogdog", Words=["cat", "dog"]
Output: [4, 7]
Explanation: The substring containing both the words are "dogcat" starting at index 4 and "catdog" at index 7.

Key Rules

  1. The list of words are of the same length. We can use this information(word length) to get each word from string s.
  2. The concatenation of each word should occurs exactly once. So if a word occurs more time in string s than it is in words, we can conclude the current substring is not what we are looking for. (i.e. In Example 2, “cat” at index 1 is not a valid case because “cat” shows up again at index 4)
  3. There should not be any intervening characters. (i.e. In Example 3, there is an “a” at index 3, so “catadog” is not a valid case)

The basic idea here is to:

  1. Loop string s from index 0 using a pointer i
  2. Extract word(substring) from s
  3. Check whether the current word can be found in the list of words (Remember not to violate the problem’s key rules).
  4. If all the words in the list are found (word count is met), we can add the current starting index (pointer i) to our result.

Example 1 Walkthrough

Let’s walk through Example 1 and see how we can solve this problem. Look at the figure below.

  1. The input string s is “catdogcat”.
  2. The word length is 3. This will be helpful when we loop through s to extract word from it.
  3. The word count is 2 . Remember that we are looking for the concatenation of words from the list exactly once.
  4. We are given a list of words. We can store the words in a hash map for easy reference. Let’s call it wordMap .

The code will look something like this.

Remember that there is nothing stopping the list of words having duplicate words like so ["cat","dog","cat"] , so we also store the count in wordMap

Here is a simple flow how the first loop looks like:

Let’s go through step by step.

Loop through string s as we need to get each word from s , starting from index 0, but when do we stop? Do we need to loop until the last index?

Imagine we are at index 6 and we get the word “cat”. Is there any way we can get the concatenation of entire list of words ([“cat","dog"])? Not really. At index 6, we simply do not have enough word to get a valid case. We need 6 characters to make a valid case. We can get the length of concatenation of all the words with wordCount * wordLength .

  1. Inner loop is responsible for getting the word from s , starting at index i
  2. We use foundMap to store the occurrence of each word later.

Let’s complete the rest of the code.

Time and Space Complexity

Time Complexity

The time complexity is O(N * M * L) where N is the length of the input string (the outer loop), M is the number of words in the list, and L is the length of each word.

Space Complexity

We are storing the words in 2 HashMaps. Also the result list may be as long as the input string at the worst case. So the space complexity is O(M + N) where M is words in the HashMap and N is the words in the result list.

--

--

Felix Lee

Software engineer in Singapore. Write random tech stuff. Still figuring out this “life” thing in life.