LeetCode解题报告(30)--双指针找拼接子字符串

原题如下: >You are given a string, s, and a list of words, words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters. >For example, given: s: "barfoothefoobarman" words: ["foo", "bar"]

You should return the indices: [0,9]. (order does not matter).

题目不难理解,实现也不难,关键是时间复杂度的控制。下面会讲述两种方法,第一种方法时间复杂度\(O((n-k)*k)\),其中n为s的长度,k为所有的words拼接起来后的长度,这种方法提交时在TLE和AC间徘徊(即偶尔AC,偶尔TLE)。第二种方法时间复杂度为O(n),提交的时候能够AC。

方法一

思路

设wordLen为给出的words列表中每个词的长度,wordNum为给出的words列表的长度,n为给出的s的长度。

那么可以从头开始遍历s直到n-wordLen*wordNum+1,每次遍历用一个字典存储遍历中遇到的属于words的词语,并检查两者对应的词语的数量关系。遇到不属于words的词语或者字典中的某一词语的数量大于words中该词语的数量则跳出执行下一次的遍历

这种方法的时间复杂度\(O((n-k)*k)\),n为s的长度。提交时偶尔TLE,偶尔AC,AC时显示的时间是876ms。 ### 实现代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
class Solution(object):
def findSubstring(self, s, words):
"""
:type s: str
:type words: List[str]
:rtype: List[int]
"""
# 注意words中的词可能会重复
wordDict = {}
for word in words:
if word in wordDict:
wordDict[word] += 1
else:
wordDict[word] = 1

wordLen = len(words[0])
result = []

for i in range(len(s)-wordLen*wordNum+1):
tmp = s[i:i+wordLen]
tmpDict = {} # 存储words的临时list
if tmp in words:
head = i
while tmp in words and i < len(s)-wordLen+1:
if tmp in tmpDict:
tmpDict[tmp] += 1
else:
tmpDict[tmp] = 1

if tmpDict[tmp] > wordDict[tmp]:
tmpDict[tmp] -= 1
break

i += wordLen
tmp = s[i:i+wordLen]
if tmpDict == wordDict:
result.append(head)
else:
continue
return result

方法二

思路

这种方法参考了这里,主要思想是利用了双指针围着当前符合words(包括数量和种类)中的连续子字符串,同时计算围着的子字符串中word的数量。

巧妙的地方在于双指针当前包围的子字符串中如果含有某个word的数量大于words中这个word的数量时,移动左指针直到把这个word的数量减少。

这种方法的时间复杂度为O(n),提交时AC的时间约100ms

实现代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
class Solution(object):
def findSubstring(self, s, words):
"""
:type s: str
:type words: List[str]
:rtype: List[int]
"""
# 注意words中的词可能会重复
wordDict = {}
for word in words:
if word in wordDict:
wordDict[word] += 1
else:
wordDict[word] = 1

wordLen = len(words[0])
wordNum = len(words)
wordSet = set(words)
sLen = len(s)
result = []

for i in range(wordLen):
left = i
tmpDict = {}
count = 0
j=i
while j < sLen-wordLen+1:
tmp = s[j:j+wordLen]
j += wordLen
if tmp in wordSet:
if tmp in tmpDict:
tmpDict[tmp]+=1
else:
tmpDict[tmp]=1

if tmpDict[tmp]<=wordDict[tmp]:
count+=1
else:
# 某个词的数量比wordDict中规定的要多了,往右移动左指针
while tmpDict[tmp] > wordDict[tmp]:
t = s[left:left+wordLen]
left += wordLen
tmpDict[t] -= 1
if tmpDict[t] < wordDict[t]:
count -= 1
# 仅去掉最左边的一个word,再往右找
if count == wordNum:
result.append(left)
t = s[left:left+wordLen]
tmpDict[t] -= 1
left += wordLen
count -= 1
# 当前的词不在wordDict
else:
tmpDict = {}
count = 0
left = j
return result