LeetCode解题报告(30)--双指针找拼接子字符串

发表于 2016-04-04 标签 python ，双指针

原题如下： >You are given a string, s, and a list of words, words, that are all of the same length. Find all starting indices of substring(s) in s that is a concatenation of each word in words exactly once and without any intervening characters. >For example, given: s: "barfoothefoobarman" words: ["foo", "bar"]

You should return the indices: [0,9]. (order does not matter).

题目不难理解，实现也不难，关键是时间复杂度的控制。下面会讲述两种方法，第一种方法时间复杂度\(O((n-k)*k)\),其中n为s的长度，k为所有的words拼接起来后的长度，这种方法提交时在TLE和AC间徘徊（即偶尔AC，偶尔TLE）。第二种方法时间复杂度为O(n)，提交的时候能够AC。

方法一

思路

设wordLen为给出的words列表中每个词的长度，wordNum为给出的words列表的长度，n为给出的s的长度。

那么可以从头开始遍历s直到n-wordLen*wordNum+1，每次遍历用一个字典存储遍历中遇到的属于words的词语，并检查两者对应的词语的数量关系。遇到不属于words的词语或者字典中的某一词语的数量大于words中该词语的数量则跳出执行下一次的遍历

这种方法的时间复杂度\(O((n-k)*k)\)，n为s的长度。提交时偶尔TLE，偶尔AC，AC时显示的时间是876ms。 ### 实现代码：

class Solution(object):
    def findSubstring(self, s, words):
        """
        :type s: str
        :type words: List[str]
        :rtype: List[int]
        """
        # 注意words中的词可能会重复
        wordDict = {}
        for word in words:
            if word in wordDict:
                wordDict[word] += 1
            else:
                wordDict[word] = 1
        
        wordLen = len(words[0])
        result = []
        
        for i in range(len(s)-wordLen*wordNum+1):
            tmp = s[i:i+wordLen]
            tmpDict = {} # 存储words的临时list
            if tmp in words:                
                head = i
                while tmp in words and i < len(s)-wordLen+1:            
                    if tmp in tmpDict:
                        tmpDict[tmp] += 1
                    else:
                        tmpDict[tmp] = 1
                    
                    if tmpDict[tmp] > wordDict[tmp]:
                        tmpDict[tmp] -= 1
                        break

                    i += wordLen
                    tmp = s[i:i+wordLen]
                if tmpDict == wordDict:
                    result.append(head)
            else:
                continue         
        return result

方法二

思路

这种方法参考了这里,主要思想是利用了双指针围着当前符合words（包括数量和种类）中的连续子字符串，同时计算围着的子字符串中word的数量。

巧妙的地方在于双指针当前包围的子字符串中如果含有某个word的数量大于words中这个word的数量时，移动左指针直到把这个word的数量减少。

这种方法的时间复杂度为O(n),提交时AC的时间约100ms

实现代码

class Solution(object):
    def findSubstring(self, s, words):
        """
        :type s: str
        :type words: List[str]
        :rtype: List[int]
        """
        # 注意words中的词可能会重复
        wordDict = {}
        for word in words:
            if word in wordDict:
                wordDict[word] += 1
            else:
                wordDict[word] = 1
        
        wordLen = len(words[0])
        wordNum = len(words)
        wordSet = set(words)
        sLen = len(s)
        result = []
        
        for i in range(wordLen): 
            left = i
            tmpDict = {}
            count = 0
            j=i
            while j < sLen-wordLen+1: 
                tmp = s[j:j+wordLen]
                j += wordLen
                if tmp in wordSet:
                    if tmp in tmpDict:
                        tmpDict[tmp]+=1
                    else:
                        tmpDict[tmp]=1
                    
                    if tmpDict[tmp]<=wordDict[tmp]:
                        count+=1
                    else:
                        # 某个词的数量比wordDict中规定的要多了，往右移动左指针
                        while tmpDict[tmp] > wordDict[tmp]:
                            t = s[left:left+wordLen]
                            left += wordLen
                            tmpDict[t] -= 1
                            if tmpDict[t] < wordDict[t]:
                                count -= 1
                    # 仅去掉最左边的一个word，再往右找
                    if count == wordNum:
                        result.append(left)
                        t = s[left:left+wordLen]
                        tmpDict[t] -= 1
                        left += wordLen
                        count -= 1  
                # 当前的词不在wordDict                        
                else:
                    tmpDict = {}
                    count = 0 
                    left = j
        return result