Repeated DNA Sequences Problem


Description

LeetCode Problem 187.

The DNA sequence is composed of a series of nucleotides abbreviated as ‘A’, ‘C’, ‘G’, and ‘T’.

  • For example, “ACGAATTCCG” is a DNA sequence.

When studying DNA, it is useful to identify repeated sequences within the DNA.

Given a string s that represents a DNA sequence, return all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any order.

Example 1:

1
2
Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"
Output: ["AAAAACCCCC","CCCCCAAAAA"]

Example 2:

1
2
Input: s = "AAAAAAAAAAAAA"
Output: ["AAAAAAAAAA"]

Constraints:

  • 1 <= s.length <= 10^5
  • s[i] is either ‘A’, ‘C’, ‘G’, or ‘T’.


Sample C++ Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class Solution {
public:
    vector<string> findRepeatedDnaSequences(string s) {
        int l = s.size();
        vector<string> ans;
        if (l < 10)
            return ans;
        
        unordered_map<string, int> ht;
        string subs;
        for (int i = 0; i < l - 10 + 1; i ++) {
            subs = s.substr(i, 10);
            if (ht.find(subs) == ht.end()) {
                ht[subs] = 0;
            } 
            ht[subs] ++;
            if (ht[subs] == 2)
                ans.push_back(subs);
                
        }
        return ans;
            
    }
};




Related Posts

Bulls And Cows Problem

LeetCode 299. You are playing the Bulls and Cows game...

Repeated DNA Sequences Problem

LeetCode 187. The DNA sequence is composed of a series...

Majority Element Problem

LeetCode 169. Given an array nums of size n, return...

Fraction To Recurring Decimal Problem

LeetCode 166. Given two integers representing the numerator and denominator...

Max Points On A Line Problem

LeetCode 149. Given an array of points where points[i] =...

Lru Cache Problem

LeetCode 146. Design a data structure that follows the constraints...