Repeated DNA Sequences Problem


Description

LeetCode Problem 187.

The DNA sequence is composed of a series of nucleotides abbreviated as ‘A’, ‘C’, ‘G’, and ‘T’.

  • For example, “ACGAATTCCG” is a DNA sequence.

When studying DNA, it is useful to identify repeated sequences within the DNA.

Given a string s that represents a DNA sequence, return all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any order.

Example 1:

1
2
Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"
Output: ["AAAAACCCCC","CCCCCAAAAA"]

Example 2:

1
2
Input: s = "AAAAAAAAAAAAA"
Output: ["AAAAAAAAAA"]

Constraints:

  • 1 <= s.length <= 10^5
  • s[i] is either ‘A’, ‘C’, ‘G’, or ‘T’.


Sample C++ Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class Solution {
public:
    vector<string> findRepeatedDnaSequences(string s) {
        int l = s.size();
        vector<string> ans;
        if (l < 10)
            return ans;
        
        unordered_map<string, int> ht;
        string subs;
        for (int i = 0; i < l - 10 + 1; i ++) {
            subs = s.substr(i, 10);
            if (ht.find(subs) == ht.end()) {
                ht[subs] = 0;
            } 
            ht[subs] ++;
            if (ht[subs] == 2)
                ans.push_back(subs);
                
        }
        return ans;
            
    }
};




Related Posts

Repeated DNA Sequences Problem

LeetCode 187. The DNA sequence is composed of a series...

Majority Element Problem

LeetCode 169. Given an array nums of size n, return...

Fraction To Recurring Decimal Problem

LeetCode 166. Given two integers representing the numerator and denominator...

Max Points On A Line Problem

LeetCode 149. Given an array of points where points[i] =...

Lru Cache Problem

LeetCode 146. Design a data structure that follows the constraints...

Longest Consecutive Sequence Problem

LeetCode 128. Given an unsorted array of integers nums, return...