# Repeated DNA Sequences Problem

## Description

LeetCode Problem 187.

The DNA sequence is composed of a series of nucleotides abbreviated as ‘A’, ‘C’, ‘G’, and ‘T’.

• For example, “ACGAATTCCG” is a DNA sequence.

When studying DNA, it is useful to identify repeated sequences within the DNA.

Given a string s that represents a DNA sequence, return all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any order.

Example 1:

``````1
2
Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"
Output: ["AAAAACCCCC","CCCCCAAAAA"]
``````

Example 2:

``````1
2
Input: s = "AAAAAAAAAAAAA"
Output: ["AAAAAAAAAA"]
``````

Constraints:

• 1 <= s.length <= 10^5
• s[i] is either ‘A’, ‘C’, ‘G’, or ‘T’.

## Sample C++ Code

``````1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class Solution {
public:
vector<string> findRepeatedDnaSequences(string s) {
int l = s.size();
vector<string> ans;
if (l < 10)
return ans;

unordered_map<string, int> ht;
string subs;
for (int i = 0; i < l - 10 + 1; i ++) {
subs = s.substr(i, 10);
if (ht.find(subs) == ht.end()) {
ht[subs] = 0;
}
ht[subs] ++;
if (ht[subs] == 2)
ans.push_back(subs);

}
return ans;

}
};
``````