您的位置:首页 > 产品设计 > UI/UE

leetcode:Repeated DNA Sequences

2016-03-13 17:32 447 查看
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].


Subscribe to see which companies asked this question

class Solution {

private:
int char2val(char c) {
switch (c) {
case 'A': return 0;
case 'C': return 1;
case 'G': return 2;
case 'T': return 3;
}
}

public:
vector<string> findRepeatedDnaSequences(string s) {

vector<string> ans;

if (s.size() <= 10)
return ans;

//map<int, int> hashTable;
char  hashTable[1048576] = {0};
int len = s.size();
int hashValue = 0;
int mask = (1<<20)-1;

for (int i=0; i<9; i++)
hashValue = (hashValue << 2) | char2val(s[i]);

for (int i=9; i<len; i++)
{
hashValue = (hashValue << 2 | char2val(s[i])) & mask;
if (hashTable[hashValue]++ == 1)
{
ans.push_back(s.substr(i-9, 10));
}
}

return ans;

/*
char  hashMap[1048576] = {0};
vector<string> ans;
int len = s.size(),hashNum = 0;
if (len < 11) return ans;
for (int i = 0;i < 9;++i)
hashNum = hashNum << 2 | (s[i] - 'A' + 1) % 5;
for (int i = 9;i < len;++i)
if (hashMap[hashNum = (hashNum << 2 | (s[i] - 'A' + 1) % 5) & 0xfffff]++ == 1)
ans.push_back(s.substr(i-9,10));
return ans;
*/
}
};
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: