Leetcode NO.187 Repeated DNA Sequences
2015-05-20 03:10
351 查看
本题题目要求如下:
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
其实本题的思路还是很简单的。。
就是hashmap,每数十个字母,就检查一下hashmap有没有相同元素,如果有的话,这个就是需要返回的元素。。。。截止到这里,只能算是简单题里面的中下水平。。。实际上,这道题还是有一定难度的,比如,如果你按以上写法,时间上没有问题,但是会超过memory limit,论坛里的几乎所有解法都是优化hashmap的key。。我的方法估计一般,感觉运行时间比别人慢很多,但是也不失为一种思路。。。。
因为比如key是["AAAAACCCCC"]这种字符串,会很耗空间,所以,我把AAAAACCCCC转化为0000011111,这个是4进制表达(A->0, C->1, G->2, T->3),这样将会大幅度减小hashmap所占的空间。。。。
完整代码如下:
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].
其实本题的思路还是很简单的。。
就是hashmap,每数十个字母,就检查一下hashmap有没有相同元素,如果有的话,这个就是需要返回的元素。。。。截止到这里,只能算是简单题里面的中下水平。。。实际上,这道题还是有一定难度的,比如,如果你按以上写法,时间上没有问题,但是会超过memory limit,论坛里的几乎所有解法都是优化hashmap的key。。我的方法估计一般,感觉运行时间比别人慢很多,但是也不失为一种思路。。。。
因为比如key是["AAAAACCCCC"]这种字符串,会很耗空间,所以,我把AAAAACCCCC转化为0000011111,这个是4进制表达(A->0, C->1, G->2, T->3),这样将会大幅度减小hashmap所占的空间。。。。
完整代码如下:
class Solution { public: vector<string> findRepeatedDnaSequences(string s) { vector<string> res; if (s.length() < 11) return res; unordered_set<int> hashset; for (int i = 0; i <= s.length() - 10; ++i) { string target = s.substr(i, 10); auto found = hashset.find(hash(target)); if (found != hashset.end()) { auto it = find(res.begin(), res.end(), target); if (it == res.end()) { res.push_back(target); } } else { hashset.insert(hash(target)); } } return res; } private: int hash(string str) { int ret = 0; for (int i = 0; i < 10; ++i) { switch (str[i]) { case 'A': ret = 4 * ret + 0; break; case 'C': ret = 4 * ret + 1; break; case 'G': ret = 4 * ret + 2; break; case 'T': ret = 4 * ret + 3; break; } } return ret; } };
相关文章推荐
- LeetCode Repeated DNA Sequences
- leetcode 187: Repeated DNA Sequences
- Repeated DNA Sequences|leetcode题解
- Repeated DNA Sequences [leetcode]
- leetcode -- Repeated DNA Sequences -- 简单但要注意
- [leetcode] Repeated DNA Sequences
- Leetcode: Repeated DNA Sequences
- Leetcode 187 Repeated DNA Sequences
- [LeetCode187]Repeated DNA Sequences
- [LeetCode] Repeated DNA Sequences
- 【Leetcode】Repeated DNA Sequences
- leetcode之Repeated DNA Sequences
- leetcode---Repeated DNA Sequences---重复子串
- [LeetCode] Repeated DNA Sequences hash map
- [LeetCode]Repeated DNA Sequences Total
- LeetCode187——Repeated DNA Sequences
- [LeetCode]Repeated DNA Sequences
- [LeetCode] Repeated DNA Sequences
- 【leetcode】Repeated DNA Sequences
- Leetcode:Repeated DNA Sequences