[Algorithms] Longest Common Substring
2015-06-13 22:36
351 查看
The Longest Common Substring (LCS) problem is as follows:
Given two strings s and t, find the length of the longest string r, which is a substring of both s and t.
This problem is a classic application of Dynamic Programming. Let's define the sub-problem (state) P[i][j] to be the length of the longest substring ends at i of s and j of t. Then the state equations are
P[i][j] = 0 if s[i] != t[j];
P[i][j] = P[i - 1][j - 1] + 1 if s[i] == t[j].
This algorithm gives the length of the longest common substring. If we want the substring itself, we simply find the largest P[i][j] and return s.substr(i - P[i][j] + 1, P[i][j]) or t.substr(j - P[i][j] + 1, P[i][j]).
Then we have the following code.
The above code costs O(m*n) time complexity and O(m*n) space complexity. In fact, it can be optimized to O(min(m, n)) space complexity. The observations is that each time we update dp[i][j], we only need dp[i - 1][j - 1], which is simply the value of the above grid before updates.
Now we will have the following code.
In fact, the code above is of O(m) space complexity. You may choose the small size for cur and repeat the same code using if..else.. to save more spaces :)
Given two strings s and t, find the length of the longest string r, which is a substring of both s and t.
This problem is a classic application of Dynamic Programming. Let's define the sub-problem (state) P[i][j] to be the length of the longest substring ends at i of s and j of t. Then the state equations are
P[i][j] = 0 if s[i] != t[j];
P[i][j] = P[i - 1][j - 1] + 1 if s[i] == t[j].
This algorithm gives the length of the longest common substring. If we want the substring itself, we simply find the largest P[i][j] and return s.substr(i - P[i][j] + 1, P[i][j]) or t.substr(j - P[i][j] + 1, P[i][j]).
Then we have the following code.
string longestCommonSubstring(string s, string t) { int m = s.length(), n = t.length(); vector<vector<int> > dp(m, vector<int> (n, 0)); int start = 0, len = 0; for (int i = 0; i < m; i++) { for (int j = 0; j < n; j++) { if (i == 0 || j == 0) dp[i][j] = (s[i] == t[j]); else dp[i][j] = (s[i] == t[j] ? dp[i - 1][j - 1] + 1: 0); if (dp[i][j] > len) { len = dp[i][j]; start = i - len + 1; } } } return s.substr(start, len); }
The above code costs O(m*n) time complexity and O(m*n) space complexity. In fact, it can be optimized to O(min(m, n)) space complexity. The observations is that each time we update dp[i][j], we only need dp[i - 1][j - 1], which is simply the value of the above grid before updates.
Now we will have the following code.
string longestCommonSubstringSpaceEfficient(string s, string t) { int m = s.length(), n = t.length(); vector<int> cur(m, 0); int start = 0, len = 0, pre = 0; for (int j = 0; j < n; j++) { for (int i = 0; i < m; i++) { int temp = cur[i]; cur[i] = (s[i] == t[j] ? pre + 1 : 0); if (cur[i] > len) { len = cur[i]; start = i - len + 1; } pre = temp; } } return s.substr(start, len); }
In fact, the code above is of O(m) space complexity. You may choose the small size for cur and repeat the same code using if..else.. to save more spaces :)
相关文章推荐
- Google Nexus系列手机和平板的版本信息汇总
- Django教程:[30]DJANGO_SETTINGS_MODULE配置
- GoGo Tester 2.3.9详细使用教程
- golang 学习
- 从苹果logo到冤案的发生 谈人类注意力饱和现象
- beego模板语法 go语言模版语法
- Django模板中的语句
- djangobook记录
- Google Guava Collections 使用介绍
- django 学习笔记
- category.DEFAULT
- Google 为什么要把最重要的秘密开源?
- CloudXNS,智解析你可以做的更好
- 在goroutine里并发调用sshagent出现panic的解决方案
- 用看板做敏捷开发
- 敏捷开发流程及敏捷工具
- gRPC版本的 Google APIs
- 简单的敏捷工具更受敏捷开发团队青睐
- Levenberg–Marquardt algorithm
- 让敏捷工具在敏捷开发中发挥高效作用