【Uva 1368】 DNA Consensus String
2015-02-20 16:30
344 查看
Description
Figure 1.
DNA (Deoxyribonucleic Acid) is the molecule which contains the genetic instructions. It consists of four different nucleotides, namely Adenine, Thymine, Guanine, and Cytosine as shown in Figure 1. If we represent a nucleotide by its initial character, a DNA strand can be regarded as a long string (sequence of characters) consisting of the four characters A, T, G, and C. For example, assume we are given some part of a DNA strand which is composed of the following sequence of nucleotides:
‘‘Thymine−Adenine−Adenine−Cytosine−Thymine−Guanine−Cytosine−Cytosine−Guanine−Adenine−Thymine" ``Thymine-Adenine-Adenine-Cytosine-Thymine-Guanine-Cytosine-Cytosine-Guanine-Adenine-Thymine"
Then we can represent the above DNA strand with the string “TAACTGCCGAT.” The biologist Prof. Ahn found that a gene X commonly exists in the DNA strands of five different kinds of animals, namely dogs, cats, horses, cows, and monkeys. He also discovered that the DNA sequences of the gene X from each animal were very alike. See Figure 2.
Figure 2 | DNA sequence of gene X |
---|---|
Cat: | GCATATGGCTGTGCA |
Dog: | GCAAATGGCTGTGCA |
Horse: | GCTAATGGGTGTCCA |
Cow: | GCAAATGGCTGTGCA |
Monkey: | GCAAATCGGTGAGCA |
Input
Your program is to read from standard input. The input consists of T test cases. The number of test cases T is given in the first line of the input. Each test case starts with a line containing two integers m and n which are separated by a single space. The integer m(4≤ \lem≤ \le50) represents the number of DNA sequences and n(4≤ \len≤ \le1000) represents the length of the DNA sequences, respectively. In each of the next m lines, each DNA sequence is given.Output
Your program is to write to standard output. Print the consensus string in the first line of each case and the consensus error in the second line of each case. If there exists more than one consensus string, print the lexicographically smallest consensus string. The following shows sample input and output for three test cases.Sample Input
35 8
TATGATAC
TAAGCTAC
AAAGATCC
TGAGATAC
TAAGATGT
4 10
ACGTACGTAC
CCGTACGTAG
GCGTACGTAT
TCGTACGTAA
6 10
ATGTTACCAT
AAGTTACGAT
AACAAAGCAA
AAGTTACCTT
AAGTTACCAA
TACTTACCAA
Sample Output
TAAGATAC7
ACGTACGTAA
6
AAGTTACCAA
12
解题思路
【genetic】遗传的【nucleotides】核苷酸
【DNA strand】DNA链
求一DNA片段X与所给的所有片段 误差之和最小,误差相同则输出字典序最小的那条。
不难发现,要使得误差最小,则X每个位置上的元素 应该等于 所有片段在相同位置上出现次数最多的元素,次数相同选字典序小的。
比如:
A C T C
A G T G
C G T A
第一位 A(2次) C(1次) -> A
第二位 G(2次) C(1次) -> G
第三位 T(3次) -> T
第四位 都出现一次 -> 选字典序小的 -> A
因此 X = AGTA
参考代码
#include <stdio.h> #include <string.h> struct { char c[4]; int num[4]; }DNA[1010]; char s[55][1010],v[128]; int main() { v['A'] = 0,v['G'] = 1; v['C'] = 2,v['T'] = 3; int T,m,n,i,j,k,x,y; char str[1010]; scanf("%d",&T); while (T--){ scanf("%d%d",&m,&n); for (i = 0;i < m;i++) scanf("%s",s[i]); memset(DNA,0,sizeof(DNA)); for (i = 0;i < n;i++) for (j = 0;j < m;j++){ DNA[i].c[v[s[j][i]]] = s[j][i]; DNA[i].num[v[s[j][i]]]++; }//统计A\G\C\T出现的次数 for (i = 0;i < n;i++){ for (j = 0;j < 3;j++) for (k = 0;k < 3-j;k++){//排序 if ((DNA[i].num[k] < DNA[i].num[k+1]) || ((DNA[i].num[k] == DNA[i].num[k+1]) && (DNA[i].c[k] > DNA[i].c[k+1]))){ x = DNA[i].num[k] ; DNA[i].num[k] = DNA[i].num[k+1]; DNA[i].num[k+1] = x; y = DNA[i].c[k]; DNA[i].c[k] = DNA[i].c[k+1]; DNA[i].c[k+1] = y; } } } for (i = 0;i < n;i++) str[i] = DNA[i].c[0];//出现次数最多的都排在了DNA[i].c[0] str[i] = 0; printf("%s\n",str); int cnt = 0; for (i = 0;i < m;i++) for (j = 0;j < n;j++) if (str[j] != s[i][j]) cnt++;//计算Hamming distance printf("%d\n",cnt); } return 0; }
相关文章推荐
- UVA - 1368 - DNA Consensus String (字符串处理)
- LA 3602 UVA 1368 - DNA Consensus String
- DNA Consensus String UVA - 1368
- uva 1368 DNA Consensus String 字符串
- UVa 1368 - DNA Consensus String
- UVA - 1368 DNA Consensus String
- UVa 1368:DNA Consensus String
- UVa 1368 - DNA Consensus String
- 习题3-7 UVa1368 DNA Consensus String
- UVA 1368 DNA Consensus String【ACM/ICPC Seoul 2006】
- DNA Consensus String UVA - 1368
- UVa 1368 - DNA Consensus String
- UVa1368 DNA Consensus String
- UVa 1368 DNA Consensus String (DNA序列)
- UVA-1368-DNA Consensus String 基础题 贪心 模拟 详细注释
- UVA 1368 - DNA Consensus String(贪心)
- uva 1368 DNA Consensus String
- 《算法竞赛入门经典2ndEdition 》习题3-7 DNA序列(DNA Consensus String, Uva1368)
- Uva 1368 DNA Consensus String
- Uva.1368 DNA Consensus String