您的位置:首页 > 其它

基因数据处理11之sam文件格式

2016-03-13 16:44 302 查看
基因数据处理11之sam文件格式

SAM的全称是sequence alignment map format。而BAM就是SAM的二进制文件(B取自binary)

1. read名称

2. SAM标记

3. chromosome

4. 5′端起始位置

5. MAPQ(mapping quality,描述比对的质量,数字越大,特异性越高)

6. CIGAR字串,记录插入,删除,错配以及splice junctions(后剪切拼接的接头)

7. mate名称,记录mate pair信息

8. mate的位置

9. 模板的长度

10. read序列

11. read质量

12. 程序用标记

样例:

hadoop@Master:~/cloud/adam/xubo/data/test20160310$ samtools view SRR003161h20.sam
SRR003161.1 0 chr1 143217889 0 4S35M85S * 0 0 TCAGATGCAATCATCGAATGGTCTCGAATGGAATCNTCTANAGAGATGGAATGTATCNCTCGCCANACGACACNCGAACAGGGNAAGGCAAGCAGNAGGNAGNNNANNNNNNNNNNNNNNNNNN AAAAAAAAAAAAAAAA:::BAAFAABAAB?>>=44!39=<!:866699888220862!08:8002!0200000!022200800!20660000600!000!06!!!6!!!!!!!!!!!!!!!!!! NM:i:1 MD:Z:31A3 AS:i:3XS:i:33 XA:Z:chr10,+42092546,4S35M85S,1;chr1,+143217421,4S35M85S,1;chr1,+143239587,4S35M85S,1;chr1,-143252938,85S35M4S,1;chr1,+143220601,4S35M85S,1;chr1,+143219665,4S35M85S,1;chr1,-143210830,85S35M4S,1;chr10,+42075371,4S35M85S,1;chr10,+42101425,4S35M85S,1;chr1,+143272381,4S35M85S,1;chr1,-143204112,85S35M4S,1;chr1,+143189975,4S35M85S,1;chr10,+42080829,4S35M85S,1;chr10,+42067652,4S35M85S,1;chr1,+143236600,4S35M85S,1;chr10,+42071261,4S35M85S,1;chr1,+143202568,4S35M85S,1;chr1,+143262016,4S35M85S,1;chr10,+42094445,4S35M85S,1;chr1,+143229991,4S35M85S,1;chr1,+143194906,4S35M85S,1;chr10,+42098197,4S35M85S,1;chr1,+143229325,4S35M85S,1;chr1,+143273144,4S35M85S,1;chr1,+143236132,4S35M85S,1;chr3,-196898795,85S35M4S,1;chr1,-125173710,85S35M4S,1;chr10,+42074903,4S35M85S,1;chr1,+143193143,4S35M85S,1;chr1,+143190443,4S35M85S,1;chr10,+42085796,4S35M85S,1;chr1,+143224622,4S35M85S,1;chr1,+143267943,4S35M85S,1;chr10,+42103854,4S35M85S,1;chr1,+143225093,4S35M85S,1;chr1,-143249828,85S35M4S,1;chr1,+143231300,4S35M85S,1;chr1,-143256486,85S35M4S,1;chr1,-143209440,85S35M4S,1;chr1,+143228021,4S35M85S,1;chr1,+143185063,4S35M85S,1;chr10,-41852367,85S35M4S,1;chr1,-143251629,85S35M4S,1;chr1,+143233540,4S35M85S,1;chr10,+42093977,4S35M85S,1;chr1,+143200517,4S35M85S,1;chr1,+143194441,4S35M85S,1;chr10,+42070793,4S35M85S,1;chr1,+143206914,4S35M85S,1;chr1,+143237811,4S35M85S,1;chr1,+143227553,4S35M85S,1;chr1,-143255189,85S35M4S,1;chr1,+143231768,4S35M85S,1;chr1,+143271341,4S35M85S,1;chr10,+42080361,4S35M85S,1;chr1,+143213870,4S35M85S,1;chr10,+42074435,4S35M85S,1;chr1,+143263324,4S35M85S,1;chr10,+42097745,4S35M85S,1;chr10,+42090276,4S35M85S,1;chr1,-125180284,85S35M4S,1;chr1,+143240055,4S35M85S,1;chr1,+143265756,4S35M85S,1;chr1,+143216113,4S35M85S,1;chr1,-125169985,85S35M4S,1;chr1,+143219197,4S35M85S,1;chr1,+143192675,4S35M85S,1;chr10,+42095848,4S35M85S,1;chr1,+143195374,4S35M85S,1;chr1,+143214338,4S35M85S,1;chr1,+143270772,4S35M85S,1;chr1,-125166285,85S35M4S,1;chr1,+143275099,4S35M85S,1;chr1,+143226451,4S35M85S,1;chr10,+42104319,4S35M85S,1;chr1,+143232233,4S35M85S,1;chr1,+143211626,4S35M85S,1;chr1,+143220133,4S35M85S,1;chr1,+143215645,4S35M85S,1;chr10,+42100036,4S35M85S,1;chr10,-41846998,85S35M4S,1;chr1,-125168084,85S35M4S,1;chr1,-125179816,85S35M4S,1;chr1,+143240523,4S35M85S,1;chr1,+143264771,4S35M85S,1;chr1,+143212094,4S34M86S,1;chr10,-41845898,86S34M4S,1;chr1,+143191375,4S31M89S,0;chr1,-125182919,89S31M4S,0;chr1,+143221908,4S31M89S,0;chr1,+143190911,4S31M89S,0;chr10,-41843753,89S31M4S,0;chr10_KI270824v1_alt,+1080,4S35M85S,1;chr10_KI270824v1_alt,+615,4S35M85S,1;
SRR003161.2 0 chr7 41381016 60 4S153M1D132M1D5M1D28M1D73M3I12M1I40M54S * 0 0 TCAGTTTGAGATGGAGTTTCATTCTTGTTGCCCAGGCTGGAGTGCAATGGCGCAATCTCAGCTCACAGCAACCTCCGCCTCCCGGGTTCAAGCGATTCTCCTGCCTCAGCCTCTCGAGTAGCTGGGATTACAGGCATGCACCATCACGCCCAGCTAATTTGCATTTTTTATTAGAGATGGGGTTTCTCCACATTGGTCAGGCTGATCTCGAACTCCTGACCTCAGGTGATCTGCCTGCCTTGGCCTCCCAAAGTGCTGGGATTACAGGCATGAGCCTGAGCCCAACCTATTTACTTTCAATCCATCTTTTCAATAACTTAAATACAAGTGTCAATATATACAATCTTTTCCTCCCTGGTTATCAAGCTTTCTAATATATATGGATGTATCTTCCAAGGTTTTTGATCCCATTTTACTTTACAGGCTCACTGCTGTGGAACCCAGAGAGCAGTCTCTTTTCAAGGNGGGCTGAGACNCGCAACAGGGGATTAGGCCAAGGCNCAGG CCCCCCCCCCCCCCCC@@@CCCFEEEFEEG888EEEFFEEEEFGGGGGGCCCCCCCCCCCCCCCCCCCCCCCCCCCCCA<777@@CCCBCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCAAACCCCCCCCCCCCCCCCCCCCCCC:93339@A>77//39AC666666C22CAAAA93333///7-0017>9999>>A???ACCCCCCC2239322>9977<?????CCCCCCCCC877777777111111::::5555:555:::::::::;:555:;;::::0040-----***--467::::;;;;;;:::511155555:555:::;::::::7777744-------///245::;;;::::::;;;;;;;;:55554774----------44-----064---------6---522451115247644255-----,4---24464422---------!,,,4464224!11:::7:::111111--7777---!---- NM:i:1MD:Z:153^T40T91^T5^T28^G73G23C0G26 AS:i:379 XS:i:88

参考:
http://www.bbioo.com/lifesciences/40-113338-1.html
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  sam