Interpro注释
2016-06-04 22:20
1636 查看
使用Interpro数据库,可以将蛋白质序列进行家族分类,预测其结构域和重要位点。Interpro综合了多种不同的数据库来构成一个综合的Interpro数据库。这些数据库有:PROSITE.HAMAP,Pfam,PRINTS,ProPom,SMART/TIGRFAMs,PIRSF,SUPERFAMILY,CATH-Gene3D,PANTHER
方法1网页版
http://www.ebi.ac.uk/interpro/
将序列粘贴到输入框中进行Interpro注释。
优点:方便。。。
缺点:输入必须为蛋白质序列;InterProScan每次查询一次性最多能比对25条蛋白质序列。
方法2使用EBI提供的脚本程序进行远程比对
脚本的下载网页:http://www.ebi.ac.uk/Tools/Webservices/services/pfa/iprscan5_rest
在这里,有perl,Python和Ruby程序各一支。分别是:iprscan_lwp.pl,iprscan_urllib2.py和iprscan_net_http.rb
[Required]
seqFile : file : query sequence ("-" for STDIN, @filename for
identifier list file)
[Optional]
--appl : str : Comma separated list of signature methods to run,
see --paramDetail appl.
--goterms : : retrieve GO terms
--nogoterms : : do not retrieve GO terms
--pathways : : retrieve pathway terms
--nopathways : : do not retrieve pathway terms
--multifasta : : treat input as a set of fasta formatted sequences
[General]
-h, --help : : prints this help text
--async : : forces to make an asynchronous query
--email : str : e-mail address
--title : str : title for job
--status : : get job status
--resultTypes : : get available result types for job
--polljob : : poll for the status of a job
--jobid : str : jobid that was returned when an asynchronous job
was submitted.
--outfile : str : file name for results (default is jobid;
"-" for STDOUT)
--useSeqId : : use sequence identifiers for output filenames.
Only available in multifasta or list file modes.
--maxJobs : int : maximum number of concurrent jobs. Only
available in multifasta or list file modes.
--outformat : str : result format to retrieve
--params : : list input parameters
--paramDetail : str : display details for input parameter
--quiet : : decrease output
--verbose : : increase output
Synchronous job:
The results/errors are returned as soon as the job is finished.
Usage: iprscan5_lwp.pl --email <your@email> [options...] seqFile
Returns: results as an attachment
优点:
缺点:不能进行核苷酸序列的注释
$perl iprscan5_lwp.pl --email fsczhenjiang@foxmail.com test.fa
结果:
JobId: iprscan5-R20160605-043400-0109-32295822-es
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
FINISHED
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.out.txt
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.log.txt
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.tsv.txt
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.xml.xml
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.htmltarball.html.tar.gz
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.gff.txt
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.svg.svg
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.sequence.txt
方法1网页版
http://www.ebi.ac.uk/interpro/
将序列粘贴到输入框中进行Interpro注释。
优点:方便。。。
缺点:输入必须为蛋白质序列;InterProScan每次查询一次性最多能比对25条蛋白质序列。
方法2使用EBI提供的脚本程序进行远程比对
EBI:The European Bioinformatics Institute
推荐使用EBI提供的perl程序来进行Interpro注释。程序能将序列发送到官方服务器进行InterPro注释,再将结果返回本地。脚本的下载网页:http://www.ebi.ac.uk/Tools/Webservices/services/pfa/iprscan5_rest
在这里,有perl,Python和Ruby程序各一支。分别是:iprscan_lwp.pl,iprscan_urllib2.py和iprscan_net_http.rb
[Required]
seqFile : file : query sequence ("-" for STDIN, @filename for
identifier list file)
[Optional]
--appl : str : Comma separated list of signature methods to run,
see --paramDetail appl.
--goterms : : retrieve GO terms
--nogoterms : : do not retrieve GO terms
--pathways : : retrieve pathway terms
--nopathways : : do not retrieve pathway terms
--multifasta : : treat input as a set of fasta formatted sequences
[General]
-h, --help : : prints this help text
--async : : forces to make an asynchronous query
--email : str : e-mail address
--title : str : title for job
--status : : get job status
--resultTypes : : get available result types for job
--polljob : : poll for the status of a job
--jobid : str : jobid that was returned when an asynchronous job
was submitted.
--outfile : str : file name for results (default is jobid;
"-" for STDOUT)
--useSeqId : : use sequence identifiers for output filenames.
Only available in multifasta or list file modes.
--maxJobs : int : maximum number of concurrent jobs. Only
available in multifasta or list file modes.
--outformat : str : result format to retrieve
--params : : list input parameters
--paramDetail : str : display details for input parameter
--quiet : : decrease output
--verbose : : increase output
Synchronous job:
The results/errors are returned as soon as the job is finished.
Usage: iprscan5_lwp.pl --email <your@email> [options...] seqFile
Returns: results as an attachment
优点:
缺点:不能进行核苷酸序列的注释
$perl iprscan5_lwp.pl --email fsczhenjiang@foxmail.com test.fa
结果:
JobId: iprscan5-R20160605-043400-0109-32295822-es
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
RUNNING
FINISHED
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.out.txt
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.log.txt
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.tsv.txt
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.xml.xml
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.htmltarball.html.tar.gz
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.gff.txt
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.svg.svg
Creating result file: iprscan5-R20160605-043400-0109-32295822-es.sequence.txt
相关文章推荐
- Guava base -- Enums
- Mybatis拦截器介绍及分页插件
- 搭建高可用的分布式hadoop2.5.2集群 HDFS HA
- Java中的IO流API整理
- 符号执行工具angr安装教程
- android 屏幕适配
- 运维工程师必会的109个Linux命令PDF
- Linux下RootKits检查常用工具及其使用
- Struts2-学习笔记系列(4)-访问servlet api
- 解析Tensorflow官方English-Franch翻译器demo
- iOS字体换算 PS的字体大小 <=>iOS上字体大小
- Picasso入门教程(七) 根据图片的优先级按顺序请求
- java常用工具类【2】
- js网页如何获取手机屏幕宽度
- Struts2-学习笔记系列(3)-返回视图
- 手机软件项目管理6—软件供应商评判项
- 写在博客前
- Windows API的使用方法简介
- 10.4.3节练习
- 在Unity内使用对象池和单例模式