您的位置:首页 > 其它

Interpro注释

2016-06-04 22:20 1636 查看
使用Interpro数据库,可以将蛋白质序列进行家族分类,预测其结构域和重要位点。Interpro综合了多种不同的数据库来构成一个综合的Interpro数据库。这些数据库有:PROSITE.HAMAP,Pfam,PRINTS,ProPom,SMART/TIGRFAMs,PIRSF,SUPERFAMILY,CATH-Gene3D,PANTHER

方法1网页版
http://www.ebi.ac.uk/interpro/
将序列粘贴到输入框中进行Interpro注释。

优点:方便。。。

缺点:输入必须为蛋白质序列;InterProScan每次查询一次性最多能比对25条蛋白质序列。

方法2使用EBI提供的脚本程序进行远程比对  

EBI:The European Bioinformatics Institute

推荐使用EBI提供的perl程序来进行Interpro注释。程序能将序列发送到官方服务器进行InterPro注释,再将结果返回本地。

脚本的下载网页:http://www.ebi.ac.uk/Tools/Webservices/services/pfa/iprscan5_rest

在这里,有perl,Python和Ruby程序各一支。分别是:iprscan_lwp.pl,iprscan_urllib2.py和iprscan_net_http.rb

[Required]

  seqFile            : file : query sequence ("-" for STDIN, @filename for

                              identifier list file)

[Optional]

      --appl         : str  : Comma separated list of signature methods to run,

                              see --paramDetail appl.

      --goterms      :      : retrieve GO terms

      --nogoterms    :      : do not retrieve GO terms

      --pathways     :      : retrieve pathway terms

      --nopathways   :      : do not retrieve pathway terms

      --multifasta   :      : treat input as a set of fasta formatted sequences

 

[General]

  -h, --help         :      : prints this help text

      --async        :      : forces to make an asynchronous query

      --email        : str  : e-mail address

      --title        : str  : title for job

      --status       :      : get job status

      --resultTypes  :      : get available result types for job

      --polljob      :      : poll for the status of a job

      --jobid        : str  : jobid that was returned when an asynchronous job

                              was submitted.

      --outfile      : str  : file name for results (default is jobid;

                              "-" for STDOUT)

      --useSeqId     :      : use sequence identifiers for output filenames.

                              Only available in multifasta or list file modes.

      --maxJobs      : int  : maximum number of concurrent jobs. Only

                              available in multifasta or list file modes.

      --outformat    : str  : result format to retrieve

      --params       :      : list input parameters

      --paramDetail  : str  : display details for input parameter

      --quiet        :      : decrease output

      --verbose      :      : increase output

Synchronous job:

  The results/errors are returned as soon as the job is finished.

  Usage: iprscan5_lwp.pl --email <your@email> [options...] seqFile

  Returns: results as an attachment

优点:

缺点:不能进行核苷酸序列的注释

$perl iprscan5_lwp.pl --email fsczhenjiang@foxmail.com  test.fa

结果:

JobId: iprscan5-R20160605-043400-0109-32295822-es

RUNNING

RUNNING

RUNNING

RUNNING

RUNNING

RUNNING

RUNNING

RUNNING

RUNNING

RUNNING

RUNNING

RUNNING

RUNNING

RUNNING

RUNNING

FINISHED

Creating result file: iprscan5-R20160605-043400-0109-32295822-es.out.txt

Creating result file: iprscan5-R20160605-043400-0109-32295822-es.log.txt

Creating result file: iprscan5-R20160605-043400-0109-32295822-es.tsv.txt

Creating result file: iprscan5-R20160605-043400-0109-32295822-es.xml.xml

Creating result file: iprscan5-R20160605-043400-0109-32295822-es.htmltarball.html.tar.gz

Creating result file: iprscan5-R20160605-043400-0109-32295822-es.gff.txt

Creating result file: iprscan5-R20160605-043400-0109-32295822-es.svg.svg

Creating result file: iprscan5-R20160605-043400-0109-32295822-es.sequence.txt
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: