标题:
COGNIZER: A Framework for Functional Annotation of Metagenomic Datasets
摘要:
Recent advances in sequencing technologies have resulted in an unprecedented increase in the number of metagenomes that are being sequenced world-wide. Given their volume, functional annotation of metagenomic sequence datasets requires specialized computational tools/techniques. In spite of having high accuracy, existing stand-alone functional annotation tools necessitate end-users to perform compute-intensive homology searches of metagenomic datasets against “multiple” databases prior to functional analysis. Although, web-based functional annotation servers address to some extent the problem of availability of compute resources, uploading and analyzing huge volumes of sequence data on a shared public web-service has its own set of limitations. In this study, we present COGNIZER, a comprehensive stand-alone annotation framework which enables end-users to functionally annotate sequences constituting metagenomic datasets. The COGNIZER framework provides multiple workflow options. A subset of these options employs a novel directed-search strategy which helps in reducing the overall compute requirements for end-users. The COGNIZER framework includes a cross-mapping database that enables end-users to simultaneously derive/infer KEGG, Pfam, GO, and SEED subsystem information from the COG annotations.
地址:
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0142102
源码:
http://metagenomics.atc.tcs.com/function/cognizer/
安装:
wget http://metagenomics.atc.tcs.com/cognizer/application/COGNIZER_source_code.zip
unzip COGNIZER_source_code.zip
mv source_code cognizer-0.9b
cd cognizer-0.9b
gcc -O2 -g cognizer.c -o cognizer
修改:为了方便在任何目录访问cognizer程序,需要修改源代码中的 blastall或者RAPSearch的路径模式,去掉相对路径,修改成只要环境变量可以访问RAPSearch 或者 blastall 就可以使用模式。
修改:数据库db相对路径修改成绝对路径,保证任何目录都可以访问。
修改:RAPSearch模式变成RAPSearch2 命令行接口模式, 多线程使用 -z , 添加 bitscore 限制, 最小 bitscore 60;
导读:
COGNIZER 快速注释模式,采用了NCBI COG 数据库 ftp://ftp.ncbi.nih.gov/pub/COG/COG/myva 作为RAPSearch的库索引进行序列相似性比对,然后与其他数据库进行关联,比如GO、KEGG 、Fig等,最大的问题可能就是库比较小, MOCAT2: a metagenomic assembly, annotation and profiling framework 文章也提及COG谱要比COGNIZER好点,原因可能就是库上,另外COG注释的一个数据库是 eggNOG, 库还是比较大,不过使用diamond软件,速度应该和 myva+RAPSearch相当, 但是二者肯定比使用 blastall 作为序列比对引擎快, 如果能认可使用 NCBI 的COG 序列库进行序列相似性搜索,COGNIZER 还是很不错。
版本:
2016-12-01.v1
Leave a Reply