Archives

标题:

KneadData: a tool designed to perform quality control on metagenomic and metatranscriptomic sequencing data

摘要:

KneadData is a tool designed to perform quality control on metagenomic sequencing data, especially data from microbiome experiments. In these experiments, samples are typically taken from a host in hopes of learning something about the microbial community on the host. However, metagenomic sequencing data from such experiments will often contain a high ratio of host to bacterial reads. This tool aims to perform principled in silico separation of bacterial reads from these “contaminant” reads, be they from the host, from bacterial 16S sequences, or other user-defined sources.

地址:

http://huttenhower.sph.harvard.edu/kneaddata

源码:

https://bitbucket.org/biobakery/kneaddata

安装:

wget  https://bitbucket.org/biobakery/kneaddata/downloads/kneaddata_v0.5.1.tar.gz
tar xzvf  kneaddata_v0.5.1.tar.gz
sudo python setup.py install --bypass-dependencies-install

解读:

KneadData 为 Huttenhower实验室出品,设计用来实现微生物组数据质量控制,尤其是metagenome 或者 metatranscriptome 数据, 也可以作为常规的污染序列去除工具,可以使用 bowtie2 或者 bmtagger作为序列比对引擎,经过具体测试 bmtagger 去除效率要比 bowtie2 好,bmtagger为人类微生物项目的SOP工具,KneadData 多线程处理效果不佳,按照下面思路也许重新基于bmtagger造一个小轮子更合适,(1)、对fastq文件切分 (2)、使用 parallel/xargs/gargs提交 (3)、合并结果文件 (4)、数据清理。

Comments are closed.