Archives

切除引物序列获取扩增目标序列:seqtk_pcr_strip

前面一篇文章: 对扩增子序列进行引物匹配: usearch 和 primersearch 遗留了一个问题,就是识别引物后如何处理? 保留引物序列还是切除引物序列的问题。

后面会讲到基于100%相似度构建OTU(或者feature)表的问题,引物区域带来的噪音就必须要考虑进去,所以这里介绍一个工具: seqtk_pcr_strip 切除引物序列获取扩增目标序列。

seqtk_pcr_strip 命令行接口:

$seqtk_pcr_strip Usage: seqtk_pcr_strip <amplicon> <fasta/q> Options: -l INT max overhung length, default [1] -v print version number

这里有一个参数: -l, 限制引物在扩增子序列位置, 设置为0, 意味着引物必须在序列的最左边和最右边,

实例

seqtk_pcr_strip amplicon-pcrout.txt amplicon-pcrout.fasta >amplicon-pcr_strip.fasta seqtk_pcr_strip amplicon-pcrout.txt amplicon-pcrout.fastq >amplicon-pcr_strip.fastq

本文材料为 BASE (Biostack Applied bioinformatic SEies ) 课程 Linux Command Line […]

Docker 容器技术对基因组数据分析性能影响

文章标题:

The impact of Docker containers on the performance of genomic pipelines

文章摘要:

Genomic pipelines consist of several pieces of third party software and, because their experimental nature, frequent changes and updates are commonly necessary thus raising serious distribution and reproducibility issues. Docker containers technology offers an ideal solution, as it allows the packaging of […]