Archives

wgsim: 比较通用的序列模拟器

标题:

wgsim: Reads simulator

摘要:

Wgsim is a small tool for simulating sequence reads from a reference genome. It is able to simulate diploid genomes with SNPs and insertion/deletion (INDEL) polymorphisms, and simulate reads with uniform substitution sequencing errors. It does not generate INDEL sequencing errors, but this can be partly compensated by simulating INDEL polymorphisms. Wgsim outputs the simulated polymorphisms, and writes the true read coordinates as well as the number of polymorphisms and sequencing errors in read names. One can evaluate the accuracy of a mapper or a SNP caller with wgsim_eval.pl that comes with the package.

地址:

https://github.com/lh3/wgsim

源码:

https://github.com/lh3/wgsim

安装:

git clone  https://github.com/lh3/wgsim
cd wgsim
gcc -g -O2 -Wall -o wgsim wgsim.c -lz -lm

导读:

wgsim 比较通用的序列模拟软件,需要提供参考基因组,从提供的命令行参数可以看到,可以控制:

1. 双端序列的外部距离(等价于插入片段长度);
2. 序列长度;
3. 错误率;
4. 序列条数;
5. 突变率;
6. Indel的比例

等。 很适合模拟 metagenome 数据集,但是需要提供很多参考基因组;

对于metagenome 数据的模拟,可以直接到 refseq 下载想模拟的微生物基因组 ,一般下载 细菌/古菌/病毒的序列,各占一定比例, 然后合并在一起,就可以模拟了, 可以参考 A comparative study of metagenomics analysis pipelines at the species level 这篇文章, 比例为:hg19 + 315 微生物基因组 (292 237 species: 74 bacteriophages, 69 viruses and 149 bacteria), 这里用的是 species,应该去除同一个种的不同 strain,但是真实的 metagenome 数据肯定会有很多同一个种的不同 strain,这也是metagenome 数据分析的难点:拼装的嵌合问题。

因为NCBI新的拼装数据库不提供打包下载方法,下载Refseq也没有先前那么简单了,这里提供一个下载 Refseq 比较方便的程序:Kraken_db_install_scripts

版本:

 2016-11-21.v1

17 comments to wgsim: 比较通用的序列模拟器

  • This business needed that I authorize a discretion contract prior to they would supply ‘common nutrient study’
    data for their macronutrients (humidity, protein, fat, thread, ash).

    authorities, like the Fda and Then returned to WSU Spokane to work at a PhD in Nutrition as well as
    Workout Physiology. Tim Johnson on the ABC NewsNow
    system, authorizations, such as the Food and Drug Administration and Right here, you will be actually shown through faculty who are actually forerunners in the field and
    also discover reducing side nutritional science and
    the practical treatments to market positive way of living
    modifications in others. Tim Johnson on the ABC NewsNow system,
    Water, Corn Syrup Solids, Caseinates (Calcium mineral,
    Magnesium, Sodium), High Oleic Safflower Oil, Canola Oil,
    Dairy Healthy protein Isolate, Glucose, Corn Maltodextrin, Glycerine,
    Short-Chain Fructooligosaccharides.

  • Click here for 2013 NSW Structures Award juries.

  • When someone writes an post he/she retains the plan of
    a user in his/her brain that how a user can understand it.
    So that’s why this article is outstdanding. Thanks!

  • Thanks in favor of sharing such a good idea, article is good, thats why i have read it entirely

  • Oh my goodness! Amazing article dude! Thanks, However I
    am having troubles with your RSS. I don’t understand why I can’t
    subscribe to it. Is there anybody getting the same
    RSS problems? Anyone who knows the solution can you kindly respond?

    Thanks!!

  • I was more than happy to uncover this website.
    I need to to thank you for your time due to this wonderful read!!
    I definitely appreciated every little bit of it and I have
    you book-marked to look at new information in your website.

  • What’s up to every one, the contents present at this site
    are really amazing for people experience, well, keep up the nice
    work fellows.

  • Moi znajomi zaproponowali spontaniczny wyjazd nad morze. Niestety nie miałam potrzebnej
    gotówki, więc zdecydowałam się skorzystać z pożyczki przez internet.
    Polecam wszystkim!

  • Diabetes is ߋften known аs Sugsr Diabettes іn adults ɑnd today think off iit as diabetes type 2 symptoms.
    Ꮃhatever іt takes is it possible to creat a similar belief tuat ʏoᥙ just
    too ccan live a powerful life ᴡith diabetes. Ƭhis ⅽan be further affected ԝhen drugs аre
    neeed fⲟr other difficulties гelated tߋo diabetes sych аs blood pressure ɑnd cholesterol medication.

  • It’s a pity you don’t have a donate button!
    I’d most certainly donate to this brilliant blog!
    I guess for now i’ll settle for bookmarking and adding
    your RSS feed to my Google account. I look forward to new updates and will talk about this website
    with my Facebook group. Talk soon!

  • Dünyaca ünlü bahis siteleri arasında bulunan piobet bahis sitesine duyulan ilgi ünlüler arasında eğlence amaçlı da kullanılıyor ve
    bahis oynanıyor.

  • Very shortly this website will be famous amid all
    blogging visitors, due to it’s nice content

  • .

    Great post. I used to be checking continuously this weblog and I’m impressed!
    Very helpful information specially the final part :
    ) I deal with such info a lot. I used to be looking for this
    particular info for a very lengthy time. Thank you and best of luck.

  • Excellent article! We will be linking to this great content on our website.

    Keep up the great writing.

  • When some one searches for his essential thing, thus he/she desires
    to be available that in detail, so that thing is maintained
    over here.

  • Ѕome genuinely choice posts on this website, bookmarked.

  • Also these poker schools coaach you on the way to deal other games like blackjack or Pai Gow, that you simply don’t need to waste your time aand energy
    to learn. You can complete your profile and other details at a
    later date iin the event yoou opt to. And winning is perhaps all dependent upon how well look for other people’s intentions and shroud your
    own personal feekings and plans.

Leave a Reply

To create code blocks or other preformatted text, indent by four spaces:

    This will be displayed in a monospaced font. The first four 
    spaces will be stripped off, but all other whitespace
    will be preserved.
    
    Markdown is turned off in code blocks:
     [This is not a link](http://example.com)

To create not a block, but an inline code span, use backticks:

Here is some inline `code`.

For more help see http://daringfireball.net/projects/markdown/syntax

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>