Coelacanth genomes reveal signatures for evolutionary transition from water to land

Assembling the coelacanth genome

First, we constructed the reference coelacanth draft genome from one of the Tanzanian specimens (TCC041-004, gender unknown; Nikaido et al. 2011), which was recovered from the body cavity of its mother (coelacanths give birth to fully formed offspring; Fig.2). A micro-computed tomography (micro-CT) scanning image was taken before sampling (Fig. 2 and Supplementary Fig. 3). In total, we generated 884.8 Gbp of raw sequence data, from which ~780 Gbp (~300× coverage) was used for the assembly using the newly developed assembler PLATANUS (Supplementary Information 2.3, 2.4,and 2.5). The genome size was estimated to be 2.74 Gbp from the k-mer analysis (Supplementary Fig. 5 and upplementary Table 3).

884.8 Gbp 的原始数据,数据清理后大约780 Gbp(大约300X),使用PLATANUS拼接软件进行基因组拼装, 基因组大小确定大约2.74 Gbp,重复序列比较多(~60%为重复序列元件).

Genome sequencing and assembly

Sequencing libraries were prepared using the Illumina TruSeq DNA Sample Prep kit (300 bp, 500 bp and 1.0 kb) and the SOLiD Mate-Paired Library Construction kit (2.5 kb and 5.0 kb) according to manufacturers’ instructions. All libraries were sequenced on the Illumina HiSeq2000 sequencers. The raw sequence reads were filtered for the trimming of adapter sequences in reads, and for the removal of paired reads with lowquality or extremely short insert sizes. Whole-genome assembly was performed with the newly developed assembler PLATANUS, which is optimised for short-read data from high-throughput sequencers. See Supplementary Information 2.3 for details.


