논문 상세보기

The Strategies of NGS Data Analysis to Genome Biology

  • 언어ENG
  • URLhttps://db.koreascholar.com/Article/Detail/288521
모든 회원에게 무료로 제공됩니다.
한국응용곤충학회 (Korean Society Of Applied Entomology)
초록

The application to genome study has been particularly developed with the introduction of the next-generation DNA sequencer (NGS) Roche/454 and Illumina/Solexa systems, along with bioinformation analysis technologies of whole-genome de novoassembly, expression profiling, DNA variation discovery, and genotyping. One of the advantages of the NGS systems is the cost-effectiveness to obtain the result of high-throughput DNA sequencing for genome, RNAnome, and miRNAnome studies. Both massive whole-genome shotgun paired-end sequencing and mate paired-end sequencing data are important steps for constructing de novo assembly of novel genome sequencing data and for resequencing the samples with a reference genome DNA sequence. To construct high-quality contig consensus sequences, each DNA fragment read length is important to obtain de novo assembly with long reading sequences of the Roche/454 system. It is necessary to have DNA sequence information from a multiplatform NGS with at least 2× and 30×depth sequence of genome coverage using Roche/454 and Illumina/Solexa, respectively, for effective an way of de novo assembly, as hybrid assembly for novel genome sequencing would be cost-effective. In some cases, Illumina/Solexa data are used to construct scaffolds through de novo assembly with high coverage depth and large diverse fragment mate paired-end information,even though they are already participating in assembly and have made many contigs. Massive short-length reading data from the Illumina/Solexa system is enough to discover DNA variation, resulting in reducing the cost of DNA sequencing. MAQ and CLC software are useful to both single nucleotide polymorphism discovery and genotyping through a comparison of resequencing data to a reference genome. Whole-genome expression profile data are useful to approach genome system biology with quantification of expressed RNAs from a whole-genome transcriptome, depending on the tissue samples, such as control and exposed tissue. The hybrid mRNA sequences from Rohce/454 and Illumina/Solexa are more powerful to find novel genes through de novo assembly in any whole-genome sequenced species. The 20× and 50× coverage of the estimated transcriptome sequences using Roche/454 and Illumina/Solexa, respectively,is effective to create novel expressed reference sequences. However, only an average 30× coverage of a transcriptome with short read sequences of Illumina/Solexa is enough to check expression quantification, compared to the reference expressed sequence tag sequence. In an in silicomethod, conserved miRNA and novel miRNA discovery is available on massive miRNAnome data in any species. Particularly, the discovered target genes of miRNA could be robust to approach genome biology study.

저자
  • Ik-Young Choi(National Instrumentation Center for Environmental Management, College of Agriculture and Life Sciences)
  • Hyung-Wook Kwon(WCU Biomodulation, Seoul National University)