차세대 염기서열 분석(Next Generation Sequencing, NGS)은 대량의 병렬 데이터 생산으로 유전체의 염기서열 을 고속으로 분석하는 기술이며, 이 기술은 바이러스 유전체 분석에도 광범위하게 사용되고 있다. 하지만, 바이 러스의 전장 유전체가 100kb를 넘을 경우, 동일한 raw data라도 분석 방법 및 소프트웨어 그리고 매개변수 (parameter)에 따라 유전체의 크기와 구조가 다르게 결정된다. 따라서 유전체가 큰 바이러스 분석 시, 최적화된 NGS 분석 방법을 선택하는 것이 중요하다. 본 연구는 장수풍뎅이 누디바이러스(Oryctes rhinocerous nudivirus, 120kb) 유전체를 기반으로, 다양한 Assembly 소프트웨어(metaviralSPAdes, metaSPAdes, velvet, shovill, Geneious, megahit)를 사용하여, 최적화된 NGS 분석 방법을 고안하였다. Assembly 소프트웨어에 따라 바이러스 유전체 크기와 특징(Single Nucleotide Polymorphism, Insertion&Deletion, repetitive genomic variants)의 차이를 확인하였 다. Assembly 소프트웨어 간의 차이가 있는 염기서열은 Sanger sequencing을 통해 재확인하여, 참조 유전체 (reference sequence)를 구축하였다. 이 참조 유전체를 기반으로 가장 정확한 Assembly 소프트웨어와 parameter를 평가하였다. 본 연구는 분석 방법에 따라 달라지는 유전체의 특성을 이해하고, 바이러스 유전체를 정확하게 구축 하는 분석 파이프라인을 제공할 것으로 기대된다.
Urbanization is a driving force of global biodiversity changes, and species that successfully adapt to city environments can become pests with the assistance of human factors. Here we present the first genomic data of Plecia longiforceps, an invasive pest exhibiting intensive outbreaks in the Seoul Metropolitan Area of Korea. HiFi and Pore-C sequencing data were used to construct a highly continuous genome assembly with a total size of 707 Mb and 8 major pseudochromosomes. Gene annotation using transcriptome data and ab initio predictions revealed significant numbers of genes related to detoxification and heat tolerance. Comparison to the Bibio marci genome showed high levels of synteny with some regions of chromosomal rearrangement. Our data will serve as an essential resource for population and functional genomic studies on dispersal and outbreaks of P. longiforceps, and facilitate research on eco-evolutionary processes of dipterans in urbanizing habitats.
Chromosomal level of Korean Diadegma fenestrale (Jeju strain, JK-2023a) of genome assembly was achieved through a combined approach utilizing Nanopore long-read sequencing and Illumina NovaSeq short-read sequencing (approximately 217.2× coverage). The assembled genome spans 221.1 Mb, comprises 68 scaffolds, with most of the genome contained within 11 chromosomal level scaffolds. The completeness of the assembly is reflected in BUSCO assessment, with values reaching 99.6%. Scaffold N50 was 17.4 Mb, and GC % was 40%. RNAseq was performed using RNA extracted from larvae, pupae, and adults at various developmental stages (trimmed RNA-Seq data, 11.3 Gb), and a total of 13,544 genes were predicted by synthesizing the transcriptome information with the annotation information of five closely related species such as, Campoletis sonorensis (GCA_013761285.1), Venturia canescens (GCF_019457755.1), and Nasonia vitripennis (GCF_000002325.3, and GCF_009193385.2). Of these, 13,498 genes were identified by BLAST and are being further analyzed. Although the frequency of DfIV genome integration into the host’s 11 chromosomes varies from 0 to 32%, it was confirmed that all 62 DfIV genome fragments were inserted into the Hymenopteran host genome.
Helicoverpa assulta (Lepidoptera: Noctuidae) exhibits a specialized herbivorous diet, primarily targeting select Solanaceae plants. Despite its significant economic impact as a pest, causing substantial harm to crops like hot pepper and tobacco, it has received comparatively limited attention in research compared to its generalist counterparts, H. armigera and H. zea.We introduce a chromosome level genome assembly using a Korean H. assulta (Pyeongchang strain, K18). This assembly was achieved through a combined approach utilizing Nanopore long-read sequencing (approximately 78X coverage) and Illumina NovaSeq short-read sequencing (approximately 54X coverage). The total assembled genome spans 424.36 Mb, designated as ASM2961881v1, comprises 62 scaffolds, with 98.7% of the genome contained within 31 scaffolds, confirming the insect's chromosome count (n = 31). The completeness of the assembly is reflected in BUSCO assessment, with values reaching 99.0%, while the repeat content accounts for 33.01%, and 18,593 CDS were annotated. Additionally, 137 genes were identified within 15 orthogroups that have rapidly expanded in H. assulta, while 149 genes in 95 orthogroups have rapidly contracted. This genome draft serves as a valuable resource to explore various aspects of the specialist's biology, enabling research into host-range evolution, chemical communication, insecticide resistance, and comparative investigations with other Heliothine species.
꼬막은 해양 어업으로써 아시아 전 지역에 있어서 중요한 수산자원 중 하나이다. 하지만, 공장 의 산업화, 해양 환경오염, 그리고 지구 온난화로 인해 해양 어업 생산량이 급격히 떨어졌다. 우리나라 남해안의 주요 수산자원인 꼬막의 유전적 특성을 파악하기 위하여 꼬막의 전장유전 체를 해독하고 염색체 서열을 규명하였다. 915.4 Mb의 게놈을 조립하였고, 19개의 염색체 유전자 서열을 식별하였다. 꼬막의 유전체에서 25,134개의 유전자들을 확인하였고, 그 중에 22,745개 의 유전자들에 대한 기능을 확인했으며, 4,014개의 유전자들에 대한 KEGG pathway를 분석하 였다. 꼬막유전체와 8종의 다른 패류와 비교유전체 분석을 통하여 확장/감소(gene gain and loss) 분석을 수행한 결과, 725개의 유전자군의 확장과 479개의 유전자군의 감소를 확인하였다. 꼬막의 homeobox 유전자 클러스터는 촉수담륜동물 내에서 잘 보존된 유전자 구조를 보였다. 또한, 꼬막은 3개의 hemoglobin 유전자들이 피조개의 hemoglobin과 높은 유사성을 보였다. 꼬막의 전장유전체 정보를 통해 꼬막의 환경 적응과 진화의 유전적 특성과 생리적 특성뿐만 아니라, 꼬막 양식의 효율성을 높이는 양식산업에 널리 이용될 수 있는 유전적 정보를 제공 할 것이다.
Isaria farinosa (Hypocreales, Ascomycota) is a cosmopolitan entomopathogenic fungus affecting a wide range of arthropod hosts. It has mainly been studied as a insecticidal agent to control the agricultural pests. To investigate the useful secondary metabolite(SM) genes in Isaria farinosa C1012 strain, de novo assembly and genome mining were carried out. A whole genome sequencing with PacBio RSII system generated NGS reads greater than 4Gb, which were assembled into 16 contigs using FALCON program. The total size of genome was 33.36Mb. The N50 and N90 were 6,686,213 and 1,912,865bp, respectively. The assembled genome data was analyzed with antiSMASH3 program with a default setting to localize the gene region responsible for synthesizing SMs, such as non-ribosomal peptide synthetases (NRPS) and polyketide synthase (PKS). In this study, we predicted 16 NRPS, 13 PKS, and 9 PKS-NRPS hybrid gene clusters in I. farinosa genome.
Chloroplasts are plant-specific organelles, which have their own genome. Most of the plant chloroplast genomes (CP genome) are highly conserved in terms of its gene contents and genome structures, and they exist in cells with abundant copy numbers. Because of numerous copy numbers, the complete chloroplast sequence assembly pipeline with small amount of whole genome resequencing data, produced by NGS technique, was established in our laboratory. From 14 accessions of cabbage (Brassica oleracea L.) resequencing data produced by Illumina Hi-seq 2000, CP genomes were assembled and compared to each other. 18 sequence variance regions were detected, and 6 HRM(High Resolution Melting curves) markers were developed. Approximately 1 Gb of whole genome sequencing data of 10 Brassica rapa and 2 Brassica napus were also obtained from Institute of Vegetables and Flowers, Chinese Academy of Agricultural Science. With these resequencing data, all CP genomes from these accessions were assembled. Total 27 complete CP genomes of B.oleracea, B.rapa, B.napus, and brassico-raphanus which is a novel allotetraploid species between B.rapa and Raphanus sativus, were compared in sequence level. Phylogenetic analysis based on the comparison revealed that B.rapa could be the maternal species when rapeseeds and brassico-raphanus became allotetraploid species. Additionally, CP genome of B.napus cv.M083 is closer to B.rapa accessions than the other B.napus accessions, thus B.napus could have several different origins.