Naematelia sinensis (Jin Er mushroom) is a gelatinous fungus of significant economic and medicinal value indigenous to China. As an obligate parasite that requires Stereum hirsutum for basidiomata formation, it serves as a distinctive model for investigating fungal-fungal interactions. Despite its potential as a source of bioactive polysaccharides, the scarcity of genomic data has impeded research into its secondary metabolism, cultivation, and underlying molecular mechanisms. We employed PacBio HiFi long-read sequencing and Hi-C chromatin conformation capture technology to generate the first chromosome-scale genome assemblies for four distinct mating types of N. sinensis. Genome assembly, annotation, and subsequent comparative genomic, phylogenomic, and gene family evolution analyses were conducted using standard bioinformatics pipelines. The assembled genomes span 20.73–20.82 Mb, resolved into 12 chromosomes with a contig N50 of 1.75–1.76 Mb, a GC content of 51.83–54.58%, and a BUSCO completeness of 96.8–97.1%. Annotation identified 19,390–19,612 repetitive sequences and 6,558–6,578 protein-coding genes. Comparative analysis with the related species Tremella fuciformis revealed extensive chromosomal rearrangements and inversions. The tetrapolar mating system was characterized, with mating-type loci A and B located on separate chromosomes, and elevated polymorphism was observed at the B locus. Phylogenomic analysis estimated the divergence of the Naematelia genus at approximately 154.88 million years ago and indicated a closer relationship between N. sinensis and N.aurantialba than with N. encephala. Gene family analysis highlighted frequent expansions and contractions throughout the evolutionary history of Naematelia, with an overall trend toward contraction. This study provides foundational genomic resources for N. sinensis. The high-quality genomes elucidate its genomes feature, diversity in mating systems, and evolutionary history. These resources will facilitate future research aimed at discovering bioactive compounds, developing cultivation strategies, and conducting broader studies on evolution in parasitic fungi.
IntroductionFungi play pivotal roles in ecosystems, biotechnology, and medicine, encompassing diverse research areas from evolution to biochemistry (Runnel et al., 2025; Case et al., 2025). Edible and medicinal mushrooms have garnered considerable attention due to their nutritional value and pharmacological properties. Many basidiomycetes produce complex fruiting bodies (termed basidiomata in technical contexts) and bioactive compounds that are not only significant to mycologists but also relevant to pharmacology, food science, and materials science (Garg, 2025). Naematelia sinensis (Tang and Yang, 2024), a distinctive golden gelatinous mushroom, exemplifies this significance through its extensive use in traditional Chinese medicine and cuisine. Investigating N. sinensis can elucidate fungal parasitism, medicinal compound biosynthesis, and mating genetics, offering interdisciplinary value for drug development, cultivation optimization, and fungal evolutionary research (Gandía and Garrigues, 2024).
Within the phylum Basidiomycota, the genus Naematelia Fr. (family Naemateliaceae) encompasses a group of fascinating jelly fungi, many of which are known for their intricate parasitic lifestyles. A prime example is N. sinensis (Tang and Yang, 2024), a rare edible and medicinal fungus endemic to China, colloquially known as ‘golden ear’ or ‘Jin Er’ (Chang and Miles, 2004). This species is distinguished by its vibrant golden, gelatinous basidiomes, which only develop through a specialized parasitic interaction with its host, the wood-decaying fungus Stereum hirsutum (Willd.) Pers. (Tang and Yang, 2024). The taxonomic placement of N. sinensis has been a subject of historical debate; it was once misidentified as Tremella mesenterica Retz ex Fr. (Luo et al., 1987) and later described as Tremella aurantialba (Bandoni and Zang, 1990). Subsequent multi-locus phylogenomic analyses placed it within the Naemateliaceae, renaming it Naematelia aurantialba (Bandoni and Zang, 1990) Millanes and Wedin (Liu et al., 2015). Most recently, a comprehensive study integrating morphology, phylogeny, and geography established the cultivated ‘Jin Er’ as a distinct species, N. sinensis (Tang and Yang, 2024) basidiomes.
N. sinensis is highly valued for its dual role as a gourmet food and a source of traditional medicine. Its purported health benefits, including antioxidant, anti-inflammatory, anti-tumor, and immunomodulatory activities, are largely attributed to a rich array of bioactive compounds such as polysaccharides, active proteins, terpenoids, phenolic acids, and flavonoids (Bandoni and Zang, 1990; Islam et al., 2016; Fan et al., 2017; Du et al., 2018; Liu et al., 2019). Polysaccharides, in particular, are a major component, constituting up to 74% of the dry weight of its basidiomes (Zhou et al., 2015). These properties have spurred its use in pharmaceuticals, functional foods, and high-end cosmetics (Yan et al., 2022). Despite significant progress in understanding its cultivation and medicinal properties (Wang et al., 2025), research has been critically hampered by the lack of a high-quality genomic reference. Previously available genomes for closely related taxa, such as N. aurantialba, are highly fragmented, consisting of hundreds of contigs and scaffolds (Sun et al., 2021; Shen et al., 2024). Such fragmented assemblies lack the completeness and contiguity required for advanced genetic studies, including comparative genomics, gene family evolution, and the identification of genes governing important traits (Wang et al., 2021).
The rapid evolution of sequencing technologies, particularly the combination of Pacific Biosciences (PacBio) single-molecule high-fidelity (HiFi) long reads and high-throughput chromosome conformation capture (Hi-C), now makes it possible to construct complete, chromosome-level genomes. This strategy has been successfully applied to assemble the genomes of numerous other economically important mushrooms, including Agaricus bisporus (J.E. Lange) Imbach (Sonnenberg et al., 2020), Lentinula edodes (Berk.) Pegler (Gao et al., 2022; Yu et al., 2022), Sparassis latifolia Y.C. Dai and F. Wu (Yang et al., 2021), Wolfiporia hoelen (Rumph.) Ryvarden and Gilb. (Li S. J. et al., 2022), Stropharia rugosoannulata Farl. ex Murrill (Li S. W. et al., 2022), Dictyophora rubrovolvata (M. Zang, D.G. Ji and S. Xu) D.G. Ji, S. Xu and R.H. Petersen (Ma et al., 2023), and Tremella fuciformis Berk. (Deng et al., 2023; Li et al., 2023). For N. sinensis, which exhibits a tetrapolar mating system common in Basidiomycota, this approach is particularly powerful. This system is governed by two unlinked mating-type (MAT) loci—one encoding homeodomain (HD) transcription factors and the other encoding pheromones and their receptors (P/R)—which control sexual compatibility between monokaryotic individuals (Sun et al., 2019; Cao et al., 2024; Shen et al., 2024). Assembling the genomes of multiple monokaryons representing different mating types provides a unique opportunity to investigate the genetic basis of this fundamental biological process.
The absence of a high-quality reference genome has been a significant barrier to understanding the fundamental biology and advancing the biotechnological application of N. sinensis. Therefore, the primary objective of this study was to generate the first chromosome-level genome assemblies for this important fungus. To achieve this, we addressed two central research questions: (1) What is the complete genomic architecture of N. sinensis at the chromosome level, and what structural and gene-level variations exist among its different mating-type monokaryons? (2) What is the evolutionary history of the Naematelia genus, and what genomic signatures underlie the unique parasitic lifestyle and metabolic capabilities of N. sinensis? In line with these questions, we formulated two key hypotheses. First, we hypothesized that the integration of PacBio HiFi long-read sequencing and Hi-C scaffolding would yield highly contiguous, chromosome-level assemblies for four distinct monokaryotic strains, enabling the detailed characterization of genomic structure, including the complex mating-type loci and variations in gene families such as Carbohydrate-Active enZymes (CAZymes) and Cytochrome P450s. Second, we hypothesized that comparative and evolutionary genomic analyses would robustly resolve the phylogenomic position of N. sinensis and reveal specific gene family expansions or contractions associated with its mycoparasitic nature and the biosynthesis of valuable secondary metabolites. This study provides a foundational genomic resource that will accelerate research into the genetics, evolution, and breeding of N. sinensis and its relatives.
Materials and methodsFungal strains and culture conditionsThe mature basidiomes of the Jin Er commercial cultivar were supplied by Yunnan Junshijie Biotechnology Co., Ltd., based in Yunnan, China. The dikaryotic strains of Naematelia sinensis (JSJ-J2F1001C) and Stereum hirsutum (JSJ-JEGJ-X18) were isolated from the basidiomes of the Jin Er cultivar and deposited in the China General Microbiological Culture Collection Center (CGMCC) under accession numbers CGMCC 41096 and CGMCC 19659, respectively. Monokaryotic basidiospores for genome sequencing were isolated from mature basidiomes using the spore-ejection method (Choi et al., 1999). Specifically, the basidiomes were surface-sterilized with 75% ethanol and suspended over a conical flask for 2 days to allow spore ejection. The ejected basidiospores were subsequently plated onto Potato Dextrose Agar (PDA) medium after gradient dilution. Single colonies were selected and preserved on agar slants at 4 °C. The number of nuclei and the microscopic morphology of yeast-like cells were examined using fluorescence microscopy (ECLIPSE Ts2R-FL, Nikon, Japan) and an optical microscope (Eclipse 80i, Nikon, Tokyo, Japan), respectively. Four monokaryotic strains (NS-27, NS-29, NS-45, and NS-58), characterized by distinct mating factors, were identified and screened as genomic sequencing materials following the methodology described by Shen et al. (2023). To obtain sufficient cell biomass for genomic DNA extraction, these strains were cultured on Potato Dextrose Agar (PDA: 200 g potato, 20 g dextrose, 18 g agar, 1,000 mL water) solid medium at 22 °C for 15 days. The induction medium (IDM) was primarily used to promote mycelial formation after the combination of two compatible monokaryotic basidiospore strains and to observe clamp connection formation. The preparation protocol for the induction medium is detailed in Reference (Cao et al., 2022). All reagents and chemicals were procured from Sigma-Aldrich (St. Louis, MO, USA).
Genomic DNA extractionAfter collection from the PDA solid medium, four monokaryotic basidiospores with distinct mating types were frozen and ground in liquid nitrogen. High-quality genomic DNA was extracted using a modified cetyltrimethylammonium bromide (CTAB) method optimized for fungal tissue (Doyle and Doyle, 1987). Approximately 0.2 g of frozen yeast-like cells was ground to a fine powder in liquid nitrogen using a mortar and pestle. The powdered tissue was transferred to a 50 mL Falcon tube and mixed with 20 mL of CTAB extraction buffer (2% CTAB, 100 mM Tris–HCl pH 8.0, 20 mM EDTA, 1.4 M NaCl, 1% PVP-40, and 0.2% β-mercaptoethanol). The mixture was incubated at 65 °C for 60 min with occasional gentle inversion. After incubation, the sample was extracted with an equal volume of chloroform:isoamyl alcohol (24:1) by inverting the tube gently for 10 min. The mixture was centrifuged at 12,000 rpm for 15 min at 4 °C to separate the phases. The upper aqueous phase was transferred to a new 50 mL tube, and an equal volume of isopropanol was added to precipitate the DNA. The DNA was spooled out with a glass rod and washed with 70% ethanol. The DNA pellet was air-dried and resuspended in 500 μL of TE buffer (10 mM Tris–HCl, 1 mM EDTA, pH 8.0). RNA was removed from the samples using RNase A (Leagene, Beijing, China) at a concentration of 10 μg/mL. The quality and quantity of the extracted DNA were assessed using a NanoDrop 2000 spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA), the Qubit dsDNA HS Assay Kit on a Qubit 3.0 Fluorometer (Life Technologies, Carlsbad, CA, USA), and electrophoresis on a 0.8% agarose gel.
Genome sequencing and assemblyThe extracted genomic DNA was used for construction of both PacBio genome sequencing library and Illumina short read sequencing library. Short-read libraries with an insertion size ranging from 300 bp to 400 bp were prepared by the Beijing Genomics Institute (BGI) using the Optimal DNA Library Prep Kit (Shenzhen, China). The quality and size of the library’s insertion fragments were assessed using the Agilent Bioanalyzer 2,100 (Agilent Technologies, Santa Clara, CA, USA). Subsequently, the library was sequenced on the DNBSEQ-G400 platform to generate paired-end sequence data. For each strain, a high-fidelity (HiFi) long-read library was prepared using the Pacific Biosciences (PacBio) SMRTbell Express Template Prep Kit 2.0. Genomic DNA was sheared to an average size of ~30 kb using a g-TUBE device (Covaris). Size-selected DNA fragments were end-repaired, A-tailed, and ligated with PacBio SMRTbell adapters. The ligated products were size-selected again to enrich for fragments of the desired size (approximately 20–30 kb). Each library was quantified using qPCR with the SMRTbell Template Quantification Kit 2.0 and diluted to a concentration of 10 nM. The libraries were then loaded onto a PacBio Sequel IIe instrument and sequenced with P6-C4 chemistry to generate long reads. Approximately 100 × coverage of high-quality sequencing data was obtained for each strain, yielding an average read length of ~20 kb.
The raw BGI short reads were preprocessed using SOAPnuke (version 2.1.0) (Chen et al., 2018), with the following parameters: “-n 0.001 -l 20 -q 0.4 --adaMis 3 --rmdup --minReadLen 150”. Subsequently, the genome size and heterozygosity were estimated using Genomescope 2 with a 21 k-mer (Ranallo-Benavidez et al., 2020). The raw sequencing data generated by the PacBio Revio platform underwent quality control via SMRT Link (version 11.1.0) (Chin et al., 2013), during which low-quality reads (less than 500 bp) and adapter sequences were removed to produce high-quality subreads. After obtaining HiFi reads, two independent de novo assemblies were generated for comparative analysis: one using Hifiasm v. 0.14-r312 (Cheng et al., 2021) with default parameters, and the other using NextDenovo v2.5.01 with default parameters. Finally, Pilon (version 1.18) (Walker et al., 2014) was employed to refine the assembly based on the short reads.
Based on the evolutionary information of single-copy orthologous genes across all fungal species, the Benchmarking Universal Single-Copy Orthologs (BUSCOs, http://busco.ezlab.org, version 5.2.2, basidiomycota_odb10) (Simão et al., 2015) was employed to evaluate the integrity of the genome assembly. Additionally, Merqury (version 1.3) (Rhie et al., 2020) was utilized to assess the k-mer completeness and the consensus quality value (QV) of the genome assembly.
Hi-C sequencing and analysisHi-C libraries were constructed following a standard in situ protocol. Intact nuclei were cross-linked with 1% formaldehyde to preserve three-dimensional chromatin contacts. Following DpnII digestion, fragment ends were repaired, biotinylated, and proximity-ligated using T4 DNA ligase to generate chimeric circles. Reverse cross-linking, shearing to 300–500 bp, and streptavidin-mediated enrichment of biotin-tagged junctions were performed prior to adapter ligation and limited-cycle PCR. The resulting libraries were quantified, size-verified, and subjected to paired-end 150-bp sequencing on the DNBSEQ-G400 platform (MGI, Wuhan, China) to achieve approximately 100 × physical coverage. Raw reads were filtered using SOAPnuke (version 2.1.0). Hi-C clean reads were aligned to the reference genome using Juicer (version 1.6) (Durand et al., 2016a), and preliminary clustering and orientation of the data were performed using 3D-DNA (version 180,922) (Dudchenko et al., 2017). Valid data obtained after alignment by Juicer were further processed using 3D-DNA JuiceBox (version 1.11.08) (Durand et al., 2016b) for automated clustering, sorting, and orientation, with visual error correction.
Genomic component prediction and functional annotationRepeatMasker (version 4.1.1) (Flynn et al., 2020) was employed for the prediction of repetitive sequences, while tRNAscan-SE (version 1.3.1) (Chan et al., 2021) and Rfam (Kalvari et al., 2021) were utilized for the identification of tRNA and rRNA sequences. To predict coding genes, a combination of Augustus (version 3.4.0) (Stanke and Morgenstern, 2005), glimmerHMM (version 3.0.1) (Majoros et al., 2004), GeneMark-ES (version 4.3.5) (Majoros et al., 2004), exonerate v2.2.0,2 PASA, and EvidenceModeler (Haas et al., 2008) was applied using three distinct strategies: (a) de novo prediction, (b) homology-based search, and (c) transcriptome-assisted annotation. Functional annotations of the predicted genes were performed against multiple databases, including Gene Ontology (GO) (Consortium, 2019), Kyoto Encyclopedia of Genes and Genomes (KEGG) (Kanehisa and Goto, 2000), NCBI Non-redundant protein database (NCBI nr), SwissProt, and Carbohydrate-Active enzymes (CAZymes) (Cantarel et al., 2009), utilizing BLAST+ (Camacho et al., 2009) and DIAMOND (Buchfink et al., 2015). The E-value threshold was set to <1 × 10−5, and the minimal alignment length percentage was 40%. Additionally, Diamond 2.9.0 (e-value > e−5) in conjunction with the Hmmer package was used for the prediction of P450s and the annotation of target protein sequences. Reference P450 sequences for cluster analysis were retrieved from the Fungal Cytochrome P450 Database.3
Identification of mat-genesBased on gene function annotation information retrieved from databases such as Gene Ontology (GO), KOG, and Swiss-Prot, the HD1 and HD2 gene sequences were identified, and the position of the A mating-type locus was determined. According to the gene function annotation results, the pheromone receptor gene (STE3) and the pheromone precursor gene (phB) associated with the B (P/R) mating-type locus were further investigated. Given that pheromone precursors are typically located within approximately 10 kb of the pheromone receptor genes, the location of the pheromone receptor gene was established. Subsequently, to identify pheromone precursors, the 10-kb flanking regions upstream and downstream of each STE3 receptor gene were extracted and analyzed for open reading frames (ORFs) using the NCBI Open Reading Frame Finder. The predicted ORFs were then translated in silico and filtered for peptides that (i) terminate with a CaaX prenylation motif, where C represents cysteine, the two central residues (a) are aliphatic (A, V, L, I, G), and the C-terminal residue (X) is A, S, M, Q, or C, and (ii) contain the conserved N-terminal signatures AF and ER (Clarke, 1992; Bao, 2019; Shen et al., 2024). Collinearity analysis of the upstream and downstream genes at the mating-type loci A and B in N. sinensis was conducted using ChromoMapper software.4
Comparative genomics analysisThe all-versus-all BLASTP method (E-value < 1 × 10−5) was employed to identify orthologous genes across four N. sinensis genomes. Single nucleotide polymorphisms (SNPs) and insertions/deletions (InDels) were detected based on genomic alignment results of shared genes between the following strain pairs: NS-27 and NS-45, as well as NS-29 and NS-58. These analyses were conducted using the MUMmer (Kurtz et al., 2004) and LASTZ (Altschul et al., 1990; Chiaromonte et al., 2002) tools. To detect genomic structural variation, we conducted a genome-wide collinearity analysis. Syntenic paralogous blocks were identified using MCSCAN (Wang et al., 2012) by comparing our genomes with the publicly available genomes of N. sinensis strain NX-20 and T. fuciformis strains TWW01-AX and Tr01. All these reference genomes correspond to monokaryotic isolates with complete gene annotations deposited in NCBI. To identify strain-specific genes, we first extracted coding sequences that lacked aligned counterparts between pairs of strains. Each putative unique gene set was then validated at the protein level to exclude assembly or annotation artefacts. Briefly, strain-specific peptides were retrieved with seqkit v2.3.0, reciprocal BLASTP (e-value ≤ 1e-5) was performed against the proteome of every partner strain, and hits ≥ 1 were discarded. Only proteins with zero reciprocal matches were retained as bona-fide strain-specific genes. KEGG and GO enrichment analyses were performed utilizing the OmicShare tools.5 Significantly enriched pathways and GO terms among rearranged genes were compared against syntenic genomes via a hypergeometric test. The resulting p-values were adjusted using false discovery rate (FDR) correction, with an FDR threshold of ≤ 0.05. Pathways and GO terms satisfying this criterion were designated as significantly enriched.
Phylogenomic analysisTo explore the evolutionary dynamics of N.sinensis, the genome sequences of additional 19 fungal species were downloaded from NCBI for phylogenomics analysis (Supplementary Table S13). The single-copy orthologous genes of these 23 species were identified using OrthoFinder (Steve Kelly Lab, Oxford, UK) (Emms and Kelly, 2015). Subsequently, the amino acid sequences of these single-copy genes were aligned with MAFFT (version 7.505) using default parameters (Madeira et al., 2022), followed by the screening of conserved regions with Gblocks (Castresana, 2000) and concatenation into supergenes via Phylosuite (Zhang et al., 2020). Maximum likelihood phylogenomic trees were constructed using RAxML (version 8.2.12) with LG + I + G4 + F model, statistical support values were obtained using nonparametric bootstrap with 1,000 replicates (Stamatakis, 2014). While divergence times were estimated using the MCMC tree module in the PAML software package with the following parameters: clock = 3 (correlated rates); ndata = 1; seqtype = 2; model = 0 (JC69); aaRatefile = wag.dat; burnin = 1,000,000; sampfreq = 10; nsample = 500,000 (Sanderson, 2003). For molecular clock calibration, the divergence time between the two outgroup fungi, Ustilago hordei(Ustilaginomycotina) and Wallemia mellicola(Wallemiomycotina), was used as a calibration constraint, with an estimated range of 406–501 million years ago (Mya) obtained from the TimeTree database. The expansion and contraction of gene families across the 23 fungal species were predicted using the Computational Analysis of gene Family Evolution (CAFE version 4.2.1) software (De Bie et al., 2006) with the following parameters: a cut-off p-value of 0.05; number of random samples = 1,000; the lambda value to calculate birth and death rates.
Count of chromosome number in N. sinensisMonokaryotic basidiospores were harvested during the logarithmic growth phase and subsequently treated with 5 g/L colchicine for 2.5 h. The cells were then incubated at 4 °C for 3 h, followed by fixation with formalin at room temperature overnight. After fixation, the cells were stained with 20 mg/L DAPI and analyzed using confocal microscopy (Carl Zeiss AG, Oberkochen, Germany).
ResultsScreening and analysis of N. sinensis sequencing strainsMature fruit bodies of N. sinensis were successfully obtained from the N. sinensis strain JSJ-J2F1001C, kindly provided by Yunnan Junshijie Biotechnology Co. Ltd. (Figure 1A). Basidiospores were harvested using a standardized basidiospore collection method, and yeast-like conidia were subsequently cultured on PDA solid medium (Figure 1B). These conidia underwent asexual reproduction via budding (as indicated by the red arrows in Figure 1D). Fluorescence microscopy confirmed that the basidiospores of N. sinensis were monokaryotic (Figure 1C). Based on preliminary compatibility analyses among different monokaryotic strains of N. sinensis (Cao et al., 2024), four strains with distinct mating factors—NS-27, NS-29, NS-45, and NS-58—were selected for subsequent genome sequencing. To validate the compatibility and mating types of these strains, pairwise mating experiments were performed, demonstrating that the combinations of NS-27 with NS-45 and NS-29 with NS-58 resulted in sexually compatible hybrids (Supplementary Table S1). After cultivating these two compatible hybrid strains in an induction medium for 15 days, faint mycelial growth was observed (Figure 1E, indicated by blue arrows), whereas incompatible strains exhibited no mycelial growth (Figure 1E, indicated by green arrows). Optical microscopy revealed clamp connections within the mycelium (Figure 1F, indicated by red arrows). Mating type primers (Shen et al., 2023) were employed to determine the mating types of the four strains, confirming that the mating types of NS-27, NS-29, NS-45, and NS-58 were A1B1, A1B2, A2B2, and A2B1, respectively (Figure 1G). Furthermore, DNBSEQ-G400 sequencing was utilized for genomic analysis. As shown in Supplementary Figures S1A–D, the genome sizes of the four N. sinensis strains ranged from approximately 21 to 22 Mb, with a heterozygosity ratio of 0.23 to 0.27% at a k-mer value of 21. Collectively, these findings confirmed the successful isolation of four monokaryotic strains with distinct mating factors, which were suitable for further monokaryotic genome sequencing and assembly.

The phenotype, microscopic morphology, and mating type of four N. sinensis strains. (A–F) Represent the fruiting body of N. sinensis, the cultivation of basidiospores on PDA, the observation of nuclei in basidiospores, the microscopic morphology of yeast-like cell reproduction in basidiospores, the morphology of hybrid strains under induction medium, and the clamp connections of mycelium in hybrid strains, respectively. Scale bars: 2 cm (A,B,E), 20 μm (C,D,F). (G) Mating type identification of 4 strains.
Genome sequencing and assemblyIn this study, we conducted the genome assembly of four monokaryotic strains of N. sinensis using Hifiasm and Nextdenovo. The results demonstrated that Hifiasm generated assemblies with a higher number of contigs (ranging from 55 to 103) and variable genome sizes (from 18.76 to 20.40 Mb), with an N50 ranging from 1,762,575 to 2,341,609 bp. In contrast, Nextdenovo produced fewer contigs (13 per strain), with more consistent genome sizes (20.73 to 20.82 Mb) and an N50 ranging from 1,749,571 to 1,765,968 bp (Supplementary Table S2). Overall, Nextdenovo exhibited superior performance compared to Hifiasm in assembling the N. sinensis genome.
In the four N. sinensis genomes, 13 contigs were assembled in each, which is two fewer than the 15 reported for the reference strain NX-20. The genome sizes and lengths of the largest contigs were comparable among all five strains, averaging approximately 21 Mb and 2.54 Mb, respectively. The average N50 of the new assemblies was 1,758,927 bp, which is approximately 55 kb shorter than that of NX-20. The GC content averaged 53.65%. The annotation of NX-20 combined de novo (Augustus) and homology-based evidence from N. encephala but did not incorporate RNA-seq data. In contrast, the four genomes reported here were annotated using an integrative pipeline that incorporated comprehensive transcriptome evidence (see Materials and methods). Consequently, the new assemblies contain an average of 6,564 predicted protein-coding genes, which is 704 more than the 5,860 genes reported for NX-20, and the total coding sequence accounts for 48.24% of the genome, approximately 6 percentage points higher than NX-20. These increases likely reflect the enhanced sensitivity provided by transcriptome support rather than true biological expansion (Table 1).
FeatureNS-27NS-29NS-45NS-58Average valueNX-20Genome size (Mb)20.7320.8020.7720.8220.7820.99Number of contigs131313131315Max length (bp)2,543,1732,539,4982,537,8512,554,1022,543,6562,546,384Contig N50 (bp)1,757,7231,765,9681,762,4471,749,5711,758,9271,814,705GC contents (%)54.5854.4951.8353.6953.6556.42Number of genes6,5786,5586,5606,5626,5645,860Average gene length (bp)2467.372459.772451.452463.232460.45-Average cds length (bp)1602.731602.441600.821599.811601.451,534Average number of exons per gene6.836.826.826.636.775-Average exon length (bp)234.71234.91234.64234.31234.64-Average intron length (bp)98.5298.3796.4199.0298.08-Gene total length (Mb)10,542,75810,505,91610,502,56010,499,20010,512,6098,989,977Gene length/Genome (%)48.5048.1748.2248.0948.2442.81ReferenceThis studyThis studyThis studyThis studySun et al. (2021)Statistics of N. sinensis genomes assembly and gene prediction.
Subsequently, BUSCO analysis and Merqury evaluation were performed to assess the quality of the genome assemblies of the four N. sinensis strains. A total of 1,764 BUSCOs v5.2.2 (basidiomycota_odb10) were identified across the four genome assemblies, with the complete BUSCO rates being 97.10, 97.00, 96.90, and 96.80% for NS-27, NS-29, NS-45, and NS-58, respectively (Supplementary Table S3). When the same assemblies were re-evaluated with BUSCOs v6.0.0 against the updated basidiomycota_odb12 lineage set (2,409 BUSCO groups), completeness remained consistently high (96.1–96.3%), while the proportion of duplicated BUSCOs was negligible (< 0.1%) and missing genes accounted for only 2.9–3.1%, confirming the robustness and completeness of the four genome assemblies across different BUSCO versions and lineage databases (Supplementary Table S3). The Merqury evaluation revealed k-mer completeness ranging from 99.62 to 99.75%, with consensus quality values (QV) ranging from 54.72 to 57.51, confirming a substantial improvement in both the accuracy and completeness of the four genome assemblies (Supplementary Table S4). Comparison with the genome assemblies of Tremellaceae species available in the NCBI database demonstrated that our assembly results surpass those of these species (Table 2).
SpeciesNCBI BioProjectTotal length (Mb)GC%ContigsN50 length (bp)BUSCOsFragmentedMissingN. sinensis_NS-27PRJNA126474420.7354.58131,757,72397.10%1.0%1.9%N. sinensis_NS-29PRJNA126474420.8054.49131,765,96897.00%1.0%2.0%N. sinensis_NS-45PRJNA126474420.7751.83131,762,44796.90%1.1%2.0%N. sinensis_NS-58PRJNA126474420.9953.69131,814,70596.80%1.1%2.1%N.aurantialba. NX-20PRJNA77229420.9946.8151,825,33692.0%1.4%6.6%N. encephala 68–887.2PRJNA33069919.7949.3151209,50085.5%3.4%11.1%T. mesenterica DSM 1558PRJNA22552928.6446.8484123,76792.01.4%6.6%T. fuciformis Tr26PRJNA28151923.6457.03,50218,44892.4%1.4%6.2%T. fuciformi TWW01-AXPRJNA92422228.456.519228,23593.1%0.6%6.3%Assembly summary statistics compared to other mushrooms of Tremellales.
High-throughput chromosome conformation capture (Hi-C), a massively parallel DNA sequencing technique, facilitates the generation of chromosome-length scaffolds and highly contiguous genome assemblies. In this study, we employed Hi-C technology to enhance genome assembly by leveraging paired-end sequencing data obtained from the DNBSEQ platform. Following quality filtering and removal of duplicate reads, approximately 2.44–2.
Comments (0)