Genome sequence of the white-rot fungus Irpex lacteus F17, a type strain of lignin degrader fungus
Yao et al. Standards in Genomic Sciences
Genome sequence of the white-rot fungus Irpex lacteus F17, a type strain of lignin degrader fungus
Mengwei Yao 0 1
Wenman Li 0 1
Zihong Duan 0 1
Yinliang Zhang 0 1
Rong Jia 0 1
0 Anhui Key Laboratory of Modern Biomanufacturing, Anhui University , Hefei 230601 , People's Republic of China
1 School of Life Sciences, Economic and Technology Development Zone, Anhui University , 111 jiulong Road, Hefei, Anhui 230601 , People's Republic of China
Irpex lacteus, a cosmopolitan white-rot fungus, degrades lignin and lignin-derived aromatic compounds. In this study, we report the high-quality draft genome sequence of I. lacteus F17, isolated from a decaying hardwood tree in the vicinity of Hefei, China. The genome is 44,362,654 bp, with a GC content of 49.64% and a total of 10,391 predicted protein-coding genes. In addition, a total of 18 snRNA, 842 tRNA, 15 rRNA operons and 11,710 repetitive sequences were also identified. The genomic data provides insights into the mechanisms of the efficient lignin decomposition of this strain.
Short genome report; Genome sequence; Irpex lacteus F17; White-rot fungus; Hardwood tree; Lignin decomposition
Irpex lacteus, a white-rot fungus with biotechnological
potential, is currently considered the most important
lignocellulose-degrading organism because of its
potential to degrade lignin and bioremediate other
ligninrelated pollutants (such as industrial dyes and aromatic
]. Lignocellulose, which is the most
abundant renewable biomass in terrestrial environments,
is composed of three major components: cellulose,
hemicellulose, and lignin [
]. Among them, lignin is a
highly irregular and heterogeneous biopolymer, which
makes it recalcitrant to degradation. Compared with
other wood-decay fungi, I. lacteus plays an important
role in the efficient enzymatic conversion of renewable
biomass, and it shows remarkable resistance to pollutant
toxicity in water and soil environments [
]. I. lacteus is
known to remove various aromatic compounds,
including endocrine disruptors, synthetic dyes, and polycyclic
aromatic hydrocarbons [
], and it can also be used to
obtain ethanol via the biological pre-treatment of
I. lacteus is a cosmopolitan species that is widespread
in Europe, North America, and Asia [
]. The fungus
produces hydrolases, such as exo- and endo-cellulases,
and extracellular oxidative enzymes, such as LiP, MnP, as
well as Lac [
], thereby showing a pattern of
ligninolytic enzymes that is typical of white-rot fungi.
Starting in the 1960’s, several studies by Japanese researchers
mainly focused on the activities of the exo- and
endocellulases, as well as an exo-cellulase gene, from I.
lacteus . Subsequently, the LiP and MnP of I. lacteus
were isolated and characterized, and the
biotechnological applicability of this fungus has drawn
considerably interests in recent years [
]. Recently, we have
degraded and detoxicated the synthetic dyes by using
manganese peroxidase isolated from I. lacteus F17 [
]. However, the genome sequence of I. lacteus has not
been reported. Thus, the genomic traits of I. lacteus are
required to reveal and elucidate the ligninolytic potential
of the type strain of white-rot fungi. Here, the genome
sequence of I. lacteus F17 is presented. To the best of
our knowledge, this is the first high-quality draft genome
sequence of I. lacteus available so far.
Classification and features
The sequenced strain of I. lacteus F17 was isolated from
a decaying hardwood tree in May 2009 in the vicinity of
Hefei, China (Table 1). Figure 1a shows the growth
status of I. lacteus F17 which was cultured on PDA
medium (200 g/L of potato extract, 20 g/L of glucose,
and 20 g/L of agar) after 5 days at 28 °C. The strain grew
faster and formed a white colony with a diameter of
6.8 cm. The micrograph of I. lacteus F17 mycelia grown
on PDA after 3 days was obtained by OLYMPUS BX51
(Fig. 1b). The mycelia were picked up from an agar plate
using a tiny tweezer, mounted on glass slides, and then
stained with an appropriate amount of fungal staining
solution mixed with lactic acid, carbolic acid and cotton
blue (lactic acid 10 mL, carbolic acid 10 g, glycerol
20 mL, cotton blue 0.02 g, distilled water 10 mL) for
light microscopic examination (400×).
I. lacteus F17 resides in the Eukaryota, in the Fungal
Kingdom, and it belongs to the family Polyporaceae, order
Polyporales, class Basidiomycetes, Phylum Basidiomycota.
Several other white-rot fungi with important biological
function are members of the Polyporales, including
Phanerochaete chrysosporium, Dichomitus squalens, Trametes
versicolor, Polyporus brumalis, and Ceriporiopsis
subvermispora. I. lacteus F17 has been identified and classified
based on its Internal Transcribed Spacer region in our
previous study [
]. The 18S rRNA gene data of I. lacteus
F17 and several other Polyporales species were aligned
using ClustalW [
]. Phylogenetic analysis based on the
nearest neighbor joining method was performed using the
MEGA6 package [
]. The confidence levels for the
individual branches were determined by bootstrap analysis
with 1000 replicates. The final phylogenetic tree was
visualized with TreeView [
]. I. lacteus F17 is phylogenetically
closely related to C. subvermispora (Fig. 2).
aEvidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement
(i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are
from the Gene Ontology project [
Genome sequencing information
Genome project history
I. lacteus F17 was selected for sequencing due to its
bioremediation of organic pollutants and application to
enzymatic biotechnologies. The genome of this strain was
sequenced by SMRT technology, and genome assembly
and annotation were performed at the Beijing Novogene
Bioinformatics Technology Co., Ltd. (Beijing, China).
The whole genome shotgun project was started in May
2016, finished in August 2016 and has been submitted to
NCBI under the accession number of MQVO00000000.
Table 2 summarized the project data. The project
information was in compliance with MIGS version 2.0 [
Growth conditions and genomic DNA preparation
I. lacteus F17 was deposited at the CCTCC under the
accession number of CCTCC AF 2014020. The strain
was grown on PDA slants for 5 days at 28 °C, at which
time the mycelia were scraped from the medium and
lysed by liquid nitrogen grinding. The genomic DNA
was extracted using the sodium dodecyl sulfate method.
The harvested DNA was analyzed by agarose gel
electrophoresis and purified using AMpure PB magnetic
beads and then quantified by a Qubit® 2.0 fluorometer
(Thermo Scientific, USA). In the end, the total amount
of 28 μg DNA with a final concentration higher than
50 ng/μL and a A260/A280 ratio of 1.9 was placed in
dry ice and sent to the sequencing.
Genome sequencing and assembly
A fungal survey by Illumina massively parallel sequencing
technology was first used to make an evaluation for the
fine mapping and assembly optimization of the fungal
genome preassembling. Then the genome of I. lacteus F17
was sequenced by using PacBio’s SMRT technology. For
the Illumina sequencing, the genome was sequenced using
a single 350 bp insert genomic DNA library that was
generated on a HiSeq 4000 PE150 system (Illumina, San
Diego, CA, USA). For the PacBio sequencing, the genomic
DNA was sheared into 20 kb fragments using a g-TUBE
(Covaris, Woburn, MA, USA), and it was sequenced on
an RSII system (PacBio, Menlo Park, CA, USA) after
constructing the SMRT Bell library. The average sequencing
depth of the 350 bp library was 20×, whereas the depth of
the PacBio library was 70×.
Two assembly strategies were used respectively after
filtering low-quality reads. A fungal survey produced
1564 Mb of clean data from 1700 Mb of raw data using
SOAP denovo technology [
]. The PacBio subreads
which were assembled into a primary assembly were
completed with the Hierarchical Genome Assembly
Process (Pacific Biosciences). A total of 3494 Mb of
clean data were detected from the genome of I. lacteus
F17 using samtools to fix the errors from the PacBio.
The low quality reads were filtered by the SMRT 2.3.0
], and the filtered reads were assembled
to generate one contig without gaps. A total of 317
contigs with an N50 of 1.15 Mb were generated from I.
lacteus F17 genome. Finally, a 44.36 Mb draft genome of I.
lacteus F17 was obtained. In addition, we used BUSCO
] to assess the completeness of I. lacteus F17 genome
and the genome has an estimated completeness of
86.9%, which indicated that we obtained a high-quality
genome assembly in this study.
By combining three types of genotype calling, including
de novo PASA prediction of Transdecoder/Glimmer/
Snap based on transcriptome data, Cufflinks prediction
based on transcriptome data and de novo Augustus
(version 2.7) [
], a total number of 10,391 protein coding
genes were predicted. The interspersed repetitive
sequences were predicted using the RepeatMasker [
The tandem repeats were analyzed by the Tandem
Repeats Finder [
] and the tRNA genes were predicted by
the tRNAscan-SE [
]. The rRNA genes were analyzed
by the rRNAmmer [
] and the snRNA were predicted
by BLAST against the Rfam [
] database. In the
end, 18 snRNA, 842 tRNA, 15 rRNA operons and a total
of 11,710 repetitive sequences were identified in the
genome. Seven databases, including Gene Ontology, Kyoto
Encyclopedia of Genes and Genomes, COG,
NonRedundant Protein Database, Transporter Classification
Database, Swiss-Prot, and Pfam database were employed
to predict gene functions. A whole genome BLAST
search (E-value less than 1e-5, minimal 2 alignment
length percentage larger than 40%) was performed
against above seven databases. All putative proteins were
compared to the entries in the CAZy database using a
BLAST search. Secreted proteases were predicted with
SignalP 4.1  and TMHMM 2.0 [
Other proteins that are important in wood-decay
(oxidoreductases) and connected to fungal secondary
metabolism were also predicted, according to a previously
published method [
The draft genome sequence was based on an assembly
of 317 contigs amounting to 44,362,654 bp, with a GC
content of 49.64% (Table 3). From the genome, 875
RNAs (including 18 snRNA, 842 tRNA, and 15 rRNA
operons), as well as 11,710 repetitive sequences, were
detected. In addition, a total of 10,661 genes were
predicted, of which 10,391 are protein coding genes. Table 4
presented the distribution of genes into COGs functional
categories. Of the last, 2065 genes (19.37%) were
assigned to COG functional categories, the most
abundant of them lies in the COG category named
“Posttranslational modification, protein turnover, chaperones”
(245 proteins) followed by “Translation, ribosomal
structure and biogenesis” (215 proteins), “General function
prediction only” (211 proteins), “Energy production and
conversion” (168 proteins), “Nucleotide transport and
metabolism” (144 proteins), “RNA processing and
modification” (121 proteins), and “Intracellular trafficking and
secretion” (116 proteins).
A total of 320 CAZyme-encoding genes were identified,
including 53 CBMs, 161 GHs, 30 glycosyl transferases, four
polysaccharide lyases, 64 AAs, and eight carbohydrate
8374 80.59 Not in COGs
The total is based on the total number of protein coding genes in the genome
esterases (Additional file 1: Table S1). In conclusion, I.
lacteus F17 possesses more CAZy families than other
fungi (Additional file 2: Table S2), especially in the
families AA3 (17 copies), AA9 (21 copies), CBM1 (34
copies), and GH5 (24 copies), which are all involved in
plant cell wall degradation.
Insights from the genome sequence
Until now, this is the first draft genome sequence of the
genus Irpex. The phylogenetic analysis based on the 18S
rRNA gene data confirms its closest relationship of I.
lacteus F17 to C. subvermispora. Annotation of the I.
lacteus F17 genome indicates that this strain possesses
320 carbohydrate-active enzymes, 191 lignin-related
oxidoreductases, 568 secreted proteases, and six secondary
metabolism gene clusters (Additional file 3: Table S3), all
of which confirm its high lignin decomposition ability.
Fifteen enzymes were classified as probable ligninolytic
enzymes, including a Lac, an LiP, and 13 MnPs, one of
which was identified previously [
]. Interestingly, both
I. lacteus F17 and C. subvermispora have the largest
number of MnPs, even greater than that of P.
chrysosporium (five MnPs), as determined by comparing 34
basidiomycetes, including 26 fungal species belonging to
the Polyporales, as well as eight species in Agaricales,
Russulales, Hymenochaetales, and Corticiales,
respectively (Additional file 4: Table S4). A high number of
MnP isozymes suggest that I. lacteus F17 has a good
ability to degrade lignin and other organic pollutants.
In this study, we characterized the genome of I. lacteus
F17 that was isolated from a decaying hardwood tree in
the vicinity of Hefei, China. Notably, this is a first
discovered sequenced strain, and we found it has lots of
lignocellulose decomposition related genes. The genome
sequencing information not only revealed its ligninolytic
enzyme diversity, but also contributed to a better
understanding of the efficient lignin decomposition of this
strain. In summary, I. lacteus F17 has become one of
model ligninolytic basidiomycetes whose availability of
genomic sequences will facilitate future genetic
engineering to degrade lignin and other organic pollutants.
Additional file 1: Table S1. Total CAZy families in I. lacteus F17.
(XLSX 17 kb)
Additional file 2: Table S2. Selection of the CAZy families involved in
plant cell wall degradation. (XLS 40 kb)
Additional file 3: Table S3. Gene contents in oxidoreductases,
secreted proteases and secondary metabolism in the genomes of I.
lacteus F17. (DOCX 15 kb)
Additional file 4: Table S4. Comparison of the number of MnPs
from 34 fungal species belonging to the Polyporales and eight other
fungi. (XLS 38 kb)
AA: Auxiliary activities; BLAST: Basic local alignment search tool;
CAZy: Carbohydrate-active enzymes; CBM: Carbohydrate-binding modules;
CCTCC: China Center for Type Culture Collection; COG: Clusters of orthologous
groups; GH: Glycoside hydrolases; Lac: Laccase; LiP: Lignin peroxidase;
MnP: Manganese peroxidase; PacBio: Pacific Bioscience; PDA: Potato
dextrose agar; SMRT: Single Molecule Real-Time
This research was supported by the National Natural Science Foundation of
China (31570102, 31070109).
MWY participated in the sequence alignment and drafted the manuscript.
WML carried out the laboratory experiments. ZHD participated in the
sequence alignment. YLZ participated in the design of the study and
performed the statistical analysis. RJ conceived of the study, and
participated in its design and coordination and helped to draft the
manuscript. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
1. Kasinath A , Novotný C , Svobodová K , Patel K , Šašek V . Decolorization of synthetic dyes by Irpex lacteus in liquid cultures and packed-bed bioreactor . Enzyme Microb Tech . 2003 ; 32 : 167 - 73 .
2. Šašek V , Novotný Č , Vampola P . Screening for efficient organopollutant fungal degraders by decolorization . Czech Mycol . 1998 ; 50 : 303 - 11 .
3. Song HG . Biodegradation of aromatic hydrocarbons by several white-rot fungi . J Microbiol . 1997 ; 35 : 66 - 71 .
4. Morin E , Kohler A , Baker A , Foulongne M , Lomard V , Nagy L , Ohm R , Patyshakuliyeva A , Burn A , Aerts A , Bailey A , Billette L , Coutinho P , Deakin G , Doddapaneni H , Floudas D , Grimwood J , Hilden K , Kues U , Labutti K , Lapidus A , Lindquist E , Lucas S , Murat C , Riley R , Salamov A , Schmutz J , Subramanian V , Wosten H , Xu J , Eastwood D , Foster G , Sonnenberg D , Cullient D , Vries R , Lundell T , Hibbert D , Henrissat B , Burton K , Kerrigan R , Challen M , Grigoriev L , Martin F . Genome sequence of the button mushroom Agaricus bisporus revals mechanisms governing adapatation to a humic-rich ecological niche . Proc Natl Acad Sci U S A . 2012 ; 109 : 17501 - 6 .
5. Novotny C , Cajthaml T , Svobodova K , Susla M , Šašek V . Irpex lacteus, a whiterot fungus with biotechnological potential-review . Folia Microbiol . 2009 ; 54 ( 5 ): 375 - 90 .
6. Baborová P , Möder M , Baldrian P , Cajthamlová K , Cajthaml T. Purification of a new manganese peroxidase of the white-rot fungus Irpex lacteus, and degradation of polycyclic aromatic hydrocarbons by the enzyme . Res Microbiol . 2006 ; 157 ( 3 ): 248 - 53 .
7. Garcia M , Lopez-Abelairas M , Lu-Chau TA , Lema J . Fungal pretreatment of agricultural residues for bioethanol production . Ind Crop Prod . 2016 ; 89 : 486 - 92 .
8. Kellner H , Luis P , Pecyna M , Barbi F , Kapturska D , Kruger D , Rzak D , Marmeisse R , Marmeisse R , Vandenbol M , Hofrichter M. Widespread occurrence of expressed fungal secretory peroxidases in forest soils . PLoS One . 2014 ; 9 ( 4 ): e95557 .
9. Novotný Č , Erbanová P , Cajthaml T , Dosoretz RC , Sasek V . Irpex lacteus, a white rot fungus applicable to water and soil bioremediation . Appl Microbiol Biot . 2000 ; 54 ( 6 ): 850 - 3 .
10. Qin X , Zhang J , Zhang X , Yang Y . Induction, purification and characterization of a novel manganese peroxidase from Irpex lacteus CD2 and its application in the decolorization of different types of dye . PLoS One . 2014 ; 9 ( 11 ): e113282 .
11. Kanda T , Wakabayashik N. Purification and properties of an endocellulase of avicelase type from Irpex lacteus (Polyporus tulipiferae) . J Biochem . 1976 ; 79 ( 5 ): 977 - 88 .
12. Cajthaml T , Erbanova P , Kollmann A , Novotny Č , Šasek V , Mougin C . Degradation of PAHs by ligninolytic enzymes of Irpex lacteus . Folia Microbiol . 2008 ; 3 ( 53 ): 289 - 94 .
13. Nisizawa K , Hashimoto Y . Cellulose-splitting enzymes . VI. Difference in the specificities of cellulase and β-glucosidase from Irpex lacteus . Arch Biochem Biophys . 1959 ; 81 ( 1 ): 211 - 22 .
14. Chen WT , Zheng LL , Jia R , Wang N. Cloning and expression of a new manganese peroxidase from Irpex lacteus F17 and its application in decolorization of reactive black 5 . Process Biochem . 2015 ; 50 ( 11 ): 1748 - 59 .
15. Yang XT , Zheng JZ , Lu YM , Jia R . Degradation and detoxification of the triphenylmethane dye malachite green catalyzed by crude manganese peroxidase from Irpex lacteus F17 . Environ Sci Pollut Res . 2016 ; 23 ( 10 ): 9585 - 97 .
16. Larkin MA , Blackshields G , Brown NP , Chenna R , McGettigan PA , Mcwilliam H , Valentin F , Wallace IM , Wilm A , Lopez R , Thompson JD , Gibson TJ . Higgins1 DG. ClustalW and Clustal X version 2 .0. Bioinformatics . 2007 ; 23 ( 21 ): 2947 - 8 .
17. Tamura K , Stecher G , Peterson D , Filipski A , Kumar S. MEGA6: Molecular evolutionary genetics analysis version 6.0 . Mol Biol Evol . 2013 ; 30 : 2725 .
18. Page RDM . TreeView: an application to display phylogenetic trees on personal computers . Computer Applic Biosci . 1996 ; 12 ( 4 ): 357 - 8 .
19. Field D , Garrity G , Gray T , Morrison N , Selengut J , Sterk P , Tatusova T , Thomson N , Allen MJ , Angiuoli SV , et al. The minimum information about a genome sequence (MIGS) specification . Nat Biotechnol . 2008 ; 26 ( 5 ): 541 - 7 .
20. Li R , Zhu H , Ruan J , Qian W , Fang X , Shi Z , Li Y , Li S , Shan G , Kristiansen K , Li S , Yang H , Wang J , Wang J . De novo assembly of human genomes with massively parallel short read sequencing . Genome Res . 2010 ; 20 ( 2 ): 265 - 72 .
21. Berlin K , Koren S , Chin CS , Drake JP , Landolin JM , Phillippy AM . Assembling large genomes with single-molecule sequencing and locality-sensitive hashing . Nat Biotechnol . 2015 ; 33 ( 6 ): 623 - 30 .
22. Simão FA , Waterhouse RM , Ioannidis P , Kriventseva EV , Zdobnov EM . BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs . Bioinformatics . 2015 ; 31 ( 19 ): 3210 - 2 .
23. Stanke M , Diekhans M , Baertsch R , Haussler D. Using native and syntenically mapped cDNA alignments to improve de novo gene finding . Bioinformatics . 2008 ; 24 ( 5 ): 637 - 44 .
24. Saha S , Bridges S , Magbanua ZV , Peterson DG . Empirical comparison of ab initio repeat finding programs . Nucleic Acids Res . 2008 ; 36 ( 7 ): 2284 - 94 .
25. Benson G . Tandem repeats finder: a program to analyze DNA sequences . Nucleic Acids Res . 1999 ; 27 ( 2 ): 573 .
26. Lowe TM , Eddy SR . tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence . Nucleic Acids Res . 1997 ; 25 ( 5 ): 955 - 64 .
27. Lagesen K , Hallin P , Rødland EA , Staerfeldt HH , Rognes T , Ussery DW . RNAmmer: consistent and rapid annotation of ribosomal RNA genes . Nucleic Acids Res . 2007 ; 35 ( 9 ): 3100 - 8 .
28. Gardner PP , Daub J , Tate JG , Nawrocki EP , Kolbe DL , Lindgreen S , Wilkinson AC , Finn RD , Griffiths-Jones S , Eddy SR , Bateman A . Rfam: updates to the RNA families database . Nucleic Acids Res . 2009 ; 37 (Database issue): 136 - 40 .
29. Nawrocki EP , Kolbe DL , Eddy SR . Infernal 1.0: Inference of RNA alignments . Bioinformatics . 2009 ; 25 ( 10 ): 1335 - 7 .
30. Petersen TN , Brunak S , Von HG , Nielsen H. SignalP 4.0: Discriminating signal peptides from transmembrane regions . Nat Meth . 2011 ; 8 : 785 - 6 .
31. Krogh A , Larsson B , Von Heijne G , Sonnhammer EL . Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes . J Mol Biol . 2001 ; 305 : 567 - 80 .
32. Floudas D , Binder M , Riley R , Barry K , Blanchette RA , Henrissat B , Martinez AT , Otillar R , Spatafora JW , Yadav JS , Aerts A , Benoit I , Boyd A , Carlson A , Copeland A , Coutinho PM , Vries RP , Ferreira P , Findley K , Foster B , Gaskell J , Glotzer D , Gorecki P , Heitman J , Hesse C , Hori C , Igarashi K , Jurgens JA , Kallen N , Kersten P , Kohler A , Kuees U , TKA K , Kuo A , LaButti K , Larrondo LF , Lindquist E , Ling A , Lombard V , Lucas S , Lundell T , Martin R , DJ ML , Morgenstern I , Morin E , Murat C , Nagy LG , Nolan M , Ohm RA , Patyshakuliyeva A , Rokas A , Ruiz-Duenas FJ , Sabat G , Salamov A , Samejima M , Schmutz J , Slot JC , John FS , Stenlid J , Sun H , Sun S , Syed K , Tsang A , Wiebenga A , Young D , Pisabarro A , Eastwood DC , Martin F , Cullen D , Grigoriev IV , Hibbett DS . The paleozoicorigin of enzymatic lignin decomposition reconstructed from 31 fungal genomes . Science . 2012 ; 336 : 1715 - 9 .
33. Ashburner M , Ball CA , Blake JA , Botstein D , Butler H , Cherry JM , et al. Gene ontology: tool for the unification of biology . Nat Genet . 2000 ; 25 ( 1 ): 25 - 9 .