The complete swine olfactory subgenome: expansion of the olfactory gene repertoire in the pig genome

BMC Genomics, Nov 2012

Background Insects and animals can recognize surrounding environments by detecting thousands of chemical odorants. Olfaction is a complicated process that begins in the olfactory epithelium with the specific binding of volatile odorant molecules to dedicated olfactory receptors (ORs). OR proteins are encoded by the largest gene superfamily in the mammalian genome. Results We report here the whole genome analysis of the olfactory receptor genes of S. scrofa using conserved OR gene specific motifs and known OR protein sequences from diverse species. We identified 1,301 OR related sequences from the S. scrofa genome assembly, Sscrofa10.2, including 1,113 functional OR genes and 188 pseudogenes. OR genes were located in 46 different regions on 16 pig chromosomes. We classified the ORs into 17 families, three Class I and 14 Class II families, and further grouped them into 349 subfamilies. We also identified inter- and intra-chromosomal duplications of OR genes residing on 11 chromosomes. A significant number of pig OR genes (n = 212) showed less than 60% amino acid sequence similarity to known OR genes of other species. Conclusion As the genome assembly Sscrofa10.2 covers 99.9% of the pig genome, our analysis represents an almost complete OR gene repertoire from an individual pig genome. We show that S. scrofa has one of the largest OR repertoires, suggesting an expansion of OR genes in the swine genome. A significant number of unique OR genes in the pig genome may suggest the presence of swine specific olfactory stimulation.

Article PDF cannot be displayed. You can download it here:

http://www.biomedcentral.com/content/pdf/1471-2164-13-584.pdf

The complete swine olfactory subgenome: expansion of the olfactory gene repertoire in the pig genome

Nguyen et al. BMC Genomics 2012, 13:584 http://www.biomedcentral.com/1471-2164/13/584 RESEARCH ARTICLE Open Access The complete swine olfactory subgenome: expansion of the olfactory gene repertoire in the pig genome Dinh Truong Nguyen1†, Kyooyeol Lee1†, Hojun Choi1, Min-kyeung Choi1, Minh Thong Le1, Ning Song1, Jin-Hoi Kim1, Han Geuk Seo1, Jae-Wook Oh2, Kyungtae Lee3, Tae-Hun Kim3 and Chankyu Park1* Abstract Background: Insects and animals can recognize surrounding environments by detecting thousands of chemical odorants. Olfaction is a complicated process that begins in the olfactory epithelium with the specific binding of volatile odorant molecules to dedicated olfactory receptors (ORs). OR proteins are encoded by the largest gene superfamily in the mammalian genome. Results: We report here the whole genome analysis of the olfactory receptor genes of S. scrofa using conserved OR gene specific motifs and known OR protein sequences from diverse species. We identified 1,301 OR related sequences from the S. scrofa genome assembly, Sscrofa10.2, including 1,113 functional OR genes and 188 pseudogenes. OR genes were located in 46 different regions on 16 pig chromosomes. We classified the ORs into 17 families, three Class I and 14 Class II families, and further grouped them into 349 subfamilies. We also identified inter- and intra-chromosomal duplications of OR genes residing on 11 chromosomes. A significant number of pig OR genes (n = 212) showed less than 60% amino acid sequence similarity to known OR genes of other species. Conclusion: As the genome assembly Sscrofa10.2 covers 99.9% of the pig genome, our analysis represents an almost complete OR gene repertoire from an individual pig genome. We show that S. scrofa has one of the largest OR repertoires, suggesting an expansion of OR genes in the swine genome. A significant number of unique OR genes in the pig genome may suggest the presence of swine specific olfactory stimulation. Keywords: Olfactory receptor, Pigs, Olfaction, OR genes Background Insects and animals can recognize the world around them by detecting thousands of chemical odorants. In mammals, odorant molecules are detected by olfactory receptors (ORs), which are part of the G-proteincoupled receptor superfamily of proteins having seven transmembrane domains. This superfamily was first discovered in rodents about two decades ago [1]. Olfaction is a complicated process; it begins in the olfactory epithelium with the specific binding of volatile odorant molecules to dedicated ORs expressed by olfactory sensory neurons (OSNs) [2-5]. * Correspondence: † Equal contributors 1 Department of Animal Biotechnology, Konkuk University, 263 Achasan-ro, Gwangjin-gu, Seoul 143-701, South Korea Full list of author information is available at the end of the article OR proteins are encoded by the largest gene superfamily in the mammalian genome. Using the available genome sequences, several studies have been conducted to elucidate OR subgenomes in species such as mice [6-9], humans [10-13], dogs and rats [14-16], and other vertebrates [14,17-19]. OR gene families can be grouped into the following two classes: the fish-like Class I ORs consisting of 17 families and the tetrapod-specific Class II ORs consisting of 14 families [18]. The number of functional OR genes ranges from less than 100 in some fishes including fugu (n = 44) and tetraodon (n = 42) [20] to ~1,200 in rats. A significant number of OR genes have pseudogenes, and the fraction of OR pseudogenes ranges from less than 20% in the opossum to more than 50% in humans or platypus [14,17]. Interestingly, in spite of the large number of genes that make up the OR © 2012 Nguyen et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Nguyen et al. BMC Genomics 2012, 13:584 http://www.biomedcentral.com/1471-2164/13/584 subgenome, most OR neurons express a single gene and in fact, even just a single allele [1,21]. Pigs are an attractive animal model to study olfaction and its influence on animal behavior because of their agricultural importance and their strong reliance on their sense of smell in various behavioral contexts. The characterization of the swine OR gene repertoire is necessary to better understand the underlying biology of olfaction in pigs. In addition, the comparison of OR gene repertoires and the abilities to smell among evolutionarily important animals is an interesting subject. In this study, we analyzed the pig genome assembly Sscrofa10.2, constructed by the Swine Genome Sequencing Consortium (SGSC), to characterize OR genes in pigs. We report here the nearly complete porcine olfactory subgenome. In addition, we classified the pig OR genes into families and compared OR gene repertoires of humans, dogs, mice, and pigs. Methods Detection of OR genes from the pig genome The swine draft genome sequences (Sscrofa10.2) were retrieved from the National Center for Biotechnology Information (NCBI). A translated basic local alignment search tool (TBLASTN) search was performed to identify regions containing OR related sequences that had at least two of the following conserved motifs: MAYDRYVAIC (TMIII), KAFSTCASH (TMVI), and PMLNPFIY (TMVII), or their variants with less than 50% of sequence difference from the conserved motifs. From the identified regions, we selected the sequences in the region one kilobase (kb) upstream and downstream of the BLAST matches. From the analysis, we identified 1,644 OR candidate sequences that were 2 kb in length and translated to amino acid sequences in all six frames. Then, we retrieved 24,809 OR protein sequences from 222 species from NCBI and performed a protein BLAST (BLASTP) analysis against the translated OR candidate sequences to determine the positions of the start and stop codons of the open reading frames (ORFs) on the basis of structural similarity to known OR proteins. For sequences that deviated from the sequences of reported OR proteins, the methionine and stop codon most similar in sequence context to those of the coding sequences of known OR proteins were selected as the start and end of the coding regions. We again performed TBLASTN analysis against the 1,644 sequences to evaluate the presence of all four conserved motifs [GN, MAYDRYVAIC (TMIII), KAFSTCASH (TMVI), and PMLNPFIY (TMVII)]. The candidate sequences were considered “functional ORs” if they were at least 300-amino acid long without any interrupting stop codons and/or frameshifts within the ORFs, “OR pseudogenes” if they were at least 300-amino acid long but contained stop codons or frameshifts within the ORFs, and “partial ORs” if they Page 2 of 12 were shorter than 300 amino acids in length but matched the sequences of the known OR genes. Sequen (...truncated)


This is a preview of a remote PDF: http://www.biomedcentral.com/content/pdf/1471-2164-13-584.pdf
Article home page: http://www.biomedcentral.com/1471-2164/13/584

Dinh Nguyen, Kyooyeol Lee, Hojun Choi, Min-kyeung Choi, Minh Le, Ning Song, Jin-Hoi Kim, Han Seo, Jae-Wook Oh, Kyungtae Lee, Tae-Hun Kim, Chankyu Park. The complete swine olfactory subgenome: expansion of the olfactory gene repertoire in the pig genome, BMC Genomics, 2012, pp. 584, 13, DOI: 10.1186/1471-2164-13-584