The TRIPLES database: a community resource for yeast molecular biology
TRIPLES is a web-accessible database of TRansposonInsertion Phenotypes, Localization and Expression in Saccharomyces cerevisiaea relational database housing nearly half a million data points generated from an ongoing study using large-scale transposon mutagenesis to characterize gene function in yeast. At present, TRIPLES contains three principal data sets (i.e. phenotypic data, protein localization data and expression data) for over 3500 annotated yeast genes as well as several hundred non-annotated open reading frames. In addition, the TRIPLES web site provides online order forms linked to each data set so that users may request any strain or reagent generated from this project free of charge. In response to user requests, the TRIPLES web site has undergone several recent modifications. Our localization data have been supplemented with approximately 500 fluorescent micrographs depicting actual staining patterns observed upon indirect immunofluorescence analysis of indicated epitope-tagged proteins. These localization data, as well as all other data sets within TRIPLES, are now available in full as tab-delimited text. To accommodate increased reagent requests, all orders are now cataloged in a separate database, and users are notified immediately of order receipt and shipment. Also, TRIPLES is one of five sites incorporated into the new functional analysis tool Function Junction provided by the Saccharomyces Genome Database. TRIPLES may be accessed from the Yale Genome Analysis Center (YGAC) homepage at http://ygac.med.yale.edu.
Since its inception, the TRIPLES web site has provided
convenient access to data from our transposon-based study of
gene function in yeast. Described in detail elsewhere (1,2), this
study utilizes a multifunctional transposon to generate random
insertions throughout the yeast genome. These insertions may
be used to derive a variety of informative alleles including
reporter gene fusions, gene disruptions and epitope-tagged
alleles (1). Gene fusions to transposon-encoded lacZ provide a
means of generating expression profiles identifying sequences
translated under given growth conditions. As this lacZ reporter
is terminated by a series of stop codons, transposon insertion
also results in truncation of its host gene, thereby potentially
generating disruption alleles for subsequent phenotypic analysis.
Finally, by means of Cre-lox recombination, an integrated
transposon insertion may be modified such that the bulk of the
transposon is excised, leaving behind a short stretch of
epitopecoding sequence. These epitope-tagged alleles can be used to
generate corresponding tagged proteins for immunolocalization.
By this approach, a single insertion is sufficient to yield
expression, phenotypic and localization dataa cumulatively
unique data set maintained and updated in TRIPLES.
In addition to the value in this collected data, individual
insertion alleles and transposon-tagged strains are useful
laboratory reagents. As such, we make all strains from this
project available free of charge to any interested researcher
through the TRIPLES web site. Order forms are available both
from the YGAC homepage as well as from links accompanying
each data set within TRIPLES. All requests are typically processed
and shipped within a week of receipt.
DESIGN AND IMPLEMENTATION
TRIPLES was implemented using the ORACLE database system,
version 8i. Our web front-end was mainly implemented using
Active Server Page (ASP), an integral part of the Microsoft IIS
web server running on Windows NT. The ASP mechanism has
enabled us to embed server-side code written in VBScript and
some PERL/CGI programs. To ensure code compatibility with
different database platforms, we have used ODBC (Open Database
Connectivity) to implement database access.
DATA SEARCHING AND RETRIEVAL
Users may access data within TRIPLES through either
composite or category-specific searches. Composite searches can
be used to retrieve records from multiple data sets (phenotypic,
expression and localization data) for any given gene/insertion.
Sequenced/defined site of insertion 22 587
Induced during vegetative growth
Strains with mutant phenotypes
Localizations (not background)
As the name suggests, category-specific searches are helpful in
querying a single type of data. In either case, searches may be
initiated by supplying a gene name in either systematic
(e.g. YIL046W) or standard form (e.g. MET30). Alternatively,
data regarding a given insertion may be accessed through its
clone ID, a unique designation (e.g. V66A9) assigned to each
transposon-mutagenized strain in our collection based upon its
position in a 96-well storage plate. Category-specific searches
may also be initiated by selecting from a list of controlled
vocabulary terms descriptive of that particular data set. For
example, TRIPLES may be queried for all tagged proteins
localizing to the nucleus by initiating a category-specific search of
localization data with nucleus chosen as the localization. To
facilitate multi-level searching, the results of a given search
may be used to initiate further category-specific searches.
These and additional search options are demonstrated on the
TRIPLES homepage at http://ygac.med.yale.edu/triples/triples.htm.
To ease data retrieval, all category-specific output reports
are presented in tabular format and may be conveniently
downloaded as tab-delimited text. If desired, these reports may be
custom-formatted to display only those fields of greatest
interest. In addition, category-specific reports may be sorted by
data fields in order to group results in a logical manner. Each
complete data set is available for downloading as a flat file,
which we periodically update. To further enhance the utility of
TRIPLES as an information resource, all composite reports
also provide access to supplemental background literature
through direct external links to corresponding entries within
SGD (3), YPD (4) and GenBank (5).
TRIPLES has grown significantly as a resource to the yeast
scientific community over the last 2 years. Approximately
doubling in total data content, TRIPLES now encompasses
functional data for nearly 60% of the yeast genome (Table 1).
At present, the TRIPLES database catalogs a collection of over
28 000 transposon insertion alleles, with each allele serving as
a potentially useful laboratory reagent. More than 27 000
transposon-mutagenized yeast strains have been used in our
studies of gene expression, disruption phenotypes and protein
localization. This data is of value both to molecular biologists
and computational biologists; to facilitate its easy dissemination,
we now offer a data download page accessible from our basic
search form. Users may download complete data sets describing
transposon insertions sites, gene expression, disruption
phenotypes and protein localization. Each data set is available
as tab-delimited text. As our phenotypic data set is very large,
we provide users the opportunity to download results from
each growth assay individually, thereby generating files more
amenable to standard spreadsheet analysis.
In response to user requests, protein localization data within
TRIPLES is now supplemented with fluorescent micrographs
illustrating actual staining patterns obtained upon indirect
immmunofluorescence analysis of given epitope-tagged
proteins (Fig. 1). Available images (JPEG files) may be viewed by
clicking on any underlined localization within a composite or
category-specific report. At present, approximately 500 such
images are accessible through TRIPLES, establishing it as one
of the largest web-accessible visual libraries of yeast protein
localization. Additional text describing these images is available
as online help within TRIPLES.
The TRIPLES web site has continued to serve as a popular
source of yeast strains and reagents: since its last release (6),
over 400 requests for reagents have been placed through
TRIPLES. In order to better service these requests, all orders
placed online are now stored in an ACCESS database.
Requests are assigned an order number; automatic email
confirmation of order receipt is shipped to each user, and users
are also notified by email of actual sample shipment. The
efficiency of this system has allowed us to accommodate larger
requests from researchers both here and abroad, without
increasing turn-around time.
With an expanding repertoire of tools and technology
facilitating large-scale research, fundamental advances in our
understanding of biology lie within reachprovided that a
spirit of cooperation continues to prevail in science. Free
exchange of data and resources is central to our immediate and
future progress; in that light, the TRIPLES database represents
an important medium by which information and reagents can
be shared among the scientific community. TRIPLES is also
featured in the new SGD tool Function Junction (3), another
helpful resource providing convenient access to data from a
variety of functional genomic projects. Collectively, these
types of resources exemplify the collaborative effort necessary
to foster rapid advancement in molecular biology and, as such,
represent a promising blueprint for scientific endeavor.
A summary of TRIPLES data sets (i.e. transposon insertion
point data, gene expression data, phenotypic data, protein
localization data) is available as Supplementary Material at
This work is supported by NIH grant R01-CA77808 (to M.S.).
A.K. is supported by a post-doctoral fellowship from the
American Cancer Society.