LIPID MAPS online tools for lipid research
Eoin Fahy
1
Manish Sud
1
Dawn Cotter
1
Shankar Subramaniam
0
1
0
Departments of Bioengineering, Chemistry and Biochemistry, University of California
,
San Diego, La Jolla, CA 92093, USA
1
LIPID MAPS Bioinformatics Core, San Diego Supercomputer Center, University of California
,
San Diego
, 9500 Gilman Drive,
La Jolla, CA 92037, USA
The LIPID MAPS consortium has developed a number of online tools for performing tasks such as drawing lipid structures and predicting possible structures from mass spectrometry (MS) data. A simple online interface has been developed to enable an end-user to rapidly generate a variety of lipid chemical structures, along with corresponding systematic names and ontological information. The structure-drawing tools are available for six categories of lipids: (i) fatty acyls, (ii) glycerolipids, (iii) glycerophospholipids, (iv) cardiolipins, (v) sphingolipids and (vi) sterols. Within each category, the structure-drawing tools support the specification of various parameters such as chain lengths at a specific sn position, head groups, double bond positions and stereochemistry to generate a specific lipid structure. The structure-drawing tools have also been integrated with a second set of online tools which predict possible lipid structures from precursor-ion and product-ion MS experimental data. The MS prediction tools are available for three categories of lipids: (i) mono/di/triacylglycerols, (ii) glycerophospholipids and (iii) cardiolipins. The LIPID MAPS online tools are publicly available at www.lipidmaps.org/tools/.
-
The structures of large and complex lipids are difficult to
represent in drawings, which leads to the use of many
custom formats that often generate more confusion than
clarity among members of the lipid research community.
For example, usage of the Simplified Molecular Line
Entry Specification (SMILES) (1) (www.daylight.com/
smiles/index.html) format to represent lipid structures,
while being very compact and accurate in terms of bond
connectivity, valence and chirality, causes problems when
the structure is rendered. This is due to the fact that the
SMILES format does not include 2D coordinates and
hence the orientation of the structure as drawn is quite
arbitrary, making visual recognition and comparison of
related structures difficult. Members of the lipid
community currently draw structures based on their own
individual preferences. A given lipid structure may
appear quite differently in different lipid databases (2, 3).
In summary, consistent structure-drawing tools for lipids
are currently not available.
The structure-drawing step is typically a most
timeconsuming process in creating molecular databases of
lipids. However, many classes of lipids lend themselves to
automated structure-drawing paradigms, due to their
consistent 2D layout. The LIPID MAPS consortium has
developed and deployed a suite of structure-drawing tools
that greatly increase the efficiency of data entry into lipid
structure databases and permit on-demand structure
generation in conjunction with a variety of MS prediction
tools. We have chosen a consistent format for representing
lipid structures (4) where, in the simplest case of the fatty
acid derivatives, the acid group (or equivalent) is drawn
on the right and the hydrophobic hydrocarbon chain is on
the left. Similarly for glycerolipids, glycerophospholipids
and sphingolipids, the radyl hydrocarbon chains are
drawn to the left and the headgoups are depicted on the
right. This approach enables a more consistent, error-free
approach to drawing lipid structures and has been used
extensively in populating the LIPID MAPS structure
database (LMSD), which currently contains over 10 000
molecules (5).
We have adopted an approach where core structures
such as diacetyl glycerol (glycerolipids) and formic acid
(fatty acyls) are represented as text-based MDL molfiles
(described under section MDL CTfile Formats at
www.mdli.com), and these molfiles are then manipulated
to generate a variety of structures in MDL molfile and
Structure Data Format (SDF) files containing that core
(Figure 1). This manipulation is carried out by
commandline or online programs written in the Perl programming
language.
The structural similarities of many lipid categories also
make it feasible to predict structures from MS precursor
ion and/or product ion data by creating a database
composed of masses of all possible likely combinations of
MDLMOL file template
Structure drawing tools
MDL MOLfile containing structure
or SDF structure data file
containing structures along with
name and ontology data
acyl side chains for a given lipid core. One can then use
matching algorithms to display possible candidates for
given precursor ion/product ion m/z values and then
generate corresponding structures.
DESCRIPTION AND IMPLEMENTATION
The LIPID MAPS website (www.lipidmaps.org/tools/
index.html) currently contains a suite of six
structuredrawing tools for the following lipid categories: fatty
acyls, glycerolipids, glycerophospholipids, cardiolipins,
sphingolipids and sterols. The online layout (Figure 2)
consists of a core structure and pull-down menus
arranged in locations appropriate for that structure. For
example, in the case of the glycerophospholipid-drawing
tool, a central glycerol core is surrounded by pull-down
menus allowing the end-user to choose from a list of
headgroups and sn1 and sn2 acyl side chains. The list of
acyl chains represents the more common species found in
mammalian cells, and could easily be modified to include
additional chains. The selected lipid structure is then
generated via a server-side Perl script. The structure is
rendered in the web browser as a Java-based MarvinView
applet (www.chemaxon.com/marvin/). Additionally, the
structure may be viewed online with the Chemdraw
ActiveX/Plugin (www.cambridgesoft.com/software/Chem
Draw/) by users who have this component installed on
their system. Current versions of the fatty-acyl-drawing
tools are now capable of drawing chiral centers and ring
structures. Molecules with correct stereochemistry are
drawn by implementing the following method: (1) usage of
the PerlMol (www.perlmol.org/) module to define atoms,
bonds and neighbors; (2) a recursive algorithm which
applies CahnIngoldPrelog (CIP) (6, 7) rules to a chiral
center and (3) a scoring system to estimate substituent
priority to assign chirality.
LIPID MAPS abbreviation
Concurrently, a generalized lipid abbreviation format has
been developed which enables structures, systematic
names and ontologies to be generated automatically
from a single source format (Figure 3). The LIPID MAPS
abbreviation format for lipids may consist of up to four
different parts: (i) carbon chain length along with any
degree of unsaturation; (ii) position and geometry of
double and triple bonds; (iii) position, type and
stereochemistry of substituents and (iv) position of carbocylic
ring junction and stereochemistry. The first part of th (...truncated)