Computer-assisted methods for molecular structure elucidation: realizing a spectroscopist's dream

Journal of Cheminformatics, Mar 2009

Background This article coincides with the 40 year anniversary of the first published works devoted to the creation of algorithms for computer-aided structure elucidation (CASE). The general principles on which CASE methods are based will be reviewed and the present state of the art in this field will be described using, as an example, the expert system Structure Elucidator. Results The developers of CASE systems have been forced to overcome many obstacles hindering the development of a software application capable of drastically reducing the time and effort required to determine the structures of newly isolated organic compounds. Large complex molecules of up to 100 or more skeletal atoms with topological peculiarity can be quickly identified using the expert system Structure Elucidator based on spectral data. Logical analysis of 2D NMR data frequently allows for the detection of the presence of COSY and HMBC correlations of "nonstandard" length. Fuzzy structure generation provides a possibility to obtain the correct solution even in those cases when an unknown number of nonstandard correlations of unknown length are present in the spectra. The relative stereochemistry of big rigid molecules containing many stereocenters can be determined using the StrucEluc system and NOESY/ROESY 2D NMR data for this purpose. Conclusion The StrucEluc system continues to be developed in order to expand the general applicability, provide improved workflows, usability of the system and increased reliability of the results. It is expected that expert systems similar to that described in this paper will receive increasing acceptance in the next decade and will ultimately be integrated directly to analytical instruments for the purpose of organic analysis. Work in this direction is in progress. In spite of the fact that many difficulties have already been overcome to deliver on the spectroscopist's dream of "fully automated structure elucidation" there is still work to do. Nevertheless, as the efficiency of expert systems is enhanced the solution of increasingly complex structural problems will be achievable.

A PDF file should load here. If you do not see its contents the file may be temporarily unavailable at the journal website or you do not have a PDF plug-in installed and enabled in your browser.

Alternatively, you can download the file locally and open with any standalone PDF reader:

https://link.springer.com/content/pdf/10.1186%2F1758-2946-1-3.pdf

Computer-assisted methods for molecular structure elucidation: realizing a spectroscopist's dream

Journal of Cheminformatics Computer-assisted methods for molecular structure elucidation: realizing a spectroscopist's dream Mikhail Elyashberg 2 Kirill Blinov 2 Sergey Molodtsov 1 Yegor Smurnyy 2 Antony J Williams 0 Tatiana Churanova 2 0 ChemZoo Inc. , 904 Tamaras Circle, Wake Forest, NC, 27587 , USA 1 Novosibirsk Institute of Organic Chemistry, Siberian Division, Russian Academy of Sciences , 9 Akademik Lavrent'ev Av., Novosibirsk, 630090 Russian Federation 2 Advanced Chemistry Development, Moscow Department , 6 Akademik Bakulev Street, Moscow 117513, Russian Federation Background: This article coincides with the 40 year anniversary of the first published works devoted to the creation of algorithms for computer-aided structure elucidation (CASE). The general principles on which CASE methods are based will be reviewed and the present state of the art in this field will be described using, as an example, the expert system Structure Elucidator. Results: The developers of CASE systems have been forced to overcome many obstacles hindering the development of a software application capable of drastically reducing the time and effort required to determine the structures of newly isolated organic compounds. Large complex molecules of up to 100 or more skeletal atoms with topological peculiarity can be quickly identified using the expert system Structure Elucidator based on spectral data. Logical analysis of 2D NMR data frequently allows for the detection of the presence of COSY and HMBC correlations of "nonstandard" length. Fuzzy structure generation provides a possibility to obtain the correct solution even in those cases when an unknown number of nonstandard correlations of unknown length are present in the spectra. The relative stereochemistry of big rigid molecules containing many stereocenters can be determined using the StrucEluc system and NOESY/ROESY 2D NMR data for this purpose. Conclusion: The StrucEluc system continues to be developed in order to expand the general applicability, provide improved workflows, usability of the system and increased reliability of the results. It is expected that expert systems similar to that described in this paper will receive increasing acceptance in the next decade and will ultimately be integrated directly to analytical instruments for the purpose of organic analysis. Work in this direction is in progress. In spite of the fact that many difficulties have already been overcome to deliver on the spectroscopist's dream of "fully automated structure elucidation" there is still work to do. Nevertheless, as the efficiency of expert systems is enhanced the solution of increasingly complex structural problems will be achievable. Background The potential of creating computer-assisted methods for the structure elucidation of new organic compounds was first discussed in the second half of the past century. Structure elucidation commonly combines information extracted from several forms of spectra. The molecular formula of the substance is generally derived from a massspectrum and structural hypotheses are deduced from spectral data which may usually include NMR, IR, UV, etc. spectra. The distinctive feature of this approach is the inference of the structure of an unknown compound that is absent from spectral libraries, i.e. without employing reference structures and their associated spectra. Later, qualitative spectral analysis without reference data was extended to quantitative spectral analysis in optical spectroscopy [ 1 ]. The solution of such problems can be facilitated by the retrieval of reference data in combination with logical-combinatorial processing of the data. A new area of investigation was developed that is now referred to as Computer-Aided Structure Elucidation (CASE). CASE was applied initially to "small molecules" as distinct from biological macromolecules and biopolymers. The first reports devoted to CASE systems were published by four independent groups of researchers exactly forty years ago [ 2-5 ]. Since the publication of these seminal reports an extensive literature regarding computer methods of structure elucidation has been produced. From the inception of CASE methods, attention has been directed to the creation of artificial intelligence or "expert" systems (ES) based on the analysis of 1D 1H and13C NMR data in combination with MS and IR spectra. The first studies of CASE development were described in a series of reviews [ 6,7 ] and monographs [ 8-10 ]. In spite of the efforts of many scientific groups no system capable of elucidating large complex molecules was delivered during the first 20 years of intensive efforts. The primary reason for failure was the lack of structural information that could be retrieved from 1D NMR spectra to use as input to the structure generator, the kernel of any expert system. The first two decades of CASE development should, nevertheless, be considered as very fruitful since a general strategy was establis (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1186%2F1758-2946-1-3.pdf

Mikhail Elyashberg, Kirill Blinov, Sergey Molodtsov. Computer-assisted methods for molecular structure elucidation: realizing a spectroscopist's dream, Journal of Cheminformatics, 2009, pp. 3, Volume 1, Issue 1, DOI: 10.1186/1758-2946-1-3