Showing posts with label cheminformatics. Show all posts
Showing posts with label cheminformatics. Show all posts

Monday, September 21, 2009

Walkingstick Molecules

ResearchBlogging.orgJohnny over at Ecographica has a post on defense molecules secreted by Walkingstick insects.  Head on over to Johnny's post for pictures of the Walkingsticks.

There are several species of Walkingstick, and the one described in this paper produces three different defense molecules that are stereoisomers of one another: anisomorphal, dolichodial, and peruphasmal.  Being an organic chemist, I wanted to know that these molecules look like so I did a little digging around.

The first place I tried was PubChem - only dolichodial was listed, although PubMed does have papers listed for all three compounds. Next I tried ChemSpider, and again only dolichodial was listed.  Time for a more specialized database.  Rich Apodaca is compiling a list of 64 free chemistry databases on his Zusammen Blog, and one of the databases he has profiled is a collection of pheromones and similar molecules called Pherobase.

Pherobase has a search box at the top of the page, and typing the name of each compound produced a list of google-search results.  The titles and descriptions of the search results were not completely obvious, but the pages with a file name containing "compounds-detail-" seemed a good place to look.  This gave me both a 2D drawing and a 3D structure for each of the compounds.  Go ahead an check out the Pherobase pages for anisomorphal, dolichodial, and peruphasmal.  The Pherobase page for each compound includes a 3D structure with the Jmol applet - if you want to turn of the auto-rotation, right-click in the molecule window and set Spin to "off."

Here are all three molecules side by side:

They are stereoisomers - the difference between them is the 3D orientation of the three side groups attached to the ring: dark wedges are "up" and dashed wedges are "down."  At a glance, I can't say why they might specifically be "defense" molecules.  However, much of molecular signalling just boils down to shape rather than chemical reactions - these compounds may just smell or taste bad to predators.

According to the article, the specific compounds produced by the bugs depends on their developmental stage, and even location.  The researchers raised 14 Walkingsticks and observed the types of defense molecules they produced as they grew.  The bugs they used produced a mixture of anisomorphal and dolichodial as hatchlings and the amount of  dolichodial increased after 2 months.  However, when they reached maturity they stopped producing either anisomorphal or dolichodial, and produced peruphasmal exclusively.  As adults, other populations of walkingsticks produce anisomorphal, or a mixture of anisomorphal and peruphasmal.  None of the adults produce dolichodial.

I'm a chemist, I don't know what all this means.  Something seems to happen at 2 months that changes the amount of  dolichodial they produce.  And something else happens when they reach maturity that they stop producing dolichodial.  It's interesting that adults produce one or both molecules that have the aldehydes on opposite sides of the ring - and the compound produced only by the immature insects is the compound with both aldehydes on the same side of the ring.

Dossey, A., Walse, S., & Edison, A. (2008). Developmental and Geographical Variation in the Chemical Defense of the Walkingstick Insect Anisomorpha buprestoides Journal of Chemical Ecology, 34 (5), 584-590 DOI: 10.1007/s10886-008-9457-8

Saturday, September 12, 2009

Avogadro 0.9.8 Is Available

The latest update to Avogadro (0.9.8) is now ready.  The Avogadro Home Page doesn't mention it yet, but if you click on the "Get Avogadro" button, you will get version 0.9.8, or you could go to Sourceforge you can get the packages directly.

According to Tim Vandermeersch's blog there are no new features, but some bugs have been fixed.

via OB, Avogadro and Molecular Modelling: Avogadro 0.9.8

Wednesday, September 9, 2009

CAS Hits 50 Million Compounds

Over the weekend, the Chemical Abstracts Service (CAS) registered its 50 millionth compound! Only 9 months ago they hit 40 million compounds. This stuff is so new that it isn't listed in PubChem of ChemSpider yet. For comparison, it took 33 years for CAS to record the first 10 million compounds.

According to the press release at CAS:

"The 50 millionth substance (CAS Registry Number 1181081-51-5) was uncovered by CAS scientists from the Examples section of a nearly 200-page patent issued by the World Intellectual Property Organization on August 13, 2009. According to the patent, "Few therapeutics are approved by the US Food and Drug Administration and other regulatory agencies for the treatment of neuropathic pain." To address this concern, a series of novel arylmethylidene heterocycles were synthesized, which included the most recent substance registered by CAS."



Check out the full press release at: 50 Millionth Unique Chemical Substance Recorded in CAS REGISTRY

Monday, August 10, 2009

Drawing Molecules with SMILES

Lately I have discovered that when drawing 2D or 3D chemical structures, it is often a lot easier to input a SMILES string than to draw it with the GUI. It reminds me of when I first started using Windows (v. 3) - at times I thought it was easier to type a command to tell the computer what I wanted to do instead of hunting through menus and dialogs to find the thing I needed to click.

SMILES is a way of describing a chemical structure on a single line - a LOT like a conventional condensed formula except without the Hydrogens. It is commonly used in database and chemical informatics applications. In an earlier post on Surfactants, I wanted to include the structures of a couple of perfluorinated surfactants. Drawing all 15 fluorines individually was just too tedious, it was much simpler - believe it or not - to type: FC(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(=O)O

Drawing large and complex compounds "by hand" is also a hassle - and in some cases very difficult to get a good looking structure. Fortunately, many on line resources (e.g. Wikipedia, PubChem, ChemSpider) list the SMILES string for compounds like morphine:
Earlier in the summer I decided to install Ubuntu on my laptop in addition to Windows so I could run either operating system. One difficulty I have had in moving to Ubuntu is finding chemistry software to replace the programs I usually use on Windows. It turns out that several of the structure programs that I have been using lately will allow me to input a SMILES string.

One program I have been experimenting with lately is Avogadro - an open source project for building and studying 3D chemical structures. Avogadro is available on both Windows and Linux (as well as Mac), which is pretty convenient for me right now. It is a very nice program, they are at version 0.9.7 and plan to release 1.0 in September.

For drawing 2D structures on Windows I use ChemSketch, available as freeware from ACDlabs. On Ubuntu I have tried several 2D structure programs, none that match up to ChemSketch. One that I have used most is BKChem, which I used for the structure of Morphine above.

Monday, February 16, 2009

Viewing 3D Structures at PubChem

PubChem is now providing 3D structures which you can view or download.  PubChem3D generates a single conformer for molecules which are not too large or too flexible.  You can view the results of this on the compound summary pages.  The Compound Summary page shows the structure of the compound with two tabs at the top: you can choose either the customary 2D view, or a 3D image by clicking the appropriate tab.

You also have the option of an interactive view by clicking either the tetrahedral molecule icon
or the image of the 3D molecule.  There are two options for viewing the molecule.

The first is a web-based viewer that opens a new window and generates an animated gif on the fly.  This web-based viewer takes a little getting used to, and strikes me as rather clumsy.  I don't understand why they did not use a java applet such as Jmol instead.  In fact, Rajarshi Guha's Pub3D site does just this.  Enter a PubChem cid and you can see the 3d structure using Jmol.

The second option for viewing the structure in 3D is to download and install the PubChem 3D Viewer. Windows, Linux and Mac versions are available.  The graphics are nice, but it is limited to the file formats used by PubChem: pc3d, asn,  and sdf for multiple molecule files.  You can load more than one molecule at a time by either opening a multi-molecule sdf file or using the Import option.  With more than one molecule loaded you can toggle between a panel-view which displays all the molecules in a table format, or an overlay mode.  Select which molecules to overlay in the Molecules tab in the right-hand panel.

In addition, the right-hand panel has controls for changing the way the molecule(s) are displayed. Oddly there is no Save function.  There might be no particular need to save the molecular data files from the viewer, but they seem to have gone to some trouble to give a lot of graphics display options.  It's too bad that you cannot save images from the PubChem 3D Viewer.

Saturday, November 8, 2008

How Do You Know It's a Mutagen?

ResearchBlogging.org

How do you know if a compound will be a mutagen before you test it. Before you even make it?

Drug companies make lots of new molecules that they hope will be useful as drugs. But there are a lot of other things that can happen when a biologically active molecule gets inside you. A lot of potential drugs just don't work, or work poorly. Many work well enough for the task at hand, but have side effects that are unpleasant. Some side effects are inconvenient but tolerable, others are deal breakers. 

Making a new molecule and testing it is a time consuming process. Just making the molecule will involve several reactions run sequentially, and each step can require careful purification before you can go on to the next step in the process. Once the molecule has been made, there is a battery of tests to run both to see how well it works on the drug target, and to find out if it is likely to make the patient sicker through side effects.

It would be helpful if you could predict the toxicity of a compound before you go to the trouble of making it in the lab, or at least ruling out compounds that are likely to be highly toxic. In Accurate and Interpretable Computational Modeling of Chemical Mutagenicity, Langham and  Jain describe their work on predicting whether a compound is a mutagen just based on the types of atoms in the molecule. And they get pretty good results.

To do this properly you will have to look at lots of molecules, so you need a simple way to describe your molecules quickly without running lots of complex calculations. Langham and Jain had a computer program list all possible pairs of atoms in each molecule, and made a

 list of these atom pairs. The example they give in their paper is an atom pair found in aspirin described as O3_1_D5_C2_Ar2. This describes an sp3 hybridized oxygen attached to one heavy atom (not hydrogen) that is 5 bonds away from an aromatic carbon with two heavy neighbors.

Next you have to look for a pattern of atom pairs in a molecule that seems to be related to whether or not it is a mutagen. Just looking at the data would probably not be very effective, so the authors used three different Machine Learning techniques to look for a pattern: support vector machines (svm), RuleFit, and K-nearest neighbors (KNN). They analyzed a training set of 4337 diverse compounds, 2401 of which were mutagens and 1936 were not and found that the SVM method gave an accuracy of 0.77, and RuleFit was a little better with an accuracy of 0.79.

The real test is how well the model works in predicting the activity of completely new molecules. So next they used their SVM and RuleFit results to try to predict the mutagenicity of a completely different set of compounds taken from the Carcinogenic Potency Database (CPDB). With this new set of compounds, SVM (accuracy 0.770) worked a little better than RuleFit (accuracy 0.718).  This is far from ideal, but it's a pretty good start.  And it is interesting to see that such a simple criterion as pairs of atoms can be predictive of a complex behavior like causing mutations.


James J. Langham, Ajay N. Jain (2008). Accurate and Interpretable Computational Modeling of Chemical Mutagenicity Journal of Chemical Information and Modeling, 48 (9), 1833-1839 DOI: 10.1021/ci800094a