78 posts · 54,697 views
Computational chemist doing drug design and discovery. Specific interests include virtual screening, docking, free energy calculations and molecular dynamics. General interests include biochemistry, pharmacology, structural biology, organic chemistry and physics.
The Curious Wavefunction
78 posts
Sort by Latest Post, Most Popular
View by Condensed, Full
by The Curious Wavefunction in The Curious Wavefunction
The Journal of Computer-Aided Molecular Design is having a smorgasbord of accomplished modelers reflecting upon the state and future of modeling in drug discovery research and I would definitely recommend anyone - and especially experimentalists - interested in the role of modeling to take a look at the articles. Many of the articles are extremely thoughtful and balanced and take a hard look at the lack of rigorous studies and results in the field; if there was ever a need to make journal articles freely available it was for these kinds, and it's a pity they aren't. But here's one that is open access, and it's by some researchers from Simulations Inc. who talk about three beasts (or in the authors' words, "Lions and tigers and bears, oh my!") in the field that are either unsolved or ignored or both.1. Entropy: As they say, entropy, taxes and death (entropy) are the three constant things in life. In modeling both small molecules and proteins, entropy has always been the elephant in the room, blithely ignored in most simulations. At the beginning there was no entropy. Early modeling programs then started extracting a rough entropic penalty for freezing certain bonds in the molecule. While this approximated the loss of ligand entropy in binding, it did nothing to take care of the conformational entropy loss that resulted in the compression of a panoply of diverse conformations in solution to a single bound conformation. But we were just getting started. A very large part of the entropy of binding a ligand by a protein comes from the displacement of water molecules in the active site, essentially their liberation from being constrained prisoners of the protein to free-floating entities in the bulk. A significant advance in trying to take this factor into account was an approach that explicitly and dynamically calculated the enthalpy, entropy and therefore the free energy of bound waters in proteins. We have now reached the point where we can at least think of doing a reasonable calculation on such water molecules. But water molecules are often ill-localized in protein crystal structures because of low-resolution, inadequate refinement and other reasons. It's not easy to perform such calculations for arbitrary proteins without crystal structures.However, a large piece of the puzzle that's still missing is the entropy of the protein which is extremely difficult to calculate on many fronts. Firstly, the dynamics of the protein is often not captured by a static x-ray structure so any attempts to calculate protein entropy in the presence and absence of ligands would have to shake the protein around. Currently the favored process for doing this is molecular dynamics (MD) which suffers from its own problems, most notably the accuracy of what's under the hood- namely force fields. Secondly, even if we can calculate the total entropy changes, what we really need to know is how the entropy is distributed between various modes since only some of these modes are affected upon ligand binding. An example of the kind of situation in which such details would be important is the case of slow, tight-binding inhibitors illustrated in the paper. The example is of two different prostaglandin synthase inhibitors which demonstrate almost identical binding orientations in the crystal structure. Yet one is a weak binding inhibitor which dissociates rapidly and the other is slow, tight-binding. Only a dynamic treatment of entropy can explain such differences, and we are still quite far from being able to do this in the general case.2. Uncertainty: Out of all the hurdles facing the successful application and development of modeling in any field, this might be the most fundamental. To reiterate, almost every kind of modeling starts by using a training set of molecules for which the data is known and then proceeds to apply the results from this training set to a test set for which the results are unknown. Successful modeling hinges on the expectation that the data in the test set is sufficiently similar to that in the training set. But problems abound. For one thing, similarity is the eye of the beholder and what seems to be a reasonable criterion for assuming similarity may turn out to be irrelevant in the real world. Secondly, overfitting is a constant issue and results that look perfect for the training set can fail abysmally on the test set.But as the article notes, the problems go further and the devil's in the details. Modeling studies very rarely try to quantify the exact differences between the two sets and the error resulting from that difference. What's needed is an estimate of predictive uncertainty for single data points, something which is virtually non-existent. The article notes the seemingly obvious but often ignored fact when it says that "there must be something that distinguishes a new candidate compound from the molecules in the training set". This 'something' will often be a function of the data that was ignored when fitting the model to the training set. Outliers which were thrown out because they were...outliers might return with a vengeance in the form of a new set of compounds that are enriched in their particular properties which were ignored.But more fundamentally, the very nature of the model used to fit the training set may be severely compromised. In its simplest incarnation for instance, linear regression may be used to fit data points to a set of relationships that are inherently non-linear. In addition, descriptors (such as molecular properties supposedly related to biological activity) may not be independent. As the paper notes, "The tools are inadequate when the model is non-linear or the descriptors are correlated, and one of these conditions always holds when drug responses and biological activity are involved". This problem penetrates into every level of drug discovery modeling, from basic molecular level QSAR to higher-level clinical or toxicological modeling. Only a judicious and high-quality application of statistics, constant validation, and a willingness to wait (for publication, press releases etc.) before the entire analysis is available will preclude erroneous results from seeing the light of day.3. Data Curation: This is an issue that should be of enormous interest to not just modelers but to all kinds of chemical and biological scientists concerned about information accuracy. The well-known principle of Garbage-In Garbage Out (GIGO) is at work here. The bottom line is that there is an enormous amount of chemical data on the internet that is flawed. For instance there are cases where incorrect structures were inferred from correct names of compounds:"The structure of gallamine triethiodide is a good illustrative example where many major databases ended up containing the same mistaken datum. Until mid-2011, anyone relying on an internet search would have erroneously concluded that gallamine triethiodide is a tribasic amine. The error resulted from mis-parsing the common name at some point as meaning that the co... Read more »
Clark, R., & Waldman, M. (2011) Lions and tigers and bears, oh my! Three barriers to progress in computer-aided molecular design. Journal of Computer-Aided Molecular Design. DOI: 10.1007/s10822-011-9504-3
by The Curious Wavefunction in The Curious Wavefunction
A recent issue of Science has an article discussing an issue that has been a constant headache for anyone involved with any kind of modeling in drug discovery - the lack of reproducibility in computational science. The author Roger Peng who is a biostatistician at Johns Hopkins talks about modeling standards in general but I think many of his caveats could apply to drug discovery modeling. The problem has been recognized for a few years now but there have been very few concerted efforts to address it. An old anecdote from my graduate advisor's research drives the point home. He wanted to replicate a protein-ligand docking study done with a compound so he contacted the scientist who had performed the study and processed the protein and ligand according to the former's protocol. He appropriately adjusted the parameters and ran the experiment. To his surprise he got a very different result. He repeated the protocol several times but consistently saw the wrong result. Finally he called up the original researcher. The two went over the protocol a few times and finally realized that the problem lay in a minor but overlooked detail - the two scientists were using slightly different versions of the modeling software. This wasn't even a new version, just an update, but for some reason it was enough to significantly change the results.These and other problems dot the landscape of modeling in drug discovery. The biggest problem to begin with is of course the sheer lack of reporting of details in modeling studies. I have seen more than my share of papers where the authors find it enough to simply state the name of the software used for modeling. No mention of parameters, versions, inputs, "pre-processing" steps, hardware, operating system, computer time or "expert" tweaking. The latter factor is crucial and I will come back to it. In any case, it's quite obvious that no modeling study can be reproducible without these details. Ironically, the same process that made modeling more accessible to the experimental masses has also encouraged the reporting of incomplete results; the incarnation of simulation as black-box technology has inspired experimentalists to widely use it, but on the flip side it has also discouraged many from being concerned about communicating under-the-hood details.A related problem is the lack of objective statistical validation in reporting modeling results, a very important topic that has been highlighted recently. Even when protocols are supposedly accurately described, the absence of error bars or statistical variation means that one can get a different result even if the original recipe is meticulously followed. Even simple things like docking runs can give slightly different numbers on the same system, so it's important to be mindful of variation in the results along with their probable causes. Feynman talked about the irreproducibility of individual experiments in quantum mechanics, and while it's not quite that bad in modeling, it's still not irrelevant.This brings us to one of those important but often unquantifiable factors in successful modeling campaigns - the role of expert knowledge and intuition. Since modeling is still an inexact science (and will probably remain so for the foreseeable future), intuition, gut feelings and a "feel" for the particular system under consideration based on experience can often be an important part of massaging the protocol to deliver the desired results. At least in some cases these intangibles are captured in any number of little tweaks, from constraining the geometry of certain parts of a molecule based on past knowledge to suddenly using a previously unexpected technique to improve the clarity of the data. A lot of this is never reported in papers and some of it probably can't be. But is there a way to capture and communicate at least the tangible part of this kind of thinking? The paper alludes to a possible simple solution and this solution will have to be implemented by journals. Any modeling protocol generates a log file which can be easily interpreted by the relevant program. In case of some modeling software like Schrodinger, there's also a script that records every step in a format comprehensible to the program. Almost any little tweak that you make is usually recorded in these files or scripts. A log file is more accurate than an English language description at documenting concrete steps. One can imagine a generic log file- generating program which can record the steps across different modeling programs. This kind of venture will need collaboration between different software companies but it could be very useful in providing a single log file that captures as much of both the tangible and intangible thought processes of the modeler as possible. Journals could insist that authors upload these log files and make them available to the community.Ultimately it's journals which can play the biggest role in the implementation of rigorous and useful modeling standards. In the Science article the author describes a very useful system of communicating modeling results used by the journal Biostatistics. Under this system authors doing simulation can request a "reproducibility review" in which one of the associate editors runs the protocols using the code supplied by the authors. Papers which pass this test are clearly flagged as "R" - reviewed for reproducibility. At the very least, this system gives readers a way to distinguish rigorously validated papers from others so that they know which ones to trust more. You would think that there would be backlash against the system from those who don't want to explicitly display the lack of verification of their protocols, but the fact that it's working seems to indicate its value to the community at large. Unfortunately in case of drug discovery, any such system will have to deal with the problem of proprietary data. There are several papers without such data which could also benefit from this system, but there can be ways to handle proprietary data. Even proprietary data can be amenable to partial reproducibility. In a typical example for instance, molecular structures which are proprietary could be encoded into special organization-specific formats that are hard to decode (an example would be formats used by OpenEye or Rosetta). One could still run a set of modeling protocols on this cryptic data set and generate statistics without revealing the identity of the structures. Naturally there will have to be safeguards against the misuse of any such evaluation but it's hard to see why they would be difficult to institute.Finally, it's only a community-wide effort equally comprised of industry and academia which can lead to the validation and use of successful modeling protocols. The article suggests creating a kind of "CodeMed Central" repository akin to PubMed Central, and I think modeling could greatly benefit from such a central data source. Code for successful protocols in virtual screening or homology modeling or molecular dynamics or what have you can be uploaded to a site (along with the log files of course). Not only would these protocols be used to verify their reproducibility, but they could also be used to practically aid data extraction from similar systems. The community as a whole would benefit. Before there's any data generation or sharing, before there's any drawing of conclusions, before there's any advancement of scientific knowledge, there's reproducibility, a scientific virtue that has guided every field of science since its modern origin. Sadly this virtue has been neglected in modeling, so it's about time that we pay more attention to it. Peng, R. (2011). Reproducible Research in Computational Science Science, 334 (6060), 1226-1227 DOI: ... Read more »
Peng, R. (2011) Reproducible Research in Computational Science. Science, 334(6060), 1226-1227. DOI: 10.1126/science.1213847
by The Curious Wavefunction in The Curious Wavefunction
Air travel constitutes the safest mode of travel in the world today. What is even more impressive is the way airplanes are designed by modeling and simulation, sometimes before the actual prototype is built. In fact simulation has been a mainstay in the aeronautical industry for a long time and what seems like a tremendously complex interaction of metal, plastic and the unpredictable movements of air flow can now be reasonably captured in a computer model.In a recent paper, Walter Woltosz of Simulations Plus Inc. asks an interesting question: compared to the aeronautical industry where modeling has been applied to airplane design for decades, why has it taken so long for modeling to catch on in the pharmaceutical industry? In contrast to airplane design which is now a well-accepted and widely used tool, why is simulation of drugs and proteins still (relatively) in the doldrums? Much progress has surely been made in the field during the last thirty years or so, but modeling is nowhere as integrated in the drug discovery process as computational fluid dynamics is in the airplane design process.Woltosz has an interesting perspective on the topic since he himself was involved in modeling the early Space Shuttles. As he recounts, what's interesting about modeling in the aeronautical field is that NASA was extensively using primitive 70s computers to do it even before they built the real thing. A lot of modeling in aeronautics involves figuring out the right sequence of movements an aircraft should take in order to keep itself from breaking apart. Some of it involves solving the Navier-Stokes equations that dictate the complicated air flow around the plane, some of it involves studying the structural and directional effects of different kinds of loads on materials used for construction. The system may seem complicated but as Woltosz tells it, simulation is now used ubiquitously in the industry to discard bad models and tweak good ones.Compare that to the drug discovery field. The first simulations of pharmaceutically relevant systems started in the early 80s. Since then the field has progressed in fits and starts and while many advances have come in the last two decades, modeling approaches are not a seamless part of the process. Why the difference? Woltosz comes up with some intriguing reasons, some obvious and others more thought-provoking.1. First and foremost of course, biological systems are vastly more complicated than aeronautical systems. Derek has already written at length about the fallacy of applying engineering analogies to drug discovery and I would definitely recommend his thoughts on the topic. In case of modeling, I have already mentioned that the modeling community is getting ahead of itself by trying to chew on more complexity than it can bite. Firstly you need to have a list of parts to simulate and we are still very much in the process of putting together this list. Secondly, having the list will tell us little about how the parts interact. Biological systems display complex feedback loops, non-linear signal-response features and functional "cliffs" where a small change in the input can lead to a big change in the output. As Woltosz notes, while aeronautical systems can also be complex, their inputs are much more well-defined.But the real difference is that we can actually build an airplane to test our theories and simulations. The chemical analogy would be the synthesis of a complex molecule like a natural product to test the principles that went into planning its construction. In the golden age of organic synthesis, synthetic feats were undertaken for structure confirmation but also to validate our understanding of the principles of physical organic chemistry, conformational analysis and molecular reactivity. Even if we get to a point where we think we have a sound grounding of the principles governing the construction and workings of a cell, it's going to be a while before we can truly confirm those principles by building a working cell from scratch.2. Another interesting point concerns the training of drug discovery researchers. Woltosz is probably right that engineers are much more of generalists than pharmaceutical scientists who are usually rigidly divided into synthetic chemists, biologists, pharmacologists, modelers, process engineers etc. The drawback of this compartmentalization is something I have experienced myself as a modeler; scientists from different disciplines can mistrust each other and downplay the value of other disciplines in the discovery of a new drug. This is in spite of the fact that drug discovery is an inherently complex and multidisciplinary process which can only benefit from an eclectic mix of backgrounds and approaches. A related problem is that some bench chemists, even those who respect modeling, want modeling to provide answers, but they don't want to run experiments (such as negative controls) which can advance the state of the field. They are reluctant to carry out the kind of basic measurements (such as measuring solvation energies of simple organic molecules) which would be enormously valuable in benchmarking modeling techniques. A lot of this is unfortunate since it's experimentalists themselves who are going to ultimately benefit from highly validated computational approaches.There's another point which Woltosz does not mention but which I think is quite important. Unlike chemists, engineers are usually more naturally inclined to learn programming and mathematical modeling. Most engineers I know know at least some programming. Even if they don't extensively write code they can still use Matlab or Mathematica, and this is independent of their specialty (mechanical, civil, electrical etc.). But you would be hard-pressed to find a synthetic organic chemist with programming skills. Also, since engineering is inherently a more mathematically oriented discipline, you would expect an engineer to be more open to exploring simulation even if he doesn't do it himself. It's more about the culture than anything else. That might explain the enthusiasm of early NASA engineers to plunge readily into simulation. The closest chemical analog to a NASA engineer would be a physical chemist, especially a mathematically inclined quantum chemist who may have used computational techniques even in the 70s, but how many quantum chemists (as compared to synthetic chemists for instance) work in the pharmaceutical industry? The lesson to be drawn here is that programming, simulation and better mathematical grounding need to be more widely integrated in the traditional education of chemists of all stripes, especially those inclined toward the life sciences.3. The third point that Woltosz makes concerns the existence of a comprehensive knowledge base for validating modeling techniques and he thinks that a pretty good knowledge base exists today upon which we can build useful modeling tools. I am not so sure. Woltosz is mainly talking about physiological data and while that's certainly valuable, the problem exists even at much simpler levels. I would like to stress again that even simple physicochemical measurements of parameters such as solvation energies which can contribute to benchmarking modeling algorithms are largely missing, mainly because they are unglamorous and underfunded. On the bright side, there have been at least some areas like virtual screening where researchers have judiciously put together robust datasets for testing their methods. But there's a long way to go and much robust basic scientific experimental data needs to be gathered. Again, this can come about only if scientists from other fields recognize the potential long-term value that modeling can bring to drug discovery and contribute to its advancement.Woltosz's analogy of drug design and airplane design also reminds me of something that Freeman Dyson once wrote about the history of flight. In "Imagined Worlds", Dyson described the whole history of flight as a process of Darwinian evolution in which many designs (and lives) were destroyed in the service of better ones. Perhaps we also need a merciless process of Darwinian evaluation in modeling. Some of this is already taking place in the field of protein modeling field with CASP and in protein-ligand modeling with SAMPL, but the fact remains that the drug discovery community as a whole (and not just modelers) will have to descend o... Read more »
Woltosz, W. (2011) If we designed airplanes like we design drugs…. Journal of Computer-Aided Molecular Design. DOI: 10.1007/s10822-011-9490-5
by The Curious Wavefunction in The Curious Wavefunction
Computational chemistry as an independent discipline has its roots in theoretical chemistry, itself an outgrowth of the revolutions in quantum mechanics in the 1920s and 30s. Theoretical and quantum chemistry advanced rapidly in the postwar era and led to many protocols for calculating molecular and electronic properties which became amenable to algorithmic implementation once computers came on the scene. Rapid growth in software and hardware in the 80s and 90s led to the transformation of theoretical chemistry into computational chemistry and to the availability of standardized, relatively easy to use computer programs like GAUSSIAN. By the end of the first decade of the new century, the field had advanced to a stage where key properties of simple molecular systems such as energies, dipole moments and stable geometries could be calculated in many cases from first principles with an accuracy matching experiment. Developments in computational chemistry were recognized by the Nobel Prize for chemistry awarded in 1998 to John Pople and Walter Kohn.In parallel with these theoretical advances, another thread started developing in the 80s which attempted something much more ambitious- to apply the principles of theoretical and computational chemistry to complex systems like proteins and other biological macromolecules and to study their interactions with drugs. The practitioners of this paradigm wisely realized that it would be futile to calculate properties of such complex systems from first principles, thus leading to the initiation of parametrized approaches in which properties would be "pre-fit" to experiment rather than calculated ab initio. Typically there would be an extensive set of experimental data (the training set) which would be used to parametrize algorithms which would then be applied to unknown systems (the test set). The adoption of this approach led to molecular mechanics and molecular dynamics - both grounded in classical physics- and to quantitative structure activity relationships (QSAR) which sought to correlate molecular descriptors of various kinds to biological activity. The first productive approach to docking a small molecule in a protein active site in its lowest energy configuration was refined by Irwin "Tack" Kuntz at UCSF. And beginning in the 70s, Corwin Hansch at Pomona College had already made remarkable forays into QSAR.These methods gradually started to be applied to actual drug discovery in the pharmaceutical industry. Yet it was easy to see that the field was getting far ahead of itself and in fact even today it suffers from the same challenges that plagued it thirty years back. Firstly, nobody had solved the twin cardinal problems of modeling protein-ligand interactions. The first one was conformational sampling wherein you had to exhaustively search the conformation space of a ligand or protein. The second one was energetic ranking wherein you had to rank these structures, either in their isolated form or in the context of their interactions with a protein. Both of these problems remain the central problems of computation as applied to drug discovery. In the context of QSAR, spurious correlations based on complex combinations of descriptors can easily befuddle its practitioners and create an illusion of causation. Furthermore, there have been various long-standing problems such as the transferability of parameters from a known training set to an unknown test set, the calculation of solvation energies for even the simplest molecules and the estimation of entropies. And finally, it's all too easy to forget the sheer complexity of the protein systems we are trying to address which display a stunning variety of behaviors, from large conformational changes to allosteric binding to complicated changes in ionization states and interactions with water. The bottom line is that in many cases we just don't understand the system which we are trying to model well enough.Not surprisingly, a young field still plagued with multiple problems could be relied upon as no more than a guide when it came to solving practical problems in drug design. Yet the discipline saw unfortunate failures in PR as it was periodically hyped. Even in the 80s there were murmurs about designing drugs using computers alone. Part of the hype unfortunately came from the practitioners themselves who were less than cautious about announcing the strengths and limitations of their approaches. The consequence was that although there continued to be significant advances in both computing power and algorithms, many in the drug discovery community looked at the discipline with a jaundiced eye.Yet the significance of the problems that the field is trying to address means that it will continue to be promising. What's its future and what would be the most productive direction in which it could be steered? An interesting set of thoughts is offered in a set of articles published in the Journal of Computer-Aided molecular design. The articles are written by experienced practitioners in the field and offer a variety of opinions, critiques and analyses which should be read by all those interested in the future of modeling in the life sciences.Jurgen Bajorath from the University of Bonn along with his fellow modelers from Novartis laments the fact that studies in the field have not aspired to a high standard of validation, presentation and reproducibility. This is an important point. No scientific field can advance if there is wide variation in the presentation of the quality of its results. When it comes to modeling in drug discovery, the proper use of statistics and well-defined metrics has been highly subjective, leading to great difficulty in separating the wheat from the chaff and honestly assessing the impact of specific techniques. Rigorous statistical validation in particular has been virtually non-existent, with the highly suspect correlation coefficients being the most refined weapon of choice for many scientists in the field. An important step in emphasizing the virtue of objective statistical methods in modeling was taken by Anthony Nicholls of OpenEye Software who in a series of important articles laid out the statistical standards and sensible metrics that any well-validated molecular modeling study should aspire to. I suspect that these articles will go down in the annals of the field as key documents.In addition, as MIT physics professor Walter Lewin is fond of constantly emphasizing in his popular lectures, any measurement you make without knowledge of its uncertainty is meaningless. It is remarkable that in a field as fraught with complexity as modeling, there has been a rather insouciant indifference to the estimation of error and uncertainty. Modelers egregiously quote numbers involving protein-ligand energies, dipole moments and other properties to four or six figures of significance when ideally those numbers are suspect even to one decimal point. Part of the problem has simply been an insufficient grounding in statistics. Tying every number to its estimated error margin (if it can be estimated at all) will not only give experimentalists and other modelers an accurate feel for the validity of the analysis and the ensuing improvement of methods but will also keep semi-naive interpreters from being overly impressed by the numbers. Whether it's finance or pharmaceutical modeling, it's always a bad idea to get swayed by figures.Then there's the whole issue, as the modelers from Novartis emphasize, of spreading the love. The past few years have seen the emergence of several rigorously constructed datasets carefully designed to test and benchmark different modeling algorithms. The problem is that these datasets have been most often validated in an industry that's famous for its secrecy. Until the pharmaceutical industry makes at least some efforts to divulge the results of its studies, a true assessment of the value of modeling methods will always come in fits and starts. I have been recently reading Michael Nielsen's eye-opening book on open science, and it's startling to realize the gains in advancement of knowledge that can result from sharing of problems, solutions and ideas. If modeling is to advance and practically contribute to drug discovery, it's imperative for industry - historically the most valuable generator of any kind of data in drug discovery - to open its vaults and allow scientists to use its wisdom t... Read more »
Bajorath, J. (2011) Computational chemistry in pharmaceutical research: at the crossroads. Journal of Computer-Aided Molecular Design. DOI: 10.1007/s10822-011-9488-z
Martin, E., Ertl, P., Hunt, P., Duca, J., & Lewis, R. (2011) Gazing into the crystal ball; the future of computer-aided drug design. Journal of Computer-Aided Molecular Design. DOI: 10.1007/s10822-011-9487-0
Maggiora, G. (2011) Is there a future for computational chemistry in drug research?. Journal of Computer-Aided Molecular Design. DOI: 10.1007/s10822-011-9493-2
by The Curious Wavefunction in The Curious Wavefunction
One of the most fun things about chemistry is that for every laundry list of examples, there is always a counterexample. The counterexample does not really violate any general principles, but it enriches our understanding of the principle by demonstrating its richness and complexity. And it keeps chemists busy.One such key principle is the hydrophobic effect, an effect with an astounding range of applicability, from the origin of life to cake baking to drug design. Textbook definitions will tell you that the signature of the "classical" hydrophobic effect is a negative heat capacity change resulting from the union of two unfavorably solvated molecular entities. The nonpolar surface area of the solute is usually proportional to the change in heat capacity. The textbooks will also tell you that the hydrophobic effect is favorable principally because of entropy; the displacement of "unhappy" water molecules that are otherwise uncomfortably bound up in solvating a solute contributes to a net favorable change in free energy. Remember, free energy is composed of both enthalpy and entropy (∆G = ∆H - T∆S) and it's the latter term that's thought to lead to hydrophobic heaven.But not always. Here's a nice example of a protein-ligand interaction where the improvements in free energy across a series of similar molecules comes not from entropy but from improved enthalpy with the entropy actually being unfavorable. A group from the University of Texas tested the binding of a series of tripeptides against the Grb2 protein SH2 domain. The exact details of the protein are not important; what's important is that the molecules only differed in the size of the cycloalkane ring in the central residue of the peptide- going from a cyclopropane to a cyclohexane. They found that the free energy of binding improves as you go from a 3-membered to a 5-membered ring but not for the reason you expect, namely a greater hydrophobic effect and entropic gain from the larger and more lipophilic rings.Instead, when they experimentally break down the free energy into enthalpy and entropy using isothermal titration calorimetry (ITC), they find that all the gain in free energy is from enthalpy. They find that every extra methylene group contributes about 0.7 kcal/mol to the interaction. In fact the entropy becomes unfavorable, not favorable as you move up the series. There's another surprise waiting in the crystal structures of the complexes. There are a couple of ordered water molecules stuck in some of the complexes. Ordered water molecules are fixed in one place and are "unhappy", so you would expect these complexes to display unfavorable free energy. Again, you would be surprised. It's the ones without ordered water molecules that have worse free energy. The nail in the coffin of conventional hydrophobic thinking is driven by the observation that the free energy does not even correlate with decreased heat capacity, something that's supposed to be a hallmark of the "classical" hydrophobic effect.Now it's probably not too surprising to find the enthalpy being favorable; after all as they note, you are making more Van der Waals contacts with the protein with larger rings and greater nonpolar surface area. But in most general cases this value is small, and the dominant contribution to the free energy is supposed to come from the "classical" hydrophobic effect with attendant displacement of waters. Not in this case where enthalpy dominates and entropy worsens. They don't really speculate much on why this may be happening. One factor that comes to my mind is the flexibility of the protein. The improved contacts between the larger rings and the protein may well be enforcing rigidity in the protein, leading to a sort of "ligand enthalpy - protein entropy" compensation. Unfortunately a comparison between bound and unbound protein is precluded by the fact that the free protein forms not a monomer but a domain-swapped dimer. In this case I think that molecular dynamics simulations might be able to shed some light on the flexibility of the free protein compared to the bound structures; it might especially be worthwhile to do this exercise in the absence of the apo structureNonetheless, this study provides a nice counterexample to the conventional thermodynamic signature of the hydrophobic effect. The textbooks probably don't need to be rewritten anytime soon, but chemists will continue to be frustrated, busy and amused as they keep trying to tame these unruly creatures, the annoying wrinkles in the data, into an organized whole.Myslinski, J., DeLorbe, J., Clements, J., & Martin, S. (2011). Protein–Ligand Interactions: Thermodynamic Effects Associated with Increasing Nonpolar Surface Area Journal of the American Chemical Society DOI: 10.1021/ja2068752... Read more »
Myslinski, J., DeLorbe, J., Clements, J., & Martin, S. (2011) Protein–Ligand Interactions: Thermodynamic Effects Associated with Increasing Nonpolar Surface Area. Journal of the American Chemical Society, 2147483647. DOI: 10.1021/ja2068752
by The Curious Wavefunction in The Curious Wavefunction
The last decade has been a bonanza decade for the elucidation of structures of G Protein-Coupled Receptors (GPCRs), culminating with the landmark structure of the first GPCR-G protein complex published a few weeks ago. With 30% of all drugs targeting these proteins and their involvement in virtually every key aspect of health and disease, GPCRs remain glowingly important targets for pure and applied science.Yet there are miles to go before we sleep. Although we now have more than a dozen structures of half a dozen GPCRs in various states (inactive, active, G-protein coupled), there are still hundreds of GPCRs whose structures are not known. The existing GPCRs all fall into the 'Class A' GPCRs. We still have to mine the vast body of Class B and C GPCRs which comprise a huge number of functionally relevant proteins. The crystal structures which we do have comprise an invaluable resource but from the point of view of drug discovery, we still don't have enough.In the absence of crystal structures, homology modeling wherein a protein of high sequence homology is used to build a computational model for an unknown structure has been the favorite tool of modelers and structural biologists. Homology modelers were recently provided an opportunity to pit their skills against nature when a contest asked them to predict the structures of the D3 and CXCR4 receptors just before the real x-ray structures came out. Both proteins are important targets involved in multiple processes like neurotransmission, depression, psychoses, cancer and HIV infection. The D3 structure prediction involved predicting the ligand-bound structure of the protein complexed with eticlopride, a D3 antagonist.The results of the contest have been published before, but in a recent Nature Chemical Biology paper, a team led by Brian Shoichet (UCSF) and Bryan Roth (UNC-Chapel Hill) perform another test of homology modeling, this time connected to the ability to virtually screen potential D3 receptor ligands and discover novel active molecules with interesting chemotypes.Two experiments provided the comparison. One protocol used the D3 homology model to screen about 3 million compounds by docking, out of which about 20 were picked and tested in assays based on docking scores and inspection. The homology model was built on the basis of the published structure of the ß2 adrenergic receptor which has been structurally heavily studied. Then, after the x-ray structure of the D3 was released, they repeated the virtual screening protocol with the crystal structure; again, 3 million compounds out of which roughly 20 were picked and tested.First the somewhat surprising and heartening result; both homology model and crystal structure demonstrated similar hit rates- about 20%. In both the cases the actual affinity of the ligands ranged from about 200 nM - 3 µM. In addition, the screen revealed some novel chemotypes that did not resemble known D3 antagonists (although not surprisingly, some hits were similar to eticlopride). As an added bonus, the top ranked ligands using the homology model did not measurably inhibit the template ß2 adrenergic receptor, which means that the homology model probably did not retain the "memory" of the original template.Now for the bee in the bonnet. The very fact that the homology model and the crystal structure produced different hits means that the two models were not identical (only one hit overlapped between the two). Of course, it's too much to expect a model of a protein with thousands of moving parts to be identical to the experimental structure, but it goes to show how careful homology modeling has to be performed and how it can still be imperfect. What is more disturbing is that the differences between the model and the crystal structure responsible for the different hits were small; in one case the difference between two carbons was only 1 Å between the two models. Other amino acids differed by less than that.And all this even after generating a stupendous number of models of unbound and ligand-bound protein. As the paper says, the team generated about 98 million initial ligand-bound homology models. Screening the top models among these involved generating multiple conformations and binding modes of the 3 million compounds; the total number of discrete protein-ligand complexes resulting from this exercise numbered about 2 trillion. That such kind of evaluation is possible is a tribute to the enormous computing power we have at our fingertips. But it's also a commentary on how relatively primitive our models are so that we are still at a loss to predict minute structural differences with significant consequences in finding new active molecules.So where does this lead us? I think it's really useful to be able to perform such comparisons between homology models and crystal structures and we can only hope more such comparisons will be possible by virtue of an increasing pipeline of GPCR structures. Yet these exercises demonstrate how challenging it is to generate a truly accurate homology model. A few years ago a similar study demonstrated that a difference in a single torsional angle of a phenylalanine residue (and that too resulting in a counter-intuitive gauche conformation) affected the binding of a ligand to a homology model of the ß2 adrenergic receptor. Our ability to pinpoint such tiny differences in homology models is still in its infancy. And this is just for Class A GPCRs for which relatively accurate templates are available. Get into Class B and Class C territory and you start looking for the proverbial black cat in the dark.Now throw in the fascinating phenomenon of functional selectivity and you have a real wrench in the works. Functional selectivity, whereby different conformations of a GPCR binding to the same ligand modulate different signal transduction pathways and cause the ligand to change its mode of action (agonist, inverse agonist etc.) takes modeling of GPCRs to unknown levels of difficulty. Most modeling currently being done does not even attempt to consider protein flexibility which is at the heart of functional selectivity. Routinely including protein flexibility in GPCR modeling has some way to go.That is why I think that, as much as we will continue to learn from GPCR homology modeling, it's not going to contribute massively to GPCR drug discovery anytime soon. Constructing accurate homology models of even a fraction of the GPCR universe will take a long time. Using such models would be like throwing darts at a board for which the center is unknown. Until we can locate the center and are plagued with the complexities of functional selectivity, we may be better off pursuing experimental approaches that that can map the effect of ligands on a particular GPCR using multifunctional assays. Fortunately, such approaches are definitely seeing the light of day.Carlsson, J., Coleman, R., Setola, V., Irwin, J., Fan, H., Schlessinger, A., Sali, A., Roth, B., & Shoichet, B. (2011). Ligand discovery from a dopamine D3 receptor homology model and crystal structure Nature Chemical Biology DOI: 10.1038/nchembio.662... Read more »
Carlsson, J., Coleman, R., Setola, V., Irwin, J., Fan, H., Schlessinger, A., Sali, A., Roth, B., & Shoichet, B. (2011) Ligand discovery from a dopamine D3 receptor homology model and crystal structure. Nature Chemical Biology. DOI: 10.1038/nchembio.662
by The Curious Wavefunction in The Curious Wavefunction
One of the questions I have pondered in the past is why the functional form of a protein should correspond to its most thermodynamically stable structure. Although this assumption is built into almost all experimental and theoretical studies of protein folding, it is not at all obvious since one may imagine other forms which could have improved stability. For instance, two protein forms may differ in the presence of a hydrogen bond or two. Based on the location and connectivity of these bonds, sometimes this slight rearrangement can cause a radical change in function, but there's no good reason why it should in the general case.
The answer however is most obvious in case of amyloid, that endlessly intriguing protein form that is implicated in so many devastating neurological disorders. Amyloid is a very stable state is often highly resistant to temperature, pH and high salt conditions. It's fair to ask how stable or unstable it is with respect to functional, soluble forms of the same protein.
To answer this question, a team led by Christopher Dobson who is a world expert on amyloid performed a series of thermodynamic measurements on a diverse group of proteins in which they measured the free energy differences between the soluble and the amyloid state. The proteins included everything from the Aß protein found in Alzheimer's disease to human lysozyme and insulin. The finding was that the free energy differences (ranging from about 3 kcal/mol to 6 kcal/mol) are not terribly dependent on the exact sequence, an observation which would be consistent with the striking recently uncovered fact that amyloid formation can be induced in almost any protein independent of its sequence. In fact the free energy difference seemed to depend more on the length and seemed to be optimal for a length of 100 residues for which the amyloid form was most stable. The difference also sharply tipped away from amyloid for increasing lengths.
This observation seems to suggest that one consequence of evolving larger proteins might be steer them away from the amyloid state and is consistent with the fact that almost all amyloid proteins have relatively short lengths (for instance, the Alzheimer's disease amyloid protein Aß has a length of roughly 40 residues). The propensity toward amyloid formation also depended on the concentration and the authors derived an limiting concentration beyond which amyloid formation would be rapid. This is again not surprising since the concentration-dependence of the process has also been demonstrated.
The real surprise came when they compared these limiting concentrations of the protein to the corresponding physiological concentrations of the same proteins in plasma. Remarkably, they found that in almost every case the physiological concentration was higher than that required to achieve amyloid formation. Thus the observations clearly indicate that for many key proteins, the amyloid state is thermodynamically more stable than the native, functional state. To put it bluntly, many nicely folded and soluble proteins are actually metastable. Now, since native proteins don't constantly form amyloid and kill us all, it's clear that the barrier to amyloid formation must be kinetic. Intriguingly, the authors speculate that these barriers can be overcome when organisms are exposed to stress, mutations or aging.
This is a pretty intriguing study and seems to underscore the belief that at least for some proteins, the folded functional state is not the most stable. However in light of what we know about evolution, this should not be too surprising. Stability is just one of many factors to be optimized during natural selection and there is no reason to assume that evolution would always act to maximize this parameter at the cost of all others. It's worth always keeping in mind that evolution cannot afford to aim for the ideal but instead has to make do with what it has.
The other question in my mind is why in spite of these barriers existing in case of so many proteins like lysozyme, insulin etc. are they regularly overcome only in the case of Aß (1-42) and a select few others. Based on the speculation in the paper, this could be because these proteins are exposed to particularly harsh conditions that force them to climb past the kinetic barrier and settle into the amyloid valley of thermodynamic comfort and physiological woe.
Among many such conditions could very well be bacterial infections. A few years back I advanced a hypothesis about amyloid formation being a defense against viral and bacterial infection mediated through the production of free radicals. A kinetic barrier-surpassing mechanism of the kind speculated here might well be what allows these proteins to achieve the transition, killing the bacteria but ironically harming their owner in the process. In the context of the present study, I think there continue to be a lot of opportunities to investigate the possible infection-induced conversion of normal proteins to their amyloid form.
Hopefully someone will do the experiment.
Baldwin, A., Knowles, T., Tartaglia, G., Fitzpatrick, A., Devlin, G., Shammas, S., Waudby, C., Mossuto, M., Meehan, S., Gras, S., Christodoulou, J., Anthony-Cahill, S., Barker, P., Vendruscolo, M., & Dobson, C. (2011). Metastability of Native Proteins and the Phenomenon of Amyloid Formation Journal of the American Chemical Society DOI: 10.1021/ja2017703... Read more »
Baldwin, A., Knowles, T., Tartaglia, G., Fitzpatrick, A., Devlin, G., Shammas, S., Waudby, C., Mossuto, M., Meehan, S., Gras, S.... (2011) Metastability of Native Proteins and the Phenomenon of Amyloid Formation. Journal of the American Chemical Society, 2147483647. DOI: 10.1021/ja2017703
by The Curious Wavefunction in The Curious Wavefunction
Speaking of protein folding, here's something interesting. One of the most enduring views of protein folding from the last decade is that of an "energy funnel". The funnel was invented by the UCSD chemist Peter Wolynes in the 90s (the original paper is highly readable) and essentially depicts a plot of the configurational enthalpy (or effective energy) of the protein on the Y axis vs the configurational entropy on the X axis. In real situations this plot is multidimensional.The funnel suggests a way out of Levinthal's paradox which contrasts the fast folding times for virtually all proteins with the vast amount of conformational space to be searched. According to the funnel viewpoint, the energy of the protein on the Y axis decreases and becomes more favorable even as the entropy on the X axis decreases, leading to fewer conformations to be searched and allowing the protein to rapidly find the native structure. The funnel has become a mainstay of descriptions of protein folding and has made its way into textbooks.The funnel view of protein folding had always puzzled me a little for the simple reason that we usually think of the enthalpy and entropy of the protein (and in fact of any chemical system) as opposing factors. Entropy would hinder the protein even as it formed more "native" contacts and led to a favorable enthalpy. Yet the funnel seems to suggest a synergy between these two factors. Many papers have said that the funnel "guides" the protein to its correct conformational state. In this week's Nature Chemical Biology, one of the founding fathers of the field, Martin Karplus, sheds some light on this confusion and informs us that the traditional view of the funnel is indeed a little misleading.To support his argument, Karplus illustrates two examples of protein folding studies using two kinds of systems. One is a lattice model system in which the protein is approximated by beads on a lattice. Native contacts in the protein are indicated by adjacent beads on the lattice. The other folding simulation is a standard molecular dynamics simulation of an alpha helix. In both cases the proteins are small (about 30 residues) but their behavior at low and high temperatures is intriguing.At low temperatures, the folding landscape is more "rugged" and folding is slower. This is a well-established concept and it simply means that there is less energy for the protein to explore all the available local minima. At high temperature the landscape is "smooth" and the protein has enough energy to explore many conformational states. What is striking is that while the effective energy (enthalpy) at high temperature decreases smoothly all the way to the native state, the free energy (which is what we should really be worrying about) has a significant barrier. Thus this barrier has to come from entropy. The crucial thing to note is that at high temperatures, the free energy is dominated by the increasing unfavorable entropy engendered by the greater number of conformations that the protein has to search.Ultimately it's easy to forget that the protein folding "funnel" is only a theoretical construct, an intuitive model. Has anyone actually observed a funnel for a real protein? As the article notes, for now the answer is a decided "No". Unfortunately it may be impossible to ever do so since to construct a real funnel one would need knowledge of every single conformational state that a protein visits on its way to folding. In addition since folding is a statistical phenomenon, one would also need knowledge of every starting trajectory. Needless to say, for now this is at best a pipe dream. However the funnel remains a useful construct provided we remember the subtleties and caveats that Karplus has described. Ultimately it's a model, and like other models it need not be real, but it should at least be useful.Karplus, M. (2011). Behind the folding funnel diagram Nature Chemical Biology, 7 (7), 401-404 DOI: 10.1038/nchembio.565... Read more »
Karplus, M. (2011) Behind the folding funnel diagram. Nature Chemical Biology, 7(7), 401-404. DOI: 10.1038/nchembio.565
by The Curious Wavefunction in The Curious Wavefunction
Blogging has been swamped lately by that miracle called life but I could not help but be drawn to a paper in this week's Science which describes a most unholy and unexpected stabilizing alliance in a protein's innards.Proteins are known to form cross-links such as disulfide bonds to stabilize interactions with ligands and substrates. Any reasonable chemist would expect these kinds of interactions to be mediated between polar residues. But nature usurps us low-lifes once again. In this week's Science, a group led by Andrew Karplus reveals a surprising stabilizing covalent cross-link between two otherwise blissfully aloof protein parts- a valine and a phenylalanine. Who could have imagined these otherwise stable partners suddenly deciding to...bond?As chemists know however, there is only one kind of chemical entity that can create such havoc with stable functional groups- a metal. It turns out that the protein is a four-helix bundle diiron protein with two Fe atoms bound in proximity to the Val and Phe. The two irons apparently create their own cofactor by neatly supplying electrons to bond the Val and Phe to each other and moulding a cosy bed for themselves. The resolution is 1.2 A so the electron density is unambiguous. Intriguingly, the real function of the protein remains unknown.Organometallic chemists who are keeping the midnight oil burning trying to use metals to functionalize unreactive C-H bonds would not be too surprised that a metal is mediating such strange interactions. But the observation demonstrates something that chemists are all too familiar with by now- Nature has been there, and it's done that.Cooley, R., Rhoads, T., Arp, D., & Karplus, P. (2011). A Diiron Protein Autogenerates a Valine-Phenylalanine Cross-Link Science, 332 (6032), 929-929 DOI: 10.1126/science.1205687... Read more »
Cooley, R., Rhoads, T., Arp, D., & Karplus, P. (2011) A Diiron Protein Autogenerates a Valine-Phenylalanine Cross-Link. Science, 332(6032), 929-929. DOI: 10.1126/science.1205687
by The Curious Wavefunction in The Curious Wavefunction
Blogging has been swamped lately by that miracle called life but I could not help but be drawn to a paper in this week's Science which describes a most unholy and unexpected stabilizing alliance in a protein's innards.Proteins are known to form cross-links such as disulfide bonds to stabilize interactions with ligands and substrates. Any reasonable chemist would expect these kinds of interactions to be mediated between polar residues. But nature usurps us low-lifes once again. In this week's Science, a group led by Andrew Karplus reveals a stabilizing covalent cross-link between, hold your breath, a valine and a phenylalanine. Who could have imagined these two otherwise blissfully aloof and stable partners suddenly deciding to...bond?As chemists know however, there is only one kind of chemical entity that can create such havoc with stable functional groups- a metal. It turns out that the protein is a four-helix bundle diiron protein with two Fe atoms bound in proximity to the Val and Phe. The two irons apparently create their own cofactor by neatly supplying electrons to bond the Val and Phe to each other and molding a cosy bed for themselves. The resolution is 1.2 A so the electron density is unambiguous. The function of the unusual cross-link seems to provide a barrier to protect the iron from potential iron chelators; experiments indicate that the iron is rapidly mopped up by chelators in mutants lacking the cross-link. Intriguingly, the real function of the protein itself remains unknown.Organometallic chemists who are keeping the midnight oil burning trying to use metals to functionalize unreactive C-H bonds would not be too surprised that a metal is mediating such strange interactions. But the observation demonstrates something that chemists are all too familiar with by now- Nature has been there, and it's done that.Cooley, R., Rhoads, T., Arp, D., & Karplus, P. (2011). A Diiron Protein Autogenerates a Valine-Phenylalanine Cross-Link Science, 332 (6032), 929-929 DOI: 10.1126/science.1205687... Read more »
Cooley, R., Rhoads, T., Arp, D., & Karplus, P. (2011) A Diiron Protein Autogenerates a Valine-Phenylalanine Cross-Link. Science, 332(6032), 929-929. DOI: 10.1126/science.1205687
by The Curious Wavefunction in The Curious Wavefunction
I have been unable to blog for the past few days because I was busy moving to Chapel Hill for a postdoc at UNC Chapel Hill. I am very excited about this move and my upcoming research which is going to involve protein design and folding. Regular blogging will resume soon. Until then, happy holidays, and I will leave you with the following interesting paper published by a group from my new institution.One of the abiding puzzles in the origin of life is to explain how life arose in the relatively small amount of time it had to evolve on the planet. From a chemical perspective, this entails explaining how especially slow chemical reactions could have contributed to the complexity of life. In a new paper in PNAS, a group from UNC suggests part of a possible solution to the puzzle by demonstrating that slow reactions especially are accelerated by temperature much more than fast reactions. Recall from college physical chemistry that the rate of a typical reaction roughly doubles with ten degree rise in temperature. As the authors note, this bit of textbook wisdom is off the mark when it comes to many important reactions and needs to be appended.They look at certain important reactions like the hydrolysis of phosphate monoesters and find that these reactions are accelerated not two or a few fold but many million fold with a rise in temperature. The increase in rate would have been significant especially under the hot, primordial conditions present on earth during its early days. Now this acceleration is free-energetic and basically corresponds to a favorable change in either the entropy or the enthaply of activation. The authors measure both this variables and find that the crucial change is in the enthalpy. It's interesting to note that a favorable change in the enthalpy would entail forming stronger interactions including hydrogen bonds between substrate and enzyme, and this is exactly the kind of process you would imagine happening during the optimization of biomolecular interactions during evolution. In fact, recent research suggests that this process of optimizing enthalpy is also mirrored during drug discovery. The authors end by explaining why a catalyst that impacted enthalpy rather than entropy favorably would have had a selective advantage in rate acceleration as the environment later cooled (and entropy became unfavorable).Amusingly, the paper has come under criticism from some unexpected quarters, from none other than folks from the infamous 'Discovery' Institute which is funded and run by creationists. In the view of these esteemed 'scientists', the paper provides no evidence that the slow reactions which were accelerated were in fact ones which were important during the origin of life. The DI crowd seems to have fundamentally misjudged the nature of origins of life research; it's more speculative than many other fields but still remains scientific. More importantly, the criticism seems to have completely missed the fact that the general hypotheses proposed by the authors- that all slow reactions could have been vastly accelerated by temperature on a hot primordial planet- is independent of the exact nature of these reactions which may or may not have contributed to life's origins. As usual, miss the forest for the trees.Stockbridge, R., Lewis, C., Yuan, Y., & Wolfenden, R. (2010). Impact of temperature on the time required for the establishment of primordial biochemistry, and for the evolution of enzymes Proceedings of the National Academy of Sciences, 107 (51), 22102-22105 DOI: 10.1073/pnas.1013647107... Read more »
Stockbridge, R., Lewis, C., Yuan, Y., & Wolfenden, R. (2010) Impact of temperature on the time required for the establishment of primordial biochemistry, and for the evolution of enzymes. Proceedings of the National Academy of Sciences, 107(51), 22102-22105. DOI: 10.1073/pnas.1013647107
by The Curious Wavefunction in The Curious Wavefunction
One of the more important paradigm shifts in our understanding of the Alzheimer’s disease-causing amyloid protein in the last few years has been the recognition of differences between the well known polymer aggregates of amyloid and their smaller, soluble oligomer counterparts. For a long time it was believed that the fully formed 40-42 amino acid protein aggregate found in autopsies was the causative agent in AD, or at least the most toxic one. This understanding has radically changed in the last few years, partly through elegant work done in identifying oligomers and partly through the unfortunate results of clinical trials targeting amyloid. The new understanding is that it’s not the fully formed aggregates but the smaller oligomers that are the real toxic species. Identifying these different monomers, dimers, trimers and tetramers is a valuable goal. But until now their recognition has mainly depended on raising specific antibodies against them, a tedious and expensive process. Small molecule probes that specifically identify each oligomer have been missing. In a recent JACS communication, a team from the University of Michigan uses a simple but clever technique to develop such probes and makes a promising step in this direction. The probes are based on the idea that the best antidote against a poison is another poison. In this case the poison is the specific sequence of amino acids that makes up amyloid. In particular, a sequence of five amino acids- KLVFF- has been found to be sufficient for aggregation and toxicity. The aggregates form by the stacking of beta sheets principally driven by hydrophobic interaction between the FF residues; each pair thus serves as a growth site for addition of further such residues. The insight then is that if one could construct a mimic of the sequence, this mimic would basically act as a competitive inhibitor and bind to the normal sequence, inhibiting further growth. In this case the strategy was to use KLVFF segments themselves which would sort of wrap around newly formed oligomers of different constitution and sequester them from further self-assembly. So the team essentially constructed two KLVFF segments joined by a linker. The linker would also serve the purpose of providing an entropic advantage to the two segments so that they would not be at an energetic disadvantage during binding. The important question was how long the linker should be. To decide on the length of the linker the team made some clever use of molecular dynamics simulations. Since you can estimate the approximate thickness of every oligomer, you can estimate the linker length that would be required to keep two KLVFF segments at the same distance as the thickness of the oligomer. For instance, the distances between the segments needed to wrap around the oligomers were 14-15 A for the dimer, 19-20 A for the trimer and 24-25 A for the tetramer. But the linker should also keep the segments stable at that distance. To probe this the team used MD simulations. The MD simulations revealed the length of the linker required to keep the two segments separated at the specific distances by indicating how much time the assembly spent at those distances.To test these results, the team then generated mixtures of different kinds of KLVFF oligomers and then added each probe to the solution. A streptavidin moiety was attached to every probe. Silver staining revealed that each probe was specifically binding to an oligomer of a certain type dictated by the compatibility of the intraprobe distance and oligomer thickness. Trimers and tetramers could be clearly identified but there was more ambiguity in case of dimers, presumably because of their less ordered structure. Most interestingly, the team then added the probes to cerebrospinal fluid (CSF). Since amyloid is part of normal physiology, it is present in CSF. Gratifyingly they found that the probes could very clearly label trimers and tetramers against a background of several other proteins and intermediates in CSF. This experiment notably demonstrates that the method can selectively detect amyloid oligomers in complex mixtures. I think that this work is valuable and paves the way toward the development of similar small-molecule based probes for identifying the key intermediates in amyloid formation. It could also be very useful in exploring amyloid formation in normal physiology and in exploring the stages of protein self-assembly in diverse amyloid-based diseases.Reinke, A., Ung, P., Quintero, J., Carlson, H., & Gestwicki, J. (2010). Chemical Probes That Selectively Recognize the Earliest Aβ Oligomers in Complex Mixtures Journal of the American Chemical Society DOI: 10.1021/ja106291e... Read more »
Reinke, A., Ung, P., Quintero, J., Carlson, H., & Gestwicki, J. (2010) Chemical Probes That Selectively Recognize the Earliest Aβ Oligomers in Complex Mixtures. Journal of the American Chemical Society, 2147483647. DOI: 10.1021/ja106291e
by The Curious Wavefunction in The Curious Wavefunction
Vijay Pande's group at Stanford has become well-known for using the collective force of millions of CPUs around the world for simulating protein folding in the project known as Folding@home. One of the enduring challenges in simulating folding has been to sample the long timescales that are common in real-life folding events, and recent breakthroughs have made accessing such time domains realistic. We should expect long protein folding simulations to be within the reach of many non-specialists in the next few years. In the latest issue of JACS, Pande's group provides an example of such advances by simulating the folding of a 39 residue protein called NTL9. The actual folding time is 1.5 ms so this is a substantially long MD simulation. To achieve this, Pande's group uses Graphic Processor Units (GPUs) of the kind that are found in video game modules. Over the last few years these units have made interesting biological phenomena accessible to chemists. C & EN has a nice article on the increasing use of GPUs for biomolecular simulation.Pande's group also uses a set of statistical tools called Markov State Models (MSMs) to identify metastable folding states and the transition trajectories between them. MSMs provide a nifty strategy to bridge the results from several short trajectories (rather than running one long one).What is endearing about the simulation is that that the correct structure doesn't form until much later and then quickly falls in place, like a lost kid suddenly remembering his place in the marching band. As can be seen in the video below, the missing piece of the puzzle is a short C-terminal part of a beta-sheet which seems to linger as part of an alpha helix while the rest of the sheet structure forms. After comfortably waltzing around as a little helical piece for a long time, it seems to suddenly remember its correct identity and snaps and collapses into place as part of the beta sheet. Very nice!Admittedly, a 39 residue protein is minuscule compared to most typical proteins. But the results provide a neat proof of concept. Importantly, they also show that current force fields with implicit solvent models can be accurate enough for this kind of simulation. Further validation will test these force fields more stringently.Voelz VA, Bowman GR, Beauchamp K, & Pande VS (2010). Molecular simulation of ab initio protein folding for a millisecond folder NTL9(1-39). Journal of the American Chemical Society, 132 (5), 1526-8 PMID: 20070076... Read more »
Voelz VA, Bowman GR, Beauchamp K, & Pande VS. (2010) Molecular simulation of ab initio protein folding for a millisecond folder NTL9(1-39). Journal of the American Chemical Society, 132(5), 1526-8. PMID: 20070076
by The Curious Wavefunction in The Curious Wavefunction
One of the great things about Bach’s organ music is how changes of a single note in a whole pattern can have rather dramatic effects on the sound. A unique and potentially very important similar phenomenon has been discovered recently in the area of GPCR research.The understanding of the basic process by which GPCRs transmit signals from the cell exterior to the interior has seen remarkable advances in the last three decades, but much still remains to be deciphered. Our knowledge of signaling responses until now hinged on the action of agonists and antagonists. Central to this knowledge was the concept of ‘intrinsic efficacy’; according to this concept, there was no difference between two full agonists for instance, and both of them would produce the same response irrespective of the situation.But this understanding failed to explain some observations. For instance, a full agonist would function as a partial agonist and even as an inverse agonist under different circumstances. Several such observations, especially in the context of GPCRs involved in neurotransmission, have forced a re-evaluation of the concept of intrinsic efficacy and led to an integrated formulation of a fascinating concept called ‘functional selectivity’.So what is functional selectivity? It is the phenomenon by which the same kind of ligand (agonist, antagonist etc.) can modulate different signaling pathways activated by a single GPCR, leading to different physiological responses. Functional selectivity thus opens up a whole new method of modifying GPCR signaling in complex ways. It comprises a new layer of complexity and control that biological systems enforce at the molecular level to engage in complex signaling and homeostasis. Functional selectivity can allow the ‘tuning’ of ligands on a continuum scale of properties, from agonism to inverse agonism. In addition it can tightly regulate the strength of the particular property. It is what allows GPCRs to function as rheostats rather than as binary switches and allows them to exercise a fine layer of biological control and discrimination.Functional selectivity is not just of academic interest. It can have clinical significance. Probably most tantalizingly, it may be one of the holy grails of pharmacology that allows us to separate the beneficial and harmful effects of a drug, leading to Paul Ehrlich’s ‘magic bullet’. Until now, side-effects have been predominantly thought to result from the lack of subtype-specificity of drugs. For instance, morphine’s side effects are thought to result from its activation of the μ-opioid receptor. But functional selectivity could provide a totally new avenue for explaining and possibly mitigating side-effects of drugs. For instance, consider the dopamine receptor agonist ropinirole, used in the treatment of Parkinson’s disease. There are several D-receptor agonists and just like them ropinirole interacts with several receptor subtypes. But unlike many of these, ropinirole does not demonstrate the dangerous side-effect named valvulopathy, a weakening of the heart valves that makes them stiff and inflamed. This can be a potentially life-threatening condition that seems to be facilitated by several dopamine agonists, but not ropinirole. The cause seems to be becoming clear only now; ropinirole is a functionally selective ligand that activates a certain pattern of second messenger pathways that is different from those activated by other agonists. Somehow this pattern of pathways is responsible for reduced valvulopathy.Let’s go back to the organ/piano analogy to gauge the significance of such control. The sound produced by a piano depends on two variables- the exact identities of the keys pressed, and the intensity (how hard or softly you press them). The second variable can be as important as the first since a pressing a key particularly hard can drown out other notes and influence the very nature of the sound. The analogy to functional selectivity would be in looking at the keys themselves as different signaling pathways and the intensity of the notes as the strength of the pathways. Now, if one ligand binding to a single GPCR is able to activate a specific combination of these pathways, each with its own strengths, think of the permutations and combinations you could get from a set of even a dozen pathways- an astonishing number. Thus, functional selectivity could be the key that unlocks the puzzle of how one ligand can put into motion such a complex set of signaling events and physiological responses. One ligand- one receptor- several pathways with differing strengths. An added variable is the concentration of certain second messengers in a particular environment or cell type, which could add even more combinations. This picture could go a long way toward explaining how we can get such complex signaling in the brain from just a few ligands like dopamine, serotonin and histamine. And as described above, it also provides a fascinating direction - along with control of subtype selectivity (a much more well known and accepted cause) - for developing therapies that demonstrate all the good stuff without the bad stuff.The basic foundation of functional selectivity is as tantalizing. Whatever the reasons for the phenomenon, the proximal cause for it has to concern the stabilization of different protein conformations by the same kind of ligands. Unravel these protein conformations and you would make significant inroads into unraveling functional selectivity. If you come to think of it, this principle is not too different from the current model of conformational selection used in explaining the action of agonists and antagonists in general, which involves the stabilization of certain conformations by specific molecules.Nature never ceases to amaze. As we plumb its mysteries further, it reveals deeper, more subtle and finer layers of control and discrimination that allows it to generate profound complexity starting from some relatively simple events like the binding of a disarmingly simple molecule like adrenaline to a protein. And combined with the action of several proteins, the concerto turns into a symphony. We have been privileged to be in the audience.Mailman, R., & Murthy, V. (2010). Ligand functional selectivity advances our understanding of drug mechanisms and drug discovery Neuropsychopharmacology, 35 (1), 345-346 DOI: 10.1038/npp.2009.117Kelly, E., Bailey, C., & Henderson, G. (2009). Agonist-selective mechanisms of GPCR desensitization British Journal of Pharmacology, 153 (S1) DOI: 10.1038/sj.bjp.0707604... Read more »
Mailman, R., & Murthy, V. (2010) Ligand functional selectivity advances our understanding of drug mechanisms and drug discovery. Neuropsychopharmacology, 35(1), 345-346. DOI: 10.1038/npp.2009.117
Kelly, E., Bailey, C., & Henderson, G. (2009) Agonist-selective mechanisms of GPCR desensitization. British Journal of Pharmacology, 153(S1). DOI: 10.1038/sj.bjp.0707604
by The Curious Wavefunction in The Curious Wavefunction
If successful, virtual screening (VS) promises to become an efficient way to find new pharmaceutical hits, competitive with high-throughput screening (HTS). Briefly, virtual screening screens libraries of millions of compounds to find new and diverse hits, either based on similarity to a known active or by complementarity to a protein binding site. The former protocol is called ligand-based VS (LBVS) and the latter is called structure-based VS (SBVS). In a typical VS campaign, either LBVS or SBVS is used to screen compounds which are then ranked by how well they are likely to be active. The top few percent compounds are then actually tested in assays, thus validating the success or failure of the VS procedure. VS has the potential to cut down on time and expenses inherent in the HTS process.Unfortunately the success rate of VS has been relatively poor, ranging from a few tenths of a percent to no more than a few percent. If VS is to become a standard part of drug discovery, the factors that influence its failures and successes warrant a thorough review. A recent review in JMC addresses some of these factors and raises some intriguing questions. From a single publication with the phrase ‘virtual screening’ in 1997, there are now about a hundred such papers every year. The authors pick about 400 successful VS results from three dominant journals in the field- J. Med. Chem., Bioorg. Med. Chem. Lett. and ChemMedChem, along with some from J. Chem. Inf. Mod. They then classify these into ligand-based and structure-based techniques. As mentioned before, VS comes in two flavors. LBVS starts from a known potent compound and then looks for “similar” compounds (with dozens of ways of defining ‘similarity’), with the hope that chemical similarity will translate into biological similarity. Structure-based techniques start with a protein structure, either an x-ray crystal structure, NMR structure or a structure built from homology modeling. The authors look for correlations between the types of methods and various parameters of success and make some interesting observations, some of which are rather counterintuitive. Here are a couple that are especially interesting.While SBVS methods dominate, LBVS methods seem to find more potent hits, usually defined as less than 1 μM in activity. Why does this happen? Actually the authors don’t seem to dwell on this point but I have some thoughts on this. Typically when you start a ligand-based campaign, your starting point is a bonafide highly potent ligand. If you have a choice of ligands with a range of activity, you will naturally pick the most potent among them as your query. Now, if your method works, is it surprising that the hits you find based on this query will also be potent? You get what you put in. Contrast this to a structure-based approach. You usually start with a crystal structure having a co-crystallized ligand in it. Co-crystallized ligands are usually but not always highly potent. The next step would be to use a method like docking to find hits that are complementary to your protein binding-site. But the binding site is conformationally pre-organized and optimized to bind its co-crystallized ligand. Thus, the ligands you screen will not be ranked highly by your docking protocol if they are ill-optimized for the binding site. For instance there could be significant induced fit changes during their binding. Even in the absence of explicit induced fit, fine parameters like precise hydrogen bonding geometries will greatly affect your score; after all, the protein binding site has hydrogen bonding geometries tailored for optimally binding its cognate ligand. If the hydrogen bonding geometries for your ligands are off even by a bit, the score will suffer. No wonder that the hits you find span a range of activities; you are using a binding site template that is not optimized to bind most of your ligands. The other reason which could thwart SBVS campaigns is simply that there is more work necessary in ‘preparing’ a crystal structure for docking. You have to add hydrogens to a structure, make sure all the ionization states are right and optimize the hydrogen bonding network in the protein. If any one of these steps goes wrong you will start with a fundamentally crappy protein structure for screening. Thus this protocol usually requires expert inspection, unlike LBVS where you just have to ‘prepare’ a single template ligand by making sure that the ionization state and bond orders are ok. These differences mean that your starting point for SBVS is more tortuous and much more likely to be messy than it is for LBVS. Again, you get out what you put in.The second observation that the authors make is also interesting, and it bears on the protein preparation step we just mentioned. They find that VS campaigns where putative hits are docked into homology models seem to find more potent hits compared to those using an x-ray structure. This is surprising since x-ray structures are supposed to be the most rock-solid structures for docking. The authors speculate that this difference could be due to the fact that building good homology model requires a fair level of expertise; thus, successful VS campaigns using homology models are likely to be carried out by experts who know what they are doing, whereas x-ray structures are more likely to be used by novices who simply use the default parameters for docking.Thirdly, the authors note an interesting correlation between the potency and frequency of hits found and the families of proteins targeted. GPCRs seem to be the most successful targeted family, followed by enzymes and then kinases. This is a pretty interesting observation and to me it points to a crucial factor which the authors don’t seem to really discuss- the nature of the libraries used for screening. These libraries are usually biased by the preferences and efforts of medicinal chemists in making certain kinds of compounds. I already blogged about a paper that looked at the surprising success of VS in finding GPCR ligands, and that paper ascribed this success to ‘library bias’ which was the fact that libraries are sometimes ‘enriched’ for GPCR-active ligands, such as aminergic compounds. Ditto for kinases; kinase inhibitors-like molecules now abound in many libraries. This is partly due to the importance of these targets and partly because of the prevalence of synthetic reactions (like cross-coupling reactions) that make it easy for medicinal chemists to synthesize such ligands and populate libraries with them. I think it would have been very interesting for the authors to analyze the nature of the screened libraries; unfortunately such information is proprietary in industrial publications. But in the absence of such data, one would have to assume that we are dealing with a fundamentally biased set of libraries, which would explain selective target enrichment.Finally, the authors find that most successful VS efforts have come from academia, while most of the potent hits have come from industry. This seems to be consistent with the role of the former in validating methodologies and that of the former in discovering new drugs.There are some caveats as usual. Most of the studies don’t include a detailed analysis of false positives and negatives since such analysis is time consuming. But this analysis can be extremely valuable in truly validating a method. Standards for assessing the success of VS are also not consistent and universal and these will have to be decided for true comparisons. But overall, virtual screening seems to hold promise. At the very least there are holes and gaps to fill. And researchers are always fond of these.Ripphausen, P., Nisius, B., Peltason, L., & Bajorath, J. (2010). Quo Vadis, Virtual Screening? A Comprehensive Survey of Prospective Applications Journal of Medicinal Chemistry DOI: 10.1021/jm101020z... Read more »
Ripphausen, P., Nisius, B., Peltason, L., & Bajorath, J. (2010) Quo Vadis, Virtual Screening? A Comprehensive Survey of Prospective Applications. Journal of Medicinal Chemistry, 2147483647. DOI: 10.1021/jm101020z
by The Curious Wavefunction in The Curious Wavefunction
The biggest utility of NMR spectroscopy in drug discovery is in assessing three things; whether a particular ligand binds to a protein, what site on the protein it binds, and what parts of the ligand interact with the protein. Over the last few years a powerful technique named ‘SAR by NMR’ has emerged which is now widely used in ligand screening. In this technique, changes in the resonances of ligand and protein protons are observed to pinpoint the ligand binding site and corresponding residues. Generally when a ligand binds to a protein, both its and the protein’s rotational correlation time decreases; the result is a broadening of signals in the spectrum which can be used to detect ligand binding. One of the most effective methods in this general area is Saturation Transfer Difference (STD) spectroscopy. As the name indicates, it hinges on the transfer of magnetization between protein and ligand; the resulting decrease in intensity of ligand signals can provide valuable information about proximity of ligand protons with specific protein residues.But these kinds of techniques suffer from some drawbacks. One straightforward drawback is that signals from protein and ligand may simply overlap. Secondly, the broadening may be so much as to virtually make the signals disappear. Thirdly from a practical perspective, it is hard to get sufficient amounts of N15-labeled protein (usually obtained by growing bacteria on a N15-rich source and then purifying the proteins of interest).To circumvent some of these problems, a team at Abbott Laboratories has come up with a neat and relatively simple method which they call ‘labeled ligand displacement’. The method involves synthesizing a protein-binding probe that has been selectively labeled with C13. Protein binding broadens and diminishes the signals of this probe. However, when a high-affinity ligand is then added, it displaces the probe and we get recovery of the C13 signals. The authors illustrate this paradigm with several proteins of pharmaceutical interest, including heat-shock protein and carbonic anhydrase.The method is relatively simple. For one thing, using a commercially available C13-labeled building block for synthesizing a ligand is easier than obtaining a N15-labeled protein. The biggest merit of the method though is the fact that it hinges on C13 signals very specific to the probe; thus there is no complicating overlap of signals. And finally, the ligand seems to be general enough to be applied to any protein. Only time will tell how much it is utilized, but for now it seems like a neat addition to the arsenal of NMR methods for studying protein-ligand interactions.Swann, S., Song, D., Sun, C., Hajduk, P., & Petros, A. (2010). Labeled Ligand Displacement: Extending NMR-Based Screening of Protein Targets ACS Medicinal Chemistry Letters, 1 (6), 295-299 DOI: 10.1021/ml1000849... Read more »
Swann, S., Song, D., Sun, C., Hajduk, P., & Petros, A. (2010) Labeled Ligand Displacement: Extending NMR-Based Screening of Protein Targets. ACS Medicinal Chemistry Letters, 1(6), 295-299. DOI: 10.1021/ml1000849
by The Curious Wavefunction in The Curious Wavefunction
The tumor suppressor p53 is one of the cell’s very best friends. Just how good a friend it is becomes apparent when, just like in other relationships, this particular relationship turns sour. p53 is the “master guardian angel” of the genome and constitutes the most frequent genetic alteration in cancer. More than 50% of human tumors contain a mutation in the p53 gene. With this kind of glowing track record, p53 would be a prime target for drugs.It turns out that discovering drugs for p53 is trickier than you think. The protein displays complex structural biology, and the mechanism of inhibitor action is not clear. But p53 malfunction is also characterized by one of the most distinctive physical mechanisms to ever emerge in an oncoprotein- about 30% of mutations in p53 simply lower the melting temperature (Tm) so that the protein becomes unstable and disordered. Thus, potential inhibitors of mutated p53 have been often termed ‘rescuers’ since ideally they would ‘rescue’ the protein from its unstable state.In a recent paper, a team led by Alan Fersht of Cambridge- one of the world’s foremost protein chemists and p53 experts- explores one frequent mutation in p53 and how its consequences could be suitably exploited for rational drug discovery. The study is a nice example of the value of interdisciplinary research in tackling a complex problem. The rogue mutation is quite simple; it turns a tyrosine on the protein surface to a cysteine. The change to a smaller amino acid opens up a cavity on the protein surface. Is this cavity druggable’, that is, can a small molecule be found that selectively and potently binds to this cavity? This is what the researchers seek to do in this study (The presence of the cysteine makes me wonder if someone has tried a covalent tethering strategy for targeting this site).The targeted site is an interesting one. It’s not exactly an allosteric site, since it is far away from the functional site but does not seem to affect the functional site. But Fersht and his colleagues have previously found a small molecule that binds to this site and raises the melting temperature. In this report, the authors extent p53 inhibitor discovery for the Y220C mutation further by using a combination of experimental and computational techniques.They start by screening a fragment library; fragment-based drug discovery is now a stable of rational drug design. To minimize false positives and negatives, they use two complementary techniques: a 1D NMR method and a technique called thermal denaturation scanning fluorimetry, which detects the effect of a ligand using an exogenous dye. The two methods interestingly gave quite diverse hits, indicating the wisdom of combining them. To further confirm the binding of the fragments to the protein, they then use N15/H1 HSQC, an NMR method that detects changes in the proton and N15 chemical shifts when a ligand binds to the protein. By comparing unbound protein shifts to bound ones, one can locate only the amino acids that interact with the ligand. This is a really nice method since one can really make out the concerned amino acids just by inspection; in the figure below, the multi-colored areas indicate significantly perturbed amino acids, and it’s very useful to locate the exact binding sites since amino acids outside the site don’t seem to be affected. In this particular case the key residues turned out to be a valine, an aspartate and a few others.The study proceeded by employing the one method which can confirm ligand binding better than any other- x-ray crystallography. Crystal structures revealed binding modes for a few of the hits. One molecule turned out to bind to the binding cavity in two copies.What exactly are the hits doing to the protein? To investigate this, the authors used molecular dynamics (MD) simulations in isopropanol. In this case isopropanol is not a solvent mimic but it’s actually a drug mimic. It has a polar and non-polar part and it can approximate a typical protein-binding molecule well. Binding cavities can be detected by looking at the density of isopropanol in the pockets. In this particular example, the highest density is actually found in other sites, but those sites are not relevant for this study. The most intriguing observation from the MD simulations was that the size of the cavity fluctuates wildly when the ligand is not present. This dynamic flexibility of residues can be characteristic of an unstable region (this was also observed in some computational enzyme designs that I described earlier). To validate this flexibility, the simulations were run with the ligand present; not surprisingly, the size fluctuation in the cavity reduces. Intriguingly, the ligand seems to play the role of the previous tyrosine in packing against the other residues and keeping the site stable.There is some way to go before we have a bonafide drug that ‘rescues’ p53 from its ignominious fate. But this study is a nice illustration of how only interdisciplinary computational and experimental work can really help us to unravel the mysteries of this ubiquitous and enigmatic Jekyll and Hyde.Basse, N., Kaar, J., Settanni, G., Joerger, A., Rutherford, T., & Fersht, A. (2010). Toward the Rational Design of p53-Stabilizing Drugs: Probing the Surface of the Oncogenic Y220C Mutant Chemistry & Biology, 17 (1), 46-56 DOI: 10.1016/j.chembiol.2009.12.011... Read more »
Basse, N., Kaar, J., Settanni, G., Joerger, A., Rutherford, T., & Fersht, A. (2010) Toward the Rational Design of p53-Stabilizing Drugs: Probing the Surface of the Oncogenic Y220C Mutant. Chemistry , 17(1), 46-56. DOI: 10.1016/j.chembiol.2009.12.011
by The Curious Wavefunction in The Curious Wavefunction
Since we were talking about GPCRs the other day, here's a nice overview of some of the experimental challenges associated with membrane proteins and how researchers are trying to overcome them. These challenges are associated not just with the crystallization, but with the whole shebang. Although many clever tricks have emerged, we have a long way to go, and at least a few of the tricks sound like brute trial and error.To begin with, it's not that easy to get your expression system to produce ample amounts of protein. As indicated, you often need liters of cell culture to get a few milligrams of protein. The workhorse for production is still good old E. coli. E. coli does not always fold membrane proteins well, but it still beats other expression systems because of its cost and efficiency. Researchers have discovered several tricks to coax E. coli to make better protein. For instance it turns out that cold, nutrient poor conditions and slower-growing bacteria produce better folded and functional protein (although the exact reasons are probably not known, I suspect it has to do with thermodynamics and the binding of chaperones). Adding lipids from higher organisms to the medium also seems to sometimes help.What’s more interesting are efforts to do away with cellular production altogether and just add reagents to cell lysates to jiggle the protein-production machinery. For some reason, wheat-germ lysates seem to work particularly well. There are companies willing to use these lysates to produce hundreds of milligrams of protein. One of the advantages of such cell-free systems is that you can add solubilizing agents and detergents to stabilize the proteins. A striking fact emerging from the article is how many private companies are engaged in developing such technology for membrane proteins; the end "credits" list at least a dozen corporate entities. The list should be encouraging to visionaries who see more fruitful academic-industrial collaborations in the future.Then of course, there’s the all-important problem of crystallization. Of the 50,000 or so structures in the PDB, hardly a dozen are of membrane proteins. Membrane proteins present the classic paradox; keep them stable in the membrane and methods like crystallography and NMR cannot study them, but take them out of the membrane and, divorced from the protective effects of the lipid bilayer, they fall apart. Scientists have worked for years and come up with dozens of tricks to circumvent this catch-22. Adding the right kind of detergents can help. In the landmark structure of the beta-2 adrenergic receptor that was solved in 2007, the researchers used two tricks: attaching a stabilizing antibody to essentially clamp two transmembrane helices together, and replacing a disordered section of the protein with a T4 lysozyme, both strategies geared toward stabilizing the protein.In the end though, there is really no general strategy and that’s still the cardinal bottleneck; as the article's title says, a "trillion tiny tweaks" are necessary to make your system work. What works for one specific membrane protein fails for another. As one of the pioneers in the field, Raymond Stevens from Scripps says, “People are always asking what the one strategy that worked is. The answer is there wasn’t one strategy, there were about fifteen”. This is why chemistry (or economics) is not like physics. Although there are general rules, every specific case still invokes its own principles. In fields like membrane protein chemistry, it is unlikely that a single holy-grail strategy could be discovered that could work for all of them. The medley of techniques applied to membrane proteins makes the science seem sometimes like black magic and trial-and-error. All this makes chemistry hard, but also very interesting; if only a dozen membrane proteins have their structures solved, think of how many more are waiting in the shadows, awaiting the fruits of our sweat and toil.Baker, M. (2010). Making membrane proteins for structures: a trillion tiny tweaks Nature Methods, 7 (6), 429-434 DOI: 10.1038/nmeth0610-429... Read more »
Baker, M. (2010) Making membrane proteins for structures: a trillion tiny tweaks. Nature Methods, 7(6), 429-434. DOI: 10.1038/nmeth0610-429
by The Curious Wavefunction in The Curious Wavefunction
Well, it's hard for several reasons which I have discussed in previous posts, but here's one reason demonstrated by a recent paper. In this paper they crystallized the ß2 adrenergic receptor with an antagonist. Previously, in the landmark publication of the ß2 structure in 2007, the protein had been crystallized with an inverse agonist. Recall that an inverse agonist inhibits the basal activity of the GPCR whereas an antagonist stabilizes both active and inactive states but does not affect the basal activity. In this case they crystallized the ß2 with an antagonist and compared the resulting structure to that of the agonist-GPCR complex. And they saw...nothing in particular. The protein backbone and side-chain locations are very similar for the antagonist (compound 3) and inverse agonist (compound 2) shown in the figure below. As we can see in the figure, about the only side-chain that shows any movement is the tyrosine on the left (Y316). No wonder that cross-docking the two ligands (that is, docking one ligand into the other ligand's protein conformation) gave very accurate ligand orientations; this was essentially a softball problem for a docking program since the antagonist was being docked into a protein conformation that was virtually identical to its own. But of course, we know that antagonists and agonists affect GPCR function quite differently. As this study shows, clearly the action is not taking place in the ligand-binding pocket where things aren't really moving. So where is the real action? It's naturally taking place on the intracellular side, where the GPCR interacts with a medley of other proteins. And as the paper accurately notes, the difference between antagonist and inverse agonist binding is probably also reflected in the protein dynamics corresponding to the two ligands. Good luck modeling that. That's the whole deal with modeling GPCRs. Simply modeling the ligand-binding pocket is not going to help us understand the differences between the binding of various ligands; one has to model multiprotein interactions and subtle effects on dynamics that are relayed through the helices. The program Desmond which I described in a earlier post is a powerful MD program, but even MD is going to really turn heads when it can take account of multiprotein interactions, and such interactions happen on a time-scale much longer than what even Desmond can access. We have a long way to go before we can do all this. But please, don't stop.Wacker, D., Fenalti, G., Brown, M., Katritch, V., Abagyan, R., Cherezov, V., & Stevens, R. (2010). Conserved Binding Mode of Human β-2 Adrenergic Receptor Inverse Agonists and Antagonist Revealed by X-ray Crystallography Journal of the American Chemical Society, 132 (33), 11443-11445 DOI: 10.1021/ja105108q... Read more »
Wacker, D., Fenalti, G., Brown, M., Katritch, V., Abagyan, R., Cherezov, V., & Stevens, R. (2010) Conserved Binding Mode of Human β Adrenergic Receptor Inverse Agonists and Antagonist Revealed by X-ray Crystallography . Journal of the American Chemical Society, 132(33), 11443-11445. DOI: 10.1021/ja105108q
by The Curious Wavefunction in The Curious Wavefunction
Would an anti-indole work?Antibiotic resistance is one of the best examples of evolution in real-time and it’s also one of the most serious medical problems of our time. Emerging resistance in bacteria like MRSA threatens to bring on a wave of epidemics that may remind us of past, more unseemly times.Given the threat that antibiotic resistance poses, it is paramount to understand the mechanisms behind this process. While considerable progress has been made in understanding the genetic basis of mutations that confer antibiotic resistance, much less is known about the population dynamics of bacteria that evolve this kind of resistance. Now, in the cover story of the latest issue of Nature, researchers from Boston University discover a novel and remarkable mechanism by which bacteria acquire resistance. The mechanism is effectively a form of bacterial altruism.The researchers start by challenging successive generations of E. coli in a bioreactor with increasing concentrations of the antibiotic norfloxacin, which inhibits DNA synthesis by binding to DNA gyrase. Around the tenth generation or so, they notice something interesting. Not all bacteria have evolved resistance to the antibiotic, but there’s a very small population of bacteria with high resistance. However, in the next few generations, the other bacteria also seem to acquire this resistance. What’s going on?It turns out that the small populations of bacteria which are highly resistant are actually ‘teaching’ their fellow bacteria to become resistant. They are doing this by a remarkably simple mechanism- by secreting the molecule indole into their environment. This indole acts as a signaling molecule that is mopped up by the other bacteria. The result is the activation of a variety of resistance mechanisms, including increased production of drug transporter proteins which are well-known to confer resistance by extruding drug molecules out.Now indole is well-known as a component of signaling molecules. For instance, indole-3-acetic acid (IAA) plays many important signaling roles in plants and encourages cell growth and division. The detection of indole by itself was not surprising in this case, since all the bacteria secreted indole as part of their regular metabolism in the beginning. But what was surprising was the mechanism; as the antibiotic stressed out the bacteria, most of them essentially weakened and stopped indole-secretion with the exception of this small cadre of selfless individuals who kept on generating the molecular signal. Since production of indole in times of stress clearly requires an investment of energy, this was a bona fide case of bacterial altruism; sacrifice one’s own fitness to increase that of the group.Ultimately though, we don’t want to just understand such novel mechanisms of antibiotic resistance but want to thwart them. Based on this mechanism I had an idea. If indole is so important for bacteria to acquire resistance, then one logical way to counter resistance would be to introduce an ‘anti-indole’ in their environment and mix it up with the natural molecule to cause confusion. This anti-indole would be a molecule resembling indole- an indole mimic and antagonist- that would effectively compete with indole for uptake, without causing any of the resulting effects. Most likely this molecule would be a very close analog of indole, perhaps indole with a hydroxyl or fluoro group on it. Any small modification of indole would do, as long as it’s enough to confuse the bacteria. Of course we would also need to worry about bioavailability and toxicity, but I don’t see why the basic strategy would be completely unfeasible and why a proof-of-principle experiment could not be done in a petri dish.Lee HH, Molla MN, Cantor CR, & Collins JJ (2010). Bacterial charity work leads to population-wide resistance. Nature, 467 (7311), 82-5 PMID: 20811456... Read more »
Lee HH, Molla MN, Cantor CR, & Collins JJ. (2010) Bacterial charity work leads to population-wide resistance. Nature, 467(7311), 82-5. PMID: 20811456
Do you write about peer-reviewed research in your blog? Use ResearchBlogging.org to make it easy for your readers — and others from around the world — to find your serious posts about academic research.
If you don't have a blog, you can still use our site to learn about fascinating developments in cutting-edge research from around the world.