MIT researchers discover new machine learning technique for assessing drug molecules

A novel machine learning method called DeepBAR could accelerate drug discovery and protein engineering

16 Mar 2021
Diane Li
Assistant Editor

Industry news

Drugs can only work if they stick to their target proteins in the body. Assessing that stickiness is a key hurdle in the drug discovery and screening process. New research combining chemistry and machine learning could lower that hurdle.

The new technique, dubbed DeepBAR, quickly calculates the binding affinities between drug candidates and their targets. The approach yields precise calculations in a fraction of the time compared to previous state-of-the-art methods. The researchers say DeepBAR could one day quicken the pace of drug discovery and protein engineering.

“Our method is orders of magnitude faster than before, meaning we can have drug discovery that is both efficient and reliable,” says Bin Zhang, the Pfizer-Laubach Career Development Professor in Chemistry at MIT, an associate member of the Broad Institute of MIT and Harvard, and a co-author of a new paper describing the technique.

The research appears in the Journal of Physical Chemistry Letters. The study’s lead author is Xinqiang Ding, a postdoc in MIT’s Department of Chemistry.

The affinity between a drug molecule and a target protein is measured by a quantity called the binding free energy — the smaller the number, the stickier the bind. “A lower binding free energy means the drug can better compete against other molecules,” says Zhang, “meaning it can more effectively disrupt the protein’s normal function.” Calculating the binding free energy of a drug candidate provides an indicator of a drug’s potential effectiveness. But it’s a difficult quantity to nail down.

Methods for computing binding free energy fall into two broad categories, each with its own drawbacks. One category calculates the quantity exactly, eating up significant time and computer resources. The second category is less computationally expensive, but it yields only an approximation of the binding free energy. Zhang and Ding devised an approach to get the best of both worlds.

Exact and efficient

DeepBAR computes binding free energy exactly, but it requires just a fraction of the calculations demanded by previous methods. The new technique combines traditional chemistry calculations with recent advances in machine learning.
The “BAR” in DeepBAR stands for “Bennett acceptance ratio,” a decades-old algorithm used in exact calculations of binding free energy. Using the Bennet acceptance ratio typically requires a knowledge of two “endpoint” states (e.g., a drug molecule bound to a protein and a drug molecule completely dissociated from a protein), plus knowledge of many intermediate states (e.g., varying levels of partial binding), all of which bog down calculation speed.

DeepBAR slashes those in-between states by deploying the Bennett acceptance ratio in machine-learning frameworks called deep generative models. “These models create a reference state for each endpoint, the bound state and the unbound state,” says Zhang. These two reference states are similar enough that the Bennett acceptance ratio can be used directly, without all the costly intermediate steps.

In using deep generative models, the researchers were borrowing from the field of computer vision. “It’s basically the same model that people use to do computer image synthensis,” says Zhang. “We’re sort of treating each molecular structure as an image, which the model can learn. So, this project is building on the effort of the machine learning community.”

While adapting a computer vision approach to chemistry was DeepBAR’s key innovation, the crossover also raised some challenges. “These models were originally developed for 2D images,” says Ding. “But here we have proteins and molecules — it’s really a 3D structure. So, adapting those methods in our case was the biggest technical challenge we had to overcome.”

A faster future for drug screening

In tests using small protein-like molecules, DeepBAR calculated binding free energy nearly 50 times faster than previous methods. Zhang says that efficiency means “we can really start to think about using this to do drug screening, in particular in the context of Covid. DeepBAR has the exact same accuracy as the gold standard, but it’s much faster.” The researchers add that, in addition to drug screening, DeepBAR could aid protein design and engineering, since the method could be used to model interactions between multiple proteins.

DeepBAR is “a really nice computational work” with a few hurdles to clear before it can be used in real-world drug discovery, says Michael Gilson, a professor of pharmaceutical sciences at the University of California at San Diego, who was not involved in the research. He says DeepBAR would need to be validated against complex experimental data. “That will certainly pose added challenges, and it may require adding in further approximations.”

In the future, the researchers plan to improve DeepBAR’s ability to run calculations for large proteins, a task made feasible by recent advances in computer science. “This research is an example of combining traditional computational chemistry methods, developed over decades, with the latest developments in machine learning,” says Ding. “So, we achieved something that would have been impossible before now.”

This research was funded, in part, by the National Institutes of Health.

For more of the latest science news, straight to your inbox, become a member of SelectScience for free today>>

Links

Tags

Combinatorial ChemistryCombinatorial chemistry, also known as combichem, is a technique used in drug discovery to create libraries of structurally related compounds. A library is generated by synthesis with a chemical reactor system or by computer-based modeling of compound combinations. When undertaking combinatorial chemistry consider reagents, buffers, resins and standards.Data AnalysisData analysis hardware and software is available to make data processing straight-forward yet powerful. Data software can be used for math and stats, technical graphing and image analysis. In addition, software is available for specific data analysis of electrophoresis, densitometry, ELISA and DNA sequencing.Process ChemistryProcess chemistry is an important stage of drug development for scaling-up drug production or chemical synthesis reactions. It is useful for optimizing economical and efficient drug production. Process chemistry uses reactors and pump systems as well as reagents, standards and buffers.Cellular PathologyCellular Pathology deals with the microscopic analysis of tissue samples and cells. Sample preparation and processing includes fixation, staining, sectioning and slide mounting, using equipment such microtomes and cryostats. In choosing immunohistochemistry and immunocytochemistry kits, consider chromogens, staining method, antibodies, microscopes and imaging.Biopharmaceutical AdvancesBiopharmaceutical advances follow the development of pharmaceuticals derived from biotechnology, also known as biotechnology medicines. Biopharmaceuticals may be produced from cell lines, plants, or microbial cells. Important considerations of biopharmaceutical use include application, cost, production process and purification.Drug DeliveryDrug Delivery refers to dosage form, route of administration, formulations, technologies, and systems for transporting a pharmaceutical compound in the body as needed to safely achieve its desired therapeutic effect. Drug delivery is often approached via the biopharmaceutical or small molecule drug's formulation, but it may also involve medical devices or drug-device combination products. Considerations include instrumentation, software and services. Drug TestingIt is essential to test the efficacy and toxicity of drugs at every stage of their development to ensure that a pharmaceutical product is fit for purpose and safe for use before reaching the stage of mass production and distribution.Targeted ScreeningDrug DiscoveryDrug discovery is the process of identifying potential new medications, involving stages such as target identification, compound screening, and preclinical development. It relies on cutting-edge technologies like high-throughput screening, artificial intelligence, and molecular modeling to accelerate the identification of drug candidates. Drug discovery plays a pivotal role in developing new therapies for diseases ranging from cancer to rare genetic disorders. Browse our peer-reviewed product directory to find the latest drug discovery technologies, compare options, check customer feedback, and get pricing directly from manufacturers.Drug DevelopmentDrug development refers to the process of bringing a new drug to market.Drug Discovery & Development ScreeningProtein BiologyThe analysis of protein expression, identity and function is vital for many areas of life science research and drug discovery. Some of the most commonly used techniques in protein analysis include Western blotting, electrophoresis and mass spectrometry.