Skip to main content

Applied Math Seminar

Date:
-
Location:
POT 745
Speaker(s) / Presenter(s):
Sally Ellingson, UK Division of Biomedical Informatics

Abstract: Drug discovery is a lengthy, expensive, and sometimes fatal process. It is also an extremely difficult task to perform with a full understanding of experimental results. Drugs are studied in test tubes which lack a realistic in vivo environment and in animal models having limited validity for human conditions. Even when new drugs pass screening experiments with no red flags, they fail during human clinical trials after a great amount of time and money has been invested. Thus, an economic burden is created that eventually must be recuperated with the few drugs that do pass FDA approval. Computational methods that consistently improve predictive accuracy over laboratory and animal testing for the entire human proteome and huge chemical space of potential drugs could revolutionize pharmaceutical research and development. The utilization of such computational tools will increase the return on future investments in health-related research and provide access to new, better understood therapies.

The state-of-the-art in many computational methodologies include machine learning approaches. In our digitalized, data-driven world, there is a wealth of knowledge available that is beyond the processing power of an individual researcher or even team of researchers. The goal of my work is to improve the prediction of novel drug safety and efficacy by increasing the accuracy of predicting polypharmacological networks, investigating how drugs interact with the entire proteome.  We integrate traditional computational simulations of protein and drug interactions (such as the efficient molecular docking calculation), cheminformatics features of drug-like molecules, and features describing individual proteins to improve the prediction of drug and protein binding.  Each component investigated provides some level of predictive utility in isolation.  For example, I have seen in my own work that a small number of drug features calculated from current cheminformatics programs can identify active compounds for a given protein with greater than 99% accuracy. These same drug features have been used in machine learning models in combination with docking scores to rescore interactions with one candidate drug to multiple proteins. The individual components of a molecular docking scoring function can be used as features in a machine learning model to greatly improve the accuracy of identifying active compounds in models specific for one protein. From a different perspective, protein features have been used in machine learning models to predict the druggability of a protein. The hypothesis of this work is that the combination of all these components can be used in one model that would vastly improve the accuracy of predicting the effects of new proteins and classes of drugs.

Presented here is a first step of showing that it can be done for a class of functionally related proteins (kinases).  Kinases have been chosen to study because kinase inhibitors are the largest class of new cancer therapies and selectively inhibiting a kinase is difficult due to their high sequence similarity, making off-target interactions with kinases a common cause of adverse drug reactions.