# Applied Math Seminar

## Applied Math Seminar

**Title:**Correct Model Selection in Big Data Analysis

**Abstract:**Although recent attention has focused on improving predictive models, less consideration has been given to variability introduced into models through incorrect variable selection. Here, the difficulty in choosing a scientifically correct model is explored both theoretically and practically, and the performance of traditional model selection techniques is compared with that of more recent methods. The results in this talk show that often the model with the highest R-squared (or adjusted R-squared) or lowest Akaike Information Criterion (AIC) is not the scientifically correct model, suggesting that traditional model selection techniques may not be appropriate when data sets contain a large number of covariates. This work starts with the derivation of the probability of choosing the scientifically correct model in data sets as a function of regression model parameters, and shows that traditional model selection criteria are outperformed by methods that produce multiple candidate models for researchers' consideration. These results are demonstrated both in simulation studies and through an analysis of a National Health and Nutrition Examination Survey (NHANES) data set.

## Applied Math Seminar

**Title:** Modeling the emergence of Division of labor in social systems

**Abstract:** Division of labor (DOL) is a key pattern of social organization that has evolved in a diverse array of systems from microbes, insects and, of course, humans. Theoretical models predict that division of labor is optimal (and that evolutionary selection can favor it) if there are increasing efficiency (or fitness) benefits arising from individual specialization. One main open question about DOL is ‘What proximate (behavioral) mechanisms are responsible for its initial emergence?’ In this talk, I will propose a novel theory using a framework of individual energetics and optimization in social dynamics. The key assumption is that individuals are myopic optimizers of a utility function that reflects the tradeoffs of energy/ time needed to perform, and become proficient in, a set of alterative (fitness-bearing) tasks. This hypothesis serves as counterpoint to existing theory of inter-individual variation in “response thresholds” popularized by studies of task allocation in social insect colonies. Simulation findings show that DOL can emerge from individual optimization and can be enhanced by varying parameters of fatigue and group size. This result has broader implications for understanding the evolutionary transition to sociality (the period in which previously solitary animals began living together in groups).

## Applied Math Seminar

**Title:** Intermittent Preventive Treatment and the Spread of Drug Resistant Malaria

**Abstract:** Over the last decade, control measures have significantly reduced malaria morbidity and mortality. However, the burden of malaria remains high, with more than 70% of malaria deaths occurring in children under the age of five. The spread of antimalarial resistant parasites challenges the efficacy of current interventions, such as Intermittent Preventive Treatment (IPT), whose aim it is to protect this vulner- able population. Under IPT, a curative dose of antimalarial drugs is administered along with a child’s routine vaccinations, regardless of their infection status, as both a protective measure and to treat subclinical infections. We have developed mathematical models to study the relative impact of IPT in promoting the spread of drug resistant malaria (compared with treatment of clinically ill individuals), and the combined effect of different drug half-lives, age-structure and local transmis- sion intensity on the number of childhood deaths averted by using IPT in both the short and long-term in malaria endemic settings. I will also discuss some potential consequences of unstable and seasonal transmission of malaria on the efficacy of IPT.

## Applied Math Seminar

**Title:**Exponential convergence rates for Batch Normalization

**Abstract:** Batch Normalization is a normalization technique that has been used in training deep Neural Networks since 2015. In spite of its empirical benefits, there exists little theoretical understanding as to why this normalization technique speeds up learning. From a classical optimization perspective, we will discuss specific problem instances in which we can prove that Batch Normalization can accelerate learning, and how this acceleration is due to the fact that Batch Normalization splits the optimization task into optimizing length and direction of parameters separately.

## Applied Math Seminar

**Title:**On Toric Ideals of some Statistical Models.

**Abstract:**We introduce hierarchical models from statistics and their associated Markov bases. These bases are often large and difficult to compute. We introduce certain toric ideals and their algebraic properties as an alternative way of thinking about these objects. One challenge is to describe hierarchical models with infinitely many generators in a finite way. Using a symmetric group action, we describe certain classes of models including progress made for the non-reducible Models. This is joint work with Uwe Nagel.

## SIAM Guest Speaker

**Title:** Efficient Methods for Enforcing Contiguity in Geographic Districting Problems

**Abstract: **Every ten years, United States Congressional Districts must be redesigned in response to a national census. While the size of practical political districting problems is typically too large for exact optimization approaches, heuristics such as local search can help stakeholders quickly identify good (but suboptimal) plans that suit their objectives. However, enforcing a district contiguity constraint during local search can require significant computation; tools that can reduce contiguity-based computations in large practical districting problems are needed. This talk introduces the geo-graph framework for modeling geographic districting as a graph partitioning problem, discusses two geo-graph contiguity algorithms, and applies these algorithms to the creation of United States Congressional Districts from census blocks in several states. The experimental results demonstrate that the geo-graph contiguity assessment algorithms reduce the average number of edges visited during contiguity assessments by at least three orders of magnitude in every problem instance when compared with simple graph search, suggesting that the geo-graph model and its associated contiguity algorithms provide a powerful constraint assessment tool to political districting stakeholders. Joint work with Douglas M. King and Edward C. Sewell

## Applied Math Seminar

**Title:** Using mathematics to fight cancer.

**Abstract: **What can mathematics tell us about the treatment of cancer? In this talk I will present some of work that I have done in the modeling of tumor growth and treatment over the last fifteen years. Cancer is a myriad of individual diseases, with the common feature that an individual's own cells have become malignant. Thus, the treatment of cancer poses great challenges, since an attack must be mounted against cells that are nearly identical to normal cells. Mathematical models that describe tumor growth in tissue, the immune response, and the administration of different therapies can suggest treatment strategies that optimize treatment efficacy and minimize negative side-effects. However, the inherent complexity of the immune system and the spatial heterogeneity of human tissue gives rise to mathematical models that pose unique challenges for the mathematician. In this talk I will give a few examples of how mathematicians can work with clinicians and immunologists to understand the development of the disease and to design effective treatments. I will use mathematical tools from dynamical systems, optimal control and network analysis. This talk is intended for a general math audience: no knowledge of biology will be assumed.

## Applied Math Seminar

**Title:**A Mathematical Model for the Force and Energetics in Competitive Running

**Abstract:**Competitive running has been around for thousands of years and many people have wondered what the optimal form and strategy is for running a race. In his paper, Behncke develops a simple mathematical model that focuses on the relationships and dynamics between the forces and energetics at play in order to find an optimal strategy for racing various distances. In this talk, I will describe the biomechanics, energetics, and optimization of running in Behncke's model and present his findings. Note: you do not have to like running to come to this talk :)

## Applied Math Seminar

**Title:**Mathematical deep learning for drug discovery

**Abstract:**Designing efficient drugs for curing diseases is of essential importance for the 21

^{st}century's life science. Computer-aided drug design and discovery has obtained a significant recognition recently. However, the geometric complexity of protein-drug complexes remains a grand challenge to conventional computational methods, including machine learning algorithms. We assume that the physics of interest of protein-drug complexes lies on low-dimensional manifolds or subspaces embedded in a high-dimensional data space. We devise topological abstraction, differential geometry reduction, graph simplification, and multiscale modeling to construct low-dimensional representations of biomolecules in massive and diverse datasets. These representations are integrated with various deep learning algorithms for the predictions of protein-ligand binding affinity, drug toxicity, drug solubility, drug partition coefficient and mutation induced protein stability change, and for the discrimination of active ligands from decoys. I will briefly discuss the working principle of various techniques and their performance in D3R Grand Challenges,a worldwide competition series in computer-aided drug design and discovery (http://users.math.msu.edu/users/wei/D3R_GC3.pdf).

## Applied Math Seminar

**Title:** Mathematics for Breast Cancer Research: investigating the role of iron.

**Abstract:** Breast cancer cells are addicted to iron. The mechanisms by which malignant cells acquire and contain high levels of iron are not completely understood. Furthermore, other cell types in a tumor, such as immune cells, can either aid or inhibit cancer cells from acquiring high levels of iron. In order to shed light in the question of how iron affects breast cancer growth, we are applying mathematical tools including polynomial dynamical systems over finite fields and 3D multiscale mathematical modeling. In this talk we will survey how mathematics is aiding in understanding the mechanisms of this addictive iron behavior of malignant cells, and present some preliminary work.