4-VA

Applying AI in Complex Macromolecular Modeling: A Difficult Challenge Realizing Beneficial Gains

AI is a hot topic these days, with engineers and scientists looking to adapt artificial intelligence (AI) technology into a variety of chemical, physical and materials applications.  However, its use in predictions of kinetics and dynamics has not been studied as closely.  This subject came to the fore at Mason’s Center for Simulation and Modeling in the form of a question, “Is AI capable of identifying meaningful patterns in the temporal behavior of solvated macromolecules?”  This question is important because it is understood that chemical sciences combined with engineering the associated data will be critical for finding solutions for environmental pollution, healthcare, sustainable energy resources, and global warming.  Learning how these processes occur at the molecular, nanometer, and mesoscopic scales — inspected through computational simulation — and analyzing how associated big datasets can play a fundamental role in tackling complex systems could prove valuable. This question prompted Professors Olga Gkountouna (then in the Department of Computational and Data Sciences at Mason) and Estela Blaisten-Barojas, the Director of Center for Simulation and Modeling sought a 4-VA@Mason grant to look more closely into the possibility.

With the grant in hand, but with the pandemic in full sway, Gkountouna and Blaisten-Barojas devised how the work on this important research could be conducted within the restrictions of the shutdown.  They needed a bright, independent thinker who could be taught to take up this big question.  The solution was found when they tapped (at the time) doctoral student James Andrews (pictured above with Blaisten-Barojas on an earlier assignment) to do the difficult research.  Andrews had previously worked with Blaisten-Barojas on several projects leading to his doctoral dissertation, and both professors felt as though he would be up to the complex task.

Andrews dove into the project, exploring the ability of how three well established recurrent neural networks — ERNN, LSTM and GRU — could provide viable data models.  “Basically, James worked on forecasting how and if a group of macromolecules in a solution are going to keep together as a cluster or not,” explains Blaisten-Barojas.  “If we can analyze how the macromolecules are behaving, we can estimate a prediction of what will come in the future. It is an estimate of the future, similar to what is done with the weather.”

After much analysis, Andrews and the two PI’s concluded that the recurrent neural network architectures investigated generate data models which reproduce excellently the macromolecules fate in the solution in the short-term. In the long-term, the forecasts statistical distributions yielded time events with limited variability.  However, the team was able to discern the parameters of when supervised machine learning serves as a viable alternative for long all-atom computer simulations.

Blaisten-Barojas adds that another important outcome of the research was the energy savings – both human and computational.  “Predicting modeling saves hundreds of hours of computing time, which require a lot of energy. Indeed, the Office of Research Computing big computers would be crunching numbers and storing the many terabytes of space, for output that could be avoided. Having a reliable forecasting model predicting if it is worth continuing a simulation or if it is going to give results that are not expected is a highly desirable tool.  With some information on the simulation future, one can plan ahead, stop, make changes, go in a different direction, or eventually continue the simulation. In a nutshell, our new decision-making tool aids the simulation practitioner to assess when long simulations are worth continuing.”

While the analysis was tedious and difficult, Blaisten-Barojas reports that Andrews found an outlet to keep up with the hard work – by leaning on peers in his research group.  Andrews and three other doctoral students in the Computational Sciences and Informatics PhD program met virtually on Fridays during the pandemic to exchange their graduate research results, share comments, input, suggestions, and provide encouragement.  “These meetings maintained a supporting and cheerful platform during the uncertain pandemic times,” notes Blaisten-Barojas.

The PhD study group: (From top) Scott Hopkins, Greg Helmick, Yoseph Abere.

Andrews’ hard work paid off, with a paper published in Chemical Science, the prestigious journal published by the Royal Society of London: “J. Andrews, O. Gkountouna and E. Blaisten-Barojas, “Forecasting Molecular Dynamics Energetics of Polymers in Solution from Supervised Machine Learning.””   The work has also been disseminated in arXiv, a preprint repository maintained by Cornell University and Zenodo, a database repository of codes and data maintained by CERN.

 

 

Another jewel in the crown for Mason’s Center for Simulation and Modeling, with some help from 4-VA@Mason.