INVITED: MPDD: Material-Properties-Descriptors Database

Monday, September 13, 2021: 11:40 AM
227 (America's Center)
Mr. Adam M. Krajewski , The Pennsylvania State University, University Park, PA
Dr. Shun-Li Shang , The Pennsylvania State University, University Park, PA
Dr. Yi Wang , The Pennsylvania State University, University Park, PA
Prof. Zi-Kui Liu , The Pennsylvania State University, University Park, PA
Lately, Machine Learning (ML) is becoming an increasingly critical tool for material discovery, thanks to its ability to predict results of time and power-intensive calculations quickly. Fundamentally, each ML study predicts some property and is composed of three elements: a database, a descriptor, and an ML algorithm. These are combined in two steps. First, the data representation is calculated using the descriptor. Then the model is iteratively evaluated on this representation and adjusted to improve it. In total, this usually takes appreciably less than a second per entry.

For instance, in our recent study, calculating structure-informed descriptors of materials took us milliseconds, and the prediction of their formation energy using a vast neural network took microseconds. These times may seem instantaneous compared to ab-initio based methods; however, with extensive databases or complex data like 3D microstructure, they can grow into days or years. We present a tool that can speed up total process orders of magnitude by removing the most time-intensive step, i.e., the descriptor calculation.

To accomplish that, we employ a NoSQL MongoDB database that moves from sharing of the material-properties data to the sharing of descriptors-properties data corresponding to the material. This change not only enables accelerated and effortless machine learning of materials but also serves as a tool for an automated and robust embodiment of prior knowledge about them in a graph-like fashion.

Furthermore, since the descriptors are often reused for related properties, our database provides a tremendous speed-up in the design space exploration. For example, a new model predicting steel toughness could immediately be used on many experimental and hypothetical steels investigated in studies predicting properties like the Poisson ratio, elastic limit, or Young's modulus, to rapidly screen possibly millions of materials that were interesting to others at a small fraction of cost and complexity.