(V) Materials Data and Machine Learning

Monday, September 13, 2021: 2:20 PM
225 (America's Center)
Dr. Tiberiu Stan , Northwestern University, Evanston, IL
Mr. Jiwon Yeom , Korea Advanced Institute of Science and Technology, Daejeon, Korea, Republic of (South)
Prof. Peter Voorhees , Northwestern University, Evanston, IL
There have been many publicized successes where Artificial Intelligence (AI) techniques enabled computers to out-perform humans. A necessary condition for the success of these approaches is the existence of large databases to train machine learning algorithms. By contrast, collecting the materials data needed for machine learning can be expensive, time-consuming, and practically impossible in some cases. This small database challenge posed by materials science along with the creation of a national materials data infrastructure that can be used to power AI applications will be discussed. As an illustration of our ability to overcome these challenges, we discuss the creation of synthetic training data for convolutional neural networks (CNNs). Microstructures generated using phase field calculations are used to train CNNs for image segmentation of X-ray tomography datasets. We find that CNNs trained using carefully designed microstructures can yield segmentations that approach the accuracy achieved by humans, thus greatly reducing the time and resources needed to analyze large materials datasets.