IRNet

Northwestern University | 2019

Logan Ward; Dipendra Jha; Wei-keng Liao; Alok Choudhary; and Ankit Agrawal; Zijiang Yang; Chris Wovertong; Ian Foster;

Description:

IRNet is a general purpose deep residual regression framework that contains model architectures composed of fully connected layers of different depths (17-layers, 24-layers and 48-layers) for data mining problems with numerical vectors as inputs. Materials discovery is crucial for making scientific advances in many domains. Collections of data from experiments and first-principle computations have spurred interest in applying machine learning methods to create predictive models capable of mapping from composition and crystal structures to materials properties. Generally, these are regression problems with the input being a 1D vector composed of numerical attributes representing the material composition and/or crystal structure. While neural networks consisting of fully connected layers have been applied to such problems, their performance often suffers from the vanishing gradient problem when network depth is increased. Hence, predictive modeling for such tasks has been mainly limited to traditional machine learning techniques such as Random Forest. In this paper, we study and propose design principles for building deep regression networks composed of fully connected layers with numerical vectors as input. We introduce a novel deep regression network with individual residual learning, IRNet, that places shortcut connections after each layer so that each layer learns the residual mapping between its output and input. We use the problem of learning properties of inorganic materials from numerical attributes derived from material composition and/or crystal structure to compare IRNet's performance against that of other machine learning techniques. Using multiple datasets from the Open Quantum Materials Database (OQMD) and Materials Project for training and evaluation, we show that IRNet provides significantly better prediction performance than the state-of-the-art machine learning approaches currently used by domain scientists. We also show that IRNet's use of individual residual learning leads to better convergence during the training phase than when shortcut connections are between multi-layer stacks while maintaining the same number of parameters.

Citation:

Dipendra Jha, Logan Ward, Zijiang Yang, Chris Wovertong, Ian Foster, Wei-keng Liao, Alok Choudhary, and Ankit Agrawal, "IRNet: A general purpose deep residual regression framework for materials discovery." Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2019

Keywords:

crystal structures, data mining, numerical vectors,

Curator: laiwei.cool@gmail.com