An international research team led by Professor Hieu-Chi Dam, JSPS Research Fellow Minh-Quyet Ha, and Doctoral Student Dinh-Khiet Le from the Co-creative Intelligence Research Area at Japan Advanced Institute of Science and Technology (JAIST), together with Viet-Cuong Nguyen from HPC SYSTEMS Inc., Professor Hiori Kino from the Institute of Statistical Mathematics, and Professor Stefano Curtarolo from Duke University (USA), has developed a new framework that integrates data from materials databases with cross-disciplinary expert knowledge extracted from scientific literature using AI, and has succeeded in accelerating the discovery of high-entropy alloys (HEAs). Their findings were published in the journal Digital Discovery.
Provided by Professor Hieu-Chi Dam from JAIST
HEAs are a new type of alloy containing five or more elements in near-equiatomic ratios and have attracted attention as next-generation materials for their outstanding mechanical properties, thermal stability, and corrosion resistance. However, identifying stable compositions with desirable properties is extremely difficult due to the vast compositional space and the complexity of phase formation mechanisms.
Conventional data-driven methods perform well when predicting compositions similar to the training data (interpolation) but struggle to predict properties for novel alloy systems containing elements absent from the training data (extrapolation). In other words, exploring unknown compositional regions has remained a significant challenge.
Meanwhile, materials scientists have accumulated a wealth of knowledge on elemental substitutability through years of research. While this expertise is scattered across the scientific literature, no established method existed to systematically extract it and integrate it with data for practical use.
The research team developed a new framework centered on the elemental substitution principle to integrate data with expert knowledge.
The new method uses four large language models (LLMs)-GPT-4o, GPT-4.5, Claude Opus 4, and Grok3-as knowledge extraction tools, querying the possibility of elemental substitution from the perspectives of five specialized fields: corrosion science, materials mechanics, metallurgy, solid-state physics, and materials science. This allows cross-disciplinary expert knowledge accumulated in the scientific literature to be systematically extracted. The knowledge extracted showed 86% agreement with the well-established Hume-Rothery rules, and was also found to capture substitution patterns that are not covered by empirically reported rules.
The Dempster-Shafer theory is also employed to integrate experimental and computational data from materials databases with the AI-extracted expert knowledge. This theoretical framework can evaluate the reliability of each information source, combine the evidence, and explicitly distinguish between confidence and uncertainty. This makes it possible to clearly show both the basis and the limitations of any given prediction.
The reliability of each knowledge source is automatically evaluated based on its consistency with the target material properties. Sources with high relevance are given greater weight, while those with low relevance receive less, forming a mechanism that prevents the incorporation of inappropriate knowledge.
Validation experiments achieved prediction accuracy of 86-92% for unknown alloy systems containing elements absent from the training data. This enables exploration of previously unknown regions that were inaccessible with conventional methods, and opens up a path to efficiently discovering HEAs within the vast compositional space of multi-element systems. The framework also outperformed conventional empirical rules and free-energy models in predicting 55 experimentally confirmed quaternary alloys collected from the literature, achieving accuracy comparable to computationally expensive methods. Furthermore, it showed high correlation (correlation coefficient: 0.81) with state-of-the-art computational methods when applied to high-entropy borides (HEBs), a distinct material system, demonstrating the framework's broad applicability.
The framework developed in this study is expected to be applicable not only to high-entropy alloy exploration, but also to a wide range of material systems that share the challenges of vast compositional spaces and data scarcity. Extension to the design of complex multicomponent materials such as functional ceramics and catalytic materials is anticipated.
The study also presents a model case for AI for Science. By using AI as a knowledge extraction tool, integrating human-accumulated expertise with data, and explicitly handling uncertainty, this initiative is expected to contribute to establishing a new research paradigm that accelerates scientific discovery.
Going forward, the team aims to combine the framework with active learning and reinforcement learning so as to be further developed into a decision-support system that selects optimal candidate materials in the context of limited experimental resources.
Journal Information
Publication: Digital Discovery
Title: Beyond interpolation: integration of data and AI-extracted knowledge for high-entropy alloy discovery
DOI: 10.1039/D5DD00400D
This article has been translated by JST with permission from The Science News Ltd. (https://sci-news.co.jp/). Unauthorized reproduction of the article and photographs is prohibited.

