A research group led by Assistant Professor (Special Appointment) Yuichiro Yada and Professor Naoki Honda of the Laboratory of Data-Driven Biology at the Graduate School of Integrated Science for Life at Hiroshima University has developed a machine learning model that can quantitatively predict amyloid-β accumulation levels even with a limited amount of data − that is, paired data measured from the same sample regarding the accumulation level of biomarker and amyloid-β that is normally required for machine learning prediction. Application of this technique is expected to lead to the development of new Alzheimer's disease biomarkers based on the predictability of amyloid-β accumulation levels. The result was published in the academic journal npj Systems Biology and Applications.
It is known that the protein amyloid-β accumulates in the brain of patients with Alzheimer's disease prior to neuronal degeneration. However, methods for estimating its accumulation in the brain currently have problems such as high cost and invasiveness.
The machine learning model developed by the group assumes a situation where several biomarker candidates are measured in model animals such as mice. In the proposed model, amyloid-β is assumed to accumulate over time according to a sigmoid (S-shaped) curve. The parameters that determine the maximum value, temporal position, and steepness of the curve are different for each individual. They further assume that the accumulation of amyloid-β is an indicator of the progression of Alzheimer's disease and that the values of biomarker candidates would be observed based on that indicator. In training the proposed model, "the parameters of the sigmoid curve" and "the extent each biomarker candidate is affected by the amyloid-β status" are estimated from the data.
Even if there is limited paired data, learning was made possible using the following techniques: Bayesian learning (a machine learning method that updates knowledge by learning new data based on knowledge obtained from previously available data); and semi-supervised learning (a machine learning method that uses both data with a correct answer and data without it for learning). With this learning model, it is possible to predict the level of amyloid-β from the observed biomarker candidates.
Furthermore, after applying the proposed model and the learning algorithm to published mouse data and evaluating its predictive performance, it was found that the machine learning model was sufficiently applicable even to small amount of paired data.
The researchers additionally evaluated which behavioral characteristics were important in predicting amyloid-β levels. In comparison with the results where all the characteristics are used, almost the same level of prediction was found when using 10 out of 11 characteristics. A relatively high level of prediction was obtained with five characteristics. These five characteristics included those available from the three different experiments. It was shown that multiple characteristics obtained using various methods may be useful as predictive biomarkers. This technique may also be applicable to other neurodegenerative diseases such as Parkinson's disease, where the accumulation of abnormal protein is followed by neuronal cell degeneration.
It is expected in the future that the method will be developed into a machine learning model to handle human data.
Yada said, "We have developed a machine learning model to discover Alzheimer's disease biomarkers based on a new standard, which is the predictability of amyloid-β levels in the brain. I hope to collaborate with experimental researchers to discover new biomarkers that predict the onset of Alzheimer's disease."
Journal Information
Publication: npj Systems Biology and Applications
Title: Few-shot prediction of amyloid β accumulation from mainly unpaired data on biomarker candidates
DOI: 10.1038/s41540-023-00321-5
This article has been translated by JST with permission from The Science News Ltd. (https://sci-news.co.jp/). Unauthorized reproduction of the article and photographs is prohibited.