The research group of Kei Matsushita, Senior Researcher from the National Agriculture and Food Research Organization (NARO) and others has created a database of the traits of 668 breeds and lines and genomic data cultivated over the past 22 years at six research centers engaged in rice breeding programs. The database is at NARO. They have announced their success in building a “Genome Selection AI” to predict rice traits from genome data using this database. The group confirmed the ability to predict with a high degree of accuracy based on genomic data important traits such as harvest volume and grain quality, which are entwined with multiple genes for which it has been difficult to select DNA markers. The group is now able to select lineages with superior characteristics at the seedling stage. It is expected to accelerate and increase the efficiency of rice breeding.
A database has been built with the data of the traits of 668 breeds and lines cultivated over the past 22 years.
NARO has a research program aiming to develop a smart breeding system and breed cultivation using the latest technology including AI, with the objective of ushering in Society 5.0 in Agriculture & Food Industry. Under the leadership of NARO President Kazuo Kyuma, the Research Center for Agricultural Information Technology was established in 2018 and in June 2020, the Shiho supercomputer went online to pursue AI research.
The breeding of rice and other cereals requires the repeated selection and crossing of parent cultivars. Many aspects need to be investigated for the purposes of selection, so the person responsible needs training, in turn requiring a large rice paddy. The breeding of new cultivars can take more than 10 years in many cases. In recent years, rice cultivar improvements through selection have advanced using DNA marker technology based on genomic data, but this method only allows limited traits such as disease resistance that involve a small number of genes to be selected.
Therefore, the research group built a prediction model for statistically learning the relationship between the traits of each cultivar and lineage and the overall genome, and developed a “Genome Selection AI” able to predict the value and extent of a wide range of traits using genomic data alone. It was proven to work.
To collect the data, the group used the data accumulated by NARO over the years. They created a digital database of the traits of 668 breeds and lines cultivated over the past 22 years at six research centers engaged in rice breeding programs. Data from an average of nine experiments for each breed and line studied since 1991 was gathered and collated. The Shiho supercomputer was used to analyze genomic data from over 900 breeds and lines.
Of the data gathered, the traits data on 129 breeds and lines collected by the National Institute of Agrobiological Sciences (Tsukuba City, Ibaraki Prefecture) in experiments and data on around 40,000 single nucleotide polymorphism (SNP) loci associated with lineage and quality acquired from the genomic data were extracted, and their relationship analyzed. And, the Genome Selection AI was built to predict traits from the genome. The traits were ear emergence period, maturity period, culm (stalk) length, ear length, ear number, overall weight, harvest volume (weight of unpolished and polished rice), thousand unpolished kernel weight, unpolished rice quality and eating quality.
When the predicted and actual measured precision of Genome Selection AI regarding 129 lines and breeds were tested, it achieved high predictive accuracy for traits that were previously difficult to select using DNA marker technology. For harvest volume, unpolished rice quality and ear length, the correlation factor of predictive and actual values was over 0.7.
Because of this ability to obtain predictive values at the early stage, this approach means selection is possible without the constraint of having just one harvest season per year in rice paddy trials. Moreover, by using climate-controlled spaces and glasshouses, selection can be done at any time. It is also hoped that the technique can be applied to crops such as fruit which require a lot of time to breed—and as long as the data exist, it should be applicable.
Says Matsushita, “By swapping selection in the rice paddy with predictive selection, as well as accelerated generation and climate-controlled growing spaces, we plan to undertake research aiming to establish a new breeding technique that will shorten the period required from crossing to selection, from five years to around three years.” He intends to put the technology to use in collaborative research with other research institutions in future.