Latest News


NTT develops world first method to perform regression analysis of disparate data: Expansion of data analysis−applicable areas based on deep learning


On April 26, NTT announced that it has developed a new data analysis method that employs deep learning to estimate functions representing input−output relations from disparate data in which correspondence between input and output variables is lost, or there is none. It is a revolutionary novel method that expands the scope of application of data analysis and is the first in the world to realize regression analysis of disparate data. Using this method, it will be possible to estimate functions representing input−output relations and analyze data, even those in which the information corresponding to inputs and outputs is collected by different departments or organizations or those that are collected in group units where individuals cannot be identified owing to privacy protection.

Masahiro Kojima, Senior Researcher of the Symbiotic Intelligence Research Project at NTT Human Informatics Laboratories, who has been involved in the development, said "The proposed method is a technology that can be universally used to analyze disparate data. We are currently in the process of establishing the technology and are first looking for business partners to research together in the areas of digital marketing and education." The results were accepted for publication at the 38th Annual AAAI Conference on Artificial Intelligence, the premier international conference in the field of AI, held on February 20-27 in Vancouver, Canada.

Regression analysis is a method of understanding the relation between inputs and outputs by estimating a function that quantitatively represents this relation. In normal regression analysis, data with input−output correspondences are collected in advance; these data are used to estimate the function. If a function is estimated using data in which inputs and outputs correspond to each other (e.g., for the time a customer spends in an online store (input) and the purchase amount (output)), then this can be used to analyze how much the considered customer buys based on the time spent in the store.

However, data collected while taking account of privacy considerations or data collected separately by different institutions or organizations (e.g., online stores and physical stores) may result in disjointed data in which the correspondence between inputs and outputs is lost. In such cases, it is not possible to analyze disparate data because the best function cannot be estimated through ordinary regression analysis.

Therefore, in recent years, research has been conducted on methods to estimate functions from such disparate data. However, most previous studies have utilized linear models, which lack the expressive power to represent nonlinear functions. In fact, real-world data often have nonlinear relations, which limit the applicability of the use of linear models. The revolutionary new method uses deep learning to estimate a function from disparate data without placing any restrictions on the form of the function, thereby making it possible to understand nonlinear relations between inputs and outputs. Technically, the key to the development of this new method was the derivation and construction of a deep learning−based estimation algorithm through mathematical analysis.

Kojima says, "This method efficiently generates promising correspondence(s) between inputs and outputs and approximately estimates the objective function. Furthermore, by minimizing the objective function using the stochastic gradient method, an excellent solution can be obtained." He also said that although the new method is a technology that can be universally used for analyzing disparate data, it cannot yet be used for analyzing and predicting data that change on a time axis, such as in weather forecasts, and that this is an issue for the future.

This article has been translated by JST with permission from The Science News Ltd. ( Unauthorized reproduction of the article and photographs is prohibited.

Back to Latest News

Latest News

Recent Updates

    Most Viewed