In recent years, artificial intelligence has increasingly been applied across a range of uses in the medical field. In particular, GPT, a type of large language model employing the attention mechanism, has drawn considerable attention for its ability to interact with humans in natural language. In the medical field, it has been reported that GPT achieves accuracy rates above the passing threshold when questions from Japan's National Medical Licensing Examination are translated into English before being fed into the model. In contrast, its performance in the domestic veterinary field, which relies on Japanese-language materials and requires specialized knowledge, had not been established.
A research group led by Project Lecturer Daiki Kato and Professor Takayuki Nakagawa from the Graduate School of Agricultural and Life Sciences at the University of Tokyo, used questions from the past three years of Japan's National Veterinary Licensing Examination to compare the performance of different GPT models and to assess how input prompt design and language translation affect answer accuracy.
The results showed that the latest o3 model achieved higher accuracy than the GPT-4o model, demonstrating that increases in pre-training volume and the number of parameters lead to improved performance even in the veterinary field.
Validation tests further revealed that, when the original Japanese questions were used without any special prompt optimization, the model substantially exceeded the passing score (60-70%) in every section, achieving an overall score of 92.9%. These results suggest that GPT possesses knowledge at or above the level expected of a Japanese veterinary school graduate.
Analysis of incorrectly answered questions revealed that accuracy declined for legal questions based on domestic regulations, image-based questions, and clinical questions requiring the integration of information and logical reasoning. This makes clear that GPT has both stronger and weaker areas within the National Veterinary Licensing Examination, and further validation studies on the use of GPT based on these findings will be needed going forward.
This is foundational research demonstrating that GPT has the potential to serve in a supporting role in veterinary education and on-site practice in Japan—for example, as a learning aid or a work-assistance tool. The findings represent an important starting point for the safe and effective use of GPT in the veterinary field going forward. The results were published in Scientific Reports.
Journal Information
Publication: Scientific Reports
Title: Performance evaluation of generative pre-trained transformer on the National Veterinary Licensing Examination in Japan
DOI: 10.1038/s41598-026-37300-9
This article has been translated by JST with permission from The Science News Ltd. (https://sci-news.co.jp/). Unauthorized reproduction of the article and photographs is prohibited.

