Abstract

Large language models (LLMs) show promise for clinical decision support but often struggle with case-specific reasoning. We present Ophtimus-V2-Tx, an 8-billion-parameter ophthalmology-specialized LLM fine-tuned on more than 10,000 case reports. Evaluation is conducted on a pre-collected dataset. Alongside text metrics (ROUGE-L, BLEU, METEOR) and a semantic similarity score, we use CliBench to map outputs to standardized codes (ICD-10-CM, ATC, ICD-10-PCS) and compute hierarchical F1 (L1-L4 and Full), with code mapping used strictly as an evaluation tool. Ophtimus-V2-Tx is competitive with a state-of-the-art general model and stronger in several settings. It improves text metrics (ROUGE-L 0.40 vs. 0.18; BLEU 0.26 vs. 0.05; METEOR 0.45 vs. 0.29) with comparable semantic similarity. On CliBench, it attains a higher full-code score for secondary diagnosis and ties or leads at selected granular levels for primary diagnosis, while medication and procedure results are close with overlapping confidence intervals. Relative to other ophthalmology-tuned baselines, it shows consistently higher text-generation scores. These findings indicate that a compact, domain-adapted model can approach-or in targeted settings, exceed-large general LLMs on clinically grounded outputs while remaining feasible for on-premise use. We also describe an auditable evaluation pipeline (frozen coding agent, identical prompts, hierarchical metrics) to support reproducibility and future benchmarking.

MeSH Terms

HumansEye DiseasesOphthalmologyClinical Decision-Making

Affiliated Institutions

Related Publications

Publication Info

Year
2025
Type
article
Volume
15
Issue
1
Pages
43532-43532
Citations
0
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

0
OpenAlex
0
Influential

Cite This

Minwook Kwon, Kuk Jin Jang, Seung Ju Baek et al. (2025). Ophtimus-V2-Tx: a compact domain-specific LLM for ophthalmic diagnosis and treatment planning. Scientific Reports , 15 (1) , 43532-43532. https://doi.org/10.1038/s41598-025-27410-1

Identifiers

DOI
10.1038/s41598-025-27410-1
PMID
41372257
PMCID
PMC12695966

Data Quality

Data completeness: 81%