Innovative Insights in Digital Health

Abstract

Automated Natural Language Processing for Tumor Response Classification in Oncology Radiology Reports
Som Biswas

Natural language processing (NLP) has rapidly evolved in recent years, enabling the extraction of clinically relevant information from unstructured electronic medical records. Radiology reports, particularly in oncology, contain detailed longitudinal information on patient disease status, which can inform therapeutic decisions and outcomes. While structured reporting has been advocated to streamline data extraction, most radiology reports remain free-text [1]. This study aimed to leverage structured oncology reports (SOR) to train a deep NLP model for tumor response category (TRC) classification in free-text oncology reports (FTOR) and compare its performance with conventional NLP algorithms and human readers [2].

In this retrospective study, 9,653 SOR and 802 FTOR were analyzed from multiple radiology centers. A BERT-based NLP model was trained on SOR and applied to FTOR. Model performance was compared with radiologists, medical students, and radiology technologist students. The BERT model achieved an F1 score of 0.70, outperforming traditional NLP approaches and technologist students, approximating medical student performance, but was inferior to radiologists (F1, 0.79). Lexical complexity and semantic ambiguity reduced performance for both humans and machines [3].

Conclusion: Deep NLP models trained on structured oncology data can achieve near-human performance in extracting oncologic outcomes from free-text reports, offering a scalable approach for large-scale oncology data curation [4].

PDF

Submit Your Manuscript

Ready to share your groundbreaking research with the world?

Submit your article here and become a part of our vibrant community dedicated to advancing scientific knowledge. We look forward to collaborating with you!