Proceedings Article | 2 April 2024
KEYWORDS: Lung cancer, Clinical trials, Education and training, Design, Tumors, Oncology, Pulmonary disorders, Medicine, Brain diseases, Radiology
Blinded Independent Central Review (BICR) is pivotal in maintaining unbiased assessment in oncology clinical trials employing various assessment criteria like Response Evaluation Criteria In Solid Tumors (RECIST) in clinical trials framework. This paper emphasizes the potential of Large Language Models (LLMs) such as OpenAI’s GPT-4 and ChatGPT trained on clinical trials’ documents and other reader training materials, to significantly improve interpretation assistance and real-time query resolution. During central review process these documents are easily accessible to readers but sometimes, given that most readers read on multiple ongoing clinical trials on regular basis, it is a daunting task to search details like trial design, endpoints and specific reader rules. Through various pre-trained frameworks using ChatGPT Application Programming Interface (API), an AI-based chatbot can be used for helping readers saving time by providing accurate study design related questions based on uploaded training documents. If successful, this analysis can open another novel implementation for LLMs (or ChatGPT) in clinical research and medical imaging. This prospective study involved the review of study design and protocol available from clinicaltrials.gov database maintained by The National Library of Medicine (NLM) at the National Institutes of Health (NIH). ClinicalTrials.gov is a registry of clinical trials that contains information on clinical studies funded by the NIH, other federal agencies, and private industry. The database includes over 444,000 trials from 221 countries. The NLM works with the US Food and Drug Administration (FDA) to develop and maintain the database. The ChatGPT based Chatbot was trained on clinical trial design data from respective studies to grasp the intricacies of assessment criteria, patient population, inclusion / exclusion criteria and other assessment nuances, as applicable. Fine-tuning with prompt engineering ensured Chatbot to understand the language and context specific to BICR. The resulting models serve as intelligent assistants to provide a user-friendly interface for reviewers. Reviewers can engage with the chatbot in natural language, obtaining clarifications on assessment guidelines, terminology, and complex cases. To train the Chatbot, we searched for lung cancer studies with following specifications: “Completed Studies | Studies with Results | Interventional Studies | Lung Cancer | Phase 3 | Study Protocols | Statistical Analysis Plans (SAPs)” from the clinicaltrials.gov website. Without any bias, we selected the first three studies in search results for our prospective study and seven questions to ask, representative of commonly encountered queries for readers. The algorithm supported by ChatGPT was evaluated against a Gold Standard medical opinion, determined by a board-certified radiologist with over 20 years of experience in the BICR process. The Chatbot provided immediate, contextually accurate insights for all the questions across three studies. The trick questions which did not have the answers, data or references in the provided text were rightly called out by the Chatbot, suggesting the user to check with document or study team. Though sometimes, in addition to referencing the study team or documents, it did provide a clinical practice or general criteria related feedback. Real-time query resolution reduces response time, preventing delays in assessment and decision-making. LLMs can streamline training by offering on-the-spot explanations and references, enhancing reviewer proficiency and efficiency. By acting as interactive chatbots, LLMs have immense potential to improve quality and efficiency by offering contextual guidance and expedited responses, ultimately enhancing decision-making and study efficiency.