Use of artificial intelligence-generated multiple-choice questions for the examination of surgical subspecialty residents: Report of feasibility and psychometric analysis

Jin Kyu Kim; Michael E. Chua; Armando  J. Lorenzo; Mandy  Rickard; Laura  Andreacchi; Michael  Kim; Douglas  Cheung; Yonah  Krakowsky; Jason Y.  Lee

doi:10.5489/cuaj.9020

Use of artificial intelligence-generated multiple-choice questions for the examination of surgical subspecialty residents

Report of feasibility and psychometric analysis

Authors

Jin Kyu Kim University of Toronto
Michael E. Chua Hospital for Sick Children, Toronto, ON
Armando J. Lorenzo The Hospital for Sick Children, Toronto, ON
Mandy Rickard Hospital for Sick Children, Toronto, ON
Laura Andreacchi University of Toronto
Michael Kim University of Toronto
Douglas Cheung University of Toronto
Yonah Krakowsky University of Toronto
Jason Y. Lee University of Toronto

DOI:

https://doi.org/10.5489/cuaj.9020

Keywords:

ChatGPT, Large Language Model, LLM, MCQ, Examination

Abstract

INTRODUCTION: Multiple-choice questions (MCQs) are essential in medical education and widely used by licensing bodies. They are traditionally created with intensive human effort to ensure validity. Recent advances in artificial intelligence (AI), particularly large language models (LLMs), offer the potential to streamline this process. This study aimed to develop and test a GPT-4 model with customized instructions for generating MCQs to assess urology residents.

METHODS: A GPT-4 model was embedded using guidelines from medical licensing bodies and reference materials specific to urology. This model was tasked with generating MCQs designed to mimic the format and content of the 2023 urology examination outlined by the Royal College of Physicians and Surgeons of Canada (RCPSC). Following generation, a selection of MCQs underwent expert review for validity and suitability.

RESULTS: From an initial set of 123 generated MCQs, 60 were chosen for inclusion in an exam administered to 15 urology residents at the University of Toronto. The exam results demonstrated a general increasing performance with level of training cohorts, suggesting the MCQs’ ability to effectively discriminate knowledge levels among residents. The majority (33/60) of the questions had discriminatory value that appeared acceptable (discriminatory index 0.2-0.4) or excellent (discriminatory index >0.4).

CONCLUSIONS: This study highlights AI-driven models like GPT-4 as efficient tools to aid with MCQ generation in medical education assessments. By automating MCQ creation while maintaining quality standards, AI can expedite processes. Future research should focus on refining AI applications in education to optimize assessments and enhance medical training and certification outcomes.

Downloads

Download data is not yet available.

Downloads

Published

2025-02-24

How to Cite

Kim, J. K., Chua, M. E., Lorenzo, A. . J., Rickard, M. ., Andreacchi, L. ., Kim, M. ., … Lee, J. Y. . (2025). Use of artificial intelligence-generated multiple-choice questions for the examination of surgical subspecialty residents: Report of feasibility and psychometric analysis. Canadian Urological Association Journal, 19(6), 182–7. https://doi.org/10.5489/cuaj.9020

Download Citation

Issue

Vol. 19 No. 6 (2025): CUAJ June

Section

Original Research

License

You, the Author(s), assign your copyright in and to the Article to the Canadian Urological Association. This means that you may not, without the prior written permission of the CUA:

Post the Article on any Web site
Translate or authorize a translation of the Article
Copy or otherwise reproduce the Article, in any format, beyond what is permitted under Canadian copyright law, or authorize others to do so
Copy or otherwise reproduce portions of the Article, including tables and figures, beyond what is permitted under Canadian copyright law, or authorize others to do so.

The CUA encourages use for non-commercial educational purposes and will not unreasonably deny any such permission request.

You retain your moral rights in and to the Article. This means that the CUA may not assert its copyright in such a way that would negatively reflect on your reputation or your right to be associated with the Article.

The CUA also requires you to warrant the following:

That you are the Author(s) and sole owner(s), that the Article is original and unpublished and that you have not previously assigned copyright or granted a licence to any other third party;
That all individuals who have made a substantive contribution to the article are acknowledged;
That the Article does not infringe any proprietary right of any third party and that you have received the permissions necessary to include the work of others in the Article; and
That the Article does not libel or violate the privacy rights of any third party.

Use of artificial intelligence-generated multiple-choice questions for the examination of surgical subspecialty residents

Report of feasibility and psychometric analysis

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

License

grouped_ads

Follow us @CUAJournal

Want to advertise?

JOB POSTINGS

Language