Scores of responses by doctors and ChatGPT on the Swedish family medicine specialist exam

Rasmus Arvidsson; Ronny Gunnarsson; Artin Entezarjou; David Sundemo; Carl Wikberg

doi:10.5878/j8jh-5128

Scores of responses by doctors and ChatGPT on the Swedish family medicine specialist exam

https://doi.org/10.5878/j8jh-5128

These scores were compiled as part of a study which compared ChatGPT’s performance with real doctors on the Swedish family medicine licensing exam. The scores from zero to ten for the cases of exam years 2017-2022. For more details, see README.txt.

Download data and documentation (3 files / 10.06 KiB)

Data files

Scores.csv
4.47 KiB
Download: Scores.csv

Documentation files

LICENSE.txt
460 Bytes
Download: LICENSE.txt
README.txt
5.14 KiB
Download: README.txt

Citation and access

Data contains personal data:

No

Citation:

Language:

English

Copyright:

Copyright is retained for the example case in the README file. See LICENSE.txt.

Method and outcome

Population:

Anonymous responses from SFAM's specialist exam in general medicine 2017-2022 and responses from ChatGPT to the same cases.

Time method:

Cross-section

Study design:

Observational study

Description of study design:

ChatGPT’s scores were compared with that of real doctors using cases from the Swedish family medicine specialist exam.

Sampling procedure:

Mixed probability and non-probability

Description of sampling:

1. Randomly selected doctor responses - a single response was selected randomly for each case. 2. Top tier doctor responses - a response for each case chosen by the exam reviewers as an example of a very good response. 3. ChatGPT responses - responses provided by ChatGPT.-4, August 3 Version 2023.

Time period(s) investigated:

2017 - 2022

Data format/data structure:

Numeric

Data collection - Simulation

Mode of collection:

Simulation

Description of the mode of collection:

Questions prompted to ChatGPT-4

Time period(s) for data collection:

2023-08-23 - 2023-08-23

Data collector:

University of Gothenburg
Opens a new window at ror.org.
ROR

Instrument

Name:

ChatGPT-4

Type:

Other

Data collection - Educational measurements and tests

Mode of collection:

Educational measurements and tests

Description of the mode of collection:

SFAM's specialist exam in general medicine

Data collector:

The Swedish Association of General Practice (SFAM)

Geographic coverage

Geographic location:

Sweden

Administrative information

Responsible department/unit:

Institute of Medicine

Topic and keywords

Swedish Standard Classification of Research Subjects 2025:

Keywords:

Metadata

Version 1

Scores of responses by doctors and ChatGPT on the Swedish family medicine specialist exam

Data files

Documentation files

Citation and access

Data access level:

Creator/​Principal investigator(s):

Research principal:

Data contains personal data:

Citation:

Language:

Copyright:

Method and outcome

Population:

Time method:

Study design:

Description of study design:

Sampling procedure:

Description of sampling:

Time period(s) investigated:

Data format/​data structure:

Data collection - Simulation

Mode of collection:

Description of the mode of collection:

Time period(s) for data collection:

Data collector:

Instrument

Name:

Type:

Data collection - Educational measurements and tests

Mode of collection:

Description of the mode of collection:

Data collector:

Geographic coverage

Geographic location:

Administrative information

Responsible department/​unit:

Topic and keywords

Swedish Standard Classification of Research Subjects 2025:

Keywords:

Metadata

Creator/Principal investigator(s):

Data format/data structure:

Responsible department/unit: