Abstract
Objectives
As medical schools worldwide condense the preclinical phase of medical education, it is increasingly important to identify resources that help medical students retain and employ the medical information. One popular tool among medical students is an application called Anki, a free and open-source flashcard program utilizing spaced repetition for quick and durable memorization. The purpose of this study is to determine how variable Anki usage among first-year medical students throughout a standardized anatomy and physiology course correlates with performance.
Methods
We designed a novel Anki add-on called “Anki Stat Scraper” to collect data on first-year medical students at Kirk Kerkorian School of Medicine during their 8-week anatomy and physiology course. Anki users (N = 45) were separated into four groups: Heavy (N = 5), intermediate (N = 5), light (N = 16), and limited-Anki (N = 19) users, based on the time each student spent on the flashcard app, how many flashcards they studied per day, and how many days they used the app prior to their anatomy and physiology exam. A 14-question Likert scale questionnaire was administered to each participant to gauge their understanding of Anki and how they used the app to study.
Results
Heavy and intermediate Anki users had higher average exam scores than their counterparts who did not use Anki as a study method. Average exam scores were 90.34%, 91.74%, 85.86%, and 87.75% for heavy, intermediate, light, and limited-Anki users respectively (p > 0.05). Our survey demonstrated that Anki users spent an average of 73.86% of their study time using Anki, compared to an average of 36.53% for limited-Anki users (p < 0.001).
Conclusion
Anki users did not score significantly higher compared to limited-Anki users. However, survey responses from students believe that Anki may still be a useful educational tool for future medical students.
Introduction
In the rapidly growing realm of medicine, medical knowledge has been expanding exponentially. Doubling time of medical information was projected to take 50 years in 1950, but now, it is estimated to take only 73 days. 1 Accordingly, medical schools demand that their students learn, retain, and master increasingly immense amounts of material throughout their education. Traditionally, this journey starts with 2 years of didactic material where they learn organ-system or discipline based information, focusing on normal versus abnormal patient findings and symptomatology. 2 One possible way of assessing student comprehension and understanding is through scheduled block exams, which analyze students’ breadth of preclinical knowledge. The concurrent expansion of technology and the internet allows many medical students to supplement their formal curricula using third-party online learning resources including pre-recorded video lectures, visual sketches, and flashcard applications.3,4 With the gradual increase in available e-resources over the past few decades, medical students must be intentional with their time-management and organizational skills to decide what the most effective methods of knowledge acquisition are. 5 It is well known that a majority of medical students use these additional resources to study for the United States Medical Licensing Exam (USMLE) Step 1. However, elaboration on how often and how important each resource is during the preclerkship years is less studied. 6
One popular resource is a program called Anki that allows users to make flashcards consisting of words, pictures, audio files, and videos. 7 Anki tests specific bite-size items of knowledge that are self-rated for difficulty and uses spaced repetition and active recall to reinforce information. Based on each students’ responses, the program will set personalized intervals for reassessment and revision. 8 This model will result in newer and more difficult flashcards appearing more frequently than older, easier, or “learned” flashcards, effectively identifying each student's knowledge gaps and working to strengthen them. This is known as the “spacing effect”. 9 The spacing effect, also known as distributed practice or spaced repetition, capitalizes on the ideas behind the “forgetting curve” which demonstrates that a person will forget a certain amount of learned information each time it is presented to them. 10 On subsequent presentations of the information, less is forgotten due to the facts building on to each other and transitioning from new material to prior knowledge. 10 This is in contrast to “mass learning” in which a person attempts to learn all of the information in one pass. It has been well documented that although the mass effect might bore more positive results in the short term, spaced repetition produces more durable long-term retention. 11 The spaced repetition model is at the heart of Anki and prior research studies have demonstrated Anki's effectiveness in improving medical student performance and reducing test anxiety.12–14
Nonetheless, there is a current gap in the literature regarding specifics on how each medical student utilizes Anki. Understanding the value each medical student places on Anki is crucial for medical faculty and future medical student cohorts to enhance knowledge attainment and facilitate learning. The purpose of our study is to evaluate Anki usage for first-year medical students throughout an 8-week anatomy and physiology course.
Methods
This study took place during the first eight weeks of medical school for the Kirk Kerkorian School of Medicine's class of 2026 from July 18th to September seventh, 2022. This was a single-institution, observational, prospective cohort study. Forty-five of the sixty medical students (75%) in their first year of medical school were recruited to participate in this study, and written informed consent was obtained prior to study initiation. All data was de-identified and in compliance with existing IRB protocols. Students were recruited based on their intention to primarily utilize the premade Anki decks “Anking” and “Lightyear.” Specific learning objectives for the exam were given to students, and the Anki cards in each premade deck were intended to cover these topics. However, no measures were taken to confirm that these were the only decks utilized and that students were not supplementing the premade decks with their own decks and/or using decks premade by senior students at the medical school. We designed a novel Anki add-on called “Anki Stat Scraper” to collect usage data on first-year medical students at Kirk Kerkorian School of Medicine during their 8-week anatomy and physiology course. The add-on was created using software called MongoDB, a cross-platform document-oriented database program, and data was stored in an online cloud server requiring monthly payments. 15
It is important to note that Anki internally stores individuals’ data locally, and that information is tied to each medical student's account. The Anki Stat Scraper add-on was designed to harvest and supplement the data that Anki already has for each person. Traditionally, in order to obtain the data, each student would have to export their data as a PDF in order for us to use, but with Anki Stat Scraper, we could have them click a single button and have all their data exported to the MongoDB server, and it is downloadable for anyone to use. The add-on queries the built in stats page through the Anki session's internal database. This data is packaged and sent to our database and we reorganize, process, and infer new data from the built-in stats. For example, the data generated by the add-on includes a list of days that a student used the app. Each day contains the number of cards and how much time was spent that day on each card. We can then regroup students based on usage statistics calculated from this basic data. We additionally included a qualitative survey in the add-on in order to assess students’ willingness to use Anki as a study tool and the overall value they placed on the application and its effectiveness. Information and reporting bias was controlled for by utilizing the add-on as a standardized data collection tool. Once student exam scores were paired with their quantitative and qualitative Anki data, we were able to have an organized dataset on Anki's usefulness as a study tool for anatomy and physiology.
Two separate groupings were done for our study. Firstly, medical students were divided into Anki users (N = 26) and limited-Anki users (N = 19). Secondly, all participants (N = 45) were separated into four groups: Heavy (N = 5), intermediate (N = 5), light (N = 16), and limited-Anki (N = 19) users, based on the time each student spent on the flashcard app, how many flashcards they studied per day, and how many days they used the app prior to their anatomy and physiology exam. We defined an Anki user (heavy, intermediate, or light) as someone who studied a minimum of 75 flashcards per day and spent at least 5 s on average per individual flashcard. The entire study period was a total of 51 days. Heavy users spent 46 or more days (90%) studying with Anki, intermediate users spent 39 or more days (75%) studying with Anki, and light users spent 26 or more days (50%) studying with Anki. Those individuals that did not utilize Anki at all or fell under the 50% cutoff were placed into the limited-Anki group. These findings are shown in Table 1.
Definitions of a heavy, intermediate, light, and limited-Anki user.
Total exam days for the Anatomy and Physiology block = 51 days. If a student did not study at least 75 flashcards with an average of at least 5 s for any given day, that day would not be counted as a day spent using Anki to study.
We chose to define Anki users based on three factors. First, we defined Anki users based on the percentage of days within the block they utilized Anki. The ideology behind spaced repetition is to be exposed to the same information over multiple consecutive days for long-term retention. 16 Therefore, students using Anki for a greater amount of total days were placed into heavy (90% of days studied), intermediate (75% of days studied), and light (50% of days studied) groups, respectively. Second, we defined Anki users based on the amount of time they spent per card. Students are capable of skipping through cards without even reading the information by pressing the spacebar. Therefore, we set an average cutoff value of 5 s per card to eliminate skipping. Each flashcard contains a small, digestible amount of information that could be retained very quickly, but it does take time to read each card and develop an answer before moving forward. Theoretically, if a medical student studied for one hour using Anki and spent 5 s on average per flashcard, they would be able to complete 720 flashcards. However, if we allowed students to do flashcards more quickly than 5 s, we believe that students may be going too quickly to process the information presented in front of them. Even though there is a possibility that writing styles for flashcards may differ between students, the 5 s cutoff was designed to weed out students who are capable of rapidly clicking through their cards without any intention of studying those cards for the sake of preventing a buildup of cards for the following day. This did not affect our groupings when distinguishing between each type of Anki user. Finally, Anki users were categorized based on the amount of cards they reviewed each day. The average anatomy and physiology deck utilized by our students contains approximately 2500 cards, so we believe that 75 cards per day would need to be reviewed in order to complete the total deck with enough time to review the cards adequately before the block exam.
A 14-question Likert scale questionnaire was administered to each participant within the add-on to gauge their understanding of Anki and how they used the app in conjunction with other resources to study (Appendix A). Questions 1–12 of the survey used a five-point scale with 1 = strongly disagree, 3 = neutral, and 5 = strongly agree. Survey questions assessed how medical students perceived Anki's user-friendliness, whether the app was too time-consuming, and if they believed Anki made a difference towards their academic success. Lastly, demographic data was collected and included for the class of 2026 to help control and assess for confounding variables. These findings are shown in Table 2.
Demographic data for students in the class of 2026.
62 total students were measured according to their gender, age, race, socioeconomic status (SES), first generation status, and veteran status. Frequencies and percentages are displayed above.
Statistical analysis
At the end of the course, students were given a 60 question faculty-written exam that consisted of 30 multiple-choice questions and 30 lab practical-style questions. Means of exam scores were calculated for Anki versus Limited Anki Users, as well as the definitions of heavy, intermediate, light, and limited Anki Users. An analysis of variance (ANOVA) test was applied to assess group differences.
For the qualitative survey data, responses were collected and converted into numerical values, then a T-statistic 95% confidence interval was calculated for the sample mean for each question to determine significance. A significant response was one in which the confidence interval did not include a theoretical mean of 3, indicating that students neither agree or disagree with the question posed. Question 13 responses were recorded as “Yes” or “No,” and a chi-square test was used to determine significance between Anki and limited-Anki users. Question 14 responses were recorded on a scale of 0%–100%, and an independent samples t-test was used to determine significance between Anki and limited-Anki users.
Results
Our definitions of heavy, intermediate, light, and limited-Anki users are displayed in Table 1 for clarity. Next, we aimed to characterize each of our four groups with descriptive statistics. Mean exam score and standard deviation were calculated for each group. Additionally, an unpaired Pearson t-test was performed between each group pair, with an ANOVA performed between all groups to assess differences. The results of this exploration are shown below in Table 3. Finally, a 14-question survey was given to all participants to gauge their usage and interest regarding Anki. Those results are displayed in Table 4.
Descriptive Statistics for Each Type of Anki User.
Survey Questionnaire.
Our findings suggest that there are no statistically significant differences between all four groups (p = 0.185). However, comparing light versus intermediate Anki users showed the biggest difference among the groups (Mean = 85.86 and 91.74). To visually compare each group, a bar graph of means and standard deviations has been included in Figures 1–3.

Exam performance based on Anki usage by group.

Anki users versus limited Anki users on exam performance.

Exam performance of heavy and intermediate Anki users versus light and limited Anki users.
Our results indicate that there were no statistically significant differences between all Anki users, regardless of the amount of days each individual spent using Anki to study for their Anatomy and Physiology examination. Importantly, the highest average score was obtained by the intermediate Anki users (Mean = 91.74%), and the lowest average score was obtained by the light-Anki users (Mean = 85.86%). The average hours spent studying and average number of Anki cards studied per day were highest among the intermediate Anki users (102.17 h and 663 flashcards).
When comparing all heavy, intermediate, and light Anki users (N = 26) to limited Anki users (N = 19), there was almost no difference in exam score averages (87.75% vs 87.85%, p = 0.561).
When comparing heavy and intermediate Anki users (N = 10) to light and limited Anki users (N = 35), exam scores were 91.04% and 86.88% on average (p = 0.747).
All 45 participating students answered 14 total survey questions and recorded the first 12 answers from a range of 1 to 5, where 1 = strongly disagree, 3 = neutral, and 5 = strongly agree. Question 13 was assessed with responses as either “Yes” or “No.” Question 14 was assessed using a sliding scale from 0% to 100%. Responses to questions 1–12 were determined as significant if the 95% confidence intervals did not include the theoretical mean of 3, signifying that students neither agree nor disagree with the question posed. Question 13 significance was measured using a chi-squared test (p = 0.139). Question 14 significance was measured using an independent samples t-test (p < 0.001).
From Table 4, Anki users demonstrated significance when answering questions 1–7, 10–12, and 14. Limited Anki users demonstrated significance when answering questions 1, 6, 9, 11, 12, and 14. Specifically, both groups agreed upon Anki being useful for their anatomy organ system examination and concurred that Anki's bite-size information was easily understandable despite using premade decks. Additionally, it is interesting to point out that both groups intended on using Anki to study for USMLE Step 1 and 2, even those who did not utilize Anki at all for this specific exam. This suggests that first-year medical students viewed Anki positively, but there may have been difficulties with using it and understanding the techniques behind spaced repetition. Future studies should evaluate if students who do not use Anki as heavily when starting medical school start to gradually supplement their studying with Anki. Furthermore, both groups agreed upon suspending flashcards that they missed multiple times. This may indicate that there are opportunities to refine information in these cards to make them more understandable. Lastly, Anki and limited Anki users had polarizing views on the amount of time they spent studying for this exam with Anki. Those we defined as Anki users spent approximately 3/4ths of their total time over the 8-week span using Anki as their primary study tool, while limited Anki users spent approximately 1/4th of their total time using Anki as their primary study tool. This suggests that even limited Anki users utilized flashcards to some extent when studying, and it would be very important to analyze how medical students study over the course of multiple organ-blocks and national medical licensing exams. For additional clarity, Table 2 includes demographic data from the class of 2026.
Discussion
Anki in medical school
This study presents promising information regarding the effect of Anki on student performance; however, as our results were not statistically significant (p = 0.185), it is unclear if utilizing Anki ultimately leads to higher test scores among students. Encouragingly, this study's results showed a positive trend showing students using Anki at a heavy or intermediate level tended to score higher on average compared to their counterparts who used Anki lightly or did not use Anki at all (limited-Anki-users) in their studies. The exam utilized in this study included 30 multiple choice questions and 30 lab practical, fill-in-the-blank questions. This preliminarily suggests that Anki's use of spaced repetition and fact recall may be advantageous in helping students to recognize information presented to them, at least under the aforementioned testing conditions. Ultimately, we believe that there is a need for more research to be done regarding Anki use among medical students as currently, there have only been six studies published on PubMed in the past 15 years regarding Anki use in medical school. This study is intended to be a pilot study introducing the Anki Stat Scraper add-on to additional medical students and schools to analyze Anki usage at other institutions. We hypothesize that because Anki requires students to digest the information throughout the duration of their course to be effective, medical students become comfortable with the terms and can therefore recognize it with greater ease on their exam. It is unclear from this study how Anki helps students to reason through complex multi-ordered critical thinking questions; however, it is likely that it is the combination of multiple video-based, medical school lecture, and text resources supplemented with Anki's use of the spaced repetition model that allows students to find success on their exams.
Study strategy
Importantly, the strategy in which a student utilizes Anki is a key factor to consider when deciding whether to implement it as a study tool in medical school. Our results showed that students who used Anki lightly, meaning they studied flashcards for less than 50% of the total time allotted for the anatomy and physiology course, scored below their limited-Anki user colleagues. We believe this is attributed to Anki's complexity and the inherent nature of spaced repetition. Students who switch between multiple study resources without an organized plan of how to balance them ultimately end up with inefficient habits that could lead to lower exam performance.3,17 In contrast, heavy and intermediate Anki users utilized Anki for 90% and 75% of the study days allotted, respectively. We believe that this amount of time may be necessary to maximize a medical students’ ability to retain information through spaced repetition and active recall. 8 Although we cannot speak to the complete spread of the resources used by heavy and intermediate Anki users, we can be sure that their utilization of at least one resource, namely Anki, was consistent during their time within the anatomy and physiology block. It can also be elucidated that students who elected to not use Anki at all had at least one fewer resource with which they studied. Therefore, they were able to make more concrete connections to the material as opposed to students who had difficulties deciding which resources to pursue and only used Anki partially. It should be noted that based on our definitions of intermediate and heavy Anki users, intermediate Anki users actually studied more flashcards than the heavy Anki users on average. This is due to the fact that we defined Anki usage levels based on how many actual days students studied at least 75 cards and not how many cards they reviewed on the days that they studied. To elaborate, intermediate Anki users may have only utilized Anki 75%–89% of the total days available to them, but on those days, their card load was higher than heavy Anki users who utilized Anki a greater number of total days. The reason for this may be that students who didn’t use Anki every single day (or almost every single day) felt they could unlock more cards without feeling overwhelmed by the amount of cards due. In contrast, students who were diligent about maximizing their utilization of Anki over a longer time span may have been more conservative about how many cards they unlocked because they anticipated having to review those cards in subsequent days. It is possible that these students were more strategic about which cards they unlocked due to this reason.
Although our data showcases a positive association between increased exam scores and students who utilized Anki at progressively higher levels, there was no significance found between the groups. It is our explanation that although Anki is a powerful study tool, there are other resources and study methods that students can utilize in order to succeed. Colloquially, it is understood that Anki should be utilized in tandem with other resources in order to solidify an appropriate understanding of the medical material. However, maximizing the distribution of use of the various resources still needs to be ascertained. Future studies should explore different Anki user definitions to determine the extent that Anki should be incorporated into a student's study routine. The addition of more exam scores and student Anki usage would allow us to definitively understand the impact of Anki on student exam performance.
Student perceptions of Anki
We also set out to understand how students felt about Anki as a program and its capabilities. Based on student survey responses, students feel like Anki is a unique tool that will help them to find success not only on their block exams, but also on their board exams, USMLE Step 1 and Step 2. This was highlighted in questions 1–6, 9–12, and 14, which were all significant. We believe that Anki is able to distill information in a concise and comprehensible manner, an increasingly important factor to consider when learning new concepts. Additionally, question 14 helped prove that our definitions of an Anki user were reliable. The average time Anki users spent studying for their exam with the app was 73.86%, compared to 36.53% for limited-Anki users (p < 0.001). Interestingly, although 19 of the 45 students were limited-Anki users and self-reported use of Anki was 36.53% of their time, the results of this survey remained skewed favorably toward the use of Anki. This suggests that even those who used Anki in a limited capacity felt that the time they did put towards using Anki was valuable. Additionally, some students believe that Anki's interface is not easily digestible, which could ultimately limit how often, if at all, students utilized the program. This finding may justify the need for a more in depth tutorial of how Anki works and what tips are available to maximize its ability to help students. Although not significant, students interestingly felt neutral about the need to use Anki every day in order for it to produce favorable results. This suggests that Anki could be valuable for medical students, but may require individualized instruction and flexible usage patterns to be most effective.
Limitations
One limitation to this study is that it followed only a single, faculty-written exam. Additional data regarding future organ-system block exams tested using standardized NBME-style questions will provide further insight into Anki's ability to aid students during their study time. Additionally, because the exam was faculty authored, it is possible that despite our efforts to recruit students who use Anking or Lightyear decks, students may have instead utilized Anki decks that were premade by senior medical students. This limits the generalizability of our results as each university may experience different results depending on the quality of their self made decks and how well they correspond to faculty authored questions. Moreover, we did not quantify the percentage of students who used Anki prior to medical school, which could confound our results because there may be a select group of students who already had prior knowledge of how to study using Anki more effectively.
This study was conducted at a single institution and followed only a portion of first-year medical students. We intended to survey a total of 60 students, but only 75% of students (N = 45) responded. No sample size/power analysis was performed for this study. Additionally, the qualitative questionnaire utilized was not validated, but it was pilot-tested by the authors of the study prior to its use. Expanding the sample size to include additional classes will help to understand the scope of Anki's ability to aid students on their school exams as well as on national board and clinical subject exams.
Conclusion
As medical education requirements continue to increase, medical students must streamline their preclinical learning methods, and identifying effective learning tools is paramount. The goal of our pilot study was to quantify the use of Anki, a spaced repetition flashcard program, for first-year medical students as they studied anatomy and physiology. Our survey results revealed that Anki users spent a significantly higher percentage of their study time using the app compared to limited-Anki users, and incoming medical students viewed Anki positively, despite having limited use for their first anatomy and physiology exam. Despite this, Anki usage did not significantly impact exam performance among the cohort. These findings suggest that spaced repetition and active recall may be more effective if done for a significant amount of a students’ study time beyond one medical school exam. Future research is needed to analyze the best strategies to optimize and incorporate Anki into medical education.
Footnotes
Acknowledgments
Not Applicable.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Ethics approval and consent to participate
De-identified data for this study were drawn from institutional databases in accordance with an approved IRB protocol. Written consent was obtained from each of the study subjects prior to study initiation. Our International Review Board is the UNLV Biomedical IRB, protocol number 2022–302.
Availability of data and materials
All data generated and analyzed during the current study is available from the corresponding author on reasonable request.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Authors’ contributions
All authors contributed to manuscript work and design. Joshua Levy helped with IRB approval and wrote the intro, methods, and results. Kencie Ely wrote the conclusion and limitations. Gemma Lagasca and Hiba Kausar performed the literature review. Deepal Patel helped with data analysis. Shaun Andersen assisted with literature review and the intro. Carlos Georges developed the Anki Stat Scraper application in its entirety. Dr Edward Simanton supervised and guided this manuscript from conception. The final manuscript was read and approved by all authors.
