Abstract
Background:
As the population of aging physicians increases, methods of assessing physicians’ cognitive function and predicting clinically significant changes in clinical performance become increasingly important. Although several approaches have been suggested, no evaluation system is accepted or utilized widely.
Study Design:
Literature was reviewed using Medline, PubMed and other sources. Articles discussing the problems of geriatric physicians were summarized, stressing publications that proposed methods of evaluation. Selected literature on evaluating aging pilots also was reviewed, and potential applications for physician evaluation were proposed. Neuropsychological cognitive test protocols were reviewed, and a reduced evaluation protocol was proposed for interdisciplinary longitudinal research.
Results:
Although there are several articles evaluating cognitive function in aging physicians and aging pilots, and although a few institutions have instituted cognitive evaluation, there are no longitudinal data assessing cognitive function in physicians over time, and correlating them with performance.
Conclusion:
Valid, reliable testing of cognitive function of physicians is needed. In order to understand its predictive value, physicians should be tested over time starting when they are young, and results should be correlated with physician performance. Early testing is needed to determine whether cognitive deficits are age-related or longstanding. A multi-institutional study over many years is proposed. Additional assessments of other factors, such as manual dexterity (perhaps using simulators) and physician frailty are recommended, but detailed discussion of these issues is beyond the scope of this article.
Keywords
Introduction
As the percentage of elderly people in our population increases, so does the percentage of elderly physicians. Older doctors, some practicing part-time or in a restricted capacity, might be part of the solution to the expected shortage of health care providers. However, it is essential to ensure that practicing physicians remain competent. The need for assessing functional competence has been recognized and studied especially among surgeons. However, the lack of accepted methods for evaluation and criteria for determining the functional correlates of physician competence has created practical and legal difficulties. These challenges also are particularly apparent among surgeons. Hospitals are obligated to assure the competence and qualifications of their medical staffs. Most rely upon department chairs to review applications for surgical privileges (initial applications and frequent re-applications). Credentials committees generally accept the endorsement of department chairs. However, those of us who are department chairs frequently face daunting challenges in evaluating and attesting to the competence of aging surgeons. In most institutions, there are no objective measures used routinely to assess manual or cognitive functions. Hence, judgments by department chairs are based on personal observation and impressions; opinions expressed by trainees, nurses and others who are in the operating room; and in some cases by outcome data. Denying privileges on the basis of a clinical impression of decreasing competence is difficult, especially when the aging surgeon in question was a mentor to the current department chair. In addition, the aging surgeon may not recognize a change in function; and denial of privileges without objective information can lead not only to acrimony, but also to lawsuits for age and employment discrimination. Unfortunately, the documentation available to us usually is an increased prevalence of bad outcomes. It seems self evident that if we wait to take action until after a previously excellent surgeon has been sued repeatedly for bad outcomes, then we have waited too long and have failed to protect the patients, hospital, and the aging surgeon.
This article was started because of the senior author’s (RTS) belief that it should be possible to promulgate an efficient, valid, reliable assessment protocol that could be used widely, would provide objective data on relevant human performance, and that would generate data that correlated with patient outcomes and delivery of safe medical care. It also seemed curious to the author that this problem had not been addressed more fully previously. As the following literature review demonstrates, it has, at least to a degree. However, none of the proposals or methodologies has been adopted widely, and most physicians (surgeons and others) are not familiar with the literature. While this article deals primarily with surgeons because they have been studied most widely, and with pilots for whom assessment using cognitive and manual function is more advanced than it is for surgeons, the principles apply to physicians in all specialties. This article provides a fairly comprehensive review of existing literature, as well as a proposal for development of a routine assessment protocol for physicians as we age.
Methods
Medline and PubMed data bases were searched using articles from 1960 to the present. Search terms included aging, aging physician, geriatric physician, and aging surgeon.14,562 articles were identified, and abstracts of potentially relevant articles were reviewed. Many anecdotal opinion articles were excluded, although a few particularly insightful such commentaries were referenced. Selected additional important references cited in the selected articles also were reviewed, as were articles on aging pilots that appear to be relevant to the problems of aging physicians.
In addition, common neuropsychological tests were reviewed, and a protocol was suggested for longitudinal screening of cognitive function in physicians. Additional research and correlation of cognitive data with physician performance is recommended to determine if/how cognitive test results can be used in evaluation aging physicians and predicting clinically relevant changes in function.
Review of the Literature
In a 1983 commentary in JAMA, Bunkin offered a personal account of his feelings of unpreparedness when he had reached age 60 and had not given any previous consideration to retirement. 1 He discussed the need for surgeons to be alert for subtle and insidious signs of degradation in their surgical performance. At approximately age 67, Bunkin reviewed his operative times and noted that each operation had taken approximately 20 minutes longer than similar procedures had required during the previous year. He stated that “technical know-how was intact, but the movements in sequence were each several seconds longer”. He described omitting a step in an operative procedure that he had performed more than 3000 times over a span of 36 years. Dr. Bunkin stated that he found these changes in his performance as a surgeon “unexpectedly scary,” and they initiated his consideration of retirement. He added “if you are in good health, the prognosis for a fulfilling retirement is very good”.
Greenfield and Proctor conducted a survey to examine the attitudes and practices surrounding retirement in a group of senior surgeons in 1993. 2 Their survey was mailed to 882 members of the American Surgical Association, and 659 (75%) returned their responses. The survey contained questions related to age which was broken down into >40, >50, >60, and >70 years; plans for retirement; work load or operative activity; and withdrawal from privileges (age, disability and peer review). Forty percent of all responders reported no retirement plans. In the age group 40-50 there were 46 responders and only 3 reported having a plan for retirement. In the 60-70 age group there were 211 responders; 49 reported having a plan for retirement, and 75 reported no plan for retirement. The limitations of this study are age and gender. Most of the respondents were of senior age and male. The survey results showed some evidence of retirement planning with advancing age, but not among all surgeons. The surgeon responders also were asked to describe their level of surgical activity, and the majority reported a decline in their workload between ages 60 and 70. However, 40% reported a customary workload past 60 years of age. Seventy-two percent of the responders thought that decisions regarding when privileges to operate are withdrawn should be based on peer review or the onset of a disability. The authors suggested that more positive attitudes towards retirement were needed. They advocated the development of methods for performance evaluation that would reflect the surgeon’s response to his or her personal aging. They also pointed out that “both personal and institutional problems can arise when surgeons continue to practice despite limitation of aging”.
In 1993, as president of the International Society for Cardiovascular Surgery, Lazar Greenfield delivered his presidential address titled “Farewell to Surgery”. 3 The focus of his discussion was the performance of aging surgeons and the impact of the end of mandatory retirement. He noted that there was little known about behaviors of aging surgeons but that there had been a great deal of interest in older workers in general with regard to productivity and job safety. He pointed out that the science of applied ergonomics defines an “older worker” as anyone over the age of 40 and stated that studies have shown that age 40 is the time of onset of performance slowing, decrease in ability to learn new skills, decline in motivation and creativity, and one’s ability to cope with stress and change. 4,5 Greenfield provided an excellent review of the physiology of aging highlighting its complexity and its effect on multiple organs and systems in both males and females. He discussed the decline in vision, hearing, mobility and dexterity that occurs in everyone over time; and he pointed out that physical strength and joint flexibility diminish, as well as overall elasticity of leg and arm movements. He commented that studies have shown reduced motion of the lumbar spine and slowing of nerve conduction in the elderly. He cited one report that showed an age-related reduction in the ability to diffuse lactate during strenuous exercise beginning as early as age 30, thus contributing to overall decreased endurance. 6
Dr. Greenfield discussed The Age Discrimination in Employment Act of 1967 7 that prohibits mandatory retirement based on age. He pointed out that there were nevertheless a few professional organizations that still mandated retirement at a specific age including the federal aviation administration (FAA) (age 65) and the Federal Bureau of Investigation (FBI) (age 57). US congress has approved mandatory retirement ages for several other professions that involve public safety including air traffic controllers (age 56), light house operators (age 55) and national park rangers (age 57). 8 He discussed the FAA’s previous mandatory retirement at age 60 for commercial pilots and the concept of calculated risk. 9 The Fair Treatment for Experienced Pilots Act was signed into law in 2007 raising the mandatory retirement age for pilots to 65. 10,11
A pilot’s performance and outcome can affect a larger number of people, whereas a surgeon’s ability to perform poses potential risk to one person (at a time). Moreover, Greenfield recognized that historically surgeons and all physicians have focused on the knowledge and competency that are measured in order to begin the practice of medicine, highlighting that little attention has been paid to when it is time to stop practicing.
Greenfield discussed the federal highway safety standard that requires licensed drivers to undergo reexamination at least every 4 years. This examination includes test of visual acuity and knowledge. He commented on the sensitive issue of age discrimination in the United States and the need to avoid bias against the elderly even when it comes to operating a motor vehicle, pointing out that “the public can legitimately ask why is it that the older individual must take examinations to be able to drive but not to be an operating surgeon”. He pointed out that many public policies use age-based criteria such as eligibility for Medicare and Social Security, tax benefits, and the federal laws that do not allow an individual to get a driver’s license before age 16 or purchase alcoholic beverages before age 21.
Greenfield discussed cognitive function in older physicians and the development of a computerized neuropsychological screening battery Assessment of Cognitive Function developed by Powell. 12 Greenfield suggested that cognitive and functional testing should be performed under controlled situations so that objective data can be used in establishing performance criteria. He stated that more research was needed and longitudinal rather than cross sectional studies should be used to gain better data regarding aging surgeons and performance.
Trunkey and Botney’s review article published in 2001 compared how surgeons assess age- related competence and how commercial aviation assesses pilots’ competency. 13 The authors cited a publication by James B. Stewart in which he stated “5 out of every 100 doctors are so incompetent, drunk, senile or otherwise impaired that they should not be practicing medicine without some form of restriction”, and that the “quality-improvement process is seriously flawed and has not been effective in weeding out incompetent physicians”. 14 The authors pointed out that Stewart’s opinions were expressed in a book, and were not the conclusions of a peer-reviewed study.
The American Heritage Dictionary defines competency as “the quality or condition of being legally qualified, eligible or admissible.” 15 The authors point out that skills, ability and performance are not that same as competence and lack legal implication. They stated that historically the medical profession has used volume performance and outcomes as a measure of competency; however, “neither have the same meaning nor have they withstood scientific scrutiny”.
Trunkey and Botney pointed out that determining a surgeon’s competence as he or she ages is complex and multifactorial. They discussed factors including reaction time, decision-making time, cognitive decline, decline in psychomotor skills, dysfunctional or anti-social behavior, substance abuse and the failure to keep learning. The authors reviewed a book by Douglas H. Powell titled “Profiles in Cognitive Aging” that outlined multiple changes in cognitive function, physiology and psychometrics that occur with aging. 12 Powell developed a series of assessments called the MicroCog test to measure and document these changes. The tests were not designed to evaluate expert knowledge but rather to examine attention, reaction time, numerical recall, verbal memory, visuospatial facility, reasoning and mental calculation. Powell reported MicroCog test results of 1002 physicians and 581 normal subjects which showed progressive, age-related decline in both groups. Trunkey and Botney reported that the tests of reactivity, simple reaction time and choice reaction time were useful in measuring decline in surgeon performance. They suggested that “in general, a simple reaction time does not consistently correlate with performance” and that “choice reaction time may discriminate experts from novices but does not do so consistently”.
Trunkey and Botney also discussed the Federal Bureau of Investigation’s (FBI) test for assessing reaction time and performance, which utilizes a simulator that presents threat situations to evaluate an agent’s accuracy and reaction time. 16 The authors pointed out that rapid decisions and action are required of trauma surgeons, much like FBI agents in the field, but in medicine, there was no simulator to test decision making ability of surgeons. Today, simulators with such capabilities exist; but they have not been studied sufficiently to determine their ability to validly detect age related decrement in surgical ability. Hence, they are not in wide use or relied upon for assessment of surgeons.
The authors also compared surgery and aviation. Newly trained surgeons have their competency assessed through board certification in their specialty following successful completion of a residency program. Board certification is regarded as a fairly good assessment of a surgeon’s competence. The authors pointed out that many surgeons are required to re-certify every 10 years. Part of recertification includes an affidavit from the chief surgeon at his/her institution stating that the applicant for recertification has not been treated for drug/ substance abuse; not had his or her license revoked or restricted; not been censored by the American College of Surgeons, their hospital or state society; and has not been convicted of a felony. The statement does not include an endorsement of surgical competency; and if it did, making negative judgment would be challenging medically and legally for the chief surgeon and hospital because of the absence of standard objective instruments to evaluate an aging surgeon’s performance and consensus in how the assessment results should be used.
Pilots undergo medical certification every six months and are subject to random blood and urinalysis screening for substance abuse. The pilots’ medical evaluation is extensive, and there are several conditions that can disqualify a pilot from flying including but not limited to: insulin dependent diabetes, angina, coronary artery disease, myocardial infarction, heart valve replacement, heart transplant and cardiac pacemaker placement. Pilots undergo simulator testing at least annually, and commercial airline pilots are subjected to “unannounced checkouts” by the FAA. Retirement at age 60 was mandated by the FAA 9 until 2007 when the age was raised to 65. 11
The authors discussed the need to address impaired competence late in a surgeon’s career. They proposed medical evaluation and certification beginning at age 50, then every two years to age 60, then annually after age 60. They suggested employing MicroCog testing, random breathalyzer and urine screening, and increasing the frequency of recertification.
In 2006, Waljee et al examined surgeon age and operative mortality in the United States. 17 Using the national Medicare files, they reviewed the operative mortality in more than 400,000 patients who had undergone one of 8 surgical procedures between 1998 and 1999. Operative mortality was defined as in the hospital or within 30 days of surgery. They categorized surgeon age into 4 groups: ≤ 40 years, 41-50 years, 51-60 years and > 60 years of age. Mortality rates of 8 different surgical procedures were examined including: pancreatectomy; coronary artery bypass grafting; carotid endarterectomy; esophagectomy; cystectomy; lung resection; aortic valve replacement; and aortic aneurysm repair. In the analysis of surgeon age and operative mortality, the authors reported that Odds Ratio (OR) and Confidence Interval (CI) adjustments were made for each surgery performed. They found that surgeons over 60 years of age who had low procedure volumes had higher mortality rates compared with younger surgeons. Additionally, they found that the age of a surgeon was not an indicator of operative risk.
Hummer pointed out that there is no age-based definition of physical competence that exists in the United States. 18 He stated that US case law exists primarily due to the Age Discrimination in Employment Act (ADEA) of 1967 which protects individuals over 40 years of age from employment discrimination based on age. 7 He stated that according to the Federation of State Medical Boards (FSMB) in the US, advanced age is not considered a disqualification for an unrestricted medical license in any state. However, he pointed out that some states do have a minimum age requirement for obtaining licensure. Dr. Hummer discussed the difficulty in the objective analysis of a surgeon’s skill. He pointed out that objective tools such as computer generated “virtual reality” and live animal surgery that examine manual dexterity, surgical decision making and procedure-specific technical skills were in early stages of development in 2007. In assessing surgical competence and when it ends, Hummer suggested that physicians should “police our own” and determine the criteria rather than being subject to a regulatory entity outside of medicine making such determinations. Dr. Hummer added that from a surgeon’s perspective, retiring from the practice of surgery can be considered a negative life transition due to the loss of the role of being a surgeon.
Blasier reviewed the effects of aging on the physical and cognitive performance of surgeons. 19 He stated “for awhile, advance in age is accompanied by increasing wisdom: however, eventually even mentation declines”. He considered the question of whether or not older surgeons make “difficult surgery more difficult or risky surgery riskier?” Dr. Blasier discussed the number of years between a surgeon’s completion of medical training and residency and the chronological age of the practicing surgeon. Considering that most surgeons complete residency and enter practice in their early 30’s, Blasier suggested that “remoteness of any surgeon’s initial education can be estimated as his/her age minus 31 years”. Moreover, he pointed out that the content of medical education and training has changed over time. As an example, he stated that in the practice of orthopedic surgery nearly “every treatment technique taught 25 years ago has been abandoned and replaced”. Knee and hip replacement procedures have changed dramatically in recent years. Blasier stated that any surgeon 66 years or older would not have received any training in these procedures during his/her residency and that surgeons 40 years of age and older would not have been trained in residency in the current techniques of joint replacement. He pointed out further that all surgical subspecialties undergo changes in treatment and techniques which require changes in the surgeon’s skill set. Blasier noted that a few studies have examined the relationship between the surgeon’s age and the time when medical training and residency was completed. 20,21,22,23 Research over the past few decades has documented that aging causes deterioration in physical and cognitive skill, 12,24,25,26,27,28,29 and surgeons need to be aware of signs in declining performance. He stressed that, in general, surgeons have been reluctant to plan for retirement; and he cited Rovit’s reasons that surgeons forestall consideration of retirement including fear of death, lack of self esteem and resistance to change. 30
Blasier also discussed that in the United States there is no federally mandated retirement age for surgeons. In the United Kingdom, surgeons in the public health service were mandated to stop performing surgery at age 65 and retire from practice at age 70, until 2011 when the United Kingdom phased out mandatory retirement at age 65 for all professions. 31,32,33
In 2008, researchers from the University of Michigan reported results of their study on neuropsychological test performance in surgeons. 34 They compared the performance of 4th year medical students at the University of Michigan who had matched into their surgical residency program (n = 21) with a group of practicing surgeons (n = 308) between the ages of 45-75 years who attended the American College of Surgeons Clinical Congress between the years 2001 and 2004. The surgeon group was divided further into 2 groups: ages 45-60 (n = 139) and ages 61-75 (n = 169). They used the Cambridge Neuropsychological Test Automated Battery (CANTAB) to examine psychomotor and visual spatial abilities in the participants. 35, 36 They found a decline in performance with aging. The medical student group performed better that the surgeon group aged 45-60 and the surgeon group aged 45-60 performed better than the surgeon group age 61-75. They reported that all three groups performed better than their normative control groups. They concluded that psychomotor and visual-spatial abilities do not improve over time with surgical training and that they are present before one enters medical practice. Limitations of the study included the inability to assess the impact of surgical experience on procedural learning ability and the limited size of the medical student group. The authors pointed out that there has been increased interest in the impact of aging on procedural learning abilities and they recommend further longitudinal studies. We also note another shortcoming: There is no evidence in this study to determine the relationship between the psychomotor and visual-spatial skills studied and surgical outcomes.
Bieliauskas et al examined age, cognition and retirement in a group of senior surgeons. 37 They titled their study using the acronym CCRASS (cognitive changes and retirement among senior surgeons). Their study was designed to indentify parameters of cognitive aging in senior surgeons and the relationship to a surgeon’s decision to retire. The CCRASS study included a computerized battery of cognitive tasks that focused on visual learning, sustained attention and reaction time. Participants also were asked to complete a self report survey defining their surgical practice and plans for retirement. The study participants were surgeons 45 years of age and older who attended the annual meeting of the American College of Surgeons from 2001-2006. 359 surgeon volunteers (330 males, 29 females) were tested between 2001 and 2005 and 94 participants were retested in 2005 and 2006. 294 participants completed the survey. The mean age of participants was 61.4 years of age, and 62 surgeons reported that they had already retired.
The author’s computerized cognitive testing of the participants utilized 3 subtests from the CANTAB battery of tests, including PAL (paired associates learning), RTI (reaction time), and RVIP (rapid visual information processing). The CCRASS study showed an age-related decline in all cognitive tasks measured. The authors reported a significant relationship between a surgeon’s subjective awareness of cognitive changes and retirement status, but no significant relationship between subjective and objective measurements of cognition. The authors concluded that their study “supports the need for development of measures of functional aging that can be used over time to assist with decisions about retirement that are in the best interest of surgeons and the patients that they serve”.
In a report published in 2010, Drag et al discussed the results from the CCRASS study and their examination of objective cognitive function in senior surgeons in relation to age and retirement status. 38 The authors administered a computerized cognitive testing battery to a group of surgeons (n = 294), both practicing and retired. Based on age, the participants were divided into two groups: age <60 (n = 126) and age 60 and above (n = 168). Cognitive testing measured 3 tasks including attention, reaction time, and visual learning and memory. The results were compared between the two groups. The authors found that the majority (61%) of practicing surgeons age 60 and older performed within the same range as the surgeons less than 60 years of age on all three tasks measured. Seven of the 108 surgeons over age 60 performed significantly below the younger group of surgeons on more than one task. The authors also found that “age was negatively correlated with cognitive performance” and that the results showed variability in cognitive performance in the group of older surgeons. The authors concluded that “age alone can not predict cognitive competence”.
An article by pediatric neurosurgeon and past president of the AMA Peter Carmel, M.D. discussed aging physicians and retirement. 39 Carmel pointed out that traditionally, physicians “keep on working as long as we can” for multiple reasons including personal satisfaction, emotional stimulation and economic or financial need. He stated that retirement for physicians has generally meant reducing office hours and or reducing the amount or types of surgeries performed. At age 75, Dr. Carmel stated that he would find “slowing down” difficult in his surgical practice. He pointed out that the high costs of medical malpractice premiums and the cost of licensing are factors influencing surgeons’ need to continue working full time. Internists and general practitioners often continue working full time since “there is simply no one to take over their patient load”. Carmel pointed out that Americans are living longer than previous generations, and this has resulted in a larger population of senior citizens. He stated “this country is facing a shortage of doctors to meet the needs of our growing and aging population”. He stated further that senior physicians will “play a crucial role in filling this gap” and that the AMA will have an active role in “accessing, relicensing and credentialing senior physicians in the future”.
An article in the New York Times by Laurie Tarkan addressed concerns regarding physician performance with advancing age. 40 She stated that approximately 20% of physicians were over 65 years of age in 2008 according to AMA data, and that “baby boomers” are more likely to delay retirement at age 65 due to financial pressures and the current state of our nation’s economy, the fact that social security benefits may be in jeopardy and the need for seniors to purchase secondary personal health insurance in addition to Medicare. She pointed out that there is no standard measurement of competency and neuro-cognitive abilities in surgeons and that physicians and colleagues often do not recognize problems or changes in the aging physician’s performance until something bad happens, such as the death of a patient. She advocated medical and cognitive evaluation on a regular basis similar to the FAA’s requirements for pilots, pointing out that renewing medical licensure to practice requires completion and documentation of CME’s only. She advocated the need for more research to define an assessment process.
The American College of Surgeons Board of Governors Committee on physician competency and health published a set of guidelines “Being Well and Staying Competent: Challenges for the Surgeon” in 2013 that is available to members of the American College of Surgeons at www.efacs.org. The aging surgeon and determining when it’s time to retire from practice was addressed by Garrett and Kaups in a 2014 Bulletin of the American College of Surgeons. 41 Their article is an excerpt from the ACS guidelines published in 2013 as stated above. The authors discuss the impact of aging on cognition and performance of surgeons and remind us that there is no mandated retirement age for surgeons unlike pilots and FBI agents. They discussed the role of objective assessment of surgical skills and judgment; the reliability of self evaluation of skills and judgment; a surgeon’s opinion on when to retire; and the small number of surgeons who plan for retirement. Garrett and Kaups point out that surgeons, as a profession, need to assure the public of the delivery of safe care. The authors recommend psychomotor assessment and periodic medical evaluation of practicing surgeons between the ages of 62 and 75 years. They reported that Stanford University Medical Center enacted a policy recently, that requires physicians on staff age 75 and older to “have a physical examination, cognitive screening and peer assessment of…clinical performance every two years”; and that “if the findings point to potential concern for patient safety, the service chief and credentials committee will, on a confidential basis, consider the results and recommend further evaluation as necessary”. 42
Additionally, Garett and Kaups recommended that surgeons plan for the transition from active surgical practice both professionally and financially.
Sun reported in 2014 that, in addition to Stanford’s enactment of age related (<75 years) screening of physical health and cognition, a few other institutions across the United States have implemented similar programs including Driscoll Children’s Hospital in Corpus Christi, Texas and University of Virginia Health System. 43 Sun commented that according to national health care consultant Jonathan Burroughs, “only 5% of more than 900 US hospitals with which he has consulted have any active policy for screening older providers”. Sun pointed out that aging physicians and retirement age have been addressed outside the United States, as well. He stated that the mandatory retirement age for the physicians in Pakistan is 70 years, and in India retirement is mandated at age 65. In Italy, surgeons at age 65 are obligated to continue working in the health care sector rather than retire.
In 2014, Katlick and Coleman discussed the program developed and implemented at Sinai Hospital in Baltimore, Maryland, designed to evaluate objectively a surgeon’s physical and cognitive function. 8 The aging surgeon program is a two day comprehensive, multidisciplinary and objective evaluation of surgeon’s cognitive and physical function. The stated goals of the program are protecting surgeons from unreliable assessment of competency or cognitive ability; identifying reversible disorders whose treatment might improve function; aiding surgeons in the decision to retire; protecting patients from unsafe surgeons; protect surgeon and hospital liability risks and others. Their multidisciplinary team includes experts in neurology, neuropsychology, surgery, physical medicine and rehabilitation, geriatric surgery, internal medicine, ophthalmology, law and ethics. The program requires a pre-screening medical history. On day one of the program, the surgeon undergoes physical and neurological examination, a PT/OT (physical therapy and occupational therapy) evaluation of reaction time, fine motor function, coordination and other parameters, and neuropsychological testing. On day two, additional neuropsychological and PT/OT evaluations are performed, as well as an ophthalmologic evaluation and exit interview. The authors stated that the confidential final report includes only objective findings and offers no recommendations regarding surgeon’s operating privileges or retirement. The authors stated that the reports are sent in an encrypted, locked electronic file to whomever paid for the program i.e. surgeon, hospital or chief of surgery. The authors point out that their program balances patient safety and liability risks while maintaining the dignity of the surgeon and his or her value to society. The time for aging physicians to stop practicing due to competency issues remains a current topic in the literature. 44,45
Reviewing how performance issues have been addressed for pilots provides evidence and suggests strategies that might be applied to physicians. An article by Carroll published in 1992 and sponsored by the Air Force Research Laboratory/ RHA War fighter Readiness Research Division (Mesa, AZ) discussed situational awareness (SA), its definition and how to measure it objectively. 46 At the time of that publication Carroll reported that an Air Staff process action team had been created to address these questions. He stated that defining SA had been approached from many directions and resulted in different view points from operators and technicians. He stated that the Major Command (MAJCOM) training programs did not have a focus on SA. The programs that were in place included: (1) Cockpit Attention Task Management (CATM); (2) Aircrew Attention Awareness Management Program (AAAMP); (3) Cockpit Resource Management (CRM); and (4) Mission Oriented Simulator Training (MOST): and (5) Aircrew Coordination Training Program (ACT). He stated that emphasis had been placed on task management, spatial orientation, gravity-induced loss of consciousness (GLOC) and attention. He described SA as “appearing more as a collateral issue than a goal” of these programs, and Carroll proposed that “SA” should be “the umbrella under which applicable human factors research and training are pursued”. He suggested further that “human factor should be ordered into a supporting role under SA to support mission accomplishment”. He opined that an individual pilot’s experience and capabilities combine to support and maintain a given level of SA, and that SA is affected by the complexity of the mission, intensity of the threat, and the amount of distraction in and out of the cockpit. He provided the definition of SA that was developed by the Air Staff group as “a pilot’s, or aircrew’s continuous perception of self and aircraft in relation to the dynamic environment of flight, threats, and mission, and the ability to forecast, then execute tasks based on that perception”. Moreover, “it is problem solving in a 3 dimensional spatial relationship complicated by the fourth dimension of time compression where there are inherently too few givens and too many variables”. He stated that SA represents cumulative effects of everything that an individual is and does as related to flight performance. Carroll stated “since the mission of the Air Force is not to maintain spatial orientation, but to fly, to fight and win”, focusing on SA should provide a direction to that end point. Carroll suggested further that SA should be the focus of further research with an emphasis on how to “define, measure, select for and train human factors and SA”. These concepts and concerns seem directly applicable to surgeons and their activities and mission.
In 1997, researchers Bell and Wagg at the Armstrong Laboratory Aircrew Training Research Division in Mesa, Arizona used observer rating to assess situational awareness (SA) in Tactical Air environments. 47 The authors measured SA in operational fighter squadrons and in multiship air combat simulations. The research focused on 3 issues; Pilot’s definition of SA, the degree to which pilots can judge their fellow pilot’s SA reliably; and whether a relationship exists between their judgments and performance. For the purpose of their study, the authors used the Air Staff definition as SA as reported by Carroll. 44 They pointed out, however, that other definitions of SA were found in the literature. 48,49,50 The authors recognized that SA is complex and combines processes, tasks and linkages between them; and they stressed that it is difficult to separate SA from the aspects of skilled performance that have been used to determine proficiency in combat. Bell and Wagg’s research focused on whether pilots could use SA to classify fellow pilots validly. The authors, with the assistance of subject-matter experts (SME’s) and instructor pilots, developed a list of 31 behavioral elements of SA. They divided these 31 behaviors into 8 categories of mission; tactical game plan; tactical employment general; tactical employment-BVR; Tactical employment-WVR; information interpretation; and system operation. They limited their study to mission-ready F-I5C pilots. The authors developed 4 separate instruments to measure SA. The 1st instrument asked the pilots to define SA and based on their personal definition of SA, to rate the importance of each of the 31 elements of SA using a 6-point Likert Scale. The authors designed 3 Situational Awareness Rating Scales (SARS) with 3 different perspectives; self, peers and supervisory. Data were collected from 238 pilots from 11 squadrons at 4 different Air Force bases. 206 pilots provided their personal definition of SA. The results showed significant agreement on the definition of SA and the ranking of the 31 elements in terms of importance. The analysis of the peer supervisory SARS showed pilots could classify fellow pilots reliably and that squadron ratings of SA correlated with success of mission in simulated air combat missions. The authors proposed that their measures of SA could be used in a squadron’s operational training environment by both peers and supervisors to assist in classifying mission-readiness. Similar strategies seem appropriate for surgeons but have not been explored in depth, so far.
Hardy and Parasuraman at the Catholic University of America reported their comprehensive study of cognition and flight performance in older pilots. 51 The authors reviewed the literature from 1939-1996 on cognitive aging studies in pilots which found age-related differences in pilots’ perception, motor skills, memory, attention and problem solving/decision making skills. They reported that the measures being used, flight simulation and accident rates, were minimally predictive of flight performance and they pointed out that the majority of studies related to pilot cognition and or flying performance used a cross sectional design rather than longitudinal. The authors suggested that the modernization of highly automated aircraft has resulted in qualitative and quantitative changes in cognitive demand. Moreover, these changes in technology have created different types of “human error”, noting that 50-75% of fatal aircraft accidents are due to human error. 52 Many surgeons also are faced with technical challenges such as image guidance, robotic surgery, limited-access surgery, stereotactic surgery and other approaches; but these have not been incorporated into a formal evaluation assessment for aging physicians.
Hardy and Parasuraman provided a comprehensive review of the theories of cognitive aging, and the impact of cognitive aging on pilots’ flight performance. They noted that these theories were based on cognitive performance of non-pilots, both young and old. Theories they discussed included fluid verses crystallized intelligence; decline in inhibition; reduction in processing resources; cognitive slowing; and disuse or under-utilized cognitive abilities. They also questioned whether laboratory tests (static) of cognitive function are representative of real-work (dynamic) cognitive processes. Hardy and Parasuraman cautioned that simulated (SIM) studies are good in the assessment of flight proficiency, but flight performance is measured better by the number of incidences, violations and accident rates. The authors suggested that more research is needed and should expand upon the typical measurements of flight performance and cognitive skill. In other words “examine higher order age related differences in domain-dependant knowledge and to link these with pilot flight performance”. Such considerations make sense for doctors, too. However, waiting for “incidences, violations, and accident rates is not ideal”. It would be better to find valid predictors before a plane crashes or a patient dies.
Causse et al discussed cognitive aging and flight performance in general aviation pilots. 53 They stated that general aviation pilots are not considered professional pilots and thus are not regulated by the FAA’s age 60 (now 65) rule; 9 no age limit is specified. The authors examined how age-related cognition impacts pilot performance and ability to make weather-related decisions. Their study focused on three components; cognitive assessment (executive functioning); age of pilot and years of experience; and flight performance. They reported that cognitive assessment was superior to chronological age in predicting pilots’ fight performance due to flight experience and the variability of individual aging effects on cognition. It would not be surprising to find similar results among surgeons, but more research is needed.
An article titled
Assessment of Cognitive Functions
Neuropsychologists specialize in the science and clinical practice of cognitive evaluation and identifying brain-behavior relationships. Neuropsychological assessment involves the testing of multiple areas of cognitive functioning via a battery of tests. This often requires periods of 4-6 hours or longer. The purpose of this testing is twofold: to assess for current functional impairments as compared to an appropriate normative sample generally, and to collect ideographic cognitive functioning data against which an individual’s future test results might be compared.
While we believe that cognitive assessment of physicians is important, such lengthy testing may not be necessary for every evaluation. Nevertheless, if physicians are to undergo cognitive testing routinely, the evaluation protocol must be not only efficient but also standardized, valid and reliable. With these considerations in mind, we have attempted to construct a protocol to specifically assess those cognitive functions that we believe to be most important to the efficient practice of medicine in the most efficient manner available, utilizing tests and techniques that are well understood and well validated.
A starting point in determining the most germane cognitive domains was to review the existing neuropsychological literature on physician competency, fitness for duty and impairment. Domains identified in the research as important include various forms of attention, auditory and visual memory, fluid reasoning, quantitative reasoning, verbal reasoning, verbal fluency, visual-spatial reasoning, and processing speed. 53 -57 Within the auditory and visual memory domains in particular, we have selected tests that assess working memory, short-term memory and long-term memory. Tests assessing each of these functional cognitive domains are identified below. Our suggestions on how they should be used and studied follows the descriptions of the tests.
Auditory Attention and Concentration
Wechsler Adult Intelligence Scale 58 – Digit Span subtest (<5 minutes) – The digit span subtest asks the examinee to recall and repeat accurately increasingly long strings of digits from memory. It includes three tasks designed to test auditory working memory and attention: 1) a digits forward task; 2) a digits backward task; and 3) a digits sequencing task. The first is a test of simple auditory attention, while the latter two are tests of more complex attention and working memory. Taken as a whole, the subtest has high internal validity, high test-retest reliability, and good sensitivity to attentional and auditory working memory impairments. 59
Wechsler Adult Intelligence Scale IV Letter-Number Sequencing (<5 minutes) – Similar to the Digit Span subtest, the Letter-Number Sequencing subtest asks examinees to accurately recall, sequence and repeat a string of mixed letters and numbers from memory. It is a test of complex attention, working memory and mental control. It has high internal validity, high test-retest reliability, and is sensitive to attentional impairments.
Paced Auditory Serial Addition Task (Gronwall version) 60 (<10 minutes) –The PASAT is a challenging task that asks examinees to add mentally the last two digits that they heard spoken and say the sum, while simultaneously listening for and remembering the next digit in the sequence. The digit presentation increases in rapidity over time. It is a test of divided auditory attention, sustained attention, mental arithmetic, working memory and information processing speed. 61 It is a well-validated test with very high internal validity, adequate to high test-retest reliability and is sensitive to attentional impairments.
Visual Attention and Concentration
Ruff 2 and 7 (5 minutes) 62-63 – The Ruff 2 & 7 asks examines to scan quickly and accurately strings of numbers or letters sequentially while marking only the 2’s and 7’s. It is a test of sustained visual attention, selective attention and visual discrepancy analysis with high internal validity, adequate to high test-retest reliability, and good sensitivity to attentional impairments.
Symbol Digit Modalities Test 64-65 (Oral and Written Modalities) (3 minutes) – The Symbol Digit Modalities Test asks examinees to quickly and accurately record a series of digits associated with a set of symbols listed in a key at the top of the test form. It includes separate written and verbal recording components. It is a test of divided attention, simple visual attention and processing speed. It is sensitive to attentional impairments and includes a comparison of processing speed abilities between the oral and written presentations. Test-retest reliability has been found to be somewhat variable, however, and it is known to be sensitive to examinee fatigue.
Auditory/Verbal Memory (immediate and delayed)
Wechsler Memory Scale 66-67 – Logical Memory I and Logical Memory II (<15 minutes) – These tests ask examinees to listen to several stories and then repeat them back verbatim. They measure immediate and delayed recall of complex auditory information. Logical Memory I (immediate recall) has high internal consistency, while Logical Memory II (delayed recall) shows adequate internal consistency. Test-retest reliability for both is somewhat variable. Both show good sensitivity to memory disturbances in a variety of patient populations.
California Verbal Learning Test 68,69 immediate and delayed recall) (<20 minutes) – This test asks examinees to listen to a series of words and repeat them back over multiple trials. It measures encoding capacity for auditory information, immediate and delayed recognition of words, retrieval efficacy, learning strategy, acquisition rate, and proactive and retroactive interference. Test-retest reliability is high for overall scores. It is sensitive to memory disturbances in a variety of patient populations.
Visual Memory (working, immediate and delayed)
Wechsler Memory Scale IV 66 – Symbol Span Test (<10 minutes) - This test presents examinees with a series of increasingly long sequences of nonsense designs which they are required to recognize and select from a lineup of similar symbols. It tests visual memory span and visual working memory. Performance is scored for correct selection of symbols in the correct order after a brief presentation. The test has shown good discriminative sensitivity to dementia and brain injury versus normal patients. 70
Rey-Osterrieth Complex Figures Test 71-72 (8-15 minutes). This test briefly presents examinees with a complex 2-dimensional design which they must first copy accurately and then later draw from memory. It assesses visual-spatial constructional ability, as well as immediate and delayed visual memory. Although there are different forms with different scoring systems, it has adequate to high inter-rater and intra-rater reliability for total scores. The cognitive operations required for adequate performance include visual perception, visual-spatial organization, motor functioning, and, on the recall conditions, immediate and delayed visual memory. It is a reliable measure of visual-constructional ability (copy phase) and memory (recall phases).
Visual Processing Speed
Ruff 2 and 7 – see above
Symbol Digit Modalities Test – see above.
Trail Making Test A and B 73 (< 5 minutes) – This is a visual scanning, sequencing and processing speed test consisting of two parts: Part A requires simple scanning and accurate sequencing of letters, while Part B adds an alternating letter-number sequencing component. It is a test of mental flexibility, impulse inhibition and executive control. Test-retest reliability is adequate with good inter-rater reliability and good sensitivity to attentional impairments. 70
Auditory and Verbal Processing Speed
Paced Auditory Serial Addition Task (see above)
Controlled Oral Word Association Test < 5 minutes) –
This test asks examinees to produce spontaneously orally as many words as they can which begin with a certain letter. It has high internal validity, high test-retest reliability, and known sensitivity to traumatic brain injuries, dementing illness and frontal lobe impairments. Test-retest reliability is good, as is inter-rater reliability. 74 -76
Symbol Digit Modalities Test (see above)
Visual Constructional Skills
Rey Complex Figure Test (see above)
Clock Drawing Test 77-78 (<2 minutes) – This test asks examinees to draw from memory an analogue clock displaying a specific time. It is a screening measure for visuoperceptual and visuospatial impairments, working memory, executive and motor functions. Although there are numerous scoring methods in use, the various protocols tend to be highly correlated with one another. In the dementia and brain injury contexts, this measure shows moderate correlations with measures of temporal orientation, visual-spatial/visual-constructional skill, and with measures of executive functioning. Given its sensitivity and specificity in dementia assessments it is widely used as a quick “scan” for cognitive impairments in adults. 70
Quantitative Reasoning
Wechsler Adult Intelligence Scale IV 58-59,70 – Arithmetic subtest (<10 minutes) – This test consists of arithmetic problems presented in an auditory story format in order of increasing difficulty, requiring the examinee to discern relevant quantitative information and apply the correct mental calculation to solve a story problem. The task taps mental manipulation, concentration, attention, working and short-term memory and numerical reasoning. It has been shown to be sensitive to an array of cognitive problems, but is not considered a reliably specific indicator of any one set of cognitive impairments. 79
Higher Level Executive Functions (multi-tasking, cognitive flexibility, planning and problem-solving)
Paced Auditory Serial Addition Task (see above)
Rey Complex Figure Test (see above)
Trail Making Test A and B (see above)
Controlled Oral Word Association Test (see above)
Abbreviated Category Test 80 (20 minutes) – This test presents examinees with a visual puzzle and requires them to deduce a numerical solution based on very limited feedback. It measures abstraction or concept formation ability, mental flexibility in the face of complex and novel problem solving, and capacity to learn from experience. This abbreviated version functions in a similar fashion to the full length version used in the Halstead-Reitan in terms of its psychometric properties and discriminative ability. It has high reliability for samples of normal and brain-damaged adults with variable test-retest reliability. It is sensitive to a variety of brain disturbances and is almost as sensitive as the full Halstead-Reitan Neuropsychological Test Battery 65 in identifying the presence or absence of neurological damage. 70,79,81
Stroop Test – 82-83 (5 minutes) – This test presents mixed written and colored stimuli to examinees and requires them to rapidly read and articulate a controlled, often counter-intuitive response. It measures cognitive control, mental flexibility and suppression of habitual responses. It has moderate to high internal validity and high test-retest reliability. Its specificity with respect to certain impairments, especially frontal lobe injuries, is high, although its sensitivity is variable.
Conclusion
Assessing performance in aging physicians is important for patient safety. Disciplinary actions occur much more frequently against doctors who have been out of medical school for 40 years than against those who have been out for 10 years. 84 Moreover, performance on various outcomes decline as physicians age. 20 Unfortunately, there appears to be no relationship between an older physicians’ assessment of his/her cognitive skills and objective cognitive measures; 37 and older surgeons perform poorly compared with younger surgeons when performing complex operations. 17 Nevertheless, even the most recent efforts by the AMA have failed to result in specific recommendations. 85 Having valid, reliable objective predictors of performance decrement would be invaluable to those of us responsible for credentialing, to patients, and to aging physicians themselves. While several approaches have been proposed for cognitive assessment of aging physicians, none has been adopted widely. Perhaps more importantly, none has been applied over time to determine whether cognitive abnormalities detected are well compensated issues that have been present for a life time or new deficiencies related to aging. It is also possible that youthful physician performance may be at the upper end of the normal range and that “normal” results at the lower end of normal parameters actually may represent cognitive decline in an aging physician. Longitudinal testing will be needed to establish each physician’s baseline function and cognitive stability/change over time, so that more frequent testing during older years can be interpreted meaningfully. We concur with the suggestion of Richard Homan, MD (personal communication, November 15, 2015) that performance on competency based board precertification examinations over time (7-10 year intervals) might provide early warning of decline; and we encourage correlation of the measures that we suggest here with examination performance over a career. In addition, although there are some data from research on pilots suggesting that cognitive decline correlates with decreased pilots’ performance (tested typically with simulators), there few data correlating cognitive changes in physicians directly with deterioration in patient care in the operating room or the office. At present, although cognitive decline is of concern, care should be exercised not to assume that it correlates with clinically significant impairment in physician function (in the operating room, for example) in the absence of other evidence. Whether that evidence is obtained with cognitive test results with patient outcomes data, or with physician performance in simulators, we believe that it is essential that the correlations (or lack thereof) be established before cognitive or other test results are used to affect physician privileges.
We believe that the lack of information available in the literature is due to the complexity involved in obtaining valid, reliable and practical data; but we believe that the question is important enough to justify embarking on a multi-year, multi-institutional study of cognitive function (and associated changes), over time, correlated with physician performance. We have proposed a brief neuropsychological test battery that is still longer than we would like (close to 2 hours, as compared with the 4-6 hour routine comprehensive neuropsychological test battery); but we have selected the tests in this battery because we believe that it will provide sufficient sensitivity and validity to detect many age-mediated cognitive changes that are potentially relevant across a range of cognitive domains. We offer this proposed protocol for comment and criticism. We hope that the medical and neuropsychological communities can reach a consensus on a cognitive test battery that will be applied widely by multiple institutions so that a large volume of longitudinal data can be acquired during medical student years and continuing throughout the latest decades of life, along with physician performance data collected over time. Motor function data have not been addressed in this paper; but motor assessment is important especially for surgeons and other physicians who perform procedures. However, motor assessment and its integration with cognitive data into overall physician performance assessment are beyond the scope of this article.
This article has also not discussed assessment of frailty. Frailty has been used to predict outcomes in our patients 86,87,88 and requires a relatively simple, quick assessment. To the best of our knowledge, no one has performed frailty assessments in physicians to determine whether increasing frailty correlates with physician performance. Although this topic is also outside the scope of the article, we believe that acquiring motor and frailty date simultaneously with cognitive data might be of great value. For example, at present, we have no idea whether a simple frailty assessment might provide all the information that we need in assessing aging physicians. While this seems somewhat unlikely, there is no evidence either way. We hope that this article’s call for collaborative research into cognitive changes over time and their functional implications will lead to movement toward consensus on a cognitive test battery, and a national or international collaborative research venture possibly combined with motor and frailty assessments, that eventually will provide the kind of valid, reliable information that is needed to help understand and guide age-related changes in physicians, to optimize physician productivity in the later years of life and to protect our patients.
