Sage Journals: Discover world-class research

Abstract

In this paper I compare the methodology of two of the most famous epidemiological studies: The Midtown Manhattan Study (1952–60) and the Epidemiologic Catchment Area Study (1980–5). At first sight, there are few features that distinguish them; both were studies of large samples of the general population; they both used highly sophisticated methods of data analysis and standardized instruments; and they involved interviewers who were not professional clinicians. However, if we carefully compare the protocols that define how ‘clinical’ information is collected, we realize that some important changes in methodology were not only due to practical necessities, but also involved an important transformation in the role of the interviewer and the skills traditionally associated with the clinician.

Keywords

Clinical knowledge interview techniques mechanical objectivity psychiatric epidemiology socio-epidemiology

Introduction

In Objectivity, Lorraine Daston and Peter Galison (2007) characterize different regimes of objectivity throughout the history of science: truth-to-nature, mechanical objectivity, structural objectivity and ‘trained judgment’. According to the authors, these various regimes of objectivity, with all the practices, techniques, epistemic values and representations of the scientific self that they entail, do not necessarily succeed one another in the same order and at the same pace in all the sciences. One of the important aims of their book was to highlight the rise of mechanical objectivity at the end of the nineteenth century. Arthur Worthington’s work on the shape of splashes, first drawn with the naked eye in the late 1870s and then captured with a photographic device around the 1890s, is exemplary of the birth of this new ideal of objectivity, which promotes a ‘blind vision’ wary of all the distortions (now commonly called ‘biases’) created by the scientific subject. As the authors note, ‘Worthington’s conversion to the “objective view” is emblematic of a sea change in the observational sciences’ (Daston and Galison, 2007: 16).

In their book, the authors develop their analysis on a variety of disciplines, including botany, physics, chemistry, biology and astronomy. In a previous work, published in French (Demazeux, 2019), I used the concept of mechanical objectivity proposed by Daston and Galison to trace back the history of clinical observation in psychiatry from 1800 to 1950. As the historian George Weisz has pointed out, ‘observation was widely perceived as the essence of clinical science’ (Weisz, 1995: x) in general medicine, and psychiatry was no exception. It has always designed its clinical knowledge on the model of the observational sciences. However, the confidence in the art of painting unbiased clinical pictures, even by the best clinician masters (Magnan, Charcot, de Clérambault, Bleuler, Kraepelin, etc.), gradually eroded at the end of the nineteenth century. This became a major methodological concern for psychiatry from the 1920s onwards (Demazeux, 2019: 216–42).

The history of psychiatric epidemiology may appear as a magnifying mirror of this peculiar malaise that affected psychiatric clinical observation throughout the twentieth century. In the first generations of epidemiological studies a certain degree of uncertainty in psychiatric diagnosis was considered unavoidable or even indicative of some ideological bias to be taken into account (e.g. the effect of social class on the diagnosis of psychosis; see Hollingshead and Redlich, 1958); however, there was a growing concern within the psychiatric establishment after World War II about the unreliability of psychiatric diagnoses. This lack of reliability was empirically documented, and psychometricians were looking hard for technical measures to solve this problem and improve the objectivity of diagnostic assessment.

In this paper, I want to highlight a form of collective conversion to mechanical objectivity at the turn of the 1950s in the USA, very similar to that which Worthington experienced on a personal level at the end of the nineteenth century. By focusing on two of the most famous epidemiological studies, the Midtown Manhattan Study on the one hand and the Epidemiologic Catchment Area (ECA) Study on the other hand, the aim of my analysis is to show that the change in home-interview study methodology was not only a concern about practical necessities (limits of cost, feasibility, efficiency, speed, etc.), but also had to do with some dramatic change in epistemic norms, related to the growing distrust in the clinician’s observational abilities, as well as the emergence of a new conception of the psychiatric symptoms as purely objective and unambiguous facts. This dramatic change led to the advent of standardized interviews and to the revolution in the USA in the early 1980s with the third edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM) (American Psychiatric Association [APA], 1980). Robert Spitzer (1932–2015), one of the principal architects of the DSM-III, wrote an article (which is now famous) provocatively asking: ‘Are clinicians still necessary?’ (Spitzer, 1983). A couple of years earlier, as we shall see, a group of epidemiologists were already asking themselves this very same question.

The Midtown Manhattan Study (1952–60)

One of the most striking features of this landmark epidemiological study was its interdisciplinary character. Matthew Smith, who wrote a recent book on the history of social psychiatry in the USA, insists on its innovative character but also on its intrinsic fragilities: ‘Midtown was conceived of as merely the first of a long succession of studies that would gradually provide a firm foundation for both psychiatric epidemiology and mental health policy. This, however, was not to be the case’ (Smith, 2023: 167). Smith describes in detail the origins of the project, its main goals and all the obstacles it encountered. He attributes the relative failure of the project not only to the unexpected death of its original designer, Thomas Rennie (1904–56), but also to the academic rivalry between psychiatrists and social scientists in a context soured by McCarthyism.¹ More fundamentally, Smith considers that this distinctive epidemiological study reveals the strengths and limitations of all post-war social psychiatry researches: ‘Midtown shows how social psychiatry was never a unified, cohesive movement that spoke with one voice but rather a messy conglomeration of ideas that was derived through a complex and often tortuous process of negotiation and interdisciplinary translation’ (Smith, 2023: 194).

We are interested here in the methodological aspects of the Manhattan Study. Fortunately, they are well documented in the volume published by Srole, Langner, Michael, Opler and Rennie (1962), with the title Mental Health in the Metropolis: The Midtown Manhattan Study. The study was designed as a large-scale epidemiological survey to be carried out in the general population. The main goal of the project was ambitious: it consisted of conducting a ‘socio-epidemiological’ study through direct observation of a representative sample of Manhattan residents.

Thomas Rennie, who initiated the project, had been a student of Meyer. Trained at Pittsburgh and Harvard, he worked for 10 years at the Henry Phipps Psychiatric Clinic in Baltimore, Maryland. He was known to be a good clinician. In 1950, he became Professor of Social Psychiatry at Cornell University in New York. The following year, he submitted this project to the National Institute of Mental Health (NIMH), with the original goal to look ‘beyond the individual’ to the forces at work in the social environment.

The project received important fundings from both public and private institutions: the NIMH, the Millbank Memorial Fund, the Rockefeller Foundation, etc. To implement it, Rennie hired a senior sociologist from the University of Chicago, Leo Srole (1908–93). The two men began working on the methodology for the study when Rennie became seriously ill. He died in 1956, before the study could deliver its first results.

Alexander Leighton (1908–2007), a colleague of Rennie at Cornell University (who had also worked in Meyer’s department in Baltimore and had a background in anthropology), agreed to take over the orphaned project. Since 1948, Leighton had been involved in another epidemiological project that was to become famous: the Stirling County Study, which sought to study mental health in a small, defined area of Canada.²

From the start, the Midtown Study was designed as the combination of four intertwined research programmes: a Home Interview Survey, a Treatment Census, a Community Sociography and an Ethnic Family Operation. The most important and promising part of the whole project was the Home Interview Survey. One of the challenges in setting up this survey was to achieve a representative sample of the Manhattan Island population (age, gender, ethnicity, etc.), since ‘the goal of the Home Interview Survey to study the mental health of Midtown residents through direct observation could of course be accomplished only with a sample of its population’ (Srole et al., 1962: 32). This part was the expertise and responsibility of the social scientists. But it soon became apparent that the most difficult would be the clinical part of the survey. How would it be possible to obtain reliable and relevant clinical information about the mental health of all the individuals in such a large sample of the population?

Rennie and his colleagues were thinking hard about this issue. The main difficulty was conceived of as a kind of ‘dilemma’ between two priorities:

The dilemma in effect turned on which question had claim to higher priority in terms of advancing the Study’s main goal. On the one hand, if it were demanded that the methods of study should above all else follow the model of intensive clinical examination and diagnosis, then realities would operate to keep the sample small. On the other hand, if the Study’s objectives demanded that the sample be large, then realities would compel the adoption of research methods which departed considerably from the intensity of the clinical model. (Srole et al., 1962: 32)

For the sample to be representative of Manhattan’s population and for the survey to be statistically significant, social scientists calculated that the study should aim for a sample size of at least one or two thousand people. But to investigate such a large sample would mean that certain ‘inexorable consequences’ would have to be taken into account in the study design. For sure, it would be impossible to send hundreds of experienced clinicians to conduct a two-hour interview at the homes of so many residents. Thus, the designers of the study had to resolve some ‘conflicts between the ideal should and the practical must’. First, it was clear that ‘ideally’, ‘several sessions with each respondent would have been desirable’. But in practice, the study design would have to confine the interview to a single session. Second, it appeared that ‘ideally, of course, interviewing should have been done by psychiatrists’ (my emphasis). But here again, for obvious budgetary reasons, one would have to rely on non-clinician interviewers. Third, the authors acknowledged that:

ideally, the interview should have been conducted along the lines of what we might call the open style of interrogation, which allows the interviewer complete freedom to decide the substance, sequence, and wording of questions to be asked in each case. (p. 37)

In practice, though, interviewers would have to follow an interview instrument specifically designed to collect all the relevant clinical information.

In any case, there was the firm conviction from the origins of the project that the direct clinical observation by one or several trained psychiatrists would constitute the best method, and that the questionnaire interviewing strategy was to be seen as a second-best choice. The idea of replacing the free questioning of patients, as it was practised in ordinary clinical settings, by a structured questionnaire was far from new. It was already being discussed at the beginning of the twentieth century. For example, the French psychiatrist Gilbert Ballet (1903: 67) in his Traité de pathologie mentale mentioned the attempts of various European psychiatrists to produce ‘systematic questionnaires’ of a few dozen questions to interview patients. What was new in the middle of the twentieth century was the conviction that such questionnaires could now be used as a kind of objective ‘thermometer’ to routinely collect relevant clinical data. The use of standardized tools, at the time when the Midtown survey began, was about to become widespread in psychiatric research. In 1956, a biometrics unit was created within the New York State Psychiatric Institute; this is where Robert Spitzer, the future ‘father’ of the DSM-III, made his debut by developing standardized questionnaires (Decker, 2013: section II, 4; Demazeux, 2013: 63). Overall, the two world wars catalysed the development of military psychiatry, and with it the development of tools for detecting mental disease.³ Furthermore, the proliferation of psychological tests designed by psychologists in industry and in educational settings imposed new scientific standards and provided a general impetus which psychiatrists were inclined to follow.

In the case of the Midtown Manhattan project, Rennie and his colleagues were concerned about the fact that ‘psychiatrists [tend] to show the greatest distrust of data obtained by questionnaires or structured interviews’. In a way, the investigators involved in the project did agree on this point, but ‘practical constraints’ required that the clinical data had to be obtained using a standardized tool. A specific instrument was constructed: the Global Judgment of Mental Health (March and Oppenheimer, 2014). It consisted of a 120-item questionnaire developed from two different sources (the Army’s Neuropsychiatric Adjunct and the Minnesota Multiphasic Inventory). One of the original features of the tool was that it did not aim only at recording the mere presence or absence of a symptom: it was also aiming at assessing, when present, its greater or lesser severity. It is through the lens of this instrument that the investigators hoped to collect a detailed clinical picture (from the normal to the pathological) of 1660 residents in Manhattan.

If one delves into the details of the home survey protocol, it is striking to see how much the designers of the study were concerned with closing, as much as possible, the gap that they deplored between the ‘ideal should’ and the ‘practical must’ (Srole et al., 1962: 37). The designers took three explicit and important measures in order to limit it.

First of all, careful attention was given to the recruitment of the interviewers. They were all required to have some professional experience in conducting interviews. More decisively, the investigators wanted to recruit, whenever possible, people having some clinical experience, such as medical students, clinical psychologists, psychiatric social workers or social caseworkers. They were all recruited as ‘allied professionals who have technical experience in the methods of intimate interrogation’. Once they were recruited, the investigators also paid special attention to their intensive training before the beginning of the study. In any case, the interview should not turn into a routine, since the interviewers were considered as the decisive ‘actors’ for the richness of the collected data.⁴

It is interesting to note how much their role was considered to be quite active. During the conduct of the home interviews, the interviewers were instructed to take into account as carefully as possible any form of information (e.g. indirect or non-verbal) that might be useful or relevant. The following quotation highlights the kind of methodological precautions that were provided:

. . . the obvious risk in the questionnaire-guided inquiry is its capacity to turn into a perfunctory routine. This real danger was directly faced in the construction of the interview instrument itself, in the determination of the kinds of professionals to be invited to apply for the interviewing task, in the careful selection of applicants, and in the planning of the special indoctrination program to which they were exposed. There they were placed under a specific mandate not only to record the respondent’s answer to each prepared question and his spontaneous elaborations and asides, but to report observations of his behavior and to probe replies and comments that were either ambiguous or suggestive as possible openings to matters of further significance. (Srole et al., 1962: 38)

Secondly trainers placed great emphasis on the fact that data collection, while it should follow the form of a standardized questionnaire, should not turn into a ‘perfunctory routine’. For example, interviewers needed to be aware of the level of distrust, of aggressiveness, or, on the contrary, of the level of solicitude or even eagerness of the people they interviewed. The challenge was to obtain a sufficiently fine-grained general picture that would allow any psychiatrist to appreciate indirectly the value of the symptoms collected:

A person says that looking down from high places makes him nervous. This could be an indication of a very careful answer (most people are made at least a little uneasy under such circumstances), or it could be a clue to a phobic disorder. Decision in such a matter had to depend on the impact made by the rest of the responses on the rater. Observations by the interviewer of the respondent’s alertness, hostility, and level of interest had to be taken into account in judging the actual content of the responses to the questions. (p. 64)

In addition, the interviewers were asked to be vigilant regarding the various biases that could be introduced, particularly social class biases (Smith, 2023: 166). In any circumstance, the rigidity of the questionnaire in the Midtown Manhattan Study was seen as an obstacle, never as an asset. Even if he or she was not an experienced clinician, the common sense and observational skills of the interviewer were relied upon to work towards the scientific success of the study.

Last but not least, once all the data had been collected, the protocol stipulated that a professional psychiatrist should provide a clinical judgement on each file. Before Rennie’s illness prevented him from continuing the project, he appeared to be very doubtful about the soundness of this specific method that he was developing with his colleagues.⁵ In particular, he expressed the greatest doubts about the ability of a psychiatrist to formulate a clinical judgement, even a precarious one, about the mental health of a person whom he had not seen, whom he had not interviewed, and about whom he only had information gathered by a third party by means of a questionnaire. But after various tests had been carried out, and once he had been reassured by the several precautions mentioned for the training of the interviewers and the indications given to them for the conduct of the home interviews, Rennie was eventually convinced that the complete set of information available would allow a psychiatrist to formulate an acceptable clinical assessment.

It was thus planned that three experienced psychiatrists (Rennie, Price Kirkpatrick and Stanley T Michael) would formulate a clinical judgement, independently, on each of the 1660 files that the investigation would eventually produce. But with Rennie’s death in 1956, this task fell to Kirkpatrick and Michael. The two psychiatrists, in their account of their lengthy analysis, opined that ‘what they were doing had a meaning’ and that it was possible to make a rather good clinical appreciation for each case. Nevertheless, they felt this general warning was necessary: ‘Although conducted by psychiatrists and clinically oriented sociologists, the Study was not clinical’ (Srole et al., 1962: 327, my emphasis). The very indirect nature of the evaluation prohibited its strict characterization as a ‘clinical’ study per se. The authors made their scruples quite clear in this deliberately convoluted formula: ‘Throughout the volumes of this Study, the data must be evaluated as a rating of mental health based on the rating psychiatrists’ perceptions operating through a questionnaire instrument.’ (p. 66, my emphasis). This quotation admirably summarizes the spirit of the home interview survey.

It is not the purpose of this paper to comment on the results of this study. But we should note that the Midtown data, which made the headlines of several US newspapers in 1962 (Smith, 2021), were rather alarming: of the 1660 files examined, 398 showed individuals with serious psychiatric symptoms, and in a large proportion of the other files, the authors found indications of some marked psychological suffering. This would mean that more than 20% of the inhabitants of Manhattan had to be considered as suffering from serious mental disorders, and also that a large majority of them, nearly 60%, seemed to present some psychiatric symptoms without ever having had the slightest contact with a psychiatric institution.

The scientific community’s first response to these results was incredulity. How could we be sure that the sample was representative? Was the questionnaire used to collect the symptoms efficient and reliable? And what should we conclude? That the Manhattan population was atypical? That psychiatric illnesses were much more widespread than it was previously thought? Srole’s team undertook a long analysis and a cross-checking of the data. While it was true that the population of Manhattan was not at all representative of the US population (it is a rich and educated population, 99% white; Srole and Fischer, 1980: 210), nothing indicated that the data obtained were specific to this population. On the contrary, when cross-checked with other available data (e.g. the Stirling County Study, the Baltimore Study of Chronic Disease), there was no evidence of methodological bias. Furthermore, the various tests carried out by applying the same methodology to a sample of psychiatric inpatients seemed to confirm that the severity levels were set appropriately in the general population (Srole et al., 1962: 138–9).

After the death of Rennie, the Midtown Manhattan Study did not have the direct influence on the orientation of public health policies that it first promised to have (Smith, 2023: 181–94). But the results of the Study certainly had an indirect impact in the orientations of the Community Mental Health Act signed by President Kennedy in 1963, with the replacement of the old and overflowing asylums by a new type of care structure: the community psychiatric centres. Curiously, this original study, which gave great importance to clinical observation (even if only indirectly), retained a high reputation in the social sciences, but was quickly forgotten by psychiatrists, since it did not provide any useful data on specific mental categories, which were the new focus of all clinical research.

The Epidemiologic Catchment Area Study (1980–5)

The ECA Study took shape some 20 years after the first results of the Midtown Manhattan Study were published. For sure, psychiatry had changed a lot in the intervening years. Important advances had been made in psychopharmacology and in our understanding of the nervous system. Funding for neuroscience was already on the rise, while psychoanalysis was in decline. In the laboratories, genetic analysis and brain imaging techniques were opening up new avenues of investigation. In the public arena, the anti-psychiatry movement was dismissing the psychiatric establishment. When Jimmy Carter was elected president of the USA in 1977, one of his first initiatives was to set up a Presidential Commission on Mental Health. Once again, 14 years after Kennedy, mental health reform was a priority under a president from the Democratic Party.

However, there was still a lack of reliable data on the need for mental health care in the general population to accomplish that. In particular, no epidemiological study had yet been able to provide a satisfactory estimate of the specific prevalence of different categories of mental illness. Much data was available for an overall assessment of the mental health of the population, but there was no detailed information on the prevalence of schizophrenia, depression, anxiety disorders, etc.

It was this gap that the ECA Study, coordinated by the NIMH, sought to fill. The goal of this extraordinary project – one of the most ambitious epidemiological projects ever undertaken in psychiatry – was to conduct a fine and detailed survey of psychiatric symptomatology in a sample of 20,000 residents (more than 10 times the sample size of the Midtown Manhattan Study).

In order to be representative of the diversity of the whole American population, the survey was coordinated in five major university cities, each with its own sociological characteristics: New Haven, Connecticut; Baltimore, Maryland; St. Louis, Missouri; Durham, North Carolina; and Los Angeles, California. The study began in 1977 under the direction of Darrel Regier. The ‘practical constraints’ of conducting a clinical trial of this magnitude were far greater than in the Midtown Manhattan Study. However, a close reading of the study protocol (which was fortunately also very thorough and detailed) reveals a complete reversal of epistemological norms. What 20 years ago were perceived as ‘constraints’ were now set as ‘goals to be achieved’, and the epistemic goals (the ‘ideal should’) that the designers of the Midtown Manhattan Study tried to achieve were now seen only as sources of ‘biases’ to be eliminated (Eaton and Kessler, 1985: 135).

This is an essential point that may have been overlooked by most commentators about the ECA: the methodological rigidity of the ECA Study has been too often interpreted as a kind of concession by the designers to the technical necessities of the survey. It was certainly necessary to limit the cost of this very ambitious survey, and it was also necessary to ensure that the data collected would be reliable and collected with the same rigour at the various sites. But all these constraints, if they existed, were at the service of a new ideal of scientific objectivity that had emerged in the conduct of clinical research. This ideal corresponds precisely to the regime of ‘mechanical objectivity’ described by historians Lorraine Daston and Peter Galison (2007).

To understand the significance of this reversal, we need to look at how the standardized instrument that would be used to collect data for the ECA Study was designed. Lee Robins (1922–2009), a sociologist in the Department of Psychiatry at Washington University School of Medicine in St. Louis, was responsible for developing this instrument. Robins was already known for her work with young people and Vietnam veterans (Campbell, 2014), and she had been involved in several studies that required the use of structured interview questionnaires. Her book Deviant Children Grown Up: A Sociological and Psychiatric Study of Sociopathic Personality (Robins, 1966) had indicated an original way of dealing with clinical data from a sociological perspective. Lee Robins had no clinical experience in psychiatry, but she was married to an influential Harvard trained psychiatrist, Eli Robins (1921–94), who worked with Samuel Guze (1923–2000) and George Winokur (1925–96) on the development of diagnostic research criteria. It was under the wings of this triumvirate that the famous Feighner criteria were elaborated in 1972⁶ (Feighner et al., 1972). The ‘Renard School’ at Washington University would play a crucial role in the development of the DSM-III in the following years, and the ECA Study would serve as a living laboratory for testing the new operational criteria that were ready to be introduced.

For a large-scale survey such as the ECA, the choice of the right instrument was crucial. Lee Robins reviewed four existing instruments,⁷ but none seemed satisfactory, either because they required experienced clinicians to administer them, or because the data they produced did not allow diagnoses to be made easily. The Renard Diagnostic Interview (RDI), however, seemed to point the way forward: based on Feighner’s criteria, it would provide data compatible with the criteria of the forthcoming DSM-III.

After a long process of development and testing of several preliminary versions, Lee Robins presented the instrument that would be used for this unprecedented epidemiological home-interview survey: the Diagnostic Interview Schedule (DIS). The peculiarity of this instrument, which was to become the ‘chief instrument in contemporary studies in psychiatric epidemiology’ (Malgady, Rogler and Tryon, 1992), was that it could be used by lay-interviewers without any clinical experience. Its use made it possible for anyone, according its designers, to assess the presence, duration and severity of individual psychiatric symptoms.

The administration of the questionnaire, with several hundred questions, had to be completed during a session not exceeding one-and-a-half hours:

To meet the needs of the ECA program, not only was an interview required that could be administered by interviewers without clinical training, but the total interview had to be short enough so that it could be administered in a single contact, together with questions about the use of health services and demographic characteristics. (Eaton and Kessler, 1985: 144)

After a series of questions aimed at collecting general demographic information (marital status, age, level of education, employment, etc.) and then concerning the use of health services (frequency of general medical consultations, psychiatric consultations, number of hospitalizations, etc.), the questionnaire unfolded an endless series of questions of the type: ‘Have you ever considered yourself a nervous person?’ or ‘Have you ever had a spell or attack when all of a sudden you felt frightened, anxious, or very uneasy in situations when most people would not be afraid ?’. At no time was the interviewer expected to intervene other than to read the questions clearly and understandably and then record the answers given (yes or no; if appropriate, the respondent was asked to rate his or her answer on a scale of 1 to 5).

The difficulty, from a methodological point of view, was as follows: on the one hand, the questions read out had to be simple enough to be understood by anyone, whatever their level of education; on the other hand, they had to be sufficiently precise to allow the identification of the whole psychiatric symptomatology. It was also necessary to include examples that were sufficiently eloquent to evoke the appropriate memories in the respondent or to evoke the appropriate experiences for the questions asked. The financial objective was, of course, to avoid using experienced clinicians in a survey consisting of home visits to several thousand people. The instrument had to be easy to administer for the dozens of specially recruited interviewers at each site. But financial concerns were not the only ones. More interestingly, from a historical perspective, the development of the DIS reveals an epistemological concern for objectivity taken to its extreme rigour at a time when suspicions of bias and value judgements threatened psychiatric diagnosis. This concern can also be seen as the psychiatric establishment’s response to the growing influence of the anti-psychiatric discourse on the public scene, which originated in Europe but was also very influential in the USA during the 1970s.

The singularity of the DIS is still striking today. At first glance, it appears to be a perfect hybrid between two traditions. As Barbara and Bruce Dohrenwend (1982: 1275) noted:

It [the DIS] has drawn in part from the clinical examination tradition but does not rely on skilled clinicians and clinical ratings, and in part from the self-report tradition but does not rely on the dominant psychometric theories of measurement error.

However, if one looks closely at the construction and use of the DIS, one has to admit that the ‘clinical model’ that was central in the Midtown Manhattan Study was here explicitly and completely disregarded. The DIS has been constructed in such a way as to minimize the level of clinical judgement. Clinicians play no role at all (only at the outset, to establish the validity of the DIS questionnaire; and very occasionally during the survey, to check the reliability of the data collected).

First of all, the instrument was designed to suppress any form of subjective intervention. Far from giving the interviewer the slightest confidence in the additional observations they might make (as was the case in Rennie and Srole’s study), meticulous care was taken to ensure that at no point did the interviewer’s work require the slightest judgement or initiative. Thus, in contrast to the whole clinical tradition, which required tact, discernment and experience in the observation of mental symptomatology, it is striking how much the DIS was built on the suppression of any form of subjective intervention, whatever its nature. One could have trusted the investigators with some form of insight, or selected them on the basis of certain qualifications or prerequisites. In previous studies, the involvement of social scientists in data collection had been seen as an asset in providing a better sense of the context surrounding mental health and illness. From now on, great care was taken to ensure that the work of the interviewers would not involve any judgement, memorization or initiative. The interviewers were to be passive recorders of the answers given, without ever influencing them, without even trying to clarify them.

In addition, the DIS incorporated all the precautions that pollsters and sociologists of the time prescribed for the conduct of ‘interviewing methodology’:⁸ one had to be wary of biases in the formulation of questions and the choice of words, as well as the implicit effects that might be caused by the age, gender, social class or ethnicity of the interviewer. The role of the interviewer was strictly defined. The DIS contained no open-ended answers; every answer had to be recorded immediately; the order of the questions was fixed and could not be changed under any circumstances; it was forbidden to skip any question, even if it was redundant with what had just been said; and the rephrasing of questions, even in the event of misunderstanding, was extremely limited: ‘One rule specified that if a respondent misunderstood a question, the interviewer should re-read it as written, emphasizing the section the respondent misunderstood. Only if that failed, should the interviewer consider rephrasing the question’ (Eaton and Kessler, 1985: 154).

Understandably, the status of the interviewer was strictly reduced to that of what we would now call a ‘pollster’. Again, the contrast with the Midtown Manhattan survey is striking: whereas Rennie and his colleagues saw it as an advantage to be able to recruit people who had some clinical experience, the ECA recruiters wanted to select interviewers who had no clinical experience. At least the designers of the study could be satisfied that they would not contaminate the data by distorting it according to some theoretical filter (Eaton, Holzer and Von Korff, 1984). The investigators agreed that ‘a little learning could prove to be a dangerous thing in amateur psychiatry’ (Eaton and Kessler, 1985: 82–3).

From then on, interviewers were recruited on the basis of criteria designed to ensure only very general skills, such as the ability to read aloud, to write legibly, to follow a set of instructions accurately, and so on. Once recruited, the lay interviewers had to undergo a short training course (about 50 hours) but without any familiarization with psychopathology. The training consisted mainly of learning how to follow the DIS instructions carefully and then simulating some interviews under the supervision of the trainers.

The volume edited by Eaton and Kessler (1985) contains rich and precise information on many aspects of the use of lay interviewers: their social status (mostly students or women in their forties looking for a part-time job), the number of interviews they are allowed to conduct per day, their pay ($4–5 per hour or $35 per completed interview), and even the measures taken to prevent general discouragement. Once again, it is important to note that the complete absence of any clinical background was seen as an asset rather than a liability for the success of the study.

The interviewers were provided with a DIS manual containing different kinds of ‘probe flow charts’ and ‘decision trees’ to help them during the interviews, especially in case they encountered an unexpected difficulty (e.g. if the respondent could not speak for himself, etc.). No initiative was tolerated; ‘If in doubt, call’ was the general instruction given to the interviewers. They were never considered as partners in the richness of the survey, but only as potential sources of error. This would have an impact on the quality of the survey, since the explicitly routine and repetitive work of the interviewers would lead to a considerable dropout rate during the course of the study (incentives were specifically put in place to combat this high rate).

Another novel feature of the DIS concerns the analysis of the data collected. As we have seen, in the protocol of the Midtown Manhattan Study, the clinician’s judgement was all-important. But with the ECA Study, the collection of symptom data was no longer even remotely linked to any form of clinical judgement. Once the data were collected, the greatest originality of the DIS – undoubtedly the most revolutionary! – was that it was based on the processing of an algorithm specifically designed for the study. It would provide in a mechanical way a diagnosis according to the DSM-III inclusion and exclusion criteria; in other words, there was no clinician at the end, but instead a computer to provide an objective diagnosis.

There was much technical discussion in the literature about the strengths and limitations of the DIS and, in particular, its scientific validity. In the absence of a ‘gold standard’ in clinical psychiatry, the only way to measure its validity was to measure the reliability of data obtained by lay interviewers versus clinician interviewers. Even then, the standard represented by the knowledge of experienced clinicians was not perceived as quite satisfactory:

Ideally, to study the validity of an instrument, one would like to have an absolute standard to compare it against. In this study, we have appointed the DIS in the hands of a psychiatrist as a yardstick against which we are measuring its performance in the hands of a lay interviewer. We would have preferred to have used an independent instrument as that yardstick, but the difficulty was in finding another measure that covered the full range of DSM-III diagnoses made by the DIS. (Robins, Helzer, Croughan and Ratcliff, 1981: 389)

Many kappa studies⁹ were carried out to compare the performance of different types of investigators: the DIS administered by a lay interviewer versus the DIS administered by a psychiatrist (Eaton and Kessler, 1985: 294–5; Helzer et al., 1985); the DIS diagnoses versus the psychiatrists’ diagnoses made after a traditional clinical interview (Anthony et al., 1985); and, even later, the DIS administered by a human versus self-administered by a computer (Blouin, Perex and Blouin, 1988; Erdman et al., 1992; Greist et al., 1987; Wells, Burnam, Leake and Robins, 1988). To infer validity from reliability seemed to many commentators to be a specific theoretical difficulty (Malgady et al., 1992). The problem was raised by Jean Endicott (1981) and Robert Spitzer (1983), two of the architects of the DSM-III, who feared long before the first results of the ECA Study were published that they would be clinically insignificant, with the risk of overestimating the true prevalence of many mental disorders in the general population.

The designers of the DIS were well aware of the limitations of their instrument, but they were confident of its ability to detect the presence of most mental symptomatology: ‘The lay interviewer giving a DIS does not necessarily get more forgetting or denial than does a psychiatrist’ (Eaton and Kessler, 1985: 166–7). Robins and her colleagues offered a curious explanation for this possibility: ‘Perhaps, the psychiatrist’s greater skill is balanced by the fact that the respondent worries less that the lay interviewer will see more in his remarks than he intends, and so is less guarded’ (pp. 166–7). In other words, it meant that the quality of clinical information did not depend, according to the authors, on the accuracy of any particular observation, but merely on the respondent’s confidence to talk. In a discussion about the limitation of the DIS, Robert Spitzer illustrated with a touch of irony this complete powerlessness of the clinician’s skills: ‘The DIS interviewer obtains all the clinical information about the subject from the subject’ (Spitzer, 1983: 402). This was tantamount to asking the question: ‘Are clinicians still necessary?’

Conclusion

The promotion of standardized clinical interviews in the late 1970s was anything but anecdotal, especially in the USA. In just a few years, the status of these instruments went from ‘second best choice’ to ‘gold standard’. While on the world stage different kinds of psychiatric epidemiologies were developed within distinct political-cultural regions (Lovell and Oppenheimer, 2022), the contrast in style between the Midtown Manhattan Study and the ECA Study reveals a profound shift in the epistemological norms of clinical investigation. In the space of 20 years, a revolution had taken place in the USA that is comparable to the one made by the English scholar Arthur Worthington at the end of the nineteenth century. The context was different, but the revolution was the same: a transition took place from a form of investigation, in which human judgement was considered indispensable for the correct assessment of the clinical value of symptoms, to a form of ‘blind vision’ which gives an instrument full confidence in the quality of the information collected. From a theoretical point of view, we can say that the very ideal in the DIS methodology was to rely on this kind of ‘blind vision’, a form of ‘mechanical objectivity’ modelled on the new technologies introduced in the natural sciences at the end of the nineteenth century. Instead of the traditional emphasis on the clinician’s own judgement, the priority was the mechanization of the process of interviewing. In other words, mechanical objectivity was seen as a goal for obtaining an unbiased representation of clinical information. The development of computer science was at the same time already tending to reduce the diagnostic activity to mere ‘problem solving’ that could be carried out methodically by an algorithm.

The brief comparison we have made between these two emblematic epidemiological studies, the Midtown Manhattan Study and the ECA Study, allows us to draw three theoretical conclusions about the recent history of psychiatry.

First of all, the results of this comparison are in line with the idea put forward by Barbara and Bruce Dohrenwend that there was a whole ‘generation’ change between the Midtown Study and the ECA Study. The latter is quite representative of what the Dohrenwends have usefully termed the ‘third generation’ of epidemiological studies. It certainly relegates the first generation of studies – all those carried at the first half of the twentieth century and involving surveys based on asylum statistics – to the ‘prehistory’ of psychiatric epidemiology. But it also marks a break with the second generation of epidemiological studies, which includes some 60 surveys carried out between 1945 and 1975, and which were characterized by greater methodological refinement, a wider field of investigation (researchers no longer confined to asylums but invested in the community) and the international dissemination of results. Most of the major epidemiological studies conducted during this period, such as the Stirling County Study, the Baltimore Study and the Midtown Manhattan Study, used standardized instruments to facilitate data collection and interpretation. But whenever a psychiatric diagnosis had to be made, the expertise of a team of psychiatrists was called upon. By contrast, what was quite new about the ECA Study, emblematic of the emergence of the third generation of epidemiological studies, was the reversal of the ‘ideal should’ and the ‘practical must’. Whereas the ‘ideal’ in the Midtown Study was set by the judgement of a trained psychiatrist, the DIS did not give individuals, let alone clinicians, any heuristic role.

A second conclusion is that this generational change also introduced what can be called a ‘split in the matrix of psychiatric epidemiology’ (Demazeux, 2014). There was clearly a change of spirit between the two generations of epidemiological studies: the first type of socio-epidemiological studies emphasized social variables, stress indicators, well-being and psychological distress, and the results were understood in a quantitative way, usually as a continuum from normal to pathological. These were replaced by studies which emphasized the validity and inter-rater reliability of precise categorical diagnoses; the methods used were modelled on those of general epidemiology, with particular attention being paid to individual risk factors. In other words, the socio-ecological approach (as embodied by the Midtown Manhattan Study) was replaced by a strict medical-centred approach (as embodied by the ECA Study).

But this change meant, above all, a rivalry of methods. One of the strengths of the ECA Study is that it made it possible to bridge the long-standing gap between clinical research and psychiatric epidemiology by using a novel standardized tool, the DIS. It was the first survey in psychiatry to attempt to reconcile both the public health model and the medical model. The paradox, however, is that the results of the study have led to a rapid divorce between epidemiology and nosology. Thus, the operationalized criteria of the DSM-III enabled a new generation of epidemiological studies to emerge, but these, in turn, challenged the very principles of the DSM. Between the two, the figure of the clinician was gradually fading. On the epidemiological side, there was no longer any indication that his expertise was really needed. On the clinical side, as Spitzer (1983) himself acknowledged in the conclusion of his article, the ball was in the clinician’s court to prove that he was still useful in ensuring the validity of psychiatric diagnoses.

The third conclusion is even more fundamental: it concerns the conception of the psychiatric symptom and the role of the clinician. In the ‘Debate on psychiatric epidemiology’, which can be found in several issues of the Archives of General Psychiatry in 1980, Myrna Weissman and Gerald Klerman (1980) felt that the ECA Study indicated a stronger commitment to the ontological view of old European clinical psychiatry:

Psychiatric epidemiology, along with other fields of mental health research during the immediate post-World War II period, rejected the classic 19th century view of diagnosis and classification associated with Kraepelin and Bleuler, and adopted the view that there was a continuum (or spectrum) from mental health through mental illness. (pp. 1423–4)

The authors explicitly placed the new position of the ECA Study under the patronage of Kraepelin and Bleuler, as opposed to the socio-epidemiological style of the Midtown Study. However, apart from this ontological consideration, one must admit that the Midtown interview design was much more ‘Kraepelinian’ in spirit than the ECA interview design, since Kraepelin repeatedly asserted throughout his life that: ‘One cannot depend upon the patient for accurate observations as to whether or not he is of a sad, sunny, seclusive, or irritable disposition, or given to fanaticism or morbid frivolity.’ In other words, the clinician ‘has to depend rather more upon observation than upon interrogation of the patient’ (Kraepelin, 1907: 110). Well ahead of the question of whether or not the DSM-III deserves to be considered ‘neo-Kraepelinian’ (Decker, 2007; Klerman, 1978), it can be said without paradox that the Midtown Study was quite classical-Kraepelinian in its conception of clinical observation. The designers of the Midtown Study were explicitly adhering to an old axiom of clinical teaching that dates back to the nineteenth century: ‘Do not reduce the role of the clinician to that of a patient’s secretary’ (Falret, 1864: 105).

On the side of the ECA Study, what is most striking is the new conception of the psychiatric symptom that was put forward. This study was said to be ‘symptom-centred’ in the sense that its first direct aim was to collect all mental symptoms relevant to a pathological condition. The production of psychiatric diagnoses was secondary, and only made by a computer. Yet nowhere in the rich literature on ECA can we find a discussion of the notion of ‘symptom’ or a clear definition of what might count as a ‘psychiatric symptom’. The traditional, complex and ambivalent structure of the psychiatric symptom (its relation to the clinical context; its polyvalent function; its changeability and plasticity) was progressively replaced by a restrictive conception of the symptom as an objective fact whose reality depends exclusively on the patient. This conception strictly follows the simple definition found in most psychiatric textbooks from the 1970s onwards, for example: ‘Symptoms are what patients tell you; signs are what you see’ (Woodruff, Goodwin and Guze, 1974: x–xi). From then on, as the psychiatric knowledge was cruelly lacking in signs, psychiatric symptoms would only get their truth from a first-person perspective.

Of course, this shift in the conception of the mental symptom would not leave the figure of the clinician unaffected. In a paper entitled ‘The Midtown Manhattan Longitudinal Study vs. “The Mental Paradise Lost” doctrine’, Srole and Fischer (1980) used the data from the Midtown survey to attack a view commonly held by psychiatrists and psychoanalysts of their time – their main target being Erich Fromm (1900–80) – namely, the idea that mental health tends to deteriorate as civilization progresses. The two sociologists criticized this finding as not only false, but also based on a gender bias that would make women more susceptible than men to the modern yoke of mental illness. Data from the Midtown Manhattan Study obtained in two surveys, 20 years apart (Midtown I in 1954 and Midtown II in 1974), confirmed that this is not the case, and that women’s subjective well-being even tended to improve with their progressive emancipation. The doctrine of the ‘lost mental paradise’ was based on nothing. This is one of the most important positive results of the Midtown Study. We can conclude that in the meantime, in the 20 years between the two surveys, it was also the figure of the clinician who had lost his mental paradise.

Footnotes

Acknowledgements

I would like to thank Emmanuel Delille and Matthew Smith for all their insightful comments, which helped to improve the text.

Declaration of conflicting interests

This paper is based on a chapter from my book L’éclipse du symptôme, published in 2019 in French, and the publisher Les Éditions d’Ithaque has given permission for this.

Funding

The author received no financial support for the research, authorship and/or publication of this article.

Notes

References

American Psychiatric Association [APA] (1980) Diagnostic and Statistical Manual of Mental Disorders, 3rd edn. Washington, DC: American Psychiatric Association.

Anthony

Folstein

Romanoski

, et al. (1985) Comparison of the Lay Diagnostic Interview Schedule and a standardized psychiatric diagnosis: experience in eastern Baltimore. Archives of General Psychiatry 42(7): 667–675.

Ballet

(ed.) (1903) Traité de pathologie mentale. Paris: Octave Doin.

Blouin

Perez

Blouin

(1988) Computerized administration of the diagnostic interview schedule. Psychiatry Research 23(3): 335–344.

Campbell

(2014) The spirit of St Louis: the contributions of Lee N. Robins to North American psychiatric epidemiology. International Journal of Epidemiology 43(Suppl. 1): i19–i28.

Cohen

(1960) A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20(1): 37–46.

Daston

Galison

(2007) Objectivity. New York: Zone Books.

Decker

(2007) How kraepelinian was Kraepelin? How kraepelinian are the neo-Kraepelinians? – from Emil Kraepelin to DSM-III. History of Psychiatry 18(3): 337–360.

Decker

(2013) The Making of DSM-III: A Diagnostic Manual’s Conquest of American Psychiatry. Oxford: Oxford University Press.

10.

Demazeux

(2013) Qu’est-ce que le DSM? Genèse et transformations de la bible américaine de la psychiatrie. Paris: Ithaque.

11.

Demazeux

(2014) Psychiatric epidemiology, or the story of a divided discipline. International Journal of Epidemiology 43(Suppl. 1): i53–i66.

12.

Demazeux

(2019) L’éclipse du symptôme. L’observation clinique en psychiatrie: 1800–1950. Paris: Ithaque.

13.

Dohrenwend

(1982) Perspectives on the past and future of psychiatric epidemiology. The 1981 Rema Lapouse Lecture. American Journal of Public Health 72(11): 1271–1279.

14.

Eaton

Kessler

(eds) (1985) Epidemiologic Field Methods in Psychiatry: The NIMH Epidemiologic Catchment Area Program. London: Academic Press.

15.

Eaton

Holzer

Von Korff

, et al. (1984) The design of the Epidemiologic Catchment Area surveys: the control and measurement of error. Archives of General Psychiatry 41(10): 944–948.

16.

Endicott

(1981) Diagnostic Interview Schedule, reliability and validity. Archives of General Psychiatry 38(11): 1300.

17.

Erdman

Klein

Greist

, et al. (1992) A comparison of two computer-administered versions of the NIMH Diagnostic Interview Schedule. Journal of Psychiatric Research 26(1): 85–95.

18.

Falret

(1864) Des maladies mentales et des asiles d’aliénés. Paris: Baillière J.-B. et fils.

19.

Feighner

Robins

Guze

Woodruff

Winokur

Munoz

(1972) Diagnostic criteria for use in psychiatric research. Archives of General Psychiatry 26: 57–63.

20.

Greist

Klein

Erdman

, et al. (1987) Comparison of computer- and interviewer-administered versions of the Diagnostic Interview Schedule. Hospital & Community Psychiatry 38(12): 1304–1311.

21.

Helzer

Robins

(1988) The diagnostic interview schedule: its development, evolution, and use. Social Psychiatry and Psychiatric Epidemiology 23(1): 6–7.

22.

Helzer

Robins

McEvoy

, et al. (1985) A comparison of clinical and Diagnostic Interview Schedule diagnoses. Physician reexamination of lay-interviewed cases in the general population. Archives of General Psychiatry 42: 657–666.

23.

Hollingshead

Redlich

(1958) Social Class and Mental Illness: Community Study. New York: John Wiley & Sons.

24.

Hyman

Herbert

(1954) Interviewing in Social Research. Chicago, IL: University of Chicago Press.

25.

Klerman

(1978) The evolution of the scientific nosology. In: Shershow

(ed.), Schizophrenia: Science and Practice. Cambridge, MA, and London: Harvard University Press, 99–121.

26.

Kraepelin

(1907) Clinical Psychiatry: A Textbook for Students and Physicians, abstracted and adapted from the 7th German edition of Psychiatrie (translated by Diefendorf

). London: Macmillan.

27.

Lovell

Oppenheimer

(eds) (2022) Reimagining Psychiatric Epidemiology in a Global Frame: Toward a Social and Conceptual History. New York: University of Rochester Press.

28.

Malgady

Rogler

Tryon

(1992) Issues of validity in the Diagnostic Interview Schedule. Journal of Psychiatric Research 26(1): 59–67.

29.

March

Oppenheimer

(2014) Social disorder and diagnostic order: the US Mental Hygiene Movement, the Midtown Manhattan study and the development of psychiatric epidemiology in the 20th century. International Journal of Epidemiology 43(Suppl. 1): i29–i42.

30.

Richardson

Dohrenwend

Klein

(1965) Interviewing: Its Forms and Functions. New York: Basic Books.

31.

Robins

(1966) Deviant Children Grown Up: A Sociological and Psychiatric Study of Sociopathic Personality. Baltimore, MD: Williams & Wilkins.

32.

Robins

Helzer

Croughan

Ratcliff

(1981) National Institute of Mental Health diagnostic interview schedule: its history, characteristics, and validity. Archives of General Psychiatry 38(4): 381–389.

33.

Smith

(2021) Getting on in Gotham: the Midtown Manhattan Study and putting the “social” in psychiatry. Culture, Medicine, and Psychiatry 45(3): 385–404.

34.

Smith

(2023) The First Resort: The History of Social Psychiatry in the United States. New York: Columbia University Press.

35.

Spitzer

(1983) Psychiatric diagnosis: are clinicians still necessary? Comprehensive Psychiatry 24(5): 399–411.

36.

Srole

Fischer

(1980) The Midtown Manhattan Longitudinal Study vs. ‘The Mental Paradise Lost’ doctrine. Archives of General Psychiatry 37(2): 209–221.

37.

Srole

Langner

Michael

Opler

Rennie

(1962) Mental Health in the Metropolis: The Midtown Manhattan Study. New York: McGraw-Hill Book Company.

38.

Weissman

Klerman

(1980) Debate on psychiatric epidemiology – reply. Archives of General Psychiatry 37(12): 1423–1424.

39.

Weisz

(1995) The Medical Mandarins: The French Academy of Medicine in the Nineteenth and Early Twentieth Centuries. Oxford: Oxford University Press.

40.

Wells

Burnam

Leake

Robins

(1988) Agreement between face-to-face and telephone-administered versions of the depression section of the NIMH Diagnostic Interview Schedule. Journal of Psychiatric Research 22(3): 207–220.

41.

Woodruff

Goodwin

DWM

Guze

(1974) Psychiatric Diagnosis. Oxford: Oxford University Press.

From the Midtown Manhattan Study to the Epidemiologic Catchment Area Study: the advent of mechanical objectivity in psychiatry

Abstract

Keywords

Introduction

The Midtown Manhattan Study (1952–60)

The Epidemiologic Catchment Area Study (1980–5)

Conclusion

Footnotes

Acknowledgements

Declaration of conflicting interests

Funding

Notes

References