Sage Journals: Discover world-class research

Abstract

This article examines the rise and decline of the enthusiasm for intelligence testing in early twentieth-century China, focusing on the appeal, the challenges, and the critiques revolving around this psychological instrument. The introduction of intelligence testing reflected not only China’s urgent needs in modernizing its merit system, but also Chinese psychologists’ aspirations for pursuing exactitude and redefining the racial characteristics of their compatriots against foreign interpretations. But despite psychologists’ endeavors, the political and geographical fragmentation of Republican China troubled the epistemic imperative of uniformity demanded by Euro-American psychometrics and therefore undermined the validity of measurement. Subsequently, the legitimacy of intelligence testing began to be questioned by several influential Chinese psychologists in the late 1920s and 30s. The difficulties in standardization and the hostility within the psychology community formed a vicious cycle, impeding the progress of nationwide testing. Through this history, the article demonstrates not only the elevation of measurement to epistemic authority in modern China, but also how its promise was challenged by a diverse and rapidly changing society.

Keywords

Intelligence testing measurement race merit state China

Introduction

The burgeoning of modern psychology in the early twentieth century exerted a tremendous influence on the formation of racial discourses and meritocratic systems in the Western world. In the East Asian context, however, the roles of psychology in constructing people’s self-understandings and differentiating populations have been underexplored. For China in particular, only a few recent studies have considered the developments of mental hygiene, industrial psychology, and behaviorism and their relations to the state and society in the Republican era.¹ Yet we still do not know much about the process through which the general population became psychological subjects in this period. This article takes intelligence testing – a psychological instrument widely used today in China, Taiwan, and many other countries around the world – as a case study, tracing its development in early twentieth-century China. It examines the reasons why intelligence testing was at first enthusiastically embraced by Chinese psychologists and the difficulties they later faced when measuring Chinese intellectual ability.

The Chinese case offers a vantage point to provincialize the experiences of the West, especially the United States, in the history of psychometrics.² Over the past decades, the development of intelligence testing and its profound influence in the United States has been detailed in the works of Stephen Jay Gould, John Carson, and many others.³ As these scholars observe, psychometrics provided a scientific means of sorting people by numbers, ultimately shaping people’s perceived worth and fostering certain racial stereotypes. However, due to their research focus, the experiences of different ethnicities with intelligence testing have often been understood within the framework of American racial politics, in which Caucasians, African Americans, and Southern and Eastern European immigrants occupy the center. Recently, scholars have begun to explore East Asians’ encounters with this Western psychological instrument. Their studies on Western measurements on Chinese subjects reveal that, with the increasing presence and competition of East Asians on the global stage, intelligence testing served as “the defence of existing hierarchies of race and ethnicity.”⁴ Nevertheless, a story from the side of the Chinese researchers has yet to be adequately told. Echoing recent academic collaboration in shifting psychometric practice from the metropoles and situating its uses in local contexts, this article is intended to uncover the voices of those Chinese psychologists.⁵ Instead of being passive receivers, they questioned their Western counterparts’ characterization of Chinese mentality. Meanwhile, they also appropriated testing to address domestic issues related to meritocracy and debated its value for Chinese society.

Through the examination of Chinese mental testers’ practices, this history also aims to revise a standard narrative of the Chinese development of measurement given by psychologists. This narrative portrays the history of intelligence testing in China as a collection of various achievements, from ancient inventions of the tangram and the imperial examination system to contemporary standardized IQ tests. The Republican era (1911–49) is usually depicted as an in-between stage occupying the least space in their accounts.⁶ This article, by contrast, argues that the Republican era was a critical turning point and deserves more attention. Its significance does not lie in how successful the achievements were, but in how Chinese people started deploying testing to separate themselves from the traditional world and started taking part in the global enterprise of psychology. While the standard narrative focuses on the outcomes, this history highlights the bumpy process in which psychologists were struggling to adapt and indigenize intelligence testing. As scholars have noted, long before the flourishing discussion about indigenous psychology in the Sinophone world since the mid-1970s, Chinese psychologists in the 1920s and 30s had already delved into the matter of indigenization, but their efforts were largely limited to the pages of psychology journals.⁷ Among these efforts, adapting intelligence testing was arguably the most conspicuous example. Almost every mental tester at the time recognized that it was impossible to directly emulate the tests from the West. They therefore called for the construction of Chinese versions of tests, a process which included translating original questions, designing new questions, and creating norms of the Chinese population. Nevertheless, these tests did not gain widespread adoption after publication, and many of them were discontinued within a few years. This article thus provides a close examination of this phenomenon and asks: What factors hindered the indigenization of intelligence testing and prevented its wider use?

Most of the factors identified in this article were related to psychologists’ inability to manufacture uniformity in a fragmented China. The relationship between the modern state and modern science has frequently been described as being mutually reinforcing. But what about practicing science in a fragmented state? How did the conditions of the newly founded Republic of China affect the trajectory of psychometrics? China, from the imperial times, was characterized by huge geographical diversity, with each region or province having its own customs, languages, and cultural orientations. In the early twentieth century, as China began to vigorously pursue modernization, the uneven development between urban and rural areas further widened the gap in people’s lived experiences and their understandings of the world. After the end of dynastic rule in 1912, China was divided by warlords. These regional power holders governed their domains in their own ways and constantly fought against each other. Even after the Nationalist Party ostensibly reunified China in 1928, the control of the central government remained restricted and the Japanese invasion during the 1930s and 40s further divided the country and impeded the realization of many modernizing projects.

Scholars have recognized that state capacity was an important dimension in explaining the fate of a certain kind of science in modern China.⁸ The case of intelligence testing can illuminate this aspect in especially revealing ways, since the fragmentation was not only an external factor influencing psychologists’ activities but also a technical factor challenging the intrinsic logic of psychometrics. The prerequisite for effective psychological measurement is the maintenance of “uniformity.” As Lewis Terman, the pioneer of American mental testing, noted in his widely referenced book The Measurement of Intelligence, “much attention was given to securing uniformity of procedure” when he was constructing the Stanford–Binet Intelligence Scale. Intelligence testing is a population-based science that deals with variations, unlike natural sciences, which generate knowledge in a laboratory or a selected field, or through modeling or postulation; or some human sciences that predicate their arguments on in-depth case studies. Thus, variations in intelligence testing, ideally, only make sense when the whole practice is done on a uniform basis. The objective construction of tests requires recruiting a large number of subjects from multiple sites who follow the same procedure, and these subjects “should be as nearly as possible representative of the several ages.”⁹ When tests are put into actual use, the examiners and test-takers are also expected to perform in standardized ways to avoid inconsistent evaluation. More importantly, since the procedure is initially operated within a targeted population, the uniformity approximated in one place by no means guarantees that it can be directly transplanted to another context. Possibilities are always there that the original design is not compatible with new social and cultural circumstances. As such, it is not hard to imagine that a fragmented state like early-twentieth-century China would pose significant challenges to the imperative of uniformity.

This article begins by exploring the causes that motivated Chinese psychologists to enthusiastically embrace intelligence testing. It traces the intersection of psychologists’ academic interests and their personal experiences, and describes the institutional and ideological backgrounds in which intelligence testing was embedded. The next section moves on to the process in which psychologists began to adapt the tests to the Chinese context in the early 1920s and the difficulties they encountered. To achieve exactitude, testing required multilateral coordination and standardization from construction to actual use. However, the fragmented conditions of Republican China to a greater or lesser extent undermined the validity of testing. This led to the fading of optimism among the psychology community in the late 1920s and 30s. The final section therefore examines the shift of attitudes toward intelligence testing and the critiques raised by several influential Chinese psychologists against its legitimacy. The rise and decline of intelligence testing demonstrates not only the elevation of measurement to epistemic authority in modern times, but also how its promise was undermined by a diverse and rapidly changing society.

The necessity of measuring Chinese intelligence

During the first two decades after the establishment of the Republic of China, intelligence testing was introduced to China primarily from the United States. In the early 1920s, the emerging Chinese psychology community held high hopes for intelligence testing. In the inaugural issue of China’s first psychology journal, Xinli, around half of the published articles were related to standardized testing, with intelligence testing being the predominant topic. This fervor, later referred to as the “testing movement” (ceyan yundong), reflected the strong desire of Chinese psychologists at the time to measure the intelligence of the Chinese population accurately. Chen Heqin (1892–1982) and Liao Shicheng (1892–1970) were two pivotal figures in this testing movement. They were among the first generation of psychologists who returned to China after studying in the United States. Their influence on the early development of psychology in China was profound, and their coauthored book, The Methods of Intelligence Testing (Zhili ceyanfa), published in 1921, was the first comprehensive work introducing intelligence testing to Chinese readers. To a large extent, their accounts were representative of the early attitudes of Chinese mental testers. From their life experiences and academic paths, we can clearly see that two main causes motivated their enthusiasm about intelligence testing: one external, stemming from the stimuli of foreign racism; the other internal, driven by China’s institutional break from the imperial era.

Racial sensitivity was embedded in Chen Heqin’s and Liao Shicheng’s mindset from a very early stage. Chen arrived in the United States in 1914 and spent his undergraduate years studying liberal arts at Johns Hopkins University. In 1917, he entered Columbia University Teachers College to pursue a master’s degree in education. Teachers College at the time was one of the most renowned institutions in educational psychology and its faculty was comprised of a group of influential psychologists, such as John Dewey, Edward Thorndike, and Robert Woodworth. It was also the main training place for Chinese students in educational psychology, educational theories, and educational administration. It has been estimated that nearly half of the theses related to the field of education written by Chinese overseas students between 1914 and 1960 came from Teachers College.¹⁰ During those years of learning, Chen witnessed racial prejudice and hostility. He wrote in his autobiography that he had never forgotten the humiliation he suffered two days prior to his departure from China when a white foreigner punched him in the chest as he blocked his way. He at that moment thought he would face far more humiliation after setting foot in a foreign country. When his friend in Baltimore wanted to find a student lodging house, repeated rejections were received due only to his Chinese identity. Chen also found that not only Americans but Chinese immigrants themselves held misconceptions and stereotypes of Chinese people. Nevertheless, Chen’s academic training also transformed some of his own ideas about race. He once supposed African Americans to live with a low cultural standard, but after he joined an investigation trip organized by Columbia professor Paul Monroe to the south, he realized from the manners of the black students that they were no different to those of the white and East Asian races.¹¹

In 1918, Chen got a PhD offer from the department of psychology and the dissertation topic assigned to him was “The Comparison of Intelligence among Races.” Originally, the plan was to spend six months in San Francisco measuring the intelligence of seven or eight races living there, and then to spend another six months analyzing the data. However, at that point Chen was about to exceed the duration of overseas study allowed by his funder, Tsinghua College, and his application for extension got delayed by prolonged administrative process. Meanwhile, he received a job offer to teach at Nanjing Higher Normal School, a leading teacher training institution in Republican China. After consideration, he decided to give up this race psychology project and returned to China in 1919.¹²

Like Chen, Liao Shicheng encountered several unusual social interactions that had to do with his Chinese identity. He recounted in a memoir how he was misidentified as a “Jap” by his Western classmates and how he was tested for his language competence by a customer when he had a part-time job in a restaurant. When the Treaty of Versailles led to the transfer of the German holdings in Shandong to Japan, he felt especially furious about the indifferent response of a Westerner with whom he conversed. “I don’t know how many times my anger was provoked during the time abroad,” he said.¹³ Yet, in terms of his early academic career, Liao Shicheng was more fortunate than Chen Heqin. He went to Brown University in 1915 to study with the educational psychologist Stephen Colvin. Later, he obtained a teaching position at Nanjing Higher Normal School in the same year as Chen Heqin, but before this he had already finished his data collection in Rhode Island, allowing him to write up his PhD dissertation while working in China. He eventually graduated in 1921 and his topic was the measurement of moral senses and its correlation to intelligence. Although racial comparison was not the main theme of the research, Liao noticed that the subjects from an Italian district had the lowest score, which he was inclined to explain away by their difference in English proficiency.¹⁴

Both Chen’s and Liao’s psychometric research involved the analysis of racial factors, which reflected the prevalence of race psychology in the West, and they were well aware that the Chinese mentality could not be exempt from interracial comparisons. During that period, there was a popular view that the Chinese mind was distinct from and less rational than the Western mind. Max Weber, for example, famously claimed that “Chinese thought has remained rather stuck in the pictorial and the descriptive. The power of logos, of defining and reasoning, has not been accessible to the Chinese.”¹⁵ Psychologists in the United States and Canada further added purported numerical evidence to these contrasts. They measured Chinese students’ intelligence in the “Oriental School” and other semi-segregated public schools or conducted research in China.¹⁶ Despite the variability of results in different tests, they flagged up certain tendencies: the Chinese were deemed good at rote memory and arithmetic, whereas they were poor at abstract thinking and logical relations.

For Chen, Liao, and many other Chinese mental testers, Chinese mental characteristics should not be solely judged by foreigners. If the Chinese could successfully implement intelligence testing in China, they could generate more solid “scientific facts” to verify or refute foreign perspectives and gain a better understanding of their racial position in the world. Chen and Liao, in their coauthored textbook, thus claimed that the study of intelligence between the white and the East Asian race could predict the future development of East Asian peoples. They also said that comparing intellectual differences within the East Asian race also helped to answer why the Japanese were able to hold sway over Asia. Besides, tests could verify whether the Northern Chinese or the Southern Chinese – and from which province – were smarter.¹⁷ Similarly, another American-trained psychologist, Lu Zhiwei, declared:

When foreigners say that Chinese characteristics are like this or like that, we always blame them for making statements based on subjective prejudice. Taking intelligence as a case in point, according to foreigners’ statements, some consider that the IQ of the overseas Chinese is higher than that of natives, others claim that Chinese students are good at remembering but poor in comprehension. Whether their sayings are right or wrong is secondary. Can we ourselves use facts to correct their arguments?¹⁸

But apart from confronting racism, solving Chinese institutional problems was an even more pressing issue. Both Chen and Liao left the United States in a rush because of China’s urgent demand for educational expertise. China had abolished the thousand-year-old imperial examination system in 1905 in an attempt to fundamentally modernize the country. This essay-based system was blamed for placing excessive emphasis on classical learning while neglecting the practical needs of the country. It also left candidates with limited vocational skills, as their primary aspirations were centered on personal recognition and securing official positions.¹⁹ Reformers therefore called for replacing it with new schools that were expected to provide universal and useful education to Chinese people. However, the immature new education created as many problems as it wanted to solve: not only did new schools widen the gaps between rich and poor, urban and rural, but they were often criticized for their lack of standards due to frequent changes in teaching materials and the lack of flexibility to adjust for individual learning needs.²⁰ More importantly, to replace the essay assessments allegedly based on examiners’ qualitative and subjective comments, Chinese reformers had to figure out a new way to properly assess merit.²¹

Chen and Liao’s expertise in psychometrics quickly gained the attention of Chinese reformers, who saw testing as the most immediate means to assist in this tumultuous transition.²² Chen and Liao, along with other experts in intelligence testing such as Zhang Yaoxiang (master’s in psychology from Columbia in 1916) and Lu Zhiwei (PhD in psychology from Chicago in 1920), were invited to join the Chinese National Association for the Advancement of Education (Zhonghua jiaoyu gaijinshe, hereafter CNAAE), the largest unofficial educational organization consisting of reform-minded educators in the country. While the warlord politics had weakened the central authority of the Ministry of Education, the Association played a significant role in shaping Chinese education. One of the leaders of the Association was Guo Bingwen, the president of Southeastern University (formerly Nanjing Higher Normal School), who had recruited Chen, Liao, and Lu to return to China for teaching and was now keen on supporting their research on testing. In conjunction with their universities, CNAAE provided Chinese mental testers with essential funding, mainly derived from membership fees, donations, and government subsidies. After 1924, CNAAE secured further funding from the China Foundation for the Promotion of Education and Culture (Zhonghua jiaoyu wenhua jijinhui), an American-supported foundation financed through the Boxer Indemnity.²³

While the old merit system in China had focused on the selection of political and social elites through multiple levels of examinations, intelligence testing was advertised by these psychologists as being able to create a “democratic” merit system via universal testing. Their ideal was to provide “equality of opportunity” for all Chinese people, allowing them to obtain appropriate education and the occupations they deserved with the help of psychometrics. Chen and Liao claimed that intelligence testing could benefit school administration because it offered scientific standards for class grouping: instead of relying solely on age, the testing results indicated the level of ability and difficulty of learning, therefore serving as a reference for determining whether students matched their current grades and whether they deserved special education. It also offered students who had been assigned their places a chance for self-inspection. By comparing their intelligence test scores (a reflection of mental ability) and their school achievement (a reflection of the outcome of learning), students would know if they had reached expected standards and if they had made appropriate efforts. Furthermore, intelligence testing was considered to be of great use for vocational guidance. According to Chen and Liao, job selection in the past had relied heavily on personal recommendations from given social networks, often resulting in unfitness for work and therefore hindering the progress of China’s industrial and commercial sectors. By contrast, intelligence testing could enable employers to make objective decisions regarding hiring and promotion, and prevent employees from wasting their talents in unsuitable career choices.²⁴ In the time of warlordism, Chinese mental testers’ vision for democratic meritocracy to an extent strengthened the reformers’ belief that they were doing what Barry Keenan calls “producing and sustaining the conditions for a democratic polity.”²⁵

In the background of confronting racism and solving institutional problems, there existed a wider ideological context of rescuing the Chinese from imprecision and arbitrariness. Starting from the nineteenth century, China was criticized by Westerners for lacking state-wide standards for language, currency, and units of measurement, and its people were derided for their ignorance of certain basic numerical knowledge. The American missionary Arthur Smith, who spent more than half a century in China, characterized Chinese people as having a “disregard of accuracy.” He claimed that in China, “the regulation of standards is a thing which each individual understands for himself” and the Chinese “can ill comprehend the mania which seems to possess the Occidental, to ascertain everything with unerring exactness.”²⁶ This then became part of Chinese intellectuals’ self-critique of the “national character.” The eminent scholar Hu Shih, in his satire, coined the word “Mr. Close-Enough” to denote the ugly Chinese habit of being ambiguous.²⁷ For Chinese proponents of intelligence testing, it was numbers that offered Chinese people a chance to possess clear and exact knowledge about their own minds: only through numbers could one exactly describe and analyze intellectual ability. In his authoritative textbook Psychological and Educational Measurement, Wang Shulin, a professor of psychology at National Central University, drew an analogy between “the ruler for measuring physical objects” and “testing for measuring mental quality.” He praised intelligence tests for making the matter of description “clear and simple” since their advent.²⁸

Another scholar, Zhang Yaoxiang, a professor at Peking Normal College, also believed that intelligence testing symbolized an objective future in opposition to the arbitrary past. In an article on the origin of intelligence testing in the inaugural issue of Xinli, Zhang set up a stark contrast between “assertion by arbitrariness” (wuduan) and “determination by law” (fading). For him, the former was the consequence of relying on subjective descriptors. He condemned terms such as “wonder child” (shentong), “the child is cultivable” (ruzi ke jiao), and “rotten wood cannot be carved” (xiumu buke diao), which had been widely used by Chinese people to describe talent in the past thousand years, for lack of consistent connotations. The ways and occasions in which these words were spoken were so random and varied from one interlocutor to another that there was little consensus. By contrast, “determination by law” referred to an objective judgment that “everyone knows what it means and agrees with it,” best exemplified by “units of measurement” or “legal codes determined by the public.” This principal, according to Zhang, was what research on intelligence testing aimed to achieve.²⁹ In other words, what made the current psychometric method different from the past was that it was based on a law – a kind of commensurability – that was recognized by every citizen, rather than on a single person’s verdict.

Taken together, the ideological and instrumental concerns permeating the newly founded Republican state created a niche for the rise of intelligence testing in the 1920s. An exact measurement of Chinese intelligence not only promised to benefit Chinese educational reform, but helped Chinese people understand their racial advantages and shortcomings without reliance on foreigners’ generalization. Such an attempt also proved that China was entering an era of exactitude in virtue of quantitative scientific methods. In the following years after the first generation of psychologists returned to China, testing became a compulsory subject in most education and psychology departments. College students were dispatched to elementary and middle schools in Beijing, Tianjin, Nanjing, and Shanghai, among other locales, to undertake internships or assist in testing research. A few universities, such as Nanjing Higher Normal School and Yanjing University, also included intelligence tests in their entrance examinations.³⁰ Nevertheless, as they moved from aspirations to actual practices, psychologists soon discovered the discord between this new psychological instrument and the society in which they were embedded. The cultural and political conditions in early-twentieth-century China presented a multitude of challenges to the progress of intelligence testing.

The challenges of standardization

The period between 1922 and 1923 was a critical moment in the development of intelligence testing in China, and also a moment when the challenges posed by a fragmented state were systematically revealed. Since intelligence testing was introduced, Chinese psychologists had become increasingly aware that Western tests were far from suitable to be directly translated and applied to the Chinese context. To accurately measure Chinese intelligence, they needed to adapt test questions into Chinese versions or design new tests, and to establish norms specific to the Chinese population.

Chinese psychologists thus embarked on the projects of indigenizing testing. After conducting some preliminary trials, they realized that it would be immensely helpful to have a more experienced psychometrician to advise on their work. Therefore, at the first annual meeting of the CNAAE in July 1922, a group of psychologists, including Chen Heqin and Zhang Yaoxiang, proposed to invite a foreign expert to China to instruct on the construction of nationwide intelligence tests.³¹ After only two months’ contact and preparation, they had the American psychologist William McCall in China.³² McCall had taught at Teacher’s College since 1915 and, hence, had a close connection with Columbia-trained Chinese intellectuals.³³ Many of them believed that this transpacific collaborative effort would accelerate the advance of intelligence testing in China.³⁴

McCall’s arrival indeed provided a strong boost to Chinese mental testers’ confidence at the onset. He was enthusiastically welcomed wherever he visited and lectured.³⁵ His classroom was always full of hundreds of listeners and his teaching style was inspiring. One member of the audience recalled being impressed by “his energetic spirit, robust physical strength, extremely different from those dying Chinese professors.”³⁶ During his visit, McCall spent six months staying in Beijing and another six months in Nanjing, working with Chinese psychologists based in these two areas to construct intelligence tests and educational tests that suited the Chinese context.³⁷ The construction of tests in a new context involved two major steps to reach standardization before moving over to wider use. First, a set of questions should be tested on a select group of students to determine which questions were a good fit for a particular age group. If a question was too easy or too difficult for students of that age to answer, then it should be excluded from the test. Second, a test norm should be set up according to the distribution of the scores that the test-takers got, which would signify the typical performance of that population. Once the norm was obtained, psychologists would have a means to interpret and compare degrees of intelligence.³⁸ However, from the accounts of McCall and Chinese psychologists, it becomes clear that the process of standardization in China encountered a number of great challenges that hardly existed in the original Euro-American context.

The first challenge psychologists in China encountered was the antagonism between regions. When McCall’s first official meeting with Chinese psychologists took place in Beijing, he recorded, “the armies of North China and Central China were fighting around the walls of Peking [Beijing].”³⁹ This politically fragmented condition made nationwide scientific projects seem impractical and sometimes even dangerous. Many foreign experts around the same period had expressed their concerns and hesitancy. For example, the China Medical Board supported by the Rockefeller Foundation reported its discouragement of developing a large-scale public health work due to “frequent changes in the government.”⁴⁰ Foreign archaeologists risked being accused of spying when they traveled across different provinces.⁴¹

Fortunately for McCall and his Chinese colleagues, key regional power holders widely recognized the importance of using science to deal with educational problems. The previous tours of two of McCall’s colleagues at Columbia, John Dewey and Paul Monroe, only a few years earlier had enjoyed great popularity. Many warlords showed their willingness to converse with the educational experts from the United States.⁴² Even though sometimes they were just paying lip service, at least they presented less hostility toward this kind of region-crossing activity. Yet McCall was still keenly aware of the potential threat of this in-between state, as he wrote, “the greatest danger that the measurement program faced was that it would enlist the support of one geographical section of China and the consequent antagonism of all other sections.” He also insisted that his Chinese colleagues should make final decisions themselves, in case the project was accused of receiving “undue foreign influence.”⁴³ McCall’s worries were not ungrounded. Considering his host, the CNAAE, was an underfunded organization, this nationwide project was undoubtedly a heavy financial burden. When they ran out of funds halfway through it, they had no choice but to accept the help of General Qi Xieyuan, a warlord of the Zhili Clique who had been assisting educational affairs during his governance of Jiangsu area.⁴⁴ McCall’s one-year plan was accomplished thanks to General Qi’s endowment, but the close relationship between General Qi and these Chinese education experts eventually led to one of the key members, Guo Bingwen, being forced to step down from his post as President of Southeast University in 1925 owing to political manipulation by the Nationalist Party.⁴⁵

While political conflicts threatened a nationwide project institutionally and financially, the educational rivalry and cultural differences between regions further complicated the pursuit of standardization. Since the universities in Beijing and Nanjing were major sites for constructing the tests, the competition between their students and staff caused by the distinct intellectual atmospheres of the two cites was a sensitive issue. As Ming-Fui Pang suggests, the southern and northern academic networks stood for very different ideological orientations, the influences of which lasted even longer than the political divisions. While intellectuals in Beijing were much involved in the New Culture Movement, advocating popular culture, Marxism, and Dewey’s experimentalism, academics in Nanjing often criticized the agenda of the New Cultural Movement, upholding elitist culture and Irving Babbitt’s New Humanism.⁴⁶ This contrast had shaped the identities of the members of both groups to a great extent, and perhaps nothing can better exemplify this than the prominent historian Guo Tingyi’s recollection of his experience of choosing which university to attend. Guo was a student at Southeast University High School in Nanjing. He obtained high scores on intelligence tests and deeply impressed the director, Liao Shicheng. One day, when he talked to Liao about his plan to study undergraduate courses at Peking University, Liao showed disapproval, saying: “You should reconsider it. The academic climate in Beijing is not good. The studies here are more solid.” His fellow townsmen also warned him that Beijing University was in turmoil, was full of petitions and fights, and did not have any discipline.⁴⁷ This binary tone, which centered around the differences between north and south and favored one over the other, was common during this period.⁴⁸

This competition between China’s two great cities became a significant problem when decisions had to be made about which party would set standards. The creation of standardized terminologies was doomed to compromise. McCall’s solution was to have Nanjing review the terminology for measurement and statistics and have Beijing standardize the psychological terminology.⁴⁹ Obviously, these decisions were largely expedient. One of the participants expressed his unease and lamented the unlikeliness of gathering psychologists from northern and southern China to discuss their opinions about translation.⁵⁰ The result was that after the terminology lists were released, they were frequently challenged and had no prescriptive power. For example, when Zhu Junyi at Southeast University published his Chinese Translations of Terminology in Statistics and Measurement in 1923 at the request of McCall,⁵¹ another scholar, Jin Guobao, questioned the appropriateness of several translations of statistical terms, such as series, probability, and quartile deviation.⁵² The terminology of measurement had a similar fate. Even more than ten years after Zhu’s publication, Chinese mental testers were still drafting lists of translations of key terms.⁵³ It was not until 1937 that the first de facto nationwide standardized edition of psychological terminology was officially announced.⁵⁴

Worse still, ineffective communication happened not only among psychologists but also between examiners and test-takers. One basic assumption of intelligence testing in its early years was that intelligence was closely correlated to mental processing speed.⁵⁵ Many kinds of intelligence tests required subjects to answer each part of the questions within a very limited timeframe without delay. In this regard, following instructions accurately and instantly became a requisite for measurement and it is clear that this requisite could not be fulfilled if examiners and test-takers did not share a common language. McCall and Chinese psychologists could not find a suitable way to deal with the problem of dialects. The dialects in different parts of China were so different that mutual intelligibility could hardly be achieved and was seen by many reformers as an obstacle to national integration. Despite efforts by Republican elites to promote the national language – Mandarin – it took a long time for ordinary people to familiarize themselves with this alien language.⁵⁶ Moreover, psychologists often faced a dilemma in choosing proper examiners. On the one hand, there were well-trained examiners from universities, but they were located in major cities a hundred miles away from the survey sites and their spoken language seemed foreign to local test-takers. On the other hand, local examiners spoke the dialect fluently, but few of them had received adequate training in conducting the tests.⁵⁷

In addition to phonological issues, the structure and vocabulary of dialects also hampered comprehension. Take the Chinese version of the Binet-Simon Intelligence Scale, for example. It was first revised by Lu Zhiwei in 1924 as part of the collaborative project with McCall. During the revision, Lu recruited about 1,400 students in Jiangsu and Zhejiang regions to standardize the scale, and the outcome seemed quite satisfactory. However, when psychologists tried to apply it to other regions in subsequent years, they found that some words and sentences in the scale were too regionally specific to be understood by those test-takers. As a result, Lu Zhiwei and his colleague Wu Tianmin had to spend another five years in the 1930s making the second revision. This time, they replaced the phrases in Jiangsu and Zhejiang dialect with the language of Beijing and believed that the newly constructed items were more representative and thus more universal.⁵⁸ It was true that this revision made the scale work much better in their investigation in Beijing. But did it really yield equally satisfactory results in other regions? The question remained unanswered. As Wu Tianmin vaguely conceded, although they had initially planned to include some southern regions into their investigation, they eventually called it off due to “various difficulties.”⁵⁹

To ameliorate the influence of language, some psychologists pinned their hopes on nonverbal tests, which primarily rely on diagrammatic and pictorial questions. The Liu-Bradshaw Non-Language Intelligence Test represented this type of indigenous effort. It drew inspiration from various Western tests such as the Army Beta and Pintner’s Non-language Test, but the images were adapted in patterns familiar to Chinese students, including Chinese faces, hairstyles, and clothing, as well as depictions of common animals and activities.⁶⁰ Yet this strategy still had to face another significant and frustrating problem – age reckoning. Since intelligence was measured by comparing test-takers’ performance to the specific age group to which they belonged, inaccurate information about age risked skewing the norm for each age group and further distorting the interpretation of test results.⁶¹ People in China had long been using lunar dates to mark important life events and schedule their daily and ritual activities. Even after the Republican government officially adopted the solar calendar as the natural accompaniment to the new political system, a great many ordinary people still stuck to the lunar calendar and constructed their lives in accord with the old customs.⁶² They counted age not by full years as in the Western practice, but saw a baby at birth as being one year old and becoming two at the lunar new year.⁶³ This meant that, on the one hand, Chinese children’s age counted in a traditional way was at least a year in advance of their Western counterparts; on the other hand, two children born just a few days apart would appear to have a one-year gap if the lunar new year was between their birthdays.

Almost every textbook and instructional article on intelligence testing in the Republican period emphasized the necessity to clarify the test-taker’s age. These pieces gave detailed accounts about how traditional Chinese age was distinct from Western age and recommended the methods of converting the traditional age to Western age.⁶⁴ However, the actual situation was not as ideal as the written words conveyed. In fact, traditional age and Western age were just simplified terms. According to a Chinese author’s survey, there were at least five or six kinds of age-calculating systems prevalent in different parts of China: some adopted a parallel system that counted age both in nominal years and in full years, whereas others set various dates other than the lunar new year for adding a year to their age. In other words, using the reported age to derive the actual age was full of uncertainty. And this became even more perplexing when a group of test-takers came from multiple provinces, a phenomenon that was quite common in large cities due to internal migration for better education and opportunities.⁶⁵

Sometimes, psychologists dispatched a questionnaire to children’s parents for their birth dates and then double-checked with information from the children themselves and the school records. Nonetheless, they still despairingly admitted that:

[A]lthough age is so important, it is pretty hard to obtain a reliable age. Younger children tend to be unclear about their age, while older children are unwilling to tell their actual age. It is even harder when it comes to precise birth dates. Not only do children usually speak ambiguously, but their parents have sometimes forgotten it as well.⁶⁶

Moreover, psychologists’ judgment was from time to time confused by chaotic school starting ages. Although the Republican government implemented a series of educational reforms, many children still entered school at diverse ages, resulting in a significant age difference – sometimes up to five years – among students in the same grade, which made it difficult for researchers to identify age by grade. There was even a situation in which some school principals were so eager to “appear well” to McCall and his colleagues that they instructed the pupils to “place the standard age for the grade on their test blanks” and changed the school records accordingly.⁶⁷

That said, the collaboration between McCall and Chinese psychologists in the early 1920s was not fruitless. They recruited over 100,000 school children into the project and constructed seven kinds of intelligence test and even more educational tests. Out of these intelligence tests, five were new designs, while the other two were revised from the Binet-Simon Scale and the Terman Mechanical Intelligence Test, respectively. The locally developed tests not only assessed arithmetic and Chinese language skills but also included questions considered common knowledge for Chinese students. These questions covered topics such as the builder of the Great Wall, the origin of the Yangtze River, and the correct terms for relatives.⁶⁸ As for the revised tests, Chinese psychologists referred to various foreign revised versions and adapted the test questions and items accordingly. For example, in the case of the Binet-Simon Scale, its Chinese revision consulted the Herring and Stanford Revisions in the United States. Compared to the original version, the revised Chinese version removed the “Fable Test,” which primarily relied on Aesop’s Fables, which were unfamiliar to Chinese children, and instead added the “Mental Arithmetic Test.”⁶⁹ The most significant change in China, however, was the invention of a new scoring system called T-C-B-F units, intended for use across all tests. This system first calculated the standardized total score, represented as the T score (total ability), and then determined the B score (brilliance) based on age, the C score (classification) based on grade level, and the F score (effort) based on the comparison of educational and intelligence test results.⁷⁰ In other words, under this scoring system, even if the age-related B score was unreliable, other scores could still be somewhat useful to sort and regulate test-takers.

It would be difficult to overstate the symbolic importance of this testing movement in the Chinese history of psychology. But we should not forget that, due to limited resources and the chaotic political situation, this indigenizing work was primarily carried out in a few of China’s most affluent provinces and the populations there could hardly represent the entire country. Take the famous “Reversal of the Hands of a Clock Test.” Students in these urban areas might not have difficulty in drawing a clock, swapping the hour hand and minute hand, and then telling the time, whereas in rural areas clocks were still so rare that this test made little sense to most of the test-takers.⁷¹ Furthermore, these earlier works could not keep pace with the rapidly changing Chinese society. What counted as common knowledge for test-takers evolved so fast in this period that tests became outdated quickly. For instance, in the “Recognizing Coins Test” in the first Chinese edition of the Binet-Simon Scale, students were expected to distinguish small coins (xiaoqian), silver coins (yinyuan), and fractional silver coins (yinjiao). But after the government enacted currency reforms in the 1930s, these coins stopped circulating and were no longer familiar to younger students.⁷²

A major setback occurred only five years after McCall’s departure. Du Zuozhou, a University of Iowa-trained psychologist, applied one of the group intelligence tests to 10,000 students in Jiangxi province. He was disappointed to find that the intelligence scores were no longer a good predictor of the students’ performance in learning. The correlation between the two had become much lower compared to the results obtained in Jiangsu in 1923.⁷³ Additionally, he discovered that the test scores poorly matched teachers’ judgments about the pupils’ talent. After comparing all the data, he concluded that, although the subjective judgments might not be accurate, the validity of the intelligence test was also questionable.⁷⁴ Other studies in the early 1930s also confirmed poor correlations between various intelligence tests as well as between these tests and academic performance.⁷⁵

In addition to problems in educational applications, racial comparative research also faced significant limitations. In 1931, Lu Zhiwei published one of China’s earliest indigenous studies on racial differences in intelligence. He used the Chinese version of the Binet-Simon Scale along with the Pintner-Paterson tests based on American norms to compare the IQs of Chinese and American children. He concluded: “Roughly speaking, therefore, the Chinese children have about the same degree of performance intelligence as American children, insofar as the Pintner norms are adequate for the general American population.” Nevertheless, what Lu referred to as the “Chinese children” might not have genuinely represented the general Chinese population, as he could have only recruited his subjects from Beijing, mostly from the elementary school affiliated with Yanjing University. Lu admitted that he was unable to perform random sampling as it was what he described as “an impossible step under China’s present educational situation,” and he explicitly stated that the Chinese version of the Binet-Simon Scale had been standardized under “very unfavorable conditions” and its “norms are inadequate.”⁷⁶ Similarly, while other Chinese mental testers tended to claim that Chinese intelligence was not inferior, they recognized that without properly addressing differences in social status and between urban and rural areas, as well as variations in age and grade, mental characteristics among races would remain indefinite and ungeneralizable.⁷⁷

Overall, there existed a substantial gap between Chinese psychologists’ expectations for intelligence testing and the outcomes they later achieved. Inspired by American psychometricians’ idea about universal testing, Chinese mental testers set as their goal early on the construction of nationwide tests. However, it was with this ambitious goal that the development of intelligence testing in early twentieth-century China became particularly vulnerable to political and geographical fragmentation. The lack of state support, the rivalries between regions, the heterogeneity of language, the various ways of age reckoning, and swift yet unequal societal changes, among other factors, collectively presented serious challenges for psychometrics. Even though some of the problems were not unique to China, the issue became conspicuous as all the factors added together. As McCall noted, “to think that standardization difficulties are peculiar to China would be a mistake, but they are certainly more numerous there than in the United States.”⁷⁸ The inability to secure uniformity ultimately led to the decline of the predictive power of intelligence testing. The next section will show that as the disparity between expectations and achievements widened, intelligence testing became increasingly susceptible to criticism from other psychological schools.

Questioning the legitimacy of intelligence testing

While the enthusiasm for the nationwide use of intelligence testing rose rapidly thanks to its promise for providing immediate intervention, its waning popularity resulted from the realization that it did not prove as useful as expected. For most frontline school personnel, intelligence testing was an unfamiliar instrument that required them to spend time in learning. However, they did not perceive the predictive, differentiating, and diagnostic powers that intelligence testing claimed to possess. The T-C-B-F system that psychologists once proudly introduced was especially complained about by schoolteachers as being too complicated to understand and operate. After the initial construction work was completed, most schools showed little interest in adopting it into their routine practice. Consequently, many intelligence tests published in the early 1920s went out of print and their circulation ceased within a few years. With the dissolution of the CNAAE in 1926 due to the chaotic warfare between the Nationalist Party and warlords, mental testers also lost a major supporter for further refinement.⁷⁹

Yet a bigger blow to the legitimacy of intelligence testing came from within the psychology community. In contrast to the early 1920s, when mental testing played a major role in discipline formation, from the mid-1920s psychology in China began to experience more diverse development. The return of US-trained psychologists specializing in experimentation, as well as the establishment and expansion of laboratories in universities, provided an opportunity for the advance of experimental psychology.⁸⁰ Several experimental psychologists, such as Guo Renyuan and Wang Jingxi, built their international reputation through English-language publications. The process of professionalization subsequently gave rise to disputes over disciplinary boundaries and approaches. These experiment-oriented psychologists hoped to position psychology alongside physics, chemistry, and biology, advocating the separation of psychology from the fields of philosophy and education. For them, as for many Republican scientists, experimentation was evidently a primary means of demonstrating the scientific legitimacy of a discipline.⁸¹

The dispute over the direction of psychology intensified during the 1930s. The end of the warlord era in 1928 created a relatively stable social and political environment for developing science at large. The founding of China’s first academy of sciences, the Academia Sinica, in the same year epitomizes the Nationalist government’s undertaking to build up national research capability. However, not every scientific discipline enjoyed equal fortune. Due to constant financial shortages, the Nationalist government exhibited a clear preference for the sciences that they thought could “increase productivity and strengthen defense.”⁸² International organizations such as the Rockefeller Foundation were also specifically committed to supporting medicine and agriculture, which they deemed more urgent.⁸³ Psychologists in academia increasingly noticed that psychology received significantly less funds and resources compared to other scientific disciplines.⁸⁴ Moreover, students who initially pursued psychology as a means to contribute to social reform found no proper job opportunities to apply their expertise.⁸⁵ The dearth of prospects exacerbated the recruitment difficulties faced by psychology departments.⁸⁶ Consequently, Chinese psychologists developed a sense of relative deprivation and began to reflect on why psychology had lost the respect of and support from society. In this context of pursuing professionalization and competing for resources, several leading psychologists attributed their situation to people’s belief that psychology lacked scientific rigor and practical utility. This viewpoint further led them to attempt to redefine psychological approaches and direct their criticism toward the previous investment in intelligence testing. They fostered an atmosphere in the 1930s that intelligence testing was not only ineffective but also unscientific, making it unnecessary for China’s progress.

One of the earliest psychologists who harshly criticized intelligence testing was Guo Renyuan. Guo was the founder of the psychology department at Fudan University, which was renowned for experimental psychology at the time. He also served as a professor of psychology at National Central University and later became the president of Zhejiang University. He committed himself to physiological psychology and had strong doubts about the scientificity of psychometrics. Since his PhD study at the University of California, Berkeley, he had held a radical behaviorist stance toward what counted as legitimate psychological research. He defined psychology as “the science which deals with the physiology of bodily mechanisms involved in the organismic adjustment to environment” and contended that only “behavior” – the solely physical and mechanical events which can be objectively observed and quantitatively experimented upon – should be the psychologist’s subject matter.⁸⁷ From 1921 onward, he published a series of articles in English, criticizing mainstream psychologists’ obsession with “instinct,” an idea that quite often referred to an innate or inherited tendency. His highly polemical writing style attracted considerable attention from his American peers. The main cause of his objection was that “instinct” and the related concepts were so vaguely defined that they mixed up a variety of reactions and responses. For him, sticking to this kind of all-encompassing yet obscure idea only effaced the complexity of human reaction systems rather than elucidating them. If psychologists did not abandon this tradition, their work would remain subjective and closer to a priori reasoning than to experience.⁸⁸

In 1929, Guo published a monograph in China, extending his criticism to intelligence testing. He complained that Western psychometricians lacked consensus on the nature of intelligence, and their views were mostly confusing; however, he asserted that these psychometricians all advocated that intelligence is hereditary. He then argued that it was nonsense to speak of heredity in psychology because all these assumed heritable mental traits came from statistics instead of experimentation. In his opinion, if psychologists only relied on statistical investigation but were unable to verify the causation through the laboratory process, their practice contained little scientific value and their conclusions violated the spirit of science.⁸⁹ He further castigated the key figures in the psychometric field as “charlatan psychologists” (yeji xinlixuejia), saying:

[T]hose charlatan psychologists – the so-called intelligence testers or mental testers – are spending every day doing mathematic work. They seem to believe that if they calculate these numbers more times, the principle of mental heredity can finally jump out from these numbers. Galton, Thorndike, Wood, Weeke, Goddard, Terman, and so on are the leaders of this group of charlatanic psychologists. They are skilled in manipulating their abacuses – the so-called statistics – but the more they calculate, the more muddled their brains become.⁹⁰

Although Guo did not criticize his Chinese colleagues, his bitter comment on the theoretical problems of their mentors – Thorndike and Terman in particular – undoubtedly constituted a direct challenge to Chinese mental testers.

In defense of intelligence testing, the psychometrician Chen Xuanshan, who obtained a PhD in psychology from Columbia University and taught at Daxia University, published a response titled “What Attitude Should We Take toward Intelligence Testing?” in Educational Reconstruction in 1931. In contrast to Guo’s focus on the scientific basis of psychology, Chen, in the article, employed a defense strategy of decoupling applications from theoretical disputes. He argued that even though “psychologists have no consistent claim about intelligence, the diversified definitions do not hamper the conduct of testing.”⁹¹ His reasoning was that, inasmuch as psychologists held a “relatively clear idea” about intelligence, they could just carry out their measurements accordingly and these measurements could in turn help them clarify the definition. To him, it was unnecessary for the researchers who used intelligence testing for instrumental purposes to get to the bottom of the definition or the nature of intelligence – the attitude toward it should be taken based on whether it was able to handle practical affairs. Chen’s argument indeed represented the typical stance of Chinese mental testers. Earlier in the 1920s, Chen Heqin and Liao Shicheng had already stated in their book An Outline of Testing that “the explanations [of intelligence] by various scholars are rather inconsistent. Hence it is clear that ‘what intelligence is’ is a theoretical question. Our aim in constructing intelligence tests is to achieve practical effectiveness. As for the theoretical issues, we can just leave them to the future.”⁹²

This prioritization of application over theory nevertheless gradually lost its appeal as the effectiveness of testing fell short of expectations. Chen Xuanshan even failed to persuade his students. A few months later, a high school teacher named Hu Feng submitted an article to the same journal to challenge his former professor. Opposed to Chen, Hu insisted that psychologists should prioritize the question of whether intelligence is innate, hereditary, or acquired, nurtured. He claimed that if this problem were not to be resolved, then intelligence testing would remain suspicious, and, by extension, the solutions to educational issues provided by test results would be fundamentally shaken. He based his argument on a study by the American educator Carleton Washburne, who found that the levels of students’ intelligence scores did not align well with the degrees of progress at a later stage. He thus took this failure of prediction as evidence to challenge his teacher’s instrumentalist means of legitimation. For him, an application without a proper theory only produced contingent effects; a correct theory was the root of all facts.⁹³ And, apparently, such a theoretical foundation was considered to be underdeveloped.

Other senior psychologists also joined the ranks of criticism, and among them, the most ardent opponent of testing was Wang Jingxi. Wang earned a PhD from Johns Hopkins University in 1923 and became a professor of psychology at Peking University, and in 1933 he was appointed as the director of the Institute of Psychology at Academia Sinica, the most prestigious research institution in China. Around the time he assumed this important position, Wang published an article titled “The Future of Chinese Psychology,” expressing his stance on the development of psychology in China. On the one hand, he praised Guo Renyuan’s experiments on animal behaviors and the emerging field of industrial psychology as correct models that Chinese psychologists should follow. On the other hand, he condemned Chinese educational psychologists’ testing agenda as a dead-end path that initially sparked young people’s enthusiasm but eventually caused disappointment and proved ineffective in reforming society. Blunter than Guo, Wang directly complained about his colleagues. He claimed that while American mental testers had already faced harsh scholarly critiques, their Chinese pupils were even less successful in measurement; worse still, they insisted on going down a blind alley and refusing to turn back. He was especially unhappy with Chinese testers’ work in Xinli in the 1920s, characterizing its statistical methods as “lacking of general knowledge” and “extremely hilarious,” and accusing it of being responsible for the declining reputation of psychology in China. As Wang found no positive impact of testing on education, he asserted that it was a waste of government funds to invest in this area of research.⁹⁴ Consequently, after he entered Academia Sinica, testing disappeared from the research agenda of the Institute of Psychology.⁹⁵

Even some allies of the testing camp were not willing to support intelligence testing. Wang’s criticisms soon received a response from Pan Shu, who was a professor of psychology at National Central University – a major hub for testing research.⁹⁶ Although Pan was not a specialist in psychometrics, he was a founding member of the Chinese Society of Testing (Zhongguo ceyan xuehui).⁹⁷ Established in 1931, this society aimed to revive the waning testing movement. It received subsidies from the Nationalist Party and was intended to help the government improve examination methods.⁹⁸ In his response, Pan gave completely different evaluations of intelligence testing versus educational testing. He asked Wang to clarify what he meant by the word “testing.” If Wang’s criticism only applied to intelligence testing, he would totally agree with him because he was “also personally suspicious about intelligence tests, not for other reasons but for the fact that up to the present I still cannot figure out what so-called intelligence really is.” However, Pan disagreed that educational testing was also at fault. In his view, educational testing, unlike intelligence testing, had clear and specific targets, which made it practicable and promising.⁹⁹ This was not the first time Pan had denounced intelligence testing. In a 1932 article in the official journal of the Chinese Society of Testing, Pan had already criticized the term “general intelligence” for its lack of a referent. “We say the words, but we have no idea what we are really talking about.”¹⁰⁰ For him, the prerequisite for applying testing was to delimit its targets to specialized domains, such as assessing one’s typing speed or their task management abilities. The more specific the trait that the test targeted, the more reliable the result was. In the following years he repeatedly emphasized the correlation between specificity and exactitude and reminded his readers “not to trust intelligence testing so easily.”¹⁰¹ By denouncing the concept of “general intelligence,” one that had occupied the center stage of many mental testers’ work since the time of Alfred Binet, Pan delegitimized the attempts to define Chinese mental ability by intelligence testing.¹⁰²

Apparently, intelligence testing in the 1930s found itself caught in a conundrum: it was criticized for not being scientific as it lacked physiological and experimental bases, and it was considered less practically valuable compared to industrial psychology and educational testing. This is not to suggest that intelligence testing lost all its advocates. Several psychologists, like Lu Zhiwei and Xiao Xiaorong, continued to deepen their research on intelligence testing, and some schools and organizations in urban areas still utilized it. However, mental testers at this point had largely lost their optimism toward developing nationwide intelligence tests and began to acknowledge that the promotion of testing had reached a bottleneck. They could not even guarantee that the majority of intelligence tests were of satisfactory quality or that their results could represent Chinese children in general.¹⁰³ The challenges in standardization and the hostility within the psychology community formed a vicious cycle. The neurologist Lu Yudao once commented that intelligence testing had been subjected to unfair accusations, leading many people to hesitate about its application.¹⁰⁴ Mental testers also complained that the lack of application eventually prevented them from improving the tests. Many of them developed a feeling of nostalgia for the early 1920s, when people across the country had worked together and eagerly anticipated intelligence testing’s successful performance.¹⁰⁵

Conclusion

The outbreak of the Second Sino-Japanese War created opportunities for mental testers in the 1940s to experiment with intelligence tests on small groups of military officers and to use tests for screening police candidates, but wartime situations such as mass migration, printing and publishing difficulties, and shortages of funding and trained examiners hindered the development of testing at large.¹⁰⁶ Contrary to mental testers’ initial expectations, intelligence testing played only a marginal role in China’s merit system and did not substantially change pedagogical practice during the Republican era. The racial concerns were considered not to be well addressed by standardized tests either. Two decades had passed, but Chinese psychologists still lamented, “it is shameful to let these [foreigners’] studies [on Chinese intelligence], whatever the results, take the place of our own.”¹⁰⁷ After the establishment of the People’s Republic of China in 1949, the communist state endorsed the path of Soviet psychology. Chinese mental testers, whether voluntarily or because they were compelled to do so, began to denounce intelligence testing as embodying bourgeois ideology because it separated intelligence from the process of physical labor.¹⁰⁸ Meanwhile, the Nationalist government retreated to Taiwan, an island that accounted for less than one two-hundredth of its former territory. There it formed an authoritarian regime, implemented policies of ethnic assimilation, and asserted more direct control over the educational system. As a Cold War ally of the United States, Taiwan continued the practice of psychometrics from the mainland period. The intensive governance of the nationalist state eventually made it possible for intelligence testing to become the nationwide standard for class grouping in middle schools in 1970.¹⁰⁹

The development of intelligence testing in early twentieth-century China showcased the ambitious aspirations and practical constraints of Chinese psychologists. They regarded measurement science as an epistemic authority and thus promoted psychometrics to serve the functions of racial comparison and merit assessment. From a comparative perspective, the emphasis on these two functions is one of the features of the Chinese case. Although these two purposes also coexisted in Euro-American metropoles, in many other locales, such as the Soviet Union, Spain, and Italy, indigenous appropriations of testing often stressed its pedagogical value and downplayed the aspect of scientific racism. The Chinese emphasis on both of the functions was closely related to the recent founding of the Republic of China. The institutional break necessitated the design of a more “democratic” and “objective” merit system that could distinguish China from its imperial past. Meanwhile, the rise of nationalist sentiments in a society predominantly made up by Han Chinese also led mental testers to view racial comparative research as a means to enhance national self-awareness. In a way, Chinese mental testers shared a similar attitude to African American testers in the United States at the time: accepting the utility of intelligence testing as an instrument but rejecting test results generated by white testers.¹¹⁰ However, despite their efforts to indigenize testing, mental testers in this period still maintained a strong belief in universality. While they revised tests, their ideal was to create versions that were psychologically equivalent to the Western ones and thus allowed everyone to be compared on the same scale. It was not until the late 1970s that psychologists in the Sinophone world began to explore local perspectives on intelligence in different cultures.

While Chinese psychologists initially held high hopes for the universal or nationwide application of intelligence testing, their inability to tame heterogeneity ultimately led to a retreat of enthusiasm. The success or failure of a scientific instrument involves a series of technical, political, and social factors. The epistemic imperative of uniformity demanded by psychometrics was especially troubled by the reality of Republican China as a fragmented state. Limited state capacity made it difficult for Chinese psychologists to manufacture uniformity, expand the testing enterprise, and accumulate its social credentials. From cases in various countries, we can clearly see that the state has played a crucial role in determining the outlook for mental testing. The success of psychometrics in the United States was driven by nationwide military conscription during World War I, which created a relatively disciplined setting for psychologists to overcome heterogeneity and, through this large-scale application, to consolidate testing’s public reputation. The advancement of testing in the British Empire also hinged on resources from both the British government and colonial administrations. Similarly, in Germany, Austria, Brazil, and the early Soviet Union, the state was also a key supporter of testing. In contrast, the formative period of intelligence testing in China during the 1920s occurred in a context without a unified central government. Without state backing, psychologists not only had to worry about funding but also got caught up in the tensions between various political and academic factions. They could hardly compel institutions and individuals to adopt testing, nor could they overlook testing biases as colonial psychologists sometimes did due to unequal power relations. Their primary approach to convince the public and their peers was to highlight the effectiveness of the tests, with standardization being their prerequisite. However, when Chinese psychologists’ ambitious attempt at nationwide testing encountered China’s vast geographical and cultural diversity – stemming both from the persistence of the traditional social structure and the rapid process of modernization – effectiveness became an elusive goal beyond their individual efforts. In the end, intelligence testing in China during this period was destined to be confined to localized and short-lived applications.

Certainly, the use of intelligence tests has sparked debates in many parts of the world, but from the controversies we have detailed here can see that Chinese mental testers appeared to have a more pronounced instrumental attitude toward testing. Unlike in Western metropoles, where the nature of intelligence was extensively debated, Chinese mental testers did not dwell much on theoretical matters. This was partly due to their urgent need for practical solutions and also later the fact that the use of tests remained limited and did not cause significant social consequences. However, this instrumental attitude eventually incurred criticisms centered on utility and scientificity. Opponents argued that intelligence testing did not offer substantial assistance, nor was it based on rigorous scientific causality, so they viewed it as detrimental to the future of psychology in China. Instead of fostering discussions about proper applications, these debates often framed the issue as a binary choice between advocating or abandoning it. This all-or-nothing mindset was symptomatic of the scarcity of investment and opportunities at the time. Rather than embracing diversity and awaiting refinement, Chinese psychologists tended to concentrate resources on what they deemed most scientific or having the most immediate effect. In this regard, the fate of intelligence testing also provides a lens through which we can observe the general tendency of developing science and technology in early twentieth-century China.

Footnotes

Acknowledgements

My heartfelt thanks to the two anonymous reviewers for their insightful suggestions and to all those who commented on my work, including Henrietta Harrison, Jennifer Altehenger, Miriam Driessen, Mary Brazelton, Shakhar Rahav, Guoqiang Dong, Ning Zhang, Yiyang Gao, Jacob Fordham, Ying Tong, Kuldip K. Singh, Yang Han, Pin-Yu Lai, Ting-Yu Cai, and Tin Hang Hung.

Declaration of conflicting interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The author would like to acknowledge the support of the Clarendon Fund Scholarship, the St Antony’s College Warden’s Scholarship, and the Oxford-Taiwan Graduate Scholarship for making this research possible.

ORCID iD

Pang-Yen Chang

1.

Emily Baum, “Healthy Minds, Compliant Citizens: The Politics of ‘Mental Hygiene’ in Republican China, 1928–1937,” Twentieth-Century China 42 (2017): 215–33; Wen-Ji Wang, “An International Teamwork: Mental Hygiene in Shanghai during the 1930s and 1940s,” History of Psychology 22 (2019): 289–308; Victor Seow, “Psychology as Technology: Industrial Psychology for an Industrializing China,” History and Technology 38 (2022): 257–73; Emily Baum, “Controlling Minds: Guo Renyuan, Behavioral Psychology, and Fascism in Republican China,” The Chinese Historical Review 22 (2015): 141–59.

2.

Annette Mülberger, “The Need for Contextual Approaches to the History of Mental Testing,” History of Psychology 17 (2014): 177–86; John Carson, “Mental Testing in the Early Twentieth Century: Internationalizing the Mental Testing Story,” History of Psychology 17 (2014): 249–55.

3.

Stephen Jay Gould, The Mismeasure of Man (New York, NY: W.W. Norton, 1981); John Carson, The Measure of Merit: Talents, Intelligence, and Inequality in the French and American Republics, 1750–1940 (Princeton, NJ: Princeton University Press, 2007); Leila Zenderland, Measuring Minds: Henry Herbert Goddard and the Origins of American Intelligence Testing (Cambridge: Cambridge University Press, 2001); Paul Davis Chapman, Schools as Sorters: Lewis M. Terman, Applied Psychology, and the Intelligence Testing Movement, 1890–1930 (New York, NY: New York University Press, 1990); Andrew Winston, “Scientific Racism and North American Psychology” Oxford Research Encyclopedia of Psychology 29 May 2020. <> (February 15, 2024).

4.

Janice Matsumura, “More or Less Intelligent: Nikkei I.Q. and Racial/Ethnic Hierarchies in British Columbia and Imperial Japan,” BC Studies 192 (2016/17): 51–69, 53. Also see David Palter, “Testing for Race: Stanford University, Asian Americans, and Psychometric Testing in California, 1920–1935” (PhD dissertation, University of California, Santa Cruz, 2014); Gerald Thomson, “So Many Clever, Industrious and Frugal Aliens: Peter Sandiford, Intelligence Testing, and Anti-Asian Sentiment in Vancouver Schools between 1920 and 1939,” BC Studies 197 (2012): 67–100. For the broader social background of Western concerns about Chinese immigrants’ characteristics, see Mae Ngai, The Chinese Question: The Gold Rushes and Global Politics (New York, NY: W. W. Norton, 2021).

5.

Martin Wieser and Gerhard Benetka, “Psychology in National Socialism: The Question of ‘Professionalization’ and the Case of the ‘Ostmark’,” History of Psychology 25 (2022): 322–41; Erik Linstrum, Ruling Minds: Psychology in the British Empire (Cambridge, MA: Harvard University Press, 2016); Irina Leopoldoff, “A Psychology for Pedagogy: Intelligence Testing in the USSR in the 1920s,” History of Psychology 17 (2014): 187–205; Annette Mülberger, Mònica Balltondre, and Andrea Graus, “Aims of Teachers’ Psychometry: Intelligence Testing in Barcelona (1920),” History of Psychology 17 (2014): 206–22; Elisabetta Cicciola, Renato Foschi, and Giovanni Pietro Lombardo, “Making Up Intelligence Scales: De Sanctis’s and Binet’s Tests, 1905 and After,” History of Psychology 17 (2014): 223–36; Ana Maria Jacó-Vilela, “Psychological Measurement in Brazil in the 1920s and 1930s,” History of Psychology 17 (2014): 237–48.

6.

Houcan Zhang, “Psychological Measurement in China,” International Journal of Psychology 23 (1988): 101–17; Louise Higgins and Gao Xiang, “The Development and Use of Intelligence Tests in China,” Psychology and Developing Societies 21 (2009): 257–75.

7.

Geoffrey Blowers, Boris Tat Cheung, and Han Ru, “Emulation Vs. Indigenization in the Reception of Western Psychology in Republican China: An Analysis of the Content of Chinese Psychology Journals (1922–1937),” Journal of the History of the Behavioral Sciences 45 (2009): 21–33. For a discussion about indigenous psychology in the Sinophone world, see Olwen Bedford and Kuang-Hui Yeh, “History of Chinese Indigenous Psychology” Oxford Research Encyclopedia of Psychology, February 28, 2020, <> (February 15, 2024).

8.

Shellen Xiao Wu and Fa-ti Fan, “China,” in Hugh Slotten, Ronald Numbers, and David Livingstone (eds.), Modern Science in National, Transnational, and Global Context (Cambridge: Cambridge University Press, 2020), pp.521–54; Jia-Chen Fu, “Practice and the History of Science in the PRC: A Historiographic Reflection,” East Asian Science, Technology and Society 13 (2019): 449–63; Zuoyue Wang, “Science and the State in Modern China,” Isis 98 (2007): 558–70.

9.

Lewis Terman, The Measurement of Intelligence: An Explanation of and a Complete Guide for the Use of the Stanford Revision and Extension of the Binet-Simon Intelligence Scale (Cambridge, MA: The Riverside Press, 1916), pp.52–3.

10.

Liou Wei-Chih, “Meiguo gelunbiya daxue shifan xueyuan Zhongguo xuesheng boshi lunwen fenxi (1914–1929)” [An Analysis of Doctoral Dissertations from Chinese Students at Teachers College, Columbia University (1914–1929)], Jiaoyu yanjiu jikan 59, no. 2 (2013): 1–48.

11.

Chen Heqin, Wo de bansheng [My Half Life], (Shanghai, China: Huahua shudian, 1946), pp.127–8, 183–4, 156–7.

12.

Ibid., pp.162–4.

13.

Liao Shicheng, “Wo de shaonian shidai” [My Youth Age], Liangyou 109 (1935): 12–13

14.

Sze-Chen Liao, “A Quantitative Study of Non-intellectual Elements” (PhD dissertation, Brown University, Providence, RI, 1921), p.32.

15.

Max Weber, Konfuzianismus und Taoismus, trans. into English by Hans Gerth (Glencoe, IL: The Free Press, 1951), p.125.

16.

Kwok Tsuen Yeung, “The Intelligence of Chinese Children in San Francisco and Vicinity,” Journal of Applied Psychology 5 (1921): 267–74; Percival Symonds, “The Intelligence of Chinese in Hawaii,” School and Society 19 (1924): 442; Peter Sandiford and Buby Kerr, “Intelligence of Chinese and Japanese Children,” Journal of Educational Psychology 17 (1926): 361–7; John Willis Creighton, “The Chinese Mind: A Study in Race Psychology” (PhD dissertation, University of Missouri, Columbia, MO, 1917); William Henry Pyle, “A Study of the Mental and Physical Characteristics of the Chinese,” School and Society 8 (1918): 264–9; G. D. Walcott, “The Intelligence of Chinese Students,” School and Society 11 (1920): 474–80.

17.

Chen Heqin and Liao Shicheng, Zhili ceyanfa [The Methods of Intelligence Testing] (Shanghai, China: Shangwu yinshuguan, 1921), p.13.

18.

Lu Zhiwei, “Zhongguo xinlixue zuijin de jianglai” [The Near Future of Chinese Psychology], Jianguo jiaoyu 2, no. 1 (1940): 2–3.

19.

Shiuon Chu, “The Longer Abolition of the Chinese Imperial Examination System (1900s–1910s),” International Journal of Asian Studies 20 (2023): 721–37.

20.

Luo Zhitian, “Keju zhidu feichu zai xiangcun zhong de shehui houguo” [The Social Impact of the Abolition of Civil Examinations in Rural China], Zhongguo shehui kexue 1 (2006): 191–204.

21.

Shiuon Chu, “Ambiguous Objectivity: The Standardized Test Movement (1920–1937) and the Remaking of Chinese Examination Discourse,” Twentieth-Century China 44 (2019): 345–61.

22.

William McCall, “Science of Education,” The Journal of Educational Research 7 (1923): 384–96, 384–5.

23.

Zhonghua jiaoyu gaijinshe, Zhonghua jiaoyu gaijinshe jianzhang ji baogao [General Regulations and Reports of the Chinese National Association for the Advancement of Education] (Beijing: Zhonghua jiaoyu gaijinshe, 1922), pp.4–6; Jiaoyubu nianjian bianzuan weiyuanhui, Di yi ci Zhongguo jiaoyu nianjian [The First Educational Yearbook of Republican China] (Shanghai, China: Kaiming shudian, 1935), wubian, p.94.

24.

Chen and Liao, Zhili ceyanfa, pp.8–13 (note 17); Liao Shicheng, “Ceyan yu zhongxuexiao” [Testing and the Middle School], Zhongdeng jiaoyu 2 (1922): 1–9.

25.

Barry Keenan, “Educational Reform and Politics in Early Republican China,” The Journal of Asian Studies 33 (1974): 225–37.

26.

Arthur Henderson Smith, Chinese Characteristics (Shanghai, China: North China Herald, 1890), pp.82–3.

27.

Hu Shih, “Chabuduo xiansheng zhuan” [The Life of Mr. Close-Enough], Xinghua 21, no. 26 (1924): 25–6.

28.

Wang Shulin, Xinli yu jiaoyu celiang [Psychological and Educational Measurement] (Shanghai, China: Shangwu yinshuguan, 1935), p.1.

29.

Zhang Yaoxiang, “Zhili ceyan yuanqi” [The Origin of Intelligence Testing], Xinli 1, no. 1 (1922): 87–9.

30.

Zhang Yaqun and Yu Ningning, “Yanjing daxue zizhu zhaosheng de tedian ji qi jiejian yiyi” [The Characteristics and Enlightenment of Independent Recruitment of Yenching University], Gaodeng jiaoyu yanjiu 34, no. 4 (2013): 90–8.

31.

They initially preferred to invite Edward Thorndike, but because of his unavailability, they turned to William McCall, then Associate Professor of Education, for consultation. See “Fenzu huiyi jilu: di shiqi xinli jiaoyu ceyan zu” [Panel Discussion Minutes: No. 17 Psychological and Educational Testing Group], Xin jiaoyu 5, no. 3 (1922): 553–4.

32.

“Meiguo jiaoyu ceyanjia jiang lai hua” [American Educational Psychometrician Is Coming to China], Shibao, August 29, 1922.

33.

“Zhonghua jiaoyu gaijinshe teqing xinli zhuanjia maike” [The Chinese National Association for the Advancement of Education Sincerely Invites Psychology Expert McCall], Xinli 2, no. 1 (1923): illustration page.

34.

Wang Shulin, “Zhili ceyan zhi fada shi” [The History of the Advancement of Intelligence Testing], Jiaoyu zazhi 19, no. 3 (1927): 1–14, 1.

35.

“Beijing shifan daxue pingmin jiaoyushe huanying maike boshi sheying” [The Welcome Photo of Dr. McCall at the Society of Mass Education at Beijing Normal College], Pingmin jiaoyu 63–4 (1923): photo page; “Huanying maike boshi zhisheng” [A Note on the Grand Welcome of Dr. McCall], Xinwenbao, January 5, 1923.

36.

Liu Bingli and William McCall, “Yuedu biaozhun ceyan ke” [The Standard Test Lessons in Reading], Guojia yu jiaoyu 2 (1926): 6–8, 6.

37.

“Maike boshi jiaoyu jinxing jihua” [Dr. McCall’s Instruction Plans], Minguo ribao, September 23, 1922.

38.

Zhu Guangqian, “Zhili ceyanfa de biaozhun” [The Standard of Intelligence Testing], Jiaoyu zazhi 14, no. 5 (1922): 1–6.

39.

William McCall, “Scientific Measurement and Related Studies in Chinese Education,” The Journal of Educational Research 11, no. 2 (1925): 85–94, 90.

40.

Sean Hsiang-lin Lei, Neither Donkey nor Horse: Medicine in the Struggle over China’s Modernity (Chicago, IL: University of Chicago Press, 2014), p.56.

41.

Justin Jacobs, The Compensations of Plunder: How China Lost Its Treasures (Chicago, IL: University of Chicago Press, 2020), p.173.

42.

Wang Zhuoran, Zhongguo jiaoyu yipie lu [A Glimpse of Education in China] (Shanghai, China: Shangwu yinshuguan, 1923).

43.

McCall, “Scientific Measurement,” p.90–1 (note 39).

44.

William McCall, “Scientific Measurement and Related Studies in Chinese Education (Continued),” The Journal of Educational Research 11, no. 3 (1925): 177–89, 185.

45.

Lü Fangshang, Minguo shilun [On the History of the Republic of China] (Taipei: Shangwu yinshuguan, 2013), pp.852–3.

46.

Ming-Fui Pang, “Xiandai Zhongguo nanfang xueshu wangluo de chushi (1911–1945)” [The Southern Academic Network of Modern China’s Historiography (1911–1945)], Guoli Zhengzhi daxue lishi xuebao 29 (2008): 51–84.

47.

Zhang Pengyuan et al., Guo Tingyi xiansheng fangwen jilu [The Reminiscences of Mr. Kuo Ting-Yee] (Taipei: Zhongyang yanjiuyuan jindaishi yanjiusuo, 1987), pp.111–112.

48.

Li Zi, “Beida han dongda de bijiao” [The Comparison between Beijing University and Southeast University], Juewu, September 19, 1924.

49.

McCall, “Scientific Measurement,” p.90 (note 39).

50.

Zhuang Zexuan, “Shencha xinlixue mingci de jingguo” [The Course of Reviewing Psychological Terminology], Xinli 2, no. 4 (1923): 1–14.

51.

Zhu Junyi, Tongji yu ceyan mingci hanyi [Chinese Translations of Terminology in Statistics and Measurement] (Shanghai, China: Shangwu yinshuguan, 1923).

52.

Jin Guobao, Tongji xinlun [New Introduction of Statistics] (Shanghai, China: Zhonghua shuju, 1925), appendix, pp.1–8.

53.

“Zhongyao ceyan mingci hanyi chugao” [The First Draft of Chinese Translation of Important Terms in Measurement], Ceyan 2, no. 4 (1936): 271–303.

54.

Zhao Yan, “Guoli bianyiguan bianding putong xinlixue tongyi yiming zhi jingguo” [The Process of Editing Standardized Terminologies of General Psychology by the National Institute for Compilation and Translation], Jiaoyu zazhi 27, no. 6 (1937): 79–86.

55.

The link between mental speed and intelligence was a twentieth-century American construct. In the nineteenth century, quickness of thought was regarded by Americans as lack of self-discipline. However, inspired by stenography and telegraphy and out of administrative convenience, psychologists in the 1910s began to view speed as an essential part of intelligence. See Justin Clark, “The Secret of Quick Thinking: The Invention of Mental Speed in America, 1890–1925,” Time & Society 29 (2020): 469–93. In recent decades, psychologists have cast more doubts on the role that mental speed plays in human intelligence. See Lazar Stankov and Richard Roberts, “Mental Speed Is Not the ‘Basic’ Process of Intelligence,” Personality and Individual Differences 22 (1997): 69–84.

56.

For related discussions, see Gina Anne Tam, Dialect and Nationalism in China, 1860–1960 (Cambridge: Cambridge University Press, 2020) and Janet Chen, The Sounds of Mandarin: Learning to Speak a National Language in China and Taiwan, 1913–1960 (New York, NY: Columbia University Press, 2023). It has been estimated that, as late as the 1950s, only 11% of the population in non-Mandarin dialect areas could comprehend standard Chinese pronunciation. See Wu Runyi and Yin Binyung, “Putonghua shehui diaocha” [A Social Survey of Putonghua], Wenzi gaige 1 (1985): 37–8.

57.

McCall, “Scientific Measurement (Continued),” pp. 181–2 (note 44).

58.

Wu Tianmin, Di er ci dingzheng Zhongguo bina ximeng zhili ceyan zhi jingguo [The Process of the Second Revision to the Binet-Simon Intelligence Scale] (Shanghai, China: Shangwu yinshuguan, 1936), p.1.

59.

Wu Tianmin, “Chongding bina ximeng zhili ceyan zhi jingguo” [The Process of Revising the Binet-Simon Intelligence Scale], Ceyan 3, no. 3 (1933): 171–4, 172.

60.

Liu Zhanen, “Wu wenzi zhili ceyan de yongfa” [The Instruction on the Non-language Intelligence Test], Qingnian jinbu 64 (1923): 9–16.

61.

A similar problem also affected anthropometric research in China around the same period. See Jia-Chen Fu, “Measuring Up: Anthropometrics and the Chinese Body in Republican Period China,” Bulletin of the History of Medicine 90 (2016): 643–71.

62.

Henrietta Harrison, The Making of the Republican Citizen: Political Ceremonies and Symbols in China, 1911–1929 (Oxford: Oxford University Press, 2000), p.65.

63.

Harriet Zurndorfer, China Bibliography: A Research Guide to Reference Works about China Past and Present (Leiden, Netherlands: Brill, 1995), p.301.

64.

Jiaoyu zazhishe, Maike ceyanfa [McCall’s Testing Methods] (Shanghai, China: Shangwu yinshuguan, 1925), p.43.

65.

Zhou Tiaoyang, “Jisuan xuetong nianling de yanjiu” [A Study on Reckoning School Children’s Age], Jiaoyu zazhi 16, no. 11 (1924): 1–19.

66.

Wu, “Chongding bina ximeng zhili ceyan,” p. 174 (note 59).

67.

McCall, “Scientific Measurement (Continued),” p. 182 (note 44).

68.

Liao Shicheng, “Tuanti zhili ceyan” [Group Intelligence Testing], Xinli 3, no. 2 (1924): 1–38.

69.

Chen Heqin and Liao Shicheng, Ceyan gaiyao [An Outline of Testing] (Shanghai, China: Shangwu yinshuguan, 1925), pp.82–4; Lewis Terman, The Stanford Revision and Extension of the Binet-Simon Scale for Measuring Intelligence (Baltimore, MD: Warwick & York, 1917).

70.

Jiaoyu zazhishe, Maike ceyanfa, pp.1–28 (note 64).

71.

Cao Richang, “Woguo ceyan yundong de huigu yu zhanwang” [The Retrospect and Prospect for the Testing Movement in China], Jiaoyu zazhi 30, no. 7 (1940): 4–8, 8.

72.

Ai Wei, Xiaoxue ertong nengli celiang [The Measurement of Elementary School Children’s Abilities] (Shanghai, China: Shangwu yinshuguan, 1948), p.259.

73.

While the correlation coefficient in the 1923 investigation was 0.669, which was close to the ideal value suggested by American psychologists at the time, the value for the Jiangxi investigation reduced significantly to only 0.13, which means the correlation was extremely weak. For the value of the 1923 investigation, see Chen and Liao, Ceyan gaiyao, p.58 (note 69).

74.

Du Zuozhou, “Genju shixing Liaoshi tuanti zhili ceyan de jieguo taolun guonei gezhong ceyan zhi ying xiuding de biyao” [A Discussion about the Need to Revise the Domestic Tests According to the Results of the Liao’s Group Intelligence Scale], Ceyan 4 (1933): 43–52.

75.

Chen Xuanshan, “Wumen congshi xinli ceyan suode yi bufen jieguo” [Some Results We Obtained from Mental Testing], Jiaoyu yu zhiye 115 (1930): 17–20; Shao Heming, “Juxing jizhong zhili ceyan yihou” [After Conducting Several Intelligence Tests], Zhonghua jiaoyujie 19, no. 2 (1931): 19–23.

76.

C. W. Luh and T. M. Wu, “A Comparative Study of the Intelligence of Chinese Children on the Pintner Performance and the Binet Tests,” The Journal of Social Psychology 2 (1931): 397–408, 403, 405.

77.

Hsiao Hong Hsiao, “The Mentality of the Chinese and Japanese,” Journal of Applied Psychology 13 (1929): 9–31; Wang, Xinli yu jiaoyu celiang, p.829 (note 28).

78.

McCall, “Scientific Measurement (Continued),” p.182 (note 44).

79.

Wang Shulin, “Banian lai Zhongguo ceyan yundong zhi jingguo” [The Eight-Year Course of the Testing Movement in China], Jiaoyu yanjiu 66 (1936): 1–4.

80.

Wang Jingxi, “Banian lai Zhongguo shiyan xinli zhi yanjiu” [The Experimental Psychology Research in China during the Past Eight Years], Jiaoyu yanjiu 62 (1936): 1–2.

81.

On the growing significance of experimentation in Republican China, see Fa-ti Fan, “The Controversy over Spontaneous Generation in Republican China: Science, Authority, and the Public,” in Jing Tsu and Benjamin Elman (eds.), Science and Technology in Modern China, 1880s–1940s (Leiden, Netherlands: Brill, 2014), pp.209–44.

82.

James Reardon-Anderson, The Study of Change: Chemistry in China, 1840–1949 (Cambridge: Cambridge University Press, 1991), p.175.

83.

Wu and Fan, “China,” p.536 (note 8).

84.

Pan Shu, “Ti xinlixue bianhu” [In Defense of Psychology], Xin minzu 2, no. 14 (1938): 7–10.

85.

Blowers et al., “Emulation Vs. Indigenization,” p. 30 (note 7).

86.

Wang Jingxi, “Zhongguo xinlixue de jianglai” [The Future of Chinese Psychology], Duli pinglun 40 (1933): 13–16, 13–14.

87.

Zing-Yang Kuo, “A Psychology without Heredity,” Psychological Review 31 (1924): 427–48, 427.

88.

Zing-Yang Kuo, “Giving up Instincts in Psychology,” The Journal of Philosophy 18 (1921): 645–64.

89.

Guo Renyuan, Xinlixue yu yichuan [Psychology and Heredity] (Shanghai, China: Shangwu yinshuguan, 1929), pp.123–4, 219.

90.

Ibid., p.119.

91.

Chen Xuanshan, “Women duiyu zhili ceyan yinggai caiqu hezhong taidu?” [What Attitude Should We Take toward Intelligence Testing?], Jiaoyu jianshe 3 (1931): 1–6, 2.

92.

Chen and Liao, Ceyan gaiyao, p.33 (note 69).

93.

Hu Feng, “Zhili ceyan yu zhili” [Intelligence Testing and Intelligence], Jiaoyu jianshe 4 (1931): 102–7.

94.

Wang, “Zhongguo xinlixue,” p.14 (note 86).

95.

Guoli zhongyang yanjiuyuan (ed.), Guoli zhongyang yanjiuyuan ershisi niandu zongbaogao [Eighth Annual Report of Academina Sinica 1935–1936] (Shanghai, China: Guoli zhongyang yanjiuyuan, 1936), pp.133–9.

96.

The predecessors of the National Central University were the previously mentioned Nanjing Higher Normal School and Southeast University.

97.

“Zhongguo ceyan xuehui huiyuanlu” [The Membership Directory of the Chinese Society of Testing], Ceyan 1 (1931): 171–6.

98.

“Zhongguo ceyan xuehui jianzhang” [Articles of the Chinese Society of Testing], Ceyan 1 (1931): 167–9.

99.

Pan Shu, “Guanyu xinlixue de yuyan” [The Prophecies about Psychology], Duli pinglun 46 (1933): 10–13, 10.

100.

Pan Shu, “Shixing xin kaoshifa de xianjue tiaojian” [The Prerequisite for Applying the New Methods of Examination], Ceyan 2 (1932): 1–4, 2.

101.

Pan Shu, “Xinlixue yu jiaoyu zhi guanxi gaiguan” [An Overview of the Relationship between Psychology and Education], Kexue jiaoyu 1, no. 4 (1934): 1–9, 6.

102.

For a concise review of the role of “general intelligence” in psychology, see Committee on Character Tests and Psychological Tests, “General Intelligence and Its Measurement,” Review of Educational Research 2 (1932): 274–83.

103.

Zuo Renxia, “Zuijin Zhongguo kexue ceyan zhi fazhan ji qi qushi” [The Recent Development and Trends of Scientific Testing in China], Xuelin 1 (1940): 99–116; Cao Richang, “Shiyong baodesi mijin ceyan chubu baogao” [The Preliminary Report on the Use of the Porteus Maze Test], Zhongguo xinli xuebao 1, no. 3 (1937): 252–63, 261.

104.

Lu Yudao, “Xinlixue zhi shehui yiyi” [The Social Meaning of Psychology], Jianguo jiaoyu 2, no. 1 (1940): 20–2.

105.

Cao, “Woguo ceyan yundong,” p. 6 (note 71).

106.

Ding Zuyin, “Wuguo yuanjing zhili ceyan zhi fazhan” [The Development of Intelligence Testing for Police in Our Country], Dongfang zazhi 42, no. 8 (1946): 26–33; Yan Shuchang, Chen Jing, and Zhang Hongmei, “Kangzhan shiqi Zhou Xiangeng de junshi xinlixue shijian yu sixiang” [Siegen K. Chou’s Military Psychological Practices and Thoughts during the War of Resistance against Japan], Xinli xuebao 44 (2012): 1554–62.

107.

Hu Jinan, “Chuangjian yige shiyan de minzu xinlixue” [Establish an Experimental Racial/National Psychology], Jianguo Jiaoyu 2, no. 1 (1940): 27–8.

108.

“Sulian xinlixue gei wo de qishi” [Insights from Soviet Psychology], Renmin ribao, January 20, 1952.

109.

“Xingzheng Yuanzhang Yan Jiagan shizheng baogao” [The Policy Address by Premier Yen Chia-kan], Febuary 18, 1972, no. 006-010601-00013-002, Academia Historica, Taipei.

110.

Wayne J. Urban, “The Black Scholar and Intelligence Testing: The Case of Horace Mann Bond,” Journal of the History of the Behavioral Sciences 25 (1989): 323–34.

Author biography

Pang-Yen Chang is a DPhil candidate in the Faculty of Asian and Middle Eastern Studies at the University of Oxford and a physician by training from Taiwan. His main research interest lies in the history of human sciences, particularly the science of the mind and brain in the East Asian context. He is the author of The Polyphonic Psyche: Hypnotism and Popular Science in Modern China 精神的複調：近代中國的催眠術與大眾科學 (New Taipei: Linking Publishing, 2020).