Sage Journals: Discover world-class research

Abstract

This article introduces a test for literary text comprehension in university students of English as a second language. Poetry is especially suited for our purpose since it frequently shows features that offer challenges to comprehension in a limited space. An example is Shakespeare’s Sonnet 43, on which our test is based: it is suited for assessing not only if a text has been understood but also the ability of respondents to reflect on their own comprehension skills. We show that the test’s psychometric properties are satisfactory, and we demonstrate its validity by analysing relevant external indicators. Thus, we can show a direct link between general reading experience and text comprehension as tested: the more students read, the better do they perform. The collaboration of literary studies with psychometrics moreover allows for a statistically valid identification of specific challenges to comprehension and thus advance our knowledge of what readers find difficult. This will be of interest not only in a hermeneutic and linguistic perspective but also with a view to addressing those difficulties in an educational context. For example, asking someone whether they have understood an utterance (in this case: a line of poetry) does not elicit reliable answers. Being able to say how one has established the meaning of a line seems to be a more reliable indicator of actually having understood it.

Keywords

Ambiguity close reading hermeneutics literature in second language learning psychometrics Shakespeare’s Sonnets stylistic devices text comprehension testing text comprehension

1. Introduction

In this article, we present a study which investigates how the comprehension of literary texts may be assessed.¹ It is part of an interdisciplinary project in which we aim to understand better how the teaching and learning of comprehension skills can be improved; in order to do so, we have to know more about those skills in the first place. We introduce a new approach and combine methods from psychometrics with literary close reading to assess text comprehension in students of English studying at a German university. We will thus not only learn more about how the understanding of literary texts works but also see a methodological gain for the disciplines involved: psychometrics is confronted with the complexity of the phenomena under consideration; hermeneutics and literary studies will have to establish an element of objectivity in a field that is notoriously subjective. The purpose of our investigation will be outlined in this introductory section. In section 2, we will present the construction of the test as well as the specific rationale for choosing the underlying poem, Shakespeare’s Sonnet 43, and we will describe our respondents. In section 3, we will analyse the test itself concerning its psychometric properties such as internal consistency (section 3.1), to then consider the respondents’ ability to reflect on their own comprehension (section 3.2), and we will assess the test’s validity by evaluating it relative to respondents’ characteristics which may be expected to be related to test performance such as previous academic performance (section 3.3). We will close by discussing the implications of our findings (section 4).

Our investigation is based on the assumption that engagement with a literary text involves two major aspects: (1) comprehension, that is, an interpretation of the statements that make up a text, and of the way in which they cohere; (2) the personal meaning a text has for its readers (see Bauer and Beck (2021)), and its cognitive as well as emotional effect. We believe in the importance of (2) but are convinced that it is crucially determined by (1), which is why we pay special attention to text comprehension and to the difficulties encountered by readers. Those difficulties are not just vaguely caused by the text as a whole but by the need for integrating individual text elements into their context. Difficulties may therefore concern both individual expressions and their relation to the rest of the text. Identifying those difficulties is thus a key to learning more about how text comprehension works: if we have learned what exactly obstructs it, we will have fulfilled a prerequisite for improving it.

Accordingly, the identification of difficulties is the focus of our investigation. There is a great amount of literature on text comprehension in general but we still know too little about the specific challenges to comprehension posed by literary texts (Das and Bhushan, 2005: 222; McNamara and Magliano, 2009: 359; Miall, 2006: 91). Such knowledge is desirable especially since literary texts frequently do not aim at the unambiguous information about facts or opinions but are nevertheless meant to be understood.² Their language and style are often more complex than that of other kinds of texts, but this does not mean that they are uninterpretable (cf. Bauer et al., 2020). An example is the use of uncommon metaphors, paradoxes, etc. which convey meaning rather than trigger ultimate incomprehension. Similarly, ambiguity is a ubiquitous phenomenon in any form of discourse but in literary texts it is more frequently made productive (see Bauer et al., 2010) and used strategically. We expect that texts are more difficult to understand when they are ambiguous but that our efforts at comprehension are rewarded when we become aware of the strategy pursued.³ Furthermore, when literary texts present us with unresolved ambiguities, comprehension frequently consists in the recognition of the difficulty itself and the alternative meanings conveyed. Literary texts thus encourage a degree of awareness and reflection that is a form of advanced comprehension. It therefore makes sense to include this level of comprehension in our investigation.

For university students of English, poetry, and early modern poetry in particular, offers the kinds of challenges described while it still rewards efforts at mastering them. It is particularly well suited to the task of assessing comprehension, as poems often show a high degree of complexity in a limited space. Accordingly, the test we have developed is based on an example of this genre, on Sonnet 43 by William Shakespeare (‘When most I wink, then do mine eyes best see’). We have chosen it because the kind and number of explanatory annotations by scholars⁴ make us assume that it fulfils the requirements outlined and presents identifiable challenges to comprehension. Its linguistic difficulty is not just due to the fact it is about 400 years old⁵ but is primarily related to the thoughts and issues represented. In terms of aspects (1) and (2) above, it is furthermore a poem which we expect to have a personal meaning for its readers as well, as it presents a situation to which readers, then and now, can individually relate.⁶

Testing the comprehension of a poem by advanced second-language students of English makes sense to us because the challenge is considerable: their language competence is, as a rule, not yet quite on a near-native-speaker level (C1), and they mostly lack the cultural knowledge and reading experience of a student of English literature who has grown up in an English-speaking culture.⁷ All this belongs to the potential factors impacting the level of text comprehension in the foreign language (Grabe and Stoller, 2018). We hope to see what proves difficult to such students, however, not only because we wish to improve the basis for language teaching but also because we wish to learn more about the comprehension of literary texts. Their demands on comprehension skills frequently go beyond everyday discourse without being irrelevant. This is why a focus on literature is useful for an evaluation of comprehension. ‘Does literature have a language of its own, perhaps rather unrepresentative of, or rather different from ordinary language (e.g. old-fashioned, obscure, pretentious or generally “difficult”? The simple answer to this old question is, “No, there is nothing uniquely different about the language of literature.”’ (Hall, 2015: 9; see also Kramsch, 1993). Hall goes on to note: ‘But a fuller answer will reveal why the language to be found in literary texts is often particularly interesting for language learners.’ He points out that the difference between literary and non-literary language is not a simple dichotomy but that the two might more usefully be conceived of as the two end points of a continuum, and that it is actually not easy to draw up a list of criteria defining one or the other type. Still, ‘the language of literature is noticeably different in that it is typically more interesting and varied and ultimately indeed more representative than the language of dreamed-up dialogues in chemists’ shops or reprinted AIDS leaflets, as found in many of the best intentioned classrooms today’ (Hall, 2015: 12).⁸

The test we introduce assesses how Shakespeare’s Sonnet 43 is understood. It does so by employing questions that build on each other; the answers to these questions allow us to see how text comprehension is based on an interplay of local and global meaning/interpretation. In a broader perspective, we are interested in identifying the points at which understanding becomes more difficult or even impossible. These findings have larger implications: we learn something about how texts are understood and may then further inquire into specific phenomena (over a wider range of tests and texts) that complicate understanding; our knowledge may also help teachers act and react differently when dealing with literary texts in the classroom, whether in school or at university.

2. Method

Given the central role of literary texts in studying English, it is essential to be able to assess students’ comprehension of literary texts (as opposed to or in addition to merely testing factual or grammatical issues); this, however, poses particular difficulties given the potentially subjective nature of comprehension (Hall, 2015) that, as Hall also notes, test authors should be mindful of.

In this section, we introduce the test we developed for this purpose. It should be noted that our test was not high stakes for the students: taking part or doing well in it did not form part of the course requirements. The results were used purely as part of the research project underlying this study, and, therefore, teaching to the test did not take place since it was not used to assess the course or to evaluate the teaching either. Rather than relying on an extrinsic source of motivation, we try to motivate our students by explaining our larger agenda, namely to find means to improve text comprehension skills which in turn will help them improve their own teaching later.

2.1 The text: William Shakespeare’s Sonnet 43

We are interested in the properties of a text and how they affect comprehension, both in the sense of enhancing as well as obstructing it. Examples of specific phenomena which may influence the ability to understand a text include syntax, lexical features as well as figures of speech and structural components such as ambiguity, irony, paradox, and many more. These appear frequently both in literary and non-literary texts. While they can be very effective in evoking particular responses in the reader as desired by the author, they may also make comprehension more difficult, particularly for non-native speakers. At the same time, in most literary texts the difficulties do not exist by coincidence or because the authors were unable to express themselves more clearly. The difficulties are rather integrated into a context in which it makes perfect sense to use complex constructions. It is a rewarding experience to process them so that everything falls into place, especially if, as a consequence, the subject or content is felt to be represented in a thought-provoking manner. Shakespeare’s Sonnet 43 (quoted in full in the Appendix) is an excellent case in point. It is concerned with the relationship between what one actually sees and what one imagines, and with the speaker’s reflection on his own dreams and desires. Readers and listeners may find that the question of how to establish a true image of the person one loves is as relevant and as difficult today as it was more than 400 years ago. A text in which the author addresses this question by using language in a non-trivial manner and suggests a non-trivial answer may be worth some processing effort. For this very reason, it is also a text which lends itself to learning more about the difficulties of comprehension.

Apart from this thematic interest, our selection of a text for the comprehension test was based mainly on two considerations: firstly, complexity (with regard to text features), and, secondly, skills addressed that are relevant for the students. Our focus on poetry (see above) is motivated by its complexity, even though it is a feature difficult to measure (Chen and Meurers, 2016, 2019). As university students of English are assumed to have a sufficiently high command of English and some reading experience, the text must not be too easy (with regard to vocabulary, argumentative structure, form etc.; see section 2.3 on test respondents) nor too difficult. Since we regard text comprehension as a communicative skill, we do not measure the complexity of a text in strictly lexical, syntactic or semantic terms only but take into account pragmatic and rhetorical aspects as well (Vajjala and Meurers, 2014). This means that we aim to find texts which show a certain degree of complexity with regard to at least one such aspect, for example, figures of speech and stylistic means. Accordingly, the skill to read and understand a complex text is based on certain kinds of knowledge (e.g. lexical, contextual) that we presuppose and that are to be linked to the specific features of the text in question. Shakespeare’s poetry provides excellent material in order to demonstrate such higher-level skills and to develop them.

We hence regard Shakespeare’s Sonnet 43 as a particularly apt test case for mainly two reasons: the first is the line of argument pursued in the poem; the second is the use of ambiguity (which is, to some extent, related to the first). As far as the argumentative structure of this sonnet is concerned, the speaker opens with a statement that seemingly does not make sense: he sees best when he winks most. The phrase ‘When most I wink’ contains a verb that has become obsolete in some of its denotations: ‘to wink’ is now mostly used in the sense of ‘blink’, including its popular representation as an emoji, whereas, during the early modern period, it could also mean ‘to close one’s eyes’.⁹ The potential ambiguity of the word has to be noticed, in which case the (apparent) paradox becomes obvious. The argumentative structure of the poem consists in developing this paradox all the way to the final couplet and making it plausible as an expression of the speaker’s relationship with the addressee (‘All days are nights to see till I see thee, / And nights bright days when dreams do show thee me.’). Secondly, besides ‘wink’, the poem contains a number of further ambiguities or rather words that are used in a potentially ambiguous manner and in various grammatical forms, for example, lines 5–6: ‘Then thou whose shadow shadows doth make bright, / How would thy shadow’s form form happy show’. Here, knowledge of grammar helps disentangle the meaning: the phrase ‘whose shadow shadows doth make bright’ contains an inversion, and the first ‘shadow’ turns out to be the subject of the phrase, whereas ‘shadows’ is the object of the sentence; moreover, ‘shadow’ here means an ‘insubstantial object’ (OED 6.a.), that is, an image of the beloved that appears in the speaker’s imagination only¹⁰, ‘shadows’ refers to the ‘image cast by a body intercepting light’ (OED II.). The word ‘form’ is used as a noun and a verb in the following line (in a figura etymologica, that is, the repetition of a word’s root, involving different word categories. The sonnet hence contains several complex structures of different kinds that may obstruct the reader’s understanding but are part of a rhetorical strategy.

2.2 Test construction

One obvious way to approach the study of text comprehension would be to investigate individuals’ ways of making sense of a text in depth. Such studies have indeed been undertaken (some are described in Fox and Alexander, 2009; Kintsch, 1998; Leslie and Caldwell, 2009), but they are by their nature restricted to a fairly small number of respondents, given that methods such as thinking aloud protocols were employed by the researchers. By contrast, our aim was to develop an instrument which can be employed to assess text comprehension on a larger scale in a standardised way. Other tests with the same aim such as the OECD’s PISA studies or other large-scale international studies such as PIRLS often rely on closed (multiple choice or forced choice) items. The advantage of such items is that they are easy to score and that they tend to be high in reliability. The disadvantage is that it can be difficult to construct items that are complex enough to capture the underlying concept in sufficient depth and that it is difficult to construct suitable distractors.¹¹ Finally, existing tests of text comprehension such as those employed by the DESI study (Beck and Klieme, 2007; Klieme, 2008) are usually aimed at lower levels of English proficiency than that of our respondents. DESI also had a broader aim, focusing on all aspects of English as a second language, not just text comprehension. PISA is probably the best known set of studies concerned with literacy; but while its focus on reading literacy is of course in some respects relevant to our research interests, it only assesses literacy in the respondents’ mother tongue, whereas we are concerned with literary text comprehension in learners of a foreign language. And, as with DESI, the target population is teenagers, not university students.

Our own test has the following characteristics: Its items are standardised, that is, each respondent is presented with the same set of items, and each item has an open format. This makes the scoring more challenging than it would be for closed items, but this format contributes to the validity of the responses which allow for qualitative as well as quantitative analysis. The test items are of two kinds: the majority are intended to examine text comprehension as such, but some are aimed at exploring the respondents’ ability to reflect on their own skills. Students’ responses were scored by experts on early modern English literature, following a detailed answer scheme developed by the literary scholars among the test’s authors. Ambiguous answers were discussed with other team members until consensus was reached. The test is reproduced in full in the Appendix.

The 15 items of the test comprise three groups which refer to the first line of the poem (‘When most I wink, then do mine eyes best see’), lines 5–6 (‘Then thou whose shadow shadows doth make bright, / How would thy shadow’s form form happy show’) and lines 13–14 (‘All days are nights to see till I see thee, / And nights bright days when dreams do show thee me’). These lines pose difficulties of different kinds that are, however, related to the global meaning of the sonnet. In the first line, linguistic knowledge comes into play with the now obsolete meaning of ‘wink’. The task relating to lines 5–6 is concerned with grammatical knowledge – word classification – to make sense of the usages of ‘form form’ (noun and verb) and ‘shadow shadows’ (both nouns). At the same time, shadow is used ambiguously here as well: ‘shadow’ refers to an ‘insubstantial object’ (OED 6.a.), whereas ‘shadows’ refers to the ‘image cast by a body intercepting light’ (OED II.). This ambiguity is relevant to the meaning of the sonnet as a whole, which becomes manifest in the concluding lines that disambiguate the opening: literal day turns out to be a metaphorical night, and vice versa. The reversal explains how and why the beloved is best seen by the speaker with his/her eyes closed. The paradox, one of the prime figures that challenges understanding, is eventually resolved.

In the first group of tasks, we want to know if participants are able to comprehend the line/sentence as a whole by being able to relate its two parts to each other (item 1.1). We then want to find out if participants can tell us more about how they understand it, first by asking if it is ambiguous and to give reasons for their answer (item 1.2/1.3), then by asking if the line makes sense to them and to name the difficulty or say how they worked out the sense (1.4/1.5). The purpose of these last two items is not to check if participants share a particular interpretation but if they are able to reflect on their choice.

In the second group of items, our focus is on the relationship between understanding specific expressions and the lines as a whole. Accordingly, we ask about the word classes in line 6 (item 2.1), assuming that those who realise that the figura etymologica (‘form form’) comprises noun and verb, and that ‘show’ here is a verb, are prepared for item 2.5, in which participants are asked to translate or paraphrase the line. Item 2.2 is slightly more tricky; it asks about the word classes of ‘shadow shadows’, which are both nouns (hence, a polyptoton¹²). Similarly, item 2.4 reflects 2.2 by asking for a paraphrase or translation of line 5. Item 2.6 asks participants what they find striking about the form of lines 5 and 6; this is meant to identify those who are able to realise that the stylistic devices of polyptoton and figura etymologica (repetition of ‘shadow’ and ‘form’) and oxymoron or paradox (the shadow that makes shadows bright) contribute to the effect. We accept answers that show the participant’s awareness of language being used in a striking manner, even if the names of the rhetorical figures are not known.¹³ This item is intended to assess the test taker’s ability to evaluate the contribution of smaller units to the meaning of lines or sentences, even when they are unusual, and to evaluate the contribution of non-semantic features (such as rhetorical and poetic devices) to the overall meaning.

The last group of items aims at an understanding of the poem as a whole, that is, the move from local to global understanding of the text. 3.1 asks participants to paraphrase or translate the last two lines, which is easier if their relation to the preceding lines is understood. Item 3.4 explicitly asks participants to go back to line 1 and explain if and how it makes sense in the overall context of the poem. Accordingly, the skill required here is once more the ability to integrate parts into a whole, but on a larger scale than in the second group of items. The understanding of paradox is taken up again in items 3.2 and 3.3 which ask if line 13 makes sense to the participants; as in item 1.4/1.5 the point is not to give the ‘right’ answer but to name the difficulty or say how they worked out the sense.

A number of versions of the test were piloted using a small number of participants before the version presented in this paper was finalised. Using a larger sample of students, the psychometric properties of this current version were evaluated and the test was validated. We describe the respondents as well as procedure in the next section.

2.3 Respondents and procedure

430 students from three universities took part in the study. Nearly 60% of them were studying for a teaching degree, the remainder followed a different course such as a BA or MA in English literature. Most of the participants studied a second subject alongside English; these second subjects were varied and included, among others, history, biology, German, other foreign languages and sports. They had been studying for a period between one and 13 semesters, with the majority (101 respondents, 23.5%) in their second semester.

In constructing our test, we were concerned with factors which might be associated with greater text comprehension skills. These factors included prior attainment at school, whether the respondent had spent a substantial amount of time in an English-speaking country, and whether they had attended school or university there, as well as their reading habits and experiences, especially concerning Shakespeare (e.g. had they attended/participated in a class focusing on Shakespeare).

As indicators of prior attainment, we used the English grade in the last year of schooling and the overall Abitur grade¹⁴; the latter refers to the average in all subjects the student had taken, which serves as a selection tool for oversubscribed university courses. The Abitur subjects usually include German, mathematics, English or another foreign language, at least one science, as well as social science and arts subjects. The Abitur grade is on a scale from 1 to 6, with 1 as the highest grade and 4 as the lowest pass grade. On average, our respondents’ Abitur grade was 2.2, ranging from 1 to 3.6. Grades in English are on a scale from 0 to 15 with 15 being the highest grade and 5 the lowest pass grade. On average, our respondents had an English grade of 12.1, with nobody reporting a grade lower than 5. It is not surprising that the English grades were fairly high on average given that the participants had all chosen to study English at university. 123 (28.9%) respondents had spent at least 3 months in an English-speaking country, with 59 (13.7%) having attended school or university there. Table 1 and Table 2 give an overview of our respondents’ reading habits and previous experience of reading and/or studying Shakespeare’s works.

Table 1.

Respondents’ background: general reading habits.

	Newspaper articles	Short stories	Novels	Poems	Other^a
Daily	164 (39.1%)	25 (6.1%)	63 (15.1%)	9 (2.2%)	44 (47.3%)
Once a week	137 (32.7%)	80 (19.4%)	94 (22.6%)	50 (12.1%)	16 (17.2%)
Once a month	78 (18.6%)	165 (40.0%)	191 (45.9%)	132 (31.9%)	10 (10.8%)
Never	40 (9.5%)	142 (34.5%)	68 (16.3%)	223 (53.9%)	23 (24.7%)

^aReplies given here included ‘social media’, ‘blogs’, or ‘the internet’, but also, for example, ‘research articles’ and ‘biographies’.

Table 2.

Respondents’ background: reading Shakespeare.

	Yes	No
Has read Shakespeare at school^a	180 (42%)	249 (58%)
Has read Shakespeare at university^a	264 (61.5%)	165 (38.5%)
Has read Shakespeare in their spare time^a	83 (19.3%)	346 (80.7%)
Has attended a course of classes on Shakespeare and/or his time before	96 (22.4%)	332 (77.6%)

^aMultiple answers were possible here. The number of students who had never read Shakespeare in any context was 73 (17%).

The test was administered as part of a lecture or seminar the students attended. Thus, participants did not have to give up any of their own time in order to complete the test. Completion took around 30 minutes on average, but no time limit was set. It was made clear to the students that participation was voluntary, that their answers were going to be anonymised, and that non-participation would not result in any disadvantages for them. The results did not form part of the course assessment, making this a low stakes test for the students. They completed the test on paper, with copies handed out at the start of the testing session and collected at the end; participants were not able to take the test home. Dictionaries, mobile devices or similar were not permitted.

Given that the aim of our study was to gain insights into immediate reactions to the stimulus material and the comprehension processes triggered by it, we did not include a control group of authentic readers with access to such sources; that is, rather than to investigate comprehension in an authentic context where respondents might draw on various sources such as the internet to aid their comprehension, we aimed at finding out about participants’ strategies at resolving comprehension difficulties on the basis of their linguistic, lexical and grammatical, knowledge.

3. Results

The results of our test are addressed in a three-fold manner: we describe the psychometric properties of our test (section 3.1), analyse the respondents’ ability to reflect on their own comprehension (section 3.2), and consider findings pertaining to the test’s validity (section 3.3).

3.1 Psychometric properties of the test

12 of the 15 items of the test entered the final score (the rationale for the other three, items 1.2, 1.4 and 3.2, is explained above, section 2.2, with an analysis based on those three in section 3.2). Answers were coded as either right or wrong, so that the range of possible scores was 0–12. Out of the 430 students who took part in the test, 25 (5.8%) did not answer any of the items correctly, while four students (0.9%) scored the full 12 points. On average, respondents solved 4.21 items correctly. The median was four points, that is, half of respondents scored no more than four points and the other half above four points. Only 21% of respondents achieved more than six points, that is, about one fifth of the sample answered more than half of the items correctly.

The most difficult item was item 1.3 (19% correct responses), the easiest was item 2.1 (62% correct responses). On average, item difficulty¹⁵ (that is, the percentage of correctly solved items) was 35.1%.

Another property of test items is the measure of item-total correlation, an indicator which refers to how closely an item resembles the overall test. Items with higher item-total correlation are better at distinguishing a person with high ability (with ability defined in the sense of the test in question) from a person with low ability, compared to items with lower item-total correlation. The indicator theoretically ranges from −1 to 1 (as any correlation). Item-total correlations in our test were between 0.201 (item 2.1) and 0.540 (item 2.4), with an average of 0.399.

Items 1.5 and 3.3 both asked ‘If it [the line] doesn’t make sense to you, please say what causes the difficulty. If it does make sense, please explain how you worked it out’, referring to lines 1 and 13, respectively. A point was only given if the respondent had replied ‘yes’ to the previous item (‘Does the line make sense to you?’), and they had given a satisfactory explanation as to how they had worked out the meaning of the line. A satisfactory explanation as to what caused any difficulty in understanding the line (i.e. one given by such respondents who had replied ‘no’ to item 1.4) was not treated as evidence of text comprehension skills, receiving a score of 0. However, since this is relevant as an indicator of the ability to reflect on and identify gaps in one’s own knowledge, such respondents’ answers will be considered in some detail in section 3.2. Table 3 gives an overview of percentage of correct responses and item-total correlations for those items which did contribute to the overall score.

Table 3.

Test items.

Item no	Percentage correct responses	Item-total correlation
1.1	47.21	0.340
1.3	19.07	0.309
1.5	21.86	0.435
2.1	61.86	0.201
2.2	23.72	0.516
2.3	21.40	0.504
2.4	23.49	0.540
2.5	36.74	0.242
2.6	45.81	0.374
3.1	59.07	0.397
3.3	30.70	0.429
3.4	30.23	0.495

Cronbach’s alpha – an indicator of a test’s reliability which measures internal consistency on a scale from zero to one¹⁶ – was 0.76, which shows that our test works fairly well in this respect.

3.2 Ability to reflect on one’s own comprehension

As noted in section 2.2, the majority of the test items relate to text comprehension per se (and these entered the test score which indicates comprehension). However, it is a valuable skill to be able to reflect on one’s own comprehension and the possible reasons for its lack: this awareness allows readers to differentiate between challenging aspects of the text (e.g. ambiguities that allow for several concurrent interpretations) and their own ignorance of a cultural context or relevant connotations of a word. Moreover, it is a skill that allows readers to look for ways and means to remedy these shortcomings. Thus, future reading is supported, and one’s comprehension skills may be developed further. We therefore wished to assess whether and to what extent our respondents possess this skill, using some qualitative analysis of the content of students’ responses coupled with quantitative description of the pattern of responses.¹⁷

In the light of these considerations, we included two sets of items in our test to measure this reflexive skill. They are 1.4 with 1.5 and 3.2 with 3.3. Items 1.4 and 3.2 ask ‘Does the line make sense to you?’, referring to lines 1 and 13, respectively. Items 1.5 and 3.3 then ask ‘If it [the line] doesn’t make sense to you, please say what causes the difficulty. If it does make sense, please explain how you worked it out’. It takes a certain skill to be able to identify reasons for not understanding a particular line or passage of text, and we aimed at finding out what types of problems were identified by our respondents. It is hardly less demanding, however, to know why one does understand something, and we were curious to learn if there were any informative patterns in the positive answers given.

First, though, we established how many respondents claimed that the lines had made sense to them. This was the majority in each case: 83.5% said this was the case for line 1 (item 1.4) and 78.6% for line 13 (item 3.2). Not all of these then gave responses in items 1.5/3.3 that were scored as adequate; in other words, not all of them could explain how they had worked out the meaning of the lines. This is what was to be expected. Additional analyses of responses to items 1.1 and 3.1, which deal more directly with comprehension of these two lines, however, show that there was a certain discrepancy in perception: if someone had indeed understood the relevant line correctly, then they should have also given correct responses to items 1.1/3.1. Out of those who gave a satisfactory answer in 1.5/3.3 and explained how they made sense of the relevant line, 87.2% and 93.2%, respectively, had given correct responses to items 1.1 and 3.1. For those who claimed to have understood the lines but could not explain how in their responses to items 1.5 and 3.3, it was rarer (though by no means uncommon – 36.3% and 59.7%, respectively) to have given responses to items 1.1 and 3.1 which actually did show adequate comprehension. This indicates that the claim to have understood a verbal utterance is not necessarily confirmed by successfully solving a task that shows such an understanding. In other words: asking someone whether they have understood an utterance does not elicit reliable answers. Being able to say how one has established the meaning of a line seems to be a more reliable indicator of actually having understood it.

As to the responses provided by those who had given a satisfactory answer to 1.5 and 3.3, we have noticed some differences between 1.5 and 3.3 when looking at the kinds of answers given. The explanations offered in response to both items fall into more or less clearly distinguishable categories. As regards 1.5, of 102 statements of reasons (some participants mentioned more than one way of having worked out the meaning), 41 mention considering the context within the poem, 22 awareness of metaphors and generally literary figures and conventions, 15 general world knowledge, 10 considering or obtaining lexical information, 8 analysing the syntax, and 7 giving various other explanations. As regards 3.3, the 130 statements comprise 25 references to context, 23 to metaphors, etc., 13 to lexical information, 9 to syntax, 4 to world knowledge, and 11 give other reasons. The largest group, however, does only come up concerning item 3.3: 45 respondents either mention that they have paraphrased or translated the line, or actually paraphrase it. As these are categorisations of open answers, a quantitative evaluation is hardly possible. But the tendency towards paraphrase may have to do with the difference between lines 1 and 13 of the poem. Whereas the first line does not yet offer much context, participants have to look ahead at the context to figure out its meaning, whereas the context can be taken for granted by line 13, and the best way of making sure one has to understood it is to paraphrase it. A few respondents (7) indicated that they imagine what the speaker and/or addressee in the poem feel or think. This confirms the impression that the building up of context (and in this case the gradual formation of characters) contributes to our understanding of utterances.

Those who said that the lines had caused them difficulties and who were then able to name these difficulties mentioned problems that mainly fell into three categories: (1) Some respondents named a lack of relevant vocabulary as the reason for their difficulties. For example, ‘I don’t understand the word “wink” in this context, but I think it’s crucial for understanding the whole line and its meanings.’ The word ‘wink’ was indeed one of the key words in the opening line of the poem. If students knew the word at all, they were more likely to be familiar with its modern meaning (i.e. opening and closing one’s eyes in rapid succession, see above) but not with the early modern usage, which is to close one’s eyes. This reading of the word is vital for being able to identify the (apparent) paradox inherent in the first line which in turn is likely to affect comprehension of the rest of the sonnet.¹⁸ (2) Others did actually note that the line contains a paradox – without using the term – and gave this as the reason for their difficulties in understanding the line. For example, one explanation for difficulties in understanding line 1 was ‘with eyes closed you cannot see’. One response on line 13 was, ‘No, because normally only days can be bright. Nights are normally dark’. (3) Word order was a further stumbling block mentioned in the responses. Sample responses were ‘It is difficult due to the word order’; and ‘The structure of the sentence confuses the reader; especially the second sentence (“and nights bright days when dreams do show thee me”) – for me it’s not clear where the term “dreams” belongs to.’ For line 13, vocabulary was not mentioned as causing problems, but the other two sources of problems, word order and paradox, were mentioned here, too.

Other responses do not so obviously fit into any of these three categories. For example, one student noted that they did not understand the opening line to begin with, but that, after reading on, the context of the poem as a whole helped them with the first line (‘Not at first as I explained above. But after reading it another time and putting it in context with the poem it started to make more sense.’). Here we see a link to the many context-related answers of respondents who claimed to have understood the first line (see above).

These preliminary analyses show that it is possible to ascertain reflexive skill by means of a standardised paper-and-pencil test. An interesting point to note from the sample responses given above is that a certain level of comprehension is necessary in order to be able to reflect on and verbalise one’s own comprehension processes. For example, it is only possible to identify paradoxes – as several respondents had done – if the different elements of the sentence or line are understood well enough for the respondent to become aware of their potentially contradictory nature.

3.3 Validation

‘Validity concerns the extent to which the test tests what it is supposed to test; it must measure what it purports to measure’ (Cohen et al., 2018: 572). One of the ways of assessing the validity of a test is to investigate factors that may be expected to be associated with better performance in the test. If a positive association between such factors and the test score is found, this is an indicator of the test’s validity. We conducted a regression analysis with the test score as the outcome and the following background factors as predictors: overall Abitur grade, English grade in the last year of schooling, having attended school or university in an English-speaking country, daily reading of English-language novels, and having read Shakespeare previously. We also included as a factor whether someone studied for a teaching degree or whether they attended a different programme of study, for example, a BA programme which included English as a minor or major (for details of these background factors and how they were distributed amongst our respondents see section 2.3). Table 4 shows the results of the regression analysis.

Table 4.

Regression analysis with background predictors for test performance.

	Unstandardised coefficients B	Standard error	Standardised coefficients beta	t	Sig.
(Constant)	2.693	1.606	—	1.677	0.095
Abitur grade	−1.390	0.284	−0.282	−4.895	0.000
English grade	0.312	0.094	0.189	3.311	0.001
School/university in English-speaking country	−0.183	0.432	−0.021	−0.424	0.672
Reads novels daily	1.102	0.404	0.136	2.728	0.007
Has read Shakespeare previously	1.013	0.390	0.128	2.595	0.010
Teacher training course	−0.164	0.309	−0.026	−0.531	0.596

R²: 0.248. Statistically significant factors are highlighted in bold.

It can be seen from the table, on the one hand, that better school performance – both overall and specifically in the subject of English – was associated with higher test scores. Reading novels in English on a daily basis¹⁹ and having previous experience of reading Shakespeare were equally associated with higher test scores. By contrast, having spent at least 3 months at school or university in an English-speaking country was not linked to an increase in test performance. Finally, the type of degree course – teacher training or other – was not associated with a difference in test performance. It is worth noting that our R² of 0.248 suggests that, while we have included important factors in our model, there are others which also contribute to test performance but were not measured by us.

4. Discussion and conclusion

In this paper, we introduced a test of text comprehension developed specifically for university students of English as a second language. In this context, text comprehension is clearly one of the central skills to be mastered by the students, which is why it is essential to have instruments suitable for assessing it. Literary texts frequently present challenges to comprehension, which makes them rewarding not only as reading material in general but also for assessing comprehension skills. Shakespeare’s Sonnet 43 was chosen as a representative example which offers a range of demanding textual phenomena. Our aim was to be able to assess a large group of respondents while not losing sight of the complexity of the skills involved in text comprehension, which is why we chose to employ an open, but standardised, item format.

The test’s psychometric properties are satisfactory, with Cronbach’s alpha at 0.76 indicating an acceptable level of overall reliability. There was a suitably broad range of difficulty in the individual items, and the item-total correlations ranged from acceptable to good. We explored the test’s validity by establishing its association with external factors in a multiple regression analysis.

Most of the indicators were positively associated with performance on the test (and none negatively), which is evidence for the validity of our test. While we have noted the relevant factors in section 2.3, we would just like to comment on two of them in particular here: firstly, the previous experience of reading literary texts as indicated through the factors ‘Reads novels daily’ and ‘Has read Shakespeare previously’. Since the experience of reading literary texts contributes greatly to text comprehension skills, the relationship between those two factors and performance on our test confirms that the latter does indeed assess such skills. This finding stresses the importance of the students’ own reading: it enables them to read and understand a wide variety of literary texts.

Secondly, it is particularly striking that those who ‘Read novels daily’ and ‘Have read Shakespeare previously’ are also better at reflecting on their own process of comprehension (items 1.3, 1.5, 2.3, 3.3, 3.4). The reflective skills assessed by the test are important for a number of reasons. In the first place, they are a more reliable indicator of a reader’s actually having understood the verbal utterance in question (in this case, lines of a complex poem) than the claim to have understood it. This observation should be relevant to the teaching of text comprehension, and literature in general. Furthermore, students need to be aware of the limits of their own comprehension skills so that they know what to address: awareness of language change, for example, will motivate them to consult a historical dictionary like the OED so as to realise that words like ‘wink’ do not necessarily mean what we think. Awareness of apparent contradictions will encourage them to read up on the functions of rhetorical figures such as paradox. In order to improve one’s understanding, it is necessary first to be aware of the precise reasons of what hinders it.

But this is only part of the story (as every reader of this article who has ever tried to teach texts knows): very often, students/readers are not even aware of their not having understood what they have read, which means that they have the intuition to have understood it, but then their reaction to a particular question of the text (or the task to paraphrase a line/passage) shows that they have not. Our test also shows this phenomenon (see above, 3.2) and indicates where intervention (by a teacher, for example) might become necessary on a meta-level, for example, by promoting both the awareness of and the investigation into difficulties such as obsolete word meanings (‘wink’ in our example), ambiguity, figurative expressions, obstacles to integrating the context into the meaning of individual lines or sentences, etc. To be alert to such possible hindrances to understanding a text may facilitate understanding, after all. We thus also aim at identifying phenomena that may, on a more general level, obstruct text comprehension, which may help learners and teachers alike to anticipate problems and challenges. And thus, little by little, we may eventually illuminate parts of that black box of text comprehension and how it works. Shakespeare proves a masterful helper when it comes to turning the nights of blindness into bright days of understanding.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: German Federal Ministry of Education and Research (Bundesminsterium für Bildung und Forschung, 01JA1911).

ORCID iDs

Matthias Bauer

Judith Glaesser

Augustin Kelava

Angelika Zirker

Notes

Appendix

Table A1.

The test

Please read the following poem:

William Shakespeare’s Sonnet 43

01 When most I wink, then do mine eyes best see;

02 For all the day they view things unrespected,

03 But when I sleep, in dreams they look on thee,

04 And darkly bright, are bright in dark directed.

05 Then thou whose shadow shadows doth make bright

06 How would thy shadow’s form form happy show

07 To the clear day with thy much clearer light

08 When to unseeing eyes thy shade shines so?

09 How would (I say) mine eyes be blessed made

10 By looking on thee in the living day

11 When in dead night thy fair imperfect shade

12 Through heavy sleep on sightless eyes doth stay?

13 All days are nights to see till I see thee

14 And nights bright days when dreams do show thee me.

Task 1: line 1 ‘When most I wink, then do mine eyes best see;‘

Task 1.1: What is the relationship between part 1 of the line (‘When most I wink’) and part 2 (‘then do mine eyes best see’)?

Task 1.2: Does the line have more meanings than one? □ yes □ no

Task 1.3: Give a reason for your answer.

Task 1.4: Does the line make sense to you? □ yes □ no

Task 1.5: If it doesn’t make sense to you, please say what causes the difficulty. If it does make sense, please explain how you worked it out.

Task 2: lines 5–6 ‘Then thou whose shadow shadows doth make bright, / How would thy shadow’s form form happy show’

Task 2.1 Read line 6 and then decide on the word classes of the phrase ‘shadow’s form form happy show’ and mark them accordingly (N / V / Adj…).

Task 2.2: Read line 5 and then decide on the word classes of the phrase ‘whose shadow shadows’ and mark them accordingly (N / V / Adj…).

Task 2.3: Do ‘shadow’ and ‘shadows’ in line 5 have the same meaning? Give reasons for your answer.

Task 2.4: Paraphrase or translate line 5.

Task 2.5: Paraphrase or translate line 6.

Task 2.6: With regard to the form of lines 5–6, what do you find striking?

Task 3: lines 13–14 ‘All days are nights to see till I see thee, / And nights bright days when dreams do show thee me.’

Task 3.1: Paraphrase or translate lines 13–14.

Task 3.2: Does line 13 make sense to you? □ yes □ no

Task 3.3: If it doesn’t make sense to you, please say what causes the difficulty. If it does make sense, please explain how you worked it out.

Task 3.4: Going back to line 1: Explain if and how line 1 makes sense in the overall context of the poem.

References

Bauer

Beck

(2014) On the meaning of fictional texts. In: Gutzmann

Köpping

Meier

(eds) Approaches to Meaning: Composition, Values, and Interpretation. Leiden: Brill, 250–275.

Bauer

Beck

(2021) Isomorphic mapping in fictional interpretation. In: Maier

Stokke

(eds) The Language of Fiction. Oxford: Oxford University Press, 277–296.

Bauer

Beck

Riecker

, et al. (2020) Linguistics Meets Literature: More on the Grammar of Emily Dickinson. Berlin: De Gruyter.

Bauer

Knape

Koch

, et al. (2010) Dimensionen der Ambiguität. Zeitschrift für Literaturwissenschaft und Linguistik 158: 7–75.

Bauer

Zirker

(2017) Shakespeare und die Bilder der Vorstellung: ‘The soul’s imaginary sight’ im 27. Sonett. In: Robert

(ed) Diesseits des “Laokoon“: Funktionen literarischer Intermedialität in der Frühen Neuzeit. Berlin: De Gruyter, 39–54.

Beck

Klieme

(eds) (2007) Sprachliche Kompetenzen: Konzepte und Messung. Weinheim: DESI-StudieBeltz.

Belsey

(2007) Why Shakespeare? Basingstoke: Palgrave Macmillan.

Bernhardt

(2010) Second-language readers and literary text. In: Bernhardt

(ed) Understanding Advanced Second-Language Reading. London: Routledge, 81–100.

Blakemore Evans

(ed) (2006) William Shakespeare: The Sonnets. 2nd edition. Cambridge: Cambridge University Press.

10.

Booth

(ed) (1977) Shakespeare’s Sonnets. New Haven: Yale University Press.

11.

Brockmann

Riecker

Bade

, et al. (2017) FictionalAssert and implicatures. In: Linguistic Evidence 2016: Empirical, Theoretical, and Computational Perspectives. Tübingen: Conference University of Tübingen. DOI: 10.15496/publikation-19038.

12.

Brogan

TVF

(2012) Polyptoton. In: Greene

(ed) The Princeton Encyclopedia of Poetry and Poetics. Princeton: Princeton UP, 1086–1087.

13.

Burrow

(ed) (2002). William Shakespeare: The Complete Sonnets and Poems. Oxford: Oxford University Press.

14.

Burwitz-Melzer

(2007) Ein Lesekompetenzmodell für den fremdsprachlichen Literaturunterricht. In: Bredella

Hallet

(eds) Literaturunterricht, Kompetenzen und Bildung. Trier: Wissenschaftlicher Verlag Trier, 127–157.

15.

Chen

Meurers

(2016) CTAP: a web-based tool supporting automatic complexity analysis. In: Coling (ed) Proceedings of the workshop on computational linguistics for linguistic complexity, Osaka, 113–119. http://aclweb.org/anthology/W116-4113.pdf

16.

Chen

Meurers

(2019) Linking text readability and learner proficiency using linguistic complexity feature vector distance. Computer Assisted Language Learning 32(4): 418–447.

17.

Cohen

Manion

Morrison

(2018) Research Methods in Education. 8th edition. London: Routledge.

18.

Das

Bhushan

(2005) From hard poetics to situated reading: A cognitive-empirical study of imagery and graded figurative language. In: Veivo

Pettersson

Polvinen

(eds) Cognition and Literary Interpretation in Practice. Helsinki: Helsinki University Press, pp. 219–236.

19.

Dixon

Bortolussi

(1996) Literary communication: effects of reader-narrator cooperation. Poetics 23: 405–430.

20.

Duncan-Jones

(ed) (2010) Shakespeare’s Sonnets. 2nd edition. London: The Arden Shakespeare.

21.

Fox

Alexander

(2009) Text comprehension: a retrospective, perspective, and prospective. In: Israel

Duffy

(eds) Handbook of Research on Reading Comprehension. New York: Routledge, 227–239.

22.

Grabe

Stoller

(2018) How reading comprehension works. In: Newton

(ed) Teaching English to Second Language Learners in Academic Contexts: Reading, Writing, Listening, and Speaking. New York: Routledge, 9–27.

23.

Hall

(2015) Literature in Language Education. 2nd edition. Basingstoke: Palgrave Macmillan.

24.

Hammond

(ed) (2012). Shakespeare’s Sonnets: An Original-Spelling Text. Oxford: Oxford University Press.

25.

Kintsch

(1998) Comprehension: A Paradigm for Cognition. Cambridge: Cambridge University Press.

26.

Klieme

(ed) (2008). Unterricht und Kompetenzerwerb in Deutsch und Englisch: Ergebnisse der DESI-Studie. Weinheim: Beltz.

27.

Klieme

Hartig

(2007) Kompetenzkonzepte in den Sozialwissenschaften und im erziehungswissenschaftlichen Diskurs. Kompetenzdiagnostik: Zeitschrift für Erziehungswissenschaft 8: 11–29.

28.

Kramsch

(1985) Literary texts in the classroom: a discourse. The Modern Language Journal 69(4): 356–366.

29.

Kramsch

(1993) Context and Culture in Language Teaching. Oxford: Oxford University Press.

30.

Leslie

Caldwell

(2009) Formal and informal measures of reading comprehension. In: Israel

Duffy

(eds) Handbook of Research on Reading Comprehension. New York: Routledge, 402–427.

31.

Leslie

Schudt Caldwell

(2017) Assessment of reading comprehension: challenges and directions. In: Israel

(ed) Handbook of Research on Reading Comprehension. 2nd edition. New York: The Guilford Press, 219–240.

32.

Mabillard

(2000) Why Study Shakespeare? Shakespeare Online.http://www.shakespeare-online.com/biography/whystudyshakespeare.html (accessed 20 August 2000).

33.

McNamara

Magliano

(2009) Toward a comprehensive model of comprehension. In: Ross

(ed) The Psychology of Learning and Motivation. Amsterdam: Elsevier Academic Press, pp. 297–384. DOI: 10.1016/S0079-7421(09)51009-2

34.

Meireles

(2005) Leseverstehen aus der Perspektive des Nicht-Muttersprachlers. In: Blühdorn

Breindl

Waßner

(eds) Text – Verstehen: Grammatik und darüber hinaus. Berlin: De Gruyter, 299–314.

35.

Miall

(2006) Literary Reading: Empirical and Theoretical Studies. New York: Peter Lang.

36.

Paran

(2008) The role of literature in instructed foreign language learning and teaching: an evidence-based survey. Language Teaching 41(4): 465–496.

37.

Rupp

(2002) Empirisches Beispiel: Interpretation im Literaturunterricht. In: Groeben

Hurrelmann

(eds) Lesekompetenz: Bedingungen, Dimensionen, Funktionen. Weinheim: Juventa, 106–122.

38.

Smith

(2019) This is Shakespeare. London: Random House.

39.

Strohner

(2005) Textverstehen aus psycholinguistischer Sicht. In: Blühdorn

Breindl

Waßner

(eds) Text – Verstehen: Grammatik und darüber hinaus. Berlin: De Gruyter, 187–204.

40.

Van Dijk

Kintsch

(1983) Strategies of Discourse Comprehension. New York: Academic Press.

41.

Vajjala

Meurers

(2014) Assessing the relative reading level of sentence pairs for text simplification. Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL-14), Gothenburg, Sweden. http://aclweb.org/anthology/E14-1031.pdf

42.

Vendler

(1997) The Art of Shakespeare’s Sonnets. Harvard: Harvard University Press.

43.

Zwaan

(1993) Aspects of Literary Comprehension: A Cognitive Approach. Amsterdam: Benjamins.

‘When most I wink,then’ – what? Assessing the comprehension of literary texts in university students of English as a second language

Abstract

Keywords

1. Introduction

2. Method

2.1 The text: William Shakespeare’s Sonnet 43

2.2 Test construction

2.3 Respondents and procedure

3. Results

3.1 Psychometric properties of the test

3.2 Ability to reflect on one’s own comprehension

3.3 Validation

4. Discussion and conclusion

Footnotes

Declaration of conflicting interests

Funding

ORCID iDs

Notes

Appendix

References