Sage Journals: Discover world-class research

Abstract

Keywords

surveys and questionnaires selection bias epidemiology research designs

As the field of paramedic-led research continues to advance, we must be cognizant of the challenges and threats to rigor which may create a weak evidence base. Research with limited conceptual consideration, related to both collection and/or reporting, creates a foundational instability to our evidence base. One area of particular concern is that of survey research.¹ In a recent editorial,² Dr Paul Simpson highlighted some of the contemporary challenges associated with survey research for scholars in paramedicine.

In this editorial, we will seek to highlight additional considerations which are critical for a successful project, focusing on key considerations in the conceptualization, operationalization, and analytic stages of the research process. Survey research can be more nuanced than it may initially appear. When developing a survey, it is critical to ask yourself the following questions.

How am I going to capture data which aligns with my research questions?

In surveys, researchers may choose to use questionnaires, checklists, or scales. Each of these tools have different methodological considerations to be used appropriately.³ For researchers who choose to use previously validated scales, there is strength insofar as it supports data which can be compared in an “apples to apples” way to prior applications.

However, previously validated scales are exactly that; they have been shown to be valid and reliable in previous study populations, which may not match your own sample characteristics. As a result, the scale may not have the same relevance or utility in your own sample, depending on it's development and validation. When using previously validated scales, it is important to keep in mind that you’ll need to prove that credibility of your results. This includes addressing these key considerations.
Is this scale relevant for the population I’m studying? Certain scales may have been validated on populations with different demographics or other characteristics. Has it been used with paramedics before? If not, is there an argument for other populations (e.g., law enforcement or fire services) that you can use to support your argument that this scale is an appropriate choice?

Are you using the scale as designed? If you adapt a scale, use specific subsets of a scale, or otherwise alter the instrument, you can no longer claim that it has been previously validated. If you change the scale, you will need to create additional checks within the survey to support the reliability and validity of your tool. Describing the steps of scale validation is outside the scope of this editorial; however, resources are available for those who may need them.⁴

Are you using the instrument in a way that it has been used previously? Pay attention to previous publications and be familiar with their analytic strategies. What cutoffs (if any) were used? If a scale was validated as a continuous measure, you will want to be prepared to defend the choice to use it as a dichotomous measure in your study.

If you’re using a previously validated measure, always make sure to include evidence that the scale has behaved reliably in your study by reporting accepted metrics (e.g., Cronbach's alpha).⁵

If no specific scale exists that will help you answer your research questions, you may consider creating your own scale.⁶ If this is the situation, when developing the survey, you should include robust considerations for face validity,⁷ convergent and discriminant validity, as well as analyses which demonstrate the scale is capturing the underlying construct using factor analysis.^8–10

How am I selecting my participants—consciously and/or unconsciously?

Serious consideration should be taken in how you recruit your sample and collect your data from participants. In-person recruitment may lead to a social pressure to participate.¹¹ Electronic recruitment may mitigate social pressure by creating distance between the researcher and the participant, but may also introduce significant concerns related to incomplete responses and nonresponse. It can also bias your sample selection toward those who believe your topic is important.¹² Web-based recruitment may be vulnerable to participants outside of the ideal sampling frame (e.g., people who are not paramedics) or increasingly, fraudulent responses from bots.^13,14

Each of these methodological considerations presents a different set of choices for researchers. If you are approaching participants in person, how do you ameliorate the social pressure to participate? If you are reaching out over email, how do you address nonresponse?¹⁵ And if you are using web-based recruitment, how do you make sure you are gathering responses from your targeted participants? What strategies can you use to support the relative representativeness of the sample? Addressing these concerns is dependent on the approach to sampling; examples may include a clear indication of the number of responses within the sampling frame (including the response rate in your publication), deploying a nonresponder survey, or comparing the demography of your study to previous studies targeting the same participant profile (in this case, other studies of paramedics). No recruitment strategy is perfect, however being clear about the threats to validity and building in strategies to address those threats are key steps in the planning process.

The bottom line is, sample characteristics matter for the conclusions you draw, and ultimately for how your findings are subsequently used in the public domain. For example, participants may be more likely to participate in your survey if they believe your topic is research important.¹² How your recruitment wording is framed can introduce bias in participants who are more (or less) likely to respond to the way you have portrayed the intent of your research.¹⁶ For example, the wording of different online advertisements recruiting Australian men for an online mental health study resulted in different recruitment rates, with differing mental health symptom profiles.¹⁷

If your survey sample is not broadly representative of “all possible paramedics” you are attempting to generalize to (e.g., “Australian paramedics”), due to factors such as recruitment channels and related demographic skew, findings cannot be generalized to the entire population. For instance, high prevalence rates of particular mental health symptoms in a survey of paramedics might reflect those most affected (and thus, selecting into the study) rather than all paramedics. In the example from Choi et al.,¹⁷ recruitment for a mental health study using wording such as “strength” yielded higher recruitment rates (although these did not translate into completion of mental health measures), compared to “mental health” advertisements; and the mental health symptom severity scores differed by recruitment language. Wording framed as a question (e.g., “worried about mental health?”) was associated with poorer mental health symptoms relative to those recruited with a focus on mental fitness or resilience.¹⁷ Carefully consider the impact of recruitment framing biases for which paramedics self-select into your studies, and the downstream impacts on any mental health rates you may then report in your studies.

What am I actually measuring in my survey?

The importance of knowing, and appropriately reporting, your sample becomes evident when we look at a working example. For demonstration, we’ll consider measurement and understanding of mental health symptoms and disorders in paramedics. Estimating the prevalence of mental health conditions in a specific population is not straightforward. While surveys are a common tool, the method used to define and measure a mental health condition can influence the reported rates of particular symptoms or conditions. Understanding these methodological differences is crucial for interpreting findings accurately.

Clinical interviews conducted by trained mental health professionals broadly represent the recommended methodology for identifying mental health conditions. Clinical interviews allow for nuanced assessment using established diagnostic criteria, considering a variety of factors including symptom severity, duration, and presence of any functional impairment. However, this approach is rarely feasible for large-scale studies due to cost, time, and logistical constraints. Self-reported diagnoses of mental health conditions, where participants are asked to indicate whether they have been diagnosed with a mental health condition by a professional, can also be used in surveys. While this method is practical, it also introduces bias. Not everyone with mental health symptoms seeks help, and help-seeking behavior varies by factors such as stigma, access to care, occupation type, and cultural influences.^18–22 Prevalence estimates based on self-reported diagnoses alone may poorly represent the accuracy of rates, particularly in populations facing barriers to mental health services.

Validated symptom scales (e.g., the Patient Health Questionnaire [PHQ-9]²³ for depressive symptoms, or the Generalized Anxiety Disorder [GAD-7]²⁴ for anxiety symptoms) are often used as screening instruments. We use the PHQ-9 and GAD-7 here simply as brief illustrative examples, and recognize that screeners should be selected based on their validity and reliability in your specific population (and geographical location) of interest. Brief screening instruments such as the PHQ-9 and GAD-7 measure the presence and severity of symptoms associated with mental health conditions. A high score is suggestive of more severe symptoms, but cannot definitively confirm a diagnosis, as this brief scale does not account for differential diagnoses or functional impairment.

Your choice of measurement method has considerable implications for interpretation. For example, reporting that a particular percentage of paramedics experience depression based on symptom scales in a convenience sample does not mean that percentage will hold true for all paramedics having a clinically diagnosed depressive disorder; yet, it can be easy for these data to be misinterpreted this way. Without clarifying the basis of measurement, such figures risk misinforming policy and practice, or blurring our understanding of risk factors and intervention approaches which hold relevance to the broader profession.

So what is the solution? Three quick tips to get you started

Ultimately, we know that underpowered sample sizes and convenience samples (such as those recruited by snowballing, word of mouth, and sharing online) can add value to our understanding in the field. This is particularly the case if careful consideration and attention is paid to the study design. Using existing reporting frameworks to structure your publications can be really helpful for transparently representing your findings. Examples include the CROSS checklist,²⁵ which provides guidance for survey or cross-sectional studies, and the CHERRIES checklist²⁶ for internet based surveys.
Consider your recruitment strategy, and the language used to communicate with potential participants.

In the design phase: Think about the language you use to recruit participants in flyers. Are you framing the research in a way that makes it more likely to appeal to particular participants and bias your sample? What could this mean for your study, and how could you work on recruitment language to support involvement of your target participant group?

When you are reporting your findings: Draw on reporting guidelines to ensure the reader has the best possible picture of your recruitment strategy. Make sure you are transparent about how you recruited your participants, and where advertisements were posted. Report the use of any financial incentives, and any specific demographics which may have been preferentially recruited (either intentionally or inadvertently).
Carefully consider your outcome measures at the beginning of your study

In the study design phase: Carefully consider what you would like your outcome measure to help you measure in your sample, as the way you ask your questions can influence findings. For example, self-reported diagnosis of a particular health condition may offer insight into help-seeking rates, but may underestimate unmet need if you are asking about formal diagnoses where help seeking is low, or stigmatized. Symptom scales are valuable for screening and identifying risk but should not be conflated with diagnostic prevalence. Consider whether they hold relevance for your population of interest, and identify any evidence to support this decision (i.e., from existing literature).

When you are reporting your findings: Researchers must clearly state how the primary outcomes of interest have been measured, and acknowledge limitations; no study is “free” from limitations, and transparent reporting is critical. Consider language such as “possible depression,” “probable depression,” or “symptoms of depression” if you do not have a confirmed diagnosis. Ideally, compare like-for-like; identifying existing studies which report rates for symptom cut-points on existing scales and comparing your rates is more helpful than simply reporting your rates alone. Additionally, look for ways to compare (and importantly, report) the demographics of your study sample relative to what is known about paramedics more broadly in your region, or internationally if appropriate to your research question. This is really important to ensure you are giving your readers enough information about the sample that your findings are relevant to.

If you are looking to compare symptoms of a particular mental or physical health condition in your sample with the broader population, make sure that the reference values you use are meaningful and appropriate for comparison. By way of example, it would not be appropriate to compare population-level mental health diagnoses by ICD code with a low-threshold cut point for symptoms on the PHQ-9 or GAD-7, as this may inflate symptom “prevalence” rates in your sample—potentially leading to incorrect statements about “prevalence” of particular mental health conditions in your sample.
Collaborate with methodological experts—from the beginning!

Consider the skills of your research team, and where appropriate, bringing experts in research methodology, and/or specific speciality areas, into your research. Just as paramedics have unique expertise in their profession, so too do epidemiologists in terms of methodological skills, and psychologists or social workers with mental health. Paramedicine is a developing profession, and encouraging transdisciplinary research supports stronger, more robust evidence development. While working together from the beginning of a study is ideal, connecting with methodological experts will help you to grow your knowledge and research skills, and ultimately support research quality development in Paramedicine over time.²⁷

Ultimately, transparency about methodology and sample representativeness is essential in reporting your findings. Prevalence figures are not absolute truths; they are estimates shaped by your measurement choices, the language you use when recruiting participants, how you recruit your samples, and broader contextual factors. Recognizing this complexity helps avoid misleading conclusions, and support clearer insights and robust, defensible research conclusions when using survey designs.

Footnotes

ORCID iDs

Amy C Reynolds

Elizabeth A. Donnelly

Author contribution(s)

Amy C Reynolds: Conceptualization; Writing – original draft; Writing – review & editing.

Elizabeth A. Donnelly: Conceptualization; Writing – original draft; Writing – review & editing.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Declaration of Conflicting Interests

The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Amy Reynolds has no conflicts of interest to declare. Elizabeth Donnelly is a Deputy Editor at Paramedicine. In line with editorial policy, this editorial was no peer-reviewed.

References

Ziniel

McDaniel

Beck

. Bringing scientific rigor to survey design in health care research. Hosp Pediatr 2019; 9: 743–748.

Simpson

. Doing better with survey-based research in paramedicine. Paramedicine 2025; 22: 55–59.

Philip

. Questionnaire, rating scale and checklist – how do they differ? Methods Psychol 2024; 10: 100145.

Boateng

Neilands

Frongillo

, et al. Best practices for developing and validating scales for health, social, and behavioral research: a primer. Front Public Health 2018; 6: 149.

Tavakol

Dennick

. Making sense of Cronbach’s alpha. Int J Med Educ 2011; 2: 53–55.

Carpenter

. Ten steps in scale development and reporting: a guide for researchers. Commun Methods Meas 2018; 12: 25–44.

Allen

Robson

Iliescu

. Face validity. Eur J Psychol Assess 2023; 39: 153–156.

Goodwin

. The role of factor analysis in the estimation of construct validity. Meas Phys Educ Exerc Sci 1999; 3: 85–100.

Morgado

FFR

Meireles

JFF

Neves

, et al. Scale development: ten main limitations and recommendations to improve future research practices. Psicologia 2017; 30: 3.

10.

Watkins

. Exploratory factor analysis: a guide to best practice. J Black Psychol 2018; 44: 219–246.

11.

Nederhof

. Methods of coping with social desirability bias: a review. Eur J Soc Psychol 1985; 15: 263–280.

12.

Stone

Schneider

Smyth

, et al. A population-based investigation of participation rate and self-selection bias in momentary data capture and survey studies. Curr Psychol 2024; 43: 2074–2090.

13.

Pace

Kim

, et al. Threats to online surveys: recognizing, detecting, and preventing survey bots. Soc Work Res 2022; 46: 343–350.

14.

Storozuk

Ashley

Delage

, et al. Got bots? Practical recommendations to protect online survey data from bot attacks. Quant Methods Psychol 2020; 16: 472–481.

15.

M-J

Zhao

Fils-Aime

. Response rates of online surveys in published research: a meta-analysis. Comput Hum Behav Rep 2022; 7: 100206.

16.

August

Oliveira

Tan

, et al. Framing effects: choice of slogans used to advertise online experiments can boost recruitment and lead to sample biases. Proc ACM Hum Comput Interact 2018; 2: 1–19.

17.

Choi

Milne

Glozier

, et al. Using different Facebook advertisements to recruit men for an online mental health study: engagement and selection bias. Internet Interv 2017; 8: 27–34.

18.

Clement

Schauman

Graham

, et al. What is the impact of mental health-related stigma on help-seeking? A systematic review of quantitative and qualitative studies. Psychol Med 2015; 45: 11–27.

19.

Milner

Scovelle

King

. Treatment-seeking differences for mental health problems in male- and non-male-dominated occupations: evidence from the HILDA cohort. Epidemiol Psychiatr Sci 2019; 28: 630–637.

20.

Chen

Mak

WWS

Lam

BCP

. Is it cultural context or cultural value? Unpackaging cultural influences on stigma toward mental illness and barrier to help-seeking. Soc Psychol Pers Sci 2020; 11: 194855061989748.

21.

Möller-Leimkühler

. Barriers to help-seeking by men: a review of sociocultural and clinical literature with particular reference to depression. J Affect Disord 2002; 71: 1–9.

22.

Gulliver

Griffiths

Christensen

. Perceived barriers and facilitators to mental health help-seeking in young people: a systematic review. BMC Psychiatry 2010; 10: 113.

23.

Kroenke

Spitzer

Williams

. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med 2001; 16: 606–613.

24.

Spitzer

Kroenke

Williams

JBW

, et al. A brief measure for assessing generalized anxiety disorder: the GAD-7. Arch Intern Med 2006; 166: 1092–1097.

25.

Sharma

Minh Duc

Luu Lam Thang

, et al. A consensus-based checklist for reporting of survey studies (CROSS). J Gen Intern Med 2021; 36: 3179–3187.

26.

Eysenbach

. Improving the quality of web surveys: the checklist for reporting results of internet E-surveys (CHERRIES). J Med Internet Res 2004; 6: e34.

27.

Simpson

. Leveraging collaboration to enhance quality in paramedicine research. Paramedicine 2024; 21: 144–146.

Building reliable evidence: Survey design and bias in paramedicine research

Abstract

Keywords

How am I going to capture data which aligns with my research questions?

How am I selecting my participants—consciously and/or unconsciously?

What am I actually measuring in my survey?

So what is the solution? Three quick tips to get you started