Abstract
Objectives:
This article delves into the challenges of medical data collection during the COVID-19 pandemic in developing countries, using Nigeria as a case study. It emphasizes how data collection impacts research quality, reliability, and validity.
Methods:
Qualitative research utilizing purposive sampling was employed to explore experiences in designing a diagnostic tool for febrile diseases in Nigeria. A questionnaire with selectable and open-ended questions was utilized for data collection, and 23 respondents participated.
Results:
Among 74 potential participants, 23 valid responses were gathered, revealing significant themes related to experiences and challenges in medical data collection. A multidisciplinary team approach proved beneficial, fostering collaboration, enhancing knowledge, and promoting positive experiences. Despite challenges with paper questionnaires, most participants preferred them for ease of use. Connectivity issues hindered timely data uploading and disrupted virtual meetings.
Conclusion:
Innovative and flexible strategies, such as a blended data collection approach and well-coordinated teams, were vital in overcoming challenges. Electronic data collection tools, reminders, and effective communication played key roles, leading to positive outcomes. This study provides valuable insights for researchers and practitioners involved in data collection, particularly in developing countries like Nigeria.
Introduction
Data collection in research is a complex process fraught with various challenges, such as difficulties in accessing the necessary individuals, low response rates to participation invitations, and individuals’ willingness to take part. In the medical field, data collection poses even greater challenges due to the sensitive nature of the health information being collected, which is typically confidential. Holden et al. 1 identified numerous challenges in contextual data collection within the healthcare sector and emphasized the importance of researchers anticipating, addressing, and navigating these challenges effectively. On the other hand, other researchers have emphasized that the data collection process, when executed accurately, stands as the most crucial aspect of research, significantly enhancing the quality, reliability, and validity of the research outcomes. 2
Advancements in technology such as online, mobile, and other digital platforms that can be used for data collection are greatly improving the way researchers collect data. Nevertheless, these also come with other challenges such as user access to these technologies and knowledge of how to use them especially when dealing with older adults and users in developing countries. 2 The COVID-19 pandemic has forced researchers to resort to other novel methods of data collection using both remote approaches and the use of video conferencing technologies such as Zoom and MS Teams.3,4 The global COVID-19 pandemic had a profound impact on medical data collection worldwide, including in Nigeria. The implementation of restrictions and safety measures to contain the spread of the virus significantly disrupted traditional methods of data collection, presenting considerable challenges to the healthcare system, including the mode of medical data collection. 3 Furthermore, the strain on healthcare resources and the shift in focus toward managing COVID-19 cases diverted attention and resources away from general disease management and research. Staff and resources were prioritized toward COVID-19 activities above all else. 5 Despite Nigeria’s relatively low number of reported COVID-19 deaths, 6 the government implemented COVID-related policies and procedures aligned with international practices and requirements which impacted access to medical staff for data collection. In Nigeria, the COVID-19 pandemic exacerbated the already poor healthcare funding situation by compelling state governments to redirect health budgets toward COVID-19 interventions, including setting up quarantine centers, COVID testing hubs, and sanitization zones at the expense of other health-related investments. 7 These disruptions had far-reaching consequences, impacting access to required medical personnel. Besides, collecting public health data in developing countries can add more contextual challenges that are very different from those faced by previous researchers whose studies have been largely based in developed countries. 8
The global shortage of human resources for health, especially in low-income countries is a severe threat to achieving the goals of universal health coverage. 9 Several rural communities in Low- to middle-income countries (LMICs) lack access to quality health care. The worsening migration of physicians from LMICs to high-income countries has led to a growing interest in the training and use of middle- and lower-cadre community health workers to provide needed health services, especially at the primary care level. These frontline health workers (FHWs) are nonphysicians and range from newly trained community health extension workers (CHEWs) to more experienced and long-practicing community health officers (CHOs). To guide their practice, manual algorithms have been developed to enable them to diagnose and treat common diseases seen in the community at the primary care level.10,11 In many instances, these FHWs face resource limitations that hinder their ability to accurately and promptly diagnose illnesses, particularly in cases involving multiple symptoms or diseases. Rural communities, particularly in LMICs like Nigeria, frequently experience a high prevalence of febrile diseases such as malaria, urinary tract infections, respiratory tract infections, tuberculosis, enteric fever, dengue fever, and yellow fever. These conditions may arise due to factors like inadequate vector control, poor sanitation, drug resistance, and self-diagnosis practices. Consequently, there is a pressing need to provide FHWs with a tool that can effectively support them in managing fever and other diseases accompanied by fever and comorbidities within these communities. Research has indicated that artificial intelligence (AI) has the potential to significantly enhance the diagnostic effectiveness and efficiency of FHWs operating in rural areas. 12
To address these challenges in developing countries, an international multidisciplinary research team was assembled comprising computational science experts and medical experts based in Nigeria, Canada, the United States of America, Uganda, and the United Kingdom, to design and carry out research on developing a tool that can be used by FHW to diagnose febrile illnesses. Some key members of the study team had earlier worked on developing AI interventions for the diagnosis of tropical diseases. 1 The overarching goal of the research project was to increase healthcare access for people living in rural and resource-poor communities in LMICs by developing a multi-disease, multi-symptom soft-computing system for early differential diagnosis of these diseases by FHWs. The project was partitioned into five phases (Systematic literature review; field data collection; application modelling; application development and testing of the application). This article focuses on the field data collection phase of the project. This article analyses the qualitative experiences of research assistants, physicians, state coordinators, the project manager, and the data quality analyst who carried out the data collection that took place between May 2021 to November 2021. The article provides novel insights into the contextual challenges in medical data collection in developing countries with a case study of the Nigerian context. It highlights the methodology adopted for the data collection on febrile diseases, the challenges faced, and a discussion on how these challenges can be avoided and mitigated for future data collections in similar contexts.
Methods
For this study, a qualitative research approach was employed, utilizing a questionnaire designed for different participant groups involved in the data collection process. The questionnaire consisted of a combination of questions with selectable options and open-ended questions, allowing respondents to share their experiences related to various aspects of the data collection process. The questionnaire was organized into four sections corresponding to the participant group types. The researchers designed the data collection instrument, which underwent peer review by senior colleagues not affiliated with the research team. Feedback and comments from the reviewers were incorporated to refine the tool before administering it to the identified participant groups.
Purposive sampling was used to select participants for this study because only people who participated in the data collection exercise could take part in the study. The participants were drawn from the 60 physicians who collected data from patients through consultations, 8 research assistants (RA) who were responsible for fieldwork in the 4 participating states, 4 state coordinators who were overseeing RA activities in each state, 1 project manager handling overall data collection management, and 1 data quality analyst (DQA) who was responsible for receiving, collating, compiling, and preparing field data for analysis. Hence, we had a population of 74 eligible participants to draw our sample from based on their roles in the data collection exercise. Patients whose data were collected by the physicians were excluded from participating in this study as the focus was on the qualitative experiences of the data collectors. Also, research participants who did not give consent despite several reminders to respond to the questionnaire were also excluded from this study.
An invitation was sent to all 74 eligible participants to take part in this study to share their experiences on the data collection exercise. A self-reporting approach was employed, enabling respondents to provide their answers at their convenience. Out of 60 physicians who took part in the data collection process, 11 of them responded to the invitation and shared their experiences and challenges in the data collection process. Data saturation was achieved from the perspective of the physicians when the same themes were repeated in the responses received from them, hence there was no additional benefit in requesting for responses from more physicians. Out of eight RAs who took part in the data collection, seven responded and shared their experiences in the data collection exercise. Meanwhile, three out of the four state coordinators also shared their experiences on the data collection process as well as the project manager and the DQA. All the respondents who agreed to participate in this study provided their written informed consent to participate before the questionnaires were administered to them.
Data collection took place across four states in the Niger Delta region of Nigeria and lasted over 2 weeks from 1 to 14 November 2021. The RAs had the responsibility of following up on the participating doctors to ensure that they promptly responded to the questionnaire. The DQA had the overall duty of prompting all team members to submit their responses, receiving all responses, uploading them on a spreadsheet, and collating them for analysis. The collected responses were stored in a central repository for collation and analysis. The data collection process resulted in a total of 23 valid responses that were used for the analysis. The analyses of all the experiences shared by the respondents formed the basis for this study and the results are discussed in the next section.
Results
Out of a population of 74 eligible participants, 23 valid responses were received for analysis, resulting in a response rate of 31.1%. The majority of participants (47.83%) were physicians who had previously contributed their experiential knowledge to the main study by collecting data from patients. Following closely were the RAs (30.43%). Unfortunately, one research assistant and one state coordinator did not respond to the invitation to participate. Table 1 provides an overview of the demographic characteristics of the study participants, with an almost equal distribution of 11 males and 12 females.
Demographics of study participants.
The data from the open-ended questions were coded for each participant group and analyzed. We identified several themes centered around participants’ experiences and challenges in medical data collection. We have presented the themes and subthemes with illustrative quotes. The participants were coded using abbreviations and numbers to protect their identities. Hence RA1 represents research assistant 1, P1 represents physician 1, PC represents project coordinator, SC1 represents state coordinator 1, and DQA represents the data quality analyst.
All participants agreed that the study objectives were achieved and all the study participants had some positive experiences working in the study
This study was a multidisciplinary research that had participants from different fields such as computer science and the medical field working together. Most of the participants who agreed this multidisciplinary approach was beneficial to them as it led to collaboration among team members.
Participants also agreed that the study helped to improve both their knowledge of the subject area and their research/data collection skills.
All the participants agreed that they worked as a team to achieve the overall goal of the project.
The word cloud in Figure 1 captures the summary of the major themes captured from the data coding and analysis.

Word cloud of the experiential themes.
There were several challenges encountered during the data collection exercise. Paper-based questionnaires had a risk of being misplaced and the additional work of filling data back into the open data kit (ODK) and uploading it to the central server. Most of the participants were already familiar with ODK and all participants received training on ODK, however, more than half (57%) of the physicians still preferred to use the paper version of the questionnaire because it was easier for them to tick the paper than navigate through the ODK during consultation sessions with a patient.
Also, the unstable power supply to charge the battery of the tablet was another reason the paper-based form of the questionnaire was preferred.
Timely upload of data to the central server was also challenging due to poor network services and internet connectivity services. Virtual meetings were also disrupted due to poor internet connectivity.
All the participants welcomed periodic reminders with preferences; phone calls (38%), messages through text or WhatsApp (25%), visitation (25%), and Google Calendar (12%).
Some of the participants especially the physicians pointed to the exercise as being hectic when combined with their daily tasks in the hospital. The RAs worked directly with the physicians and their major challenge was the busy schedule of the physicians.
Other challenges encountered during this study included industrial action by doctors in tertiary health facilities, so deadlines could not be met and the data collection period was extended.
Political unrest in one of the four states used for this study, fuel scarcity, and bad roads were also challenges faced during the data collection phase
The study took place during the rainy season in the study area and the rains hampered the process by interrupting planned data collection visits and the monitoring process.
The word cloud, in Figure 2 below, highlights the themes extracted from the responses based on the challenges faced by the participants during the data collection exercise.

Word cloud of themes based on challenges faced by participants.
Discussion
This study which aimed at describing the qualitative experiences of data collection in a developing country during the COVID-19 pandemic also provided novel insights into the contextual challenges in medical data collection in developing countries with a case study of the Nigerian context.
There was a 92% response rate by the study participants as 8% of the study participant did not respond to the questionnaire sent, this is similar to what is reported in other studies where electronic data collection were employed. This is comparable to a study by Bokonda et al. 4 in nine developing countries, where they reported the successful use of electronic data collection in health research in about 73% of total usage. Premkumar et al. 13 reported a very successful use and implementation of digital data collection in household data collection with no major technical issues, while Maleghemi et al. 14 credit the increased surveillance of Acute Flaccid Paralysis (AFP) in South Sudan to the use of electronic data collection method. Generally, when electronic data collection is coded properly, the data quality will always be impressive, especially in terms of data completeness. Each dataset submitted has details needed for the various analytics. Also, inconsistencies in the data are greatly reduced, if not eliminated due to the real-time monitoring and tracking of the collected data.
An important result from this study is the success of teamwork through multidisciplinary team participation, which brought about knowledge gain and collaboration. This was found to be very beneficial to the study participants, although this is in contrast to a study where the clinicians and researchers were reluctant to share data. 15
The data collection team identified several challenges as depicted in Figure 2. During the study period, resident doctors in tertiary health facilities in Nigeria embarked on a 63-day strike due to nonpayment of entitlements. According to Adeloye et al., 16 the workforce crisis in Nigeria is mostly due to months of unpaid entitlements, poor welfare, lack of appropriate health facilities, and emerging factions among health workers. Although strikes by health workers are experienced globally, their consequences can be detrimental in regions with resource challenges. The strike contributed to extending the duration of the data collection exercise because most of the doctors who took part in the data collection were absent from work. Another challenge was unrest in one of the participating states where data were collected, which also impeded the process. Noonan, 17 delineates the harmful effects of sociopolitical violence and unrest on the healthcare system. Therefore, sociopolitical violence and unrest are factors that should be considered when doing research in similar contexts.
Poor network connectivity and electricity supply were also seen as a challenge during this study as it impeded the successful transmission of data to the server promptly. This is comparable to several studies done in Nigeria, where poor network connection and electricity supply are seen as a structural barrier to the use of digital technology in research.18,19 Some of the study participants adopted the use of paper questionnaires due to the unstable power supply for charging the device and poor network or unreliable internet, which affected the timely upload of data to the ODK server. Poor mobile network service and short battery life of the mobile devices were some of the challenges reported. Medhanyie et al. 8 and Shaffer et al. 20 also used offline data entry due to poor internet bandwidths and unreliable internet when developing a data collection and management system in West Africa. Meanwhile, Shovlin et al. 21 also presented some technological challenges facing medical data digitization and infrastructure limitations in low-resource contexts.
Some participants further complained that the data collection exercise was quite tasking due to their tight shifts and long working hours. This could be due to the incessantly low and unfair distribution of health workers in the Nigerian health sector. According to Abubakar et al., 22 brain drain is a major challenge in Nigeria; therefore, identifying experienced physicians with great clinical experience can be challenging in less developed settings. In addition to the aforementioned constrains, though android tablets were given to the doctors who participated in the data collection exercise as compensation for their tasks, some doctors said they would have preferred monetary compensation instead. Poor remuneration of health workers in Nigeria has always been a serious challenge and one of the reasons for the constant migration of health workers to developed countries.
Recommendations
Based on the findings of this study, we propose the following recommendations for researchers operating in similar settings:
When conducting research involving senior doctors who are actively engaged in information retrieval, it is advisable to enlist additional assistance from younger doctors or other healthcare professionals. This can help alleviate the workload and accommodate the busy schedules of senior physicians.
For research conducted in remote areas, it is important to allocate sufficient transportation resources or provide multiple transport options. Additionally, scheduling meetings well in advance with reminders can help prevent missed opportunities to engage with respondents.
To address challenges related to electricity and internet connectivity in areas with limited access, the utilization of digital technologies, such as providing power banks to RAs and utilizing data collection tools like ODK, can be beneficial. ODK allows for locally storing the collected data until better network connectivity is available, facilitating later upload to the server.
It is recommended to have additional devices readily available as backups to replace any faulty equipment during data collection. This precautionary measure helps prevent delays and interruptions in the data collection process.
Study limitations
We acknowledge the limitations of our study and the potential impact on the generalizability of our findings. The purposive sampling method employed may not accurately represent the entire population, and the sample size may be insufficient for generalization. Furthermore, the specific locations and limited number of respondents may restrict our ability to fully capture the complexities of data collection challenges in similar settings. The COVID-19 pandemic’s impact, which imposed restrictions on the mobility and physical contact, likely exacerbated these challenges.
In terms of methodology, the questionnaire used in this qualitative study was custom designed by the researchers and was not pilot tested before its administration. However, its content validity was confirmed through validation by senior colleagues not involved in the research.
It is important to note that our report primarily focuses on our data collection process and may not provide a comprehensive analysis of the febrile disease diagnosis phenomenon. As a result, we did not address issues related to data quality, processing outcomes, or potential biases that may affect recall and other metrics.
While we categorized our experiences into themes, we acknowledge the potential influence of our dual roles as both researchers and participants, which may have shaped our analysis based on our preconceptions.
Despite these limitations, we believe that our experiences can offer valuable insights and lessons for individuals facing challenges in medical data collection in Nigeria.
Conclusion
This article presents the authors’ experiences, challenges, and insights gained during a data collection process for developing a medical app in Nigeria, aiming to bridge a literature gap in research from developing countries. Qualitative data analysis was employed, with data organized into thematic areas. Positive experiences, such as collaboration and knowledge expansion, were highlighted by the researchers. Despite encountering challenges, the team’s cooperative nature facilitated the achievement of objectives.
In addressing core challenges, innovation and flexibility played a crucial role. These challenges included time constraints, healthcare worker strikes, civil unrest, poor network connectivity, and adverse weather conditions. The utilization of a blended data collection approach, incorporating tools like ODK, setting periodic reminders, and making repeated calls and visits, contributed to the overall success of the data collection process.
To enhance similar data processing efforts, the authors recommend paying attention to team composition, setting targets beyond the actual required sample size, and carefully considering incentive packages for participants. These factors are essential for ensuring a seamless data collection process and the attainment of goals. By sharing their experiences and recommendations, this article offers valuable insights for researchers and practitioners engaged in large-scale data collection endeavors, particularly in the context of developing countries like Nigeria.
Footnotes
Acknowledgements
Nil.
Author contributions
Edidiong Umoh—Introduction, methods, article collation and submission. Chimaobi Isiguzo—Results, discussion, and recommendations. Christie Akwaowo—Conceptualization, methods, conclusion, and limitations. Kingsley Attai—Results, discussion, and references. Nnette Ekpenyong—Abstract, results, discussion, and recommendations. Humphrey Sabi—Methods, literature synopsis, results, article review. Emem Dan—Abstract, introduction, and literature synopsis. Nwokoro Obinna—Introduction, literature synopsis, and results. Faith-Michael Uzoka—Conceptualization, introduction, conclusion, and limitations.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study is a subset of preliminary articles on a grant from the New Frontiers in Research Fund of Canada. New Frontiers in Research Fund Exploration (Toronto, Ontario, CAt GRANT_NUMBER: #102079).
Ethics statement
The studies involving human participants were reviewed and approved by the Human Research Ethics Board (HREB) of Mount Royal University, Calgary, Alberta, Canada on 16 June 2020. (Application number: 102232), and University of Uyo Teaching Hospital, Uyo Institutional Health Research Ethical Committee (IHREC), with reference number: UUTH/AD/S/96/VOL.XXI/564. The participants also provided their written informed consent to participate in this study.
Informed consent
The patients/participants provided their written informed consent to participate in this study.
Trial registration
Not applicable.
