Abstract
Aim:
Archived military documents contain health information that can enrich the Norwegian Armed Forces Health Registry (NAFHR). However, uncertainty exists about the preservation of the documents for digital reproduction and the accuracy of clinical measurements for research purposes. This study aims to present and assess the quality of military health data extracted from the paper-based personnel files of Norwegian men born in 1950.
Methods:
We digitized the military health information of approximately 60% (n=17,324) of the Norwegian men who were born in 1950. Health records were manually transcribed, and some of the transcribed data were checked for errors by using similar registrations in the NAFHR. Clinical measures were compared with results from national health surveys. Variations between the conscription board health examinations and the examinations on the first day of service were explored. Transcribed cardiovascular disease (CVD) risk factor data were tested with logistic regression models to assess their predictive ability.
Results:
The transcribed data showed good compliance and readability, with overall accurate and valid clinical measurements. While minor variations existed between the data recorded on conscription board examinations and medical examinations on the first service day, the measurements generally aligned with the national health survey results. Several of the CVD risk factors showed the expected associations with CVD mortality.
Conclusions:
Keywords
Background
In Norway, conscription to military service is a guiding principle in the social contract between the country’s citizens and the state. Eligible men and women must attend a military health examination to determine their fitness for duty [1]. The conscription board health examination (CBHE) prior to service consists of clinical examinations and tests of physical and mental health, according to standard guidelines [2]. These military medical screening procedures were established in the 1950s and have not changed much over time [3].
The data collected are stored in the Norwegian Armed Forces Health Registry (NAFHR). This national health registry was established in 2005 and is used for monitoring the health status of military personnel through their service [4]. Data from the Armed Forces Personnel Database, the Military Electronic Medical Records System, and the Norwegian Cause of Death Registry are regularly entered into the NAFHR.
The data in the NAFHR spans over five decades with nearly complete national coverage. However, information on height, weight, and general cognitive ability was not registered in the NAFHR for most men born before 1950. In 1950, 32,203 boys were born in Norway [5]. Approximately 6–7% of the birth cohorts were exempted before the CBHE, mainly because of death, emigration, disabling diseases, or crime [6,7]. However, only 25,770 (80%) of the men born in 1950 are included in the NAFHR, and lots of the clinical information is missing. The data loss can, at least partly, be explained by the Armed Forces’ archive procedures. A military electronic personnel database was adopted in 1992; before that time, military records were paper-based and kept within personnel files that were handed over to the National Archives of Norway when the person turned 70 years old.
A search at the National Archives showed that paper-based military archives contained health information for at least 50–60% of Norwegian men born between 1941 and 1957 and that military service cards were tagged with an 11-digit personal identification number unique to every Norwegian individual [8]. This discovery presented an opportunity to expand and update the electronically searchable information in the NAFHR by digitizing the military archives. However, the consistency and accuracy of the data were unknown.
Historical archives hold extensive information about social conditions and political initiatives aimed at improving welfare and health. Linking this information to health outcomes across generations offers valuable insights for public health and social epidemiology [9]. There is great interest in these data for public health studies, i.e. on mortality, cardiovascular diseases, and dementia. Several Nordic studies have already utilized conscription data to explore health outcomes in later life, emphasizing the role of early-life health conditions [10 –12].
Therefore, this paper aims to describe methods for digitization of military paper-based documents and to present and evaluate the quality of the data.
Methods
The digitized 1950 birth cohort
The National Archives in Oslo held paper-based military health records for 50–60% of Norwegian men born between 1941 and 1957. From these cohorts, we selected the cohort born in 1950, as this is one of the older cohorts in the NAFHR. Consequently, this review included 17,324 Norwegian men born in 1950 (Supplemental Figure 1) found in the National Archives in Oslo, hereafter referred to as “the digitized 1950 cohort.”
Health examinations conducted by the conscription board and on the first day of service
Since 2010, CBHEs have been conducted at 10 regional military facilities that are specially equipped for military medical examinations. However, in 1968–1969, CBHEs were conducted locally in municipalities across the country (Supplemental Figure 2). The local police districts or assembly halls were venues for conscription summoning, while military doctor teams traveled around the country with medical equipment for measuring. The CBHEs and health examinations conducted on the first day the conscripts met for military service (hereafter, the first service day) were performed according to instructions provided by the Armed Forces Joint Medical Services [2].
Height was measured to the nearest centimeter (cm) and weight to the nearest kilogram (kg). Systolic and diastolic blood pressure (SBP and DBP) were measured in mm Hg on the right upper arm after five minutes of seated rest using an appropriately sized cuff and manual sphygmomanometer. The resting heart rate was measured by pulse palpation as beats per minute. Furthermore, the physical health assessment consisted of a clinical interview, a review of the candidate’s personal statements of health, and any medical documentation of disease. A test for cognitive ability, introduced in 1954 and referred to as “general ability” (GA), consisted of three different subtests: arithmetic, figure, and word similarity. The sum of the three tests combined is calculated as stanine scores (short for “Standard Nine”) and ranges from 1 (worst) to 9 (best). A more detailed description of the tests and scores can be found elsewhere [3,13].
Digitizing information
The personnel folders contained several military medical forms with a great variety of information. We reviewed and transcribed some selected information from the conscripts’ structured medical journal, which included the place and date of the CBHE military medical examination, height, weight, SBP and DBP, heart rate per minute, and GA test results (stanine score and the scores from each of the three subtests). For our review, we categorized the military conscription sites into 19 groups that corresponded with the former Norwegian county classification (1946–2018) [14] (Supplemental Table 1).
Transcription and validation process
The data were manually transcribed into an Access database according to a codebook developed in a pilot phase of the project. Seven employees participated in the transcription, and the employees were all given training by the same personnel and used the same codebook and database. In the quality control of the transcription process, the 11-digit personal identification number was linked to the Norwegian Central Population Register to identify and correct erroneous registrations. The transcribing staff regularly reviewed the transcribed data for any outlier values and corrected them.
Once the transcribing task was over, members of the project group who had not been involved in transcribing reviewed the data for unrealistic values/records and verified the accuracy of these values by comparing them with the original documents. Numerical values were checked for any extreme values, while non-numerical values, such as dates, were checked for errors. A total of 103 transcribed cases with outlying numbers and unrealistic information were identified. In 51 cases, the information from the personnel files was corrected. The remaining 52 cases were found to have been transcribed correctly, although those numbers were unrealistic. These were attributed to errors made by the doctors and illegible handwriting.
To assess the reliability of the transcription process, we conducted an intra-investigator reproducibility test by randomly selecting 100 men from the cohort and comparing the transcriptions of two different staff members. Among the 1800 total transcribed values, only 38 (2.1%) deviated between the two staff members; among these, 22 deviating transcriptions were due to mistakes and three were due to different individual interpretations.
Last, we performed a series of statistical tests to investigate features of the digitized 1950 cohort related to cardiovascular disease (CVD) mortality. By linking data on the underlying cause of death registered in the Norwegian Cause of Death Registry, we included deaths that occurred from the year of CBHE to the end of 2021. The EU shortlist for Causes of Death, which ensures comparability over time, was used to identify the underlying cause of death. CVD deaths were defined using the International Classification of Diseases (ICD) codes I00-99 in ICD-10 and 390-459 in ICD-9 [15,16].
Results
Quality control of the transcribed data
Of the 17,324 men who had medical documents from conscription, 85% had weight records, and 75% had records of heart rate (Table I). Most of the medical documents that lacked weight records came from some of the largest conscription sites in western Norway. In total, 15,790 men (91%) also had additional health information from the first service day (Table I), but heart rate was rarely recorded (missing n=12,076), as the doctors often left this section empty or wrote “RM” (regular).
Descriptive statistics on Norwegian men born in 1950 based on information transcribed from 17,324 military health records at the National Archives of Norway.
Indicates a p-value <.05 in the skewness and kurtosis test for normality.
When reviewing the conscript records for significant differences between the CBHE day and the first day of service, 312 discrepant cases were found, of which 123 transcribed registrations were corrected. Minor changes in height and weight were found between these two time points, which seems logical as the first day of military service usually comes one year after conscription (Supplemental Table 2). Because there was an overlap between the digitized 1950 cohort and registrations in the NAFHR, we could investigate some of the transcribed information toward registrations that were electronically available and quality controlled. We found that 16,551 of the height and 705 of the weight measurements from conscription that belonged to the 17,324 men in the digitized 1950 cohort were already registered in the NAFHR (Supplemental Table 3). Overall, discrepancies between the two data sources were low.
Descriptive statistics
To explore the variance of the transcribed data and assess whether the records were accurate enough for research purposes, we produced descriptive statistics on height, weight, SBP, DBP, heart rate, and GA-stanine score (Table I). The mean GA-stanine score for the digitized 1950 cohort was 5.9 points with an SD of 1.7. Overall, the mean blood pressure and heart rate were higher when measured on the first service day compared to the measurements from the CBHE. The heart rate, SBP, and DBP at the CBHE clustered around some specific values. For example, 43% of the CBHE-DBPs had a value of 80 mm Hg, and 32% of the CBHE-SBPs were 120 mm Hg (Figure 1). Similarly, 23% of the conscripts had a recorded heart rate of 72 bpm at the CBHE (Figure 2).

Systolic (SBP) and diastolic (DBP) blood pressure (mm Hg) at two time points between the conscription board health examinations (CBHE) and the first service day. Histograms and box plots of frequencies, including normal curves of the distributions.

Heart rate (beats/min) at two time points between the conscription board health examinations (CBHE) and the first service day. Histograms and box plots of frequencies, including normal curves of the distributions.
The means, SDs, and medians of the transcribed height, weight, and blood pressure data were comparable to similar survey data among young men in Norway [17 –19] (Table II).
Overview of information transcribed from military health records at the National Archives of Norway and registrations in the Norwegian Armed Forces Health Registry (NAFHR) and population health surveys on young men in Norway.
CBHE: Conscription board health examination.
Mean represents the mean (standard deviation), and median corresponds to the median [10th–90th percentile].
Assessing transcribed data in statistical models for cardiovascular disease mortality
We assessed whether the proportion of deaths from CVD among all deaths differed between the digitized 1950 cohort (n=17,324) and the men born in 1950 who were registered in the NAFHR (n=25,770). Among the digitized 1950 cohort, 23.6% died from CVD, whereas 21.6% of those born in the same year and registered in the NAFHR did, resulting in a 2% difference in the proportions (p=0.03) (Supplemental Table 4).
We used two multivariable logistic regression models to examine whether some of the transcribed information from CBHE (Model 1) or the first service day (Model 2) that could be considered CVD risk factors would demonstrate expected predictive relationships with CVD mortality. Based on known associations with CVD demonstrated among Swedish male conscripts born from 1949 to 1951 [10,11], we expected to find higher risk of CVD mortality with lower GA-stanine scores (continuous), higher body mass index (BMI) (continuous), and elevated SBP and DBP levels compared to the normal range (SBP and DBP level categories suggested in Supplemental Table 5 [10]). Model 1 involved 11,581 conscripts with complete information on the four selected variables. Model 2 included 7367 conscripts who were fit for military service and had the aforementioned information recorded on their first service day (GA-stanine score was taken from the CBHE).
The associations between the GA-stanine score and CVD mortality, as well as between BMI and CVD mortality, remained consistent between the two models (Table III). This indicated that those with a higher GA-stanine score or lower BMI were at lower risk of CVD mortality. Regarding the association between blood pressure and CVD mortality, only the group who had the lowest DBP on the first service day had a statistically significantly lower risk of CVD mortality (OR=0.28, 95% CI: 0.09–0.90). Most of the numeric effect estimates of the higher SBP groups in the model using CBHE data implied a lower risk of CVD mortality compared to the reference, but higher SBP measured at the first service day implied higher CVD mortality. For DBP, the numeric OR effect estimates had a similar direction in the two different models with measurements from CBHE and the first service day respectively.
Odds ratio (95% confidence intervals) for CVD mortality (from the year of CBHE until 2021) based on information transcribed from military health records stored at the National Archives of Norway.
p-value < 0.05.
Discussion
This paper describes methods for digitizing military paper-based documents. The cohort of men born in 1950 was selected for this review, covering approximately 60% of all men born in Norway that year. Because of the military selection, we assumed that the digitized cohort represented the healthy and normally skilled population. This assumption was strengthened by the fact that few of the men in the digitized cohort died before 2021, and by the finding of normally distributed health data that were comparable to data found in population health surveys in the same age group.
The files belonging to those who died, were deemed unfit for military service, or became officers, or military police, or refused military service for ethical reasons were stored in archives that were not subjected to our review. An extended review of more archives could further increase the generalizability of the digitized cohort. Unfortunately, as there has been no absolute safeguarding of the military archives, we must consider that some of the personnel files and some of the documents stored within the files may have been lost.
Overall, we found that the information we selected for digitization was consistently recorded in the military health documents and was accurately transcribed. This indicates that these documents are well-preserved and easily readable. Yet, missing recordings from the health examinations conducted on the first service day is a limitation, as measurements from two separate times would have provided an opportunity for quality assessments and to determine true changes in health between two time points. We speculate that the doctors who examined the conscripts on the first service day recorded the measurements only if they differed from those taken during the CBHE.
In 1968–1969, when the men in the digitized cohort were summoned to CBHE, the medical examination procedures were much the same as they are today, but the setting and locations differed. Currently, there are 10 stationary locations for conscript examinations in Norway; however, in the archived material, we registered 141 different locations across the country. While some of the locations were stationary and located in military facilities, the majority operated on a temporary basis. This arrangement may have introduced some variability since military doctors had to travel with measurement equipment to different locations, and a larger number of medical staff conducted the health examinations and recorded the results. Nevertheless, the regional differences in the clinical measurements (not presented) were consistent with known differences in height and weight observed in other surveys [6,20,21].
To determine whether the data were accurate enough for research purposes, we used a range of methods, including examining inconsistencies within the transcribed and already registered data, comparing data with results in similar populations, and conducting statistical analyses. The information on the digitized 1950 cohort data demonstrated high reliability and validity. This was supported by the minimal discrepancies between measures recorded at two different time points in military health examinations for the same individuals, alignment with existing records in the NAFHR and other health surveys conducted for similar male birth cohorts in Norway [17 –19], and demonstrated predictive accuracy of CVD mortality.
Several studies have been carried out to investigate Nordic conscription databases and utilize conscription records for epidemiological studies [10,11,22 –26]. Despite the differences in the aims of the studies and the inclusion criteria, it is worth noting that the findings from our statistical analysis aligned with the outcomes of these prior studies [11,25]. However, our findings did not show the expected consistent association between higher SBP and DBP levels and CVD mortality, and differed from previous studies that revealed the influence of both SBP and DBP on CVD mortality [10,26].
We noticed that the process of measuring and recording blood pressure in military examination settings might have been imprecise. Clinicians might have been instructed to round up their records or might have had preferences for specific numbers. Notably, in measurements taken at CBHE, as depicted in Figure 1, the clustering of SBP and DBP values around certain numbers suggests rounding to the nearest 5 or 10 mm Hg, potentially leading to an underestimation of associations [10]. There might be differences in the competence requirements and procedures between the military health personnel at CBHE and those who examined on the first service day. It might be that the measurements of SBP and heart rate at the CBHE are of somewhat lower quality than the measurements collected on the first service day or in a health survey.
Conclusion
This review included health records stored at the National Archives of Norway from the 1950 male birth cohort of Norway. The assessment of the transcribed material showed conformity with the source documents. The results support that the digitization of military personal files can enhance a national health registry by the inclusion of information that has been lost or was not included in the registry.
The military personnel files and documents that have been handed over to the National Archives must be considered unique data sources, both in the Norwegian context and the international context. This study aimed to demonstrate not only how archived records could be transformed into data for health research but also how their validity and reliability could be tested for use as health data. In doing so, this study can serve as an example for future research or tasks that plan to update existing registries or build new ones.
Supplemental Material
sj-docx-1-sjp-10.1177_14034948251350504 – Supplemental material for Digitizing paper-based military health records from Norwegian males born in 1950 – Assessments of data quality and applicability in research
Supplemental material, sj-docx-1-sjp-10.1177_14034948251350504 for Digitizing paper-based military health records from Norwegian males born in 1950 – Assessments of data quality and applicability in research by Kristine Vejrup, Hye Jung Choi, Leif Å. Strand, Inger Ariansen and Elin A. Fadum in Scandinavian Journal of Public Health
Footnotes
Acknowledgements
We extend our sincere gratitude to the National Archives of Norway and our colleagues for their invaluable efforts, cooperation, and contribution in this project.
List of abbreviations
BMI: Body mass index
CBHE: Conscription board health examination
CI: Confidence interval
cm: Centimeter
CVD: Cardiovascular disease
DBP: Diastolic blood pressure
GA: General ability
ICD: International Classification of Diseases
kg: Kilogram
mm Hg: Millimeter of mercury
NAFHR: Norwegian Armed Forces Health Registry
SBP: Systolic blood pressure
Availability of data and materials
The datasets generated and/or analyzed are not publicly available but are available from the NAFHR with necessary approval.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This project was funded by the Norwegian Ministry of Defense. The funder had no role in the data collection, analysis, or interpretation of the data or writing of the manuscript.
Ethical approval
This investigation, as a quality control project, was not required to obtain approval from the Regional Committees for Medical and Health Research Ethics. The work was authorized by the Regulations for the NAFHR and was carried out by employees of the NAFHR who are authorized to process the data without the data subjects’ consent according to the Health Register Act, regulations, and security manual for the NAFHR.
Consent for publication
Not applicable.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
