Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has a high transmissibility profile which favors the accumulation of mutations along its genome, providing the emergence of new variants. In this context, haplotype studies have allowed mapping specific regions and combining approaches and tracking phylogenetic changes. During the COVID-19 pandemic, it was notorious that home environments favored the circulation of SARS-CoV-2, in this study we evaluated 1,407 individuals positive for SARS-CoV-2, in which we located 53 families in the period from June 2021 to February 2023. The epidemiological data were collected in E-SUS notifica and SIVEP-gripe. Then, the genetic material was extracted using the commercial kit and the viral load was evaluated and the viral genomes were sequenced using the Illumina MiSeq methodology. In addition, the circulation of 3 variants and their respective subvariants was detected. The delta variant represented the highest number of cases with 45%, the Omicron variant 43% and the lowest number with 11% of cases the Gamma variants. There were cases of families infected by different subvariants, thus showing different sources of infection. The haplotype network showed a distribution divided into 6 large clusters that were established according to the genetic characteristics observed by the algorithm and 224 Parsimony informative sites were found. In addition, 92% of subjects were symptomatic and 8% asymptomatic. The secondary attack rate of this study was 8.32%. Therefore, we can infer that the home environment favors the spread of SARS-CoV-2, so it is of paramount importance to carry out genomic surveillance in specific groups such as intradomiciliary ones.
Introduction
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection has become a public health issue, with more than 767 million cases worldwide. 1 In evolutionary terms, SARS-CoV-2 presents a high transmissibility rate that favors the accumulation of mutations that impact the number of variations at specific sites of the genome.2,3
In addition, the spread of SARS-CoV-2 has shown that most cases of infection with this virus occur in the home environment, characterized as a place that allows intense and prolonged contact between residents.4,5
In this context, the study of haplotypes has allowed the mapping of specific regions of the genome. 6 Haplotype determination, coupled with approaches and tracking of phylogenetic changes has become increasingly relevant in classifying new lineages and understanding the dynamics of infections among close populations, such as in the household environment. It is characterized as a technique that favors the visualization of the behavior of certain groups of organisms because it uses metrics different from traditional phylogenies to establish distances.5,7
Studies related to intrahousehold transmission have shown the emergence of new haplotypes in the household environment even with very close infections.8,9 In this study, we relate the main haplotypic markers and dynamics of infections involving the Omicron, Delta, and Gamma variants in individuals who share the same family environment, through the method of determining haplotypes associated with phylogeny, using data from complete viral sequences from the far west of the Amazon.
Methods
Place of study
This study was carried out at Laboratório de Virologia Molecular da Fundação Oswaldo Cruz de Rondônia-FIOCRUZ/RO in partnership with Rede Genômica da FIOCRUZ.
Ethical aspects
The study was approved by the Research Ethics Committee of the Centro de Pesquisa em Medicina Tropical de Rondônia-CEP/CEPEM with opinion No.4.637.465.
Study population and epidemiological data
A cohort of 1405 individuals positive for SARS-CoV-2 was selected from June 2021 to February 2023. Biological samples were collected at reference centers in different municipalities in the state of Rondônia. Epidemiological data were collected from SIVEP-Gripe and E-SUS.
Nucleic acid extraction and RT-qPCR
Biological samples from nasopharyngeal swab were forwarded to the Laboratório de Virologia Molecular (FIOCRUZ/RO) where viral RNA was extracted using the kit QIAamp® viral RNA Mini Kit (QIAGEN, Alemanha), according to the manufacturer’s instructions. Viral RNA was subjected to the quantitative RT-qPCR test to determine the viral load of SARS-CoV-2, developed by Queiroz et al. 10 Biological samples with Ct values <25 for the target were selected for genomic sequencing.
Complete genome sequencing of SARS-CoV-2
Genomic sequencing was performed using Illumina MiSeq and NextSeq platforms (Illumina, San Diego, CA, USA) and COVIDSEQ kit as previously reported. 11
Mutation analysis and lineage determination
Severe acute respiratory syndrome coronavirus 2 genomes were classified into lineages using the available software Pangolin COVID-19 Lineage Assigner v2.1.7 12 and mutations were analyzed with Nextclade v.1.5.4. 13 Nextclade implements several quality controls (QCs) to flag potentially problematic sequences due to errors during sequencing or assembly. Sequences that produced mediocre and poor metrics were excluded from mutation analyses.
Phylogenetic and statistical analysis
High quality (>29 kb) whole reference genomes (<1% of N) of variants of concern (VOCs) Gamma, Delta, and Omicron sampled in Brazil (n = 138) were downloaded from the GISAID EpiCoV database on September 11, 2022. The sequences generated in this the study (n = 95) and the retrieved sequences were aligned using MAFFT v.7.487. 14 The best model of nucleotide substitution was measured (GTR + G + I) using ModelFinder 15 and the phylogenetic tree was reconstructed using the maximum likelihood method in the program IQ-TREE v.2.1.3. 16 Branch support values were obtained using Ultrafast Bootstrap with 1000 replicates. The tree was visualized and edited with FigTree v.1.4.4. 17 Statistical analysis and graphics were performed using R v4.0.3 software.
Analysis of the secondary attack rate among households
The secondary attack rate was estimated from the number of infected household contacts with the overall value of cases per household. Age group, vaccination, and other criteria were not considered. 18
Haplotypes analysis
For this study, the Templeton, Crandal e Sing (TCS) method, named after its creators Templeton, Crandal and Sing, was used, 19 and defaults to building a distance matrix between individuals (identifying haplotypes) and then identifying the shortest paths between groups and thus establishing network relationships. Because of the particularities of the statistical model used, this methodology can infer microevolutionary events. 20
The application of the TCS method for the inference of the haplotype network was conducted using the software PopART, 21 a tool that allows the construction, visualization, and editing of haplotype networks through statistical inference models, enabling the comparison of different network conformations from the same dataset.
Results
Epidemiological data and secondary attack rate
A total of 1405 SARS-CoV-2-positive individuals were analyzed between June 24, 2021 and February 17, 2023, and 53 families were identified, which are formed between 2 and 4 individuals, totaling a number of 117 study participants (117/1405). The individuals were aged between 1 and 82 years, with a mean of 38 years, 55% (64/117) were male and 45% (53/117) female.
The cases of intradomiciliary infection were distributed in 20 municipalities of the state, covering 38% (20/52) of the entire territory. The capital Porto Velho accounted for the largest number of households characterized with 32% (17/53), followed by the municipalities of Ariquemes, Alvorada do Oeste, Chupinguaia, Teixeiropólis, Jaru, Ji-Paraná, which had 15% (3/53) for each municipality. The municipalities of Cacoal, Cerejeiras, Machadinho do Oeste, Seringueiras, Urupá had 10% (2/53) of families for each municipality, while Castanheiras, Guajará-Mirim, Itapuã do Oeste, Presidente Médici, São Miguel do Guaporé, Vilhena, Candeias do Jamari, and Novo Horizonte do Oeste corresponded to 5% (1/53) of families for each municipality. The secondary attack rate of the study population was 8.32% (117/1405).
The most prevalent symptoms reported by individuals were cough 63% (74/117), headache 56% (66/114), fever 55% (64/117), runny nose 44% (51/117) and sore throat 32% (37/117), followed by less prevalent olfactory disorders 9% (11/117), taste disorders 9% (11/117), and dyspnea 4% (5/117). In addition, 8% (9/117) of cases were asymptomatic.
A total of 12% (14/117) of the individuals had comorbidities. Reported diseases include long-term respiratory diseases, heart disease, advanced stage chronic kidney disease, obesity, diabetes, and hypertension. Regarding the vaccination profile of the study population, 3% completed the vaccination schedule with 4 doses, 12% (14/117) were vaccinated with 3 doses, 35% (41/117) had 2 doses, 27% (32/117) were classified as partially vaccinated, as they had received only one dose and 23% (27/117) of the individuals were not vaccinated. Vaccination data of the study population compared with identified subvariants are shown in Figure 1.

Bar plot showing lineages and vaccination data. Family 42 was removed for not containing vaccination data.
Phylogenetic analysis
Three variants of SARS-CoV-2 were traced among the families. Through genomic sequencing analyses, the variants Gamma, Delta, and Omicron were confirmed, being the only variants characterized in this study. For the construction of the phylogenetic tree and haplotype network, 95 high-quality sequences were included.
Delta variant accounted for the largest number of families representing 45% (24/53), presenting 3 subvariants, with AY.43 accounting for 50% (12/24) of the families, followed by AY.99.2 with 46% (11/24) and AY.122 with 4% (1/24). The Omicron variant presented 43% (23/53) of the families and was the variant that obtained the highest number of subvariants in intradomiciliary infections, being BA.1 with 26% (6/23), BA.1.1 with 13% (3/23), BA.1.1.1 with 9% (2/23), and BA.1.14.1 with 22% (5/23) of the families.
However, the Gamma variant was responsible for the lowest number of cases, presenting 11% (6/53) of the intradomiciliary cases, with the following subvariants: P.1 with 50% (3/6), P.1.14 with 17% (1/6), and P.1.4 with 33% (2/6) of the families (Figure 2).

Maximum likelihood (ML) phylogenetic tree including isolated representatives of the Gamma, Delta, and Omicron variants. Maximum likelihood phylogenetic tree showing 95 sequences with “good” quality in NEXTCLADE referring to 44 families obtained in this study and 138 reference genomes retrieved from GISAID. The tree was rooted with the most ancestral sequence (EPI_ISL_402123). Gamma, Omicron, and Delta are shaded in red, blue, and green, respectively.
The subvariants BA.2.36, BA.5.2.1, and BA.5.1 were identified in only one family member, as 11.32% (6/53) of the families showed infection with different subvariants.
Haplotype network
The analysis of SARS-CoV-2 sequences in individuals of the same family through a haplotype network were carried out with the intention of analyzing the possible different sources of infections and the virus evolution considering the geographical context and the dates of detection of the virus in individuals. The haplotype network showed a distribution divided into 6 major clusters that were established according to the genetic characteristics observed by the algorithm, and which are also in agreement with the classification inferred by Nextclade. As expected, individuals from the same family who had infection of the same SARS-CoV-2 strain were grouped in a single cluster; however, some individuals from the same family were grouped in different halogroups and others were not grouped in any cluster (Figure 3).

Haplotype network containing the Gamma, Delta, and Omicron variant clades and their respective subvariants. Haplotype network containing the strains of the Gamma, Delta, and Omicron variants that circulated in the State of Rondônia in the period corresponding to the study, each represented by different colors.
Parsimony informative sites contains at least 2 types of nucleotides, which occur at the same site at least twice. In this study, 224 Parsimony informative sites were found, which are plotted in green in Figure 4:

Distribution of parsimony informative sites along the SARS-CoV-2 virus genome. The parsimony informative sites are represented by the green color, in the position they were located in the SARS-CoV-2 virus genome.
Discussion
The COVID-19 pandemic resulting from the high spread of SARS-CoV-2 has been relevant in recent years, showing a worrying scenario in Brazil, mainly in the Amazon due to successive introductions of variants and viral strains.22-24
The secondary attack rate obtained among the sample of this study was 8.32%, differing from the study carried out in 2022, in the State of Rondônia, 25 in which the secondary attack rate was higher, this decrease may be related to the decrease in testing for SARS-CoV-2 and the advance of vaccination.
The symptomatology presented by the study population follows the clinical pattern of COVID-19, 26 it is important to emphasize the low frequency of manifestations such as olfactory and taste disorders and dyspnea, which at the beginning of the pandemic were more prevalent in infected individuals. 27
The different comorbidities related in this study population are already characterized as factors contributing to the development of severe cases of the disease.28-30 However, most individuals were vaccinated, being an important protective factor against severe SARS-CoV-2 infections and decreased risk of mortality. 31
Regarding the profile of variants associated with infections in vaccinated individuals, the predominance of the BA.1 lineage and other lineages that present important mutations that confer characteristics of higher transmission, immunological and vaccine escape is notable. 32 Which can be observed by the increase in cases in the State of Rondônia between January and March 2022, where the circulation of the Omicron variant was predominant. 33 Gamma, Delta, and Omicron variant clades had wide distribution of cases, 34 demonstrating the ability of these variants that were in circulation to cause intra-household infections, as shown in some studies.35,36 It is noteworthy that few studies have correlated intra-household infections with the Gamma, Delta, and Omicron clades as demonstrated here.
In general, the construction of haplotype networks for SARS-CoV-2 has demonstrated that the technique is able to recognize clusters in large-sample groups.37,38 In this case, having the ability to consider phylogeography, as well as in small groups of samples. 39
In this study, it was evidenced that the haplogroups formed were not derived from a predominant haplogroup as already seen in other studies, 39 where from one event, there is dispersion into sub-clusters of haplotypes. Although there are modulated clusters according to lineages, our results show that there are no determinant haplotypes that could concentrate many individuals. Instead, the clusters were more discrete and limited to family profiles, which demonstrates great adaptive capacity of the virus, already evidenced in other studies.9,40 As expected, individuals from the same family who had infection of the same SARS-CoV-2 strain were grouped in a single cluster; however, some individuals from the same family grouped in different halogroups and others did not group in any cluster. This can be explained by the presence of point mutations in the genome of these variants that were determinant for required halogroup identity.
Although SARS-CoV-2 has a revision mechanism in the viral replication process, mutations can occur in the virus genome, increasing its transmissibility and the ability to cause escape in the host immune system. 41 It is also important to note that the process of sequencing and assembling genomes has a crucial influence on the process of inferring haplotype networks, as well as on phylogeny inferences.
We observed that individuals from 6 families (fam42, fam44, fam45, fam46, fam49 and fam52) became infected with different strains of SARS-CoV-2, possibly showing that these individuals from the same family became infected from different sources of infection. One of the factors that could elucidate the occurrence of infections from different sources consists of behavioral habits combined with the relaxation of restriction measures. It is noteworthy that these infections occurred only among the Omicron sublineages (BA. *). Compared to the Gamma and Delta variants, Omicron is characterized by a high number of accumulated mutations, high transmissibility and subsequently an increase in the number of cases of the disease.42,43
Conclusion
We can infer that the intradomiciliary environment is a favorable place for the spread of the SARS-CoV-2, since it is a place that allows intense and prolonged contact between those who share the same domestic environment. In view of this, it is important to carry out genomic surveillance in groups that are more vulnerable, as is the case with intradomiciliary.
Supplemental Material
sj-xlsx-1-bbi-10.1177_11779322241266354 – Supplemental material for Haplotypic Distribution of SARS-CoV-2 Variants in Cases of Intradomiciliary Infection in the State of Rondônia, Western Amazon
Supplemental material, sj-xlsx-1-bbi-10.1177_11779322241266354 for Haplotypic Distribution of SARS-CoV-2 Variants in Cases of Intradomiciliary Infection in the State of Rondônia, Western Amazon by Karolaine Santos Teixeira, Márlon Grégori Flores Custódio, Gabriella Sgorlon, Tárcio Peixoto Roca, Jackson Alves da Silva Queiroz, Ana Maisa Passos-Silva, Jessiane Ribeiro and Deusilene Vieira in Bioinformatics and Biology Insights
Footnotes
Acknowledgements
This study was developed by a group of researchers from Laboratório de Virologia Molecular da Fundação Oswaldo Cruz, in Rondônia, with financial support from the Fiocruz Genomic Network, Departamento de Ciência e Tecnologia (DECIT), Fundação para o Desenvolvimento das Ações Científicas e Tecnológicas da Pesquisa do Estado de Rondônia—FAPERO, Programa de Pesquisa para o SUS (PPSUS), as well as Instituto Nacional de Ciência e Tecnologia de Epidemiologia da Amazônia Ocidental—INCT- EpiAmo who have been important contributors to scientific development in the Amazon region. Collaboration from Coordenação de Aperfeiçoamento Pessoal de Nível Superior—CAPES, from whom some authors received financial aid (scholarships) during the production of this study, the Vice president of Vigilância em Saúde e Laboratórios de Referências of Fiocruz, Instituto de Biologia Molecular do Paraná (IBMP) and Laboratório Central de Saúde Pública de Rondônia (LACEN/RO) were essential for the development of the study.
Funding:
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by Fundação Oswaldo Cruz de Rondônia—FIOCRUZ/RO (PROEP 2021 Process: VPGDI-008-FIO-21-2-17), Departamento de Ciência e Tecnologia (DECIT), Fundação para o Desenvolvimento da Ação Científica e Tecnológica e à Pesquisa do Estado de Rondônia—FAPERO (Process: 01133100038-0000.72 / 2016; Public bid invitation: 012/2016 PRO-RONDÔNIA and PPSUS 001/2021 Process: 350.095.442.048.526.000.000) and by Instituto Nacional de Epidemiologia da Amazônia Ocidental—INCT EpiAmO. FGN is a CNPq fellow. Departamento de Ciência e Tecnologia (DECIT) of the Brazilian MoH, US/CDC and OPAS, Brazilian office.
Declaration of conflicting interests:
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Author Contributions
Conceptualization: KST, MGFC, GOS, and DV; Data curation: KST, MGFC, and GSO; Formal analysis: KST, GOS, MGFC, TPR, AMSP, JASQ, and DV; Funding Acquisition: DV; Investigation: Methodology: KST, MGFC, GSO, TPR, and DV; Project Administration: DV; Supervision: DV; Writing–original draft: KST, MGFC, GSO, TPR, JR, AMSP, JASQ, ALFS, and DV; Writing–review & editing: MGFC and DV; All authors have read and agreed to the published version of the manuscript.
Data Availability
All the SARS-CoV-2 genomes generated and analyzed in this study are available in the EpiCov database in GISAID under the following ID numbers: EPI_ISL_11112665-11112679, EPI_ISL_11112681-11112700, EPI_ISL_11112702-11112704, EPI_ISL_11112706-11112719, EPI_ISL_11112721-11112725, EPI_ISL_11112727-11112746, EPI_ISL_11112748-11112760, EPI_ISL_11112762, EPI_ISL_11112764-11112768, EPI_ISL_11112770, EPI_ISL_11112772-11112776, EPI_ISL_11112778-11112789, EPI_ISL_11112791-11112798, EPI_ISL_11622642-11622700, EPI_ISL_11622702-11622725, EPI_ISL_5030021, EPI_ISL_6575689-6575706, EPI_ISL_6575708, EPI_ ISL_6575710-6575739, EPI_ISL_8623163-8623164, EPI_ISL_8623166-8623256, EPI_ISL_8623258-8623269, EPI_ISL_9414682-9414748, EPI_ISL_9414750-9414774, EPI_ISL_9636793-9636797, EPI_ISL_9636798-9636804 and EPI_ISL_9636805-9636. 878. The list of accession IDs may be found in the attached file in the supplemental materials.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
