Abstract
Male factor infertility contributes to nearly half of all infertility cases, yet timely semen analysis is often hindered by logistical and psychosocial barriers. The standard recommendation for in-clinic analysis within 1 hr post-ejaculation limits access, prompting interest in at-home semen collection and mail-in testing solutions. This prospective study evaluates the Give Legacy At-Home Semen Collection Kit for its ability to preserve semen quality over time and assesses whether delayed analyses (up to 48 hr post-ejaculation) can reliably predict baseline semen parameters. Thirty participants provided 60 semen samples analyzed across five timepoints (T0: 30 min, T1: 1 hr, T2: 6 hr, T3: 24 hr, T4: 30 hr). Total and progressive motility declined over time, with the greatest reductions after 6 hr. At baseline, high total motile sperm count (TOMO) individuals exhibited significantly greater motility than those in the low TOMO group and retained relatively higher motility across time points. Predictive models using delayed data (T1–T4) accurately estimated baseline motility values, with strong performance metrics (e.g., progressive motility model: mean absolute error [MAE] = 7.31, root mean squared error [RMSE] = 11.15, r = .86, intraclass correlation coefficient [ICC] = .74). These findings suggest that semen degradation during transit is predictable and that baseline motility can be reconstructed with clinically acceptable accuracy. This study supports the clinical utility of at-home semen collection and mail-in analysis for male fertility assessment. Predictive modeling enables diagnostic insights even when testing is delayed, expanding access while maintaining diagnostic integrity. Future research should explore broader populations and integrate additional fertility biomarkers.
Keywords
Introduction
Male infertility contributes to approximately 30% of infertility cases in couples, with broader implications suggesting that up to 50% of infertility is associated with male factors (Fainberg & Kashanian, 2019). Despite its prevalence, male fertility evaluation is often delayed due to logistical, psychological, or cultural barriers (Boivin et al., 2007). Traditional semen analysis requires in-clinic sample collection and analysis within a short time frame, typically within 1 hr post-ejaculation per World Health Organization (WHO) 5th edition recommendations (WHO, 2010). This constraint can limit access and convenience for patients, impeding early diagnosis and intervention.
The advent of mail-in semen collection technologies offers a potential solution, allowing men to collect specimens privately at home and ship them for centralized laboratory analysis. However, the challenge lies in the temporal degradation of semen parameters—particularly motility—during the shipping process, which can span 24 to 48 hr. Prior research has documented predictable declines in motility over time and relatively stable concentration values (Samplaski et al., 2021; Ventimiglia et al., 2015).
This study evaluates the Give Legacy At-Home Semen Collection Kit’s ability to preserve semen quality over time and assesses the viability of predicting early post-ejaculation semen parameters based on delayed analyses. Using a structured sampling and analysis protocol, we compare semen quality metrics over five timepoints, train predictive models, and validate the reliability of reconstructed baseline values from delayed data.
Materials and Methods
Study Design and Setting
This is a prospective observational study conducted at Give Legacy’s clinical and central laboratories in Hoboken, NJ and San Antonio, TX, respectively. Participants were recruited through social media and outreach to educational institutions. Informed consent was obtained in accordance with IRB approval.
Participant Recruitment and Demographics
A total of 30 male participants contributed 60 semen samples and 1,596 total data points. Participants ranged in age from 22 to 35 years (M = 28.26, SD = 4.05). Participants reflected a diverse ethnic background, including Asian (14.8%), Hispanic/Latino (14.8%), Indian (14.8%), American (7.4%), Caucasian (7.4%), Portuguese (7.4%), and Vietnamese (7.4%). Additional representation included European, Kazakh, Singaporean, South Asian, Taiwanese, and White ethnicities (each 3.7%). Most participants resided in New Jersey (48.1%) or New York (48.1%). Educational attainment was high, with 44.4% holding postgraduate degrees, 37.0% reporting a 4-year undergraduate degree, and 18.5% having completed high school.
Shipping Environment
This included a sealed sterile container with 12 to 13 mL of modified HTF (HUman Tubal Fluid) medium with gentamicin, buffered with HEPES (N-2-hydroxyethylpiperazine-N-2-ethane sulfonic acid). Samples were shipped overnight Monday through Wednesday.
Research and Development Phase
Samples were tested for total motility and progressive at baseline (T0), followed by analysis at T1 (1 hr), T2 (6 hr), T3 (24 hr), and T4 (30 hr). In-clinic data from kits shipped overnight (24–48 hr) were used to validate the predictive motility model.
In-lab and in-clinic samples were maintained and tested according to SOPs using Hamilton Thorne IVOS II CASA systems and with WHO 5th edition reference ranges.
Data Collection and Management
Data elements included participant ID, demographics, ejaculate volume, timing of ejaculation, transport and testing timestamps, semen parameters, and test method metadata. Quality control included repeat testing for abnormal results, de-identification of data, and secure storage in encrypted Give Legacy servers.
Raw data were recorded in Microsoft Excel and imported into R (version 4.5.0) in wide format, where each row represented a unique semen sample. The total motile sperm count (TMSC), also known as the TOMO score, was calculated using the formula:
TOMO scores at baseline were then categorized into two clinically meaningful groups: low TOMO (<20 million) and high TOMO (≥20 million), based on fertility-related thresholds supported by existing guidelines.
Training of Predictive Model
Predictive modeling was trained using a linear mixed models on early timepoints (T1–T4), to predict total motility and progressive motility. The model extended the validation to samples shipped between 24 and 48 hr, simulating at-home sample degradation. In addition, reverse prediction models were developed to estimate baseline (T0) semen quality parameters, enabling assessment of how well future values could infer early clinical characteristics.
Model performance was evaluated using standard predictive metrics: mean absolute error (MAE), root mean squared error (RMSE), Pearson correlation coefficient (r), and intraclass correlation coefficient (ICC) for absolute agreement. Predictive validity was further examined using Bland–Altman plots, which visualized mean bias and 95% limits of agreement between predicted and observed values (Bland & Altman, 1986).
Concentration was excluded as a predictor, consistent with prior studies showing its temporal stability (Cooper et al., 2010).
Validation of Delayed Testing
Delayed sample metrics at T1–T4 were used to reconstruct expected T0 values and verify measured T5 values. Comparisons used paired analyses and were stratified by TOMO score, with clinically relevant cutoffs at 20 million (TOMO = Volume × Concentration × Motility). The models’ ability to accurately reconstruct early values was assessed using correlation and agreement analyses.
Descriptive statistics for semen parameters across all timepoints are provided in Table 1. The parameters include volume, total motility and progressive motility, morphology, and DNA fragmentation. Data for total motility and progressive motility were collected at five time intervals following ejaculation: 30 min (T0), 1 hr (T1), 6 hr (T2), 24 hr (T3), and 30 hr (T4). Figure 1 presents the distributional patterns of progressive motility, and total motility across T1–T4 intervals. Progressive motility decreased from a mean of 48.10 (SD = 22.15) to 34.10 (SD = 19.97), whereas total motility declined from 58.81 (SD = 22.63) to 45.80 (SD = 23.29) over the same period. These trends indicate a consistent reduction in semen quality over time, with the most marked decline occurring after the 6-hr timepoint.
Descriptive Statistics for Semen Parameters Across Timepoints.
Note. n = 60. Values represent the mean (M), standard deviation (SD), median (Mdn), first quartile (Q1), and third quartile (Q3) for each semen parameter measured at six post-ejaculation timepoints: 30 min (T0), 1 hr (T1), 6 hr (T2), 24 hr (T3), and 30 hr (T4).

Distribution of Progressive Motility and Total Motility Across Time.
Statistical Analysis
All statistical analyses were conducted in R (version 4.5.0), with significance set at α = .05. Both wide and long formats were used to meet different analytic needs. Descriptive statistics (frequencies and percentages) were used to summarize participant characteristics in the wide-format dataset. Distributions of key variables were assessed using histograms and Shapiro–Wilk tests. As assumptions of normality were violated, Wilcoxon signed-rank tests were applied to compare paired measures between T0 and T4 for total motility and progressive motility.
For longitudinal analysis, linear mixed models (LMMs) were fitted using restricted maximum likelihood (REML). Sample ID was treated as a random intercept to account for repeated measures. Model performance was evaluated using marginal R2, conditional R2, and ICC. Estimated marginal means (EMMs) were plotted over time to illustrate group-wise trends. Boxplots display the distribution of two key semen parameters in Figure 1.
Table 2 presents the EMMs for total motility across timepoints, evaluated by TOMO group and overall, adjusting for age. At baseline (T0), overall motility was 53.5% (SE = 2.5). By T4 (~30 hr), motility declined to 40.3% (SE = 2.5), representing a 24.7% reduction.
Estimated Marginal Means and Percentage Change in Total Motility by TOMO Group and Timepoint (Adjusted for Age).
Group-specific patterns showed a similar trend. In the low TOMO group, motility dropped from 39.1% at T0 to 25.2% at T4 (−35.5%). In the high TOMO group, motility declined from 67.9% at T0 to 55.3% at T4 (−18.6%).
Table 3 presents the EMMs for progressive motility across timepoints, analyzed by TOMO group and overall, with age included as a covariate. At baseline (T0), overall progressive motility was 42.7% (SE = 2.4). This value declined to 29.1% (SE = 2.4) at T4 (~30 hr), corresponding to a 31.9% reduction from baseline. Within-group trends showed a similar decline. In the low TOMO group, progressive motility decreased from 28.0% at T0 to 15.8% at T4 (−43.6%). In the high TOMO group, motility declined from 57.4% at T0 to 42.4% at T4 (−26.1%).
Estimated Marginal Means and Percentage Change in Progressive Motility by TOMO Group and Timepoint (Adjusted for Age).
Note. EMM = estimated marginal mean. SE = standard error. Percentage change (Δ% vs. T0) reflects the relative change from baseline (T0). T4 represents the final clinical observation (~30 hr).
Discussion
This study investigates the efficacy of the Give Legacy At-Home Semen Collection Kit in preserving semen quality over time and predicting early post-ejaculation semen parameters based on delayed analyses. This study demonstrates that delayed analysis of semen parameters using The Legacy Kit can accurately predict early post-ejaculation values, and the results align with previous work (Hamilton et al., 2015; Samplaski et al., 2021) showing motility’s linear degradation and concentration stability.
The results show that total motility and progressive motility decline consistently over time, with the most marked reduction occurring after the 6-hr timepoint. At T0, the baseline motility is higher in the high TOMO group compared with the low TOMO group. Over time, both groups exhibit changes in motility, with notable differences between them.
At T2 and T3, both groups experience a decrease in motility compared with their respective baselines. By T4, there is more pronounced reduction in motility for both groups; however, the percentage decrease is more in the low TOMO group (−35.5% at T4) than in the high TOMO group (−18.6% at T4). These findings suggest that while both low and high TOMO groups experience fluctuations in total motility over time, individuals with higher initial motility tend to maintain relatively higher levels throughout the observation period.
The reverse prediction models developed by training on T1–T4 to estimate baseline (T0) semen quality parameters also demonstrate high predictive validity, as evidenced by strong correlation and agreement metrics.
Predictive Model Accuracy
To assess whether motility values observed at later timepoints could be used to retrospectively estimate initial motility (T0: 30 min post-ejaculation), linear mixed model was developed using data from T1 to T4 (Table 4). The model, trained on T1–T4 data, yielded strong predictive performance with MAE = 9.36, RMSE = 13.62, and Pearson r = .79. Agreement between predicted and observed T0 values was acceptable, ICC = .62, 95% CI = [0.42, 0.75], p < .001.
Model Accuracy for Predicting Total and Progressive Motility at T0 from Later Time Points.
Note. Absolute error; RMSE = root mean squared error; ICC = intraclass correlation coefficient for agreement; Pearson r = correlation between predicted and observed scores. Higher ICC values indicate stronger agreement, whereas higher Pearson r values reflect stronger correlation between observed and predicted scores. Models trained on T1–T4 data (n = 228) predicted T0 using clinic observations.
To evaluate the predictive capacity of estimating progressive motility at T0 using subsequent time points, a linear mixed model was fitted. The model, trained on data from T1 (1 hr) to T4 (30 hr), yielded an MAE of 7.31, an RMSE of 11.15, and a Pearson correlation coefficient of r = .86. The ICC indicated moderate agreement between predicted and observed values (ICC = .74, 95% CI = [0.50, 0.79], p < .001).
Conclusion
This study systematically evaluated temporal changes in semen quality parameters—total motility and progressive motility—using an at-home collection kit with delayed laboratory analysis. Across all outcomes, a consistent decline was observed from the 30-min baseline to the post-transit point. Linear mixed models confirmed that time was a robust predictor of deterioration, and the TOMO score was a significant stratifier of semen quality across timepoints. To assess retrospective estimation capabilities, predictive models were trained on later timepoints (T1–T4) to reconstruct initial semen quality at T0. Model performance showed acceptable levels of error and correlation for motility-based measures. Overall, these findings underscore the feasibility of using machine learning frameworks to infer baseline semen quality from delayed or transit-affected samples. However, predictive performance remains constrained by both biological variability and the effects of sample handling. Incorporating stratification factors like the TOMO score enhances model interpretability and clinical utility, particularly when estimating semen quality in nonclinical settings.
These results have important implications for understanding how different levels of baseline activity can influence overall mobility patterns during extended periods or under specific conditions such as transit or clinical observations lasting up to ~48 hr post-transit. The predictive models trained on later timepoints can estimate baseline values with high accuracy, enabling at-home semen collection for fertility evaluations without compromising diagnostic integrity.
Future research should explore potential mechanisms underlying these differences between low and high TOMO individuals while considering additional factors such as age adjustments made within this analysis framework. In addition, larger, multi-region cohorts, incorporation of additional fertility biomarkers, and real-world patient usage simulations should be considered to further validate and expand the applicability of these findings.
The Give Legacy at-home semen collection kit maintains clinically relevant semen quality metrics for up to 48 hr post-ejaculation. Predictive models trained on later timepoints can estimate baseline values with accuracy, enabling at-home semen collection for fertility evaluations without compromising diagnostic integrity.
This innovation holds promise to expand male fertility testing access and overcome traditional barriers associated with clinic-based testing.
Footnotes
Ethical Considerations
The present study protocol was reviewed and approved by the institutional review board of WCG Protocol. No. 20253287. Informed consent was submitted by all subjects when they were enrolled.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of Conflicting Interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Unika Alexander and Khaled Kteily are full-time employees of Give Legacy. Dr. Ramy Abou Ghayda is a part-time employee of Give Legacy. Dr. Denny Sakkas and Dr. Francisco Arredondo serve as scientific advisors to Give Legacy. None of the authors exerted any influence on this work in relation to the company or its products. The authors have not received any financial or non-financial rewards related to this study, and Give Legacy had no role in the design, methodology, data collection, analysis, interpretation of results, or writing of this manuscript, either directly or indirectly.
