Abstract
Martian rock and regolith samples are being collected and cached by NASA’s Perseverance rover, with the goal of returning them to Earth as soon as the mid-2030s. Upon return, samples would be housed in a sample receiving facility under biological containment to prevent exposing Earth’s biosphere to any potential biohazards that might be present. Samples could be released from high containment for scientific investigations if they are found to be safe or are sterilized. The Sample Safety Assessment Protocol Tiger Team (SSAP-TT) was convened by the Sample Receiving Project between August 2023 and August 2024 and tasked with the development of a Sample Safety Assessment Protocol (SSAP). The result of this work is a proposed three-step protocol, supported by Bayesian statistical hypothesis testing, to assess the risk as to whether returned samples contain modern martian biology that could represent a biohazard. The proposed protocol outlines procedures to determine whether the samples could be safely released from high containment without sterilization or require a “hold and review” step. This article presents the central concept of the SSAP approach—comparing returned samples to the abiotic baseline. Organic molecules, which exist throughout the solar system, can have either biotic or abiotic origins. However, biotically produced organic molecules exhibit distinct complexity, distribution, and abundance characteristics that differentiate them from those formed through abiotic chemistry. The proposed protocol would examine the organic inventory of returned samples by using multiple techniques, including morphological and spectral assessments, to determine whether any signals exceed the abiotic baseline; that is, whether the organic molecular inventory could be explained solely by abiotic chemical synthesis. This approach provides a rigorous, yet feasible, safety assessment protocol by using modern techniques while minimizing sample consumption. We also identify key areas for future research and development, which include detection limits and further characterization of the martian abiotic background.
Introduction
Approximately 3–4 billion years ago, when life arose on Earth, Mars was likely more habitable than it is today, with a thicker atmosphere and liquid water at the surface (Carr, 1987; Carr and Head III, 2003; Schmidt et al., 2022). The discovery of evidence of a past atmosphere and water led to the hypothesis that Mars could have hosted life in its ancient past, and signs of this life may be recorded in the ancient martian geologic record. In 2020, NASA’s Mars 2020 Perseverance Rover was launched with four goals, which included the search for signs of ancient microbial life and caching samples for potential future return to Earth (Farley et al., 2020). Perseverance was sent to Jezero Crater, which hosts an ancient delta (∼3.5–3.9 Ga; Fassett and Head, 2008; Goudge et al., 2015) that may contain evidence of past microbial life.
Perseverance has an analytical payload that includes several instruments capable of assaying for signs of past life. However, payload capacity, power requirements, and harsh conditions limit the number, size, and resolution of onboard instruments and sample preparation conditions (compared to those available on Earth). Therefore, the search for life (ancient or modern) at Jezero Crater requires returning samples to Earth for analysis with high-resolution state-of-the-art instrumentation (NASEM, 2007, 2023). The detection of martian life would be one of the most important scientific discoveries of our time. Even in the event that evidence of life is not found, information gained from returned martian samples would revolutionize our understanding of Mars and the solar system (iMOST et al., 2019).
Perseverance is collecting and caching a scientifically selected set of geological and atmospheric samples for return to Earth. So far, at publication, 30 samples have been collected and include 27 rock cores, two regolith samples, and one atmospheric sample. These samples include igneous and sedimentary rocks, each with a mass of ∼10–27 g (Zorzano et al., 2024). NASA and the European Space Agency (ESA) are planning a joint Mars Sample Return (MSR) campaign to return those samples from the surface of Mars to Earth. If the samples contain evidence of ancient (billions of years old) fossilized life, but no evidence of modern life, they would not be considered hazardous to Earth’s biosphere. While the surface of Jezero Crater is not considered currently habitable for extant life (Farley et al., 2020; NRC, 1997; Rummel et al., 2014), there is a low, though not zero, chance that extant martian life could be present in the samples, which could represent a risk to Earth’s biosphere. Because of the potential risk, Mars is classified as a Category V restricted Earth return planet by Committee on Space Research (COSPAR) and NASA’s Office of Planetary Protection (COSPAR Panel on Planetary Protection, 2024; Office of Safety and Mission Assurance, 2021; United Nations, 1967), and returned samples would be opened in a biologically secure high-containment facility (Box 1) (Atlas, 2008; Beaty et al., 2009; COSPAR, 2008).
Background and History of Sample Safety Assessment
Background and History of Sample Safety Assessment
If it is safe to do so, samples should be released from the high-containment facility to enable analysis with the most up-to-date and cutting-edge technology available. Due to the number, size, and cost of the instruments required to achieve the proposed MSR science objectives, it is not feasible to equip the sample receiving facility (SRF) with the full breadth of equipment necessary for sample analyses. Therefore, to release the samples, a test protocol is needed to assess the risk of potential biohazards in returned samples within a given risk tolerance threshold and determine the requirements for release. The proposed test protocol discussed in this article provides such a Sample Safety Assessment Protocol (SSAP).
In September 2023, the SSAP Tiger Team (SSAP-TT) convened to propose an SSAP for the planned MSR campaign. The result was a feasible three-step protocol (Fig. 1), supported by a Bayesian statistical framework, to assess the level of risk that the returned samples (200–600 g total) contain biohazards that could harm Earth’s biosphere. The SSAP-TT objectives, methods, and findings were originally presented as an internal NASA/ESA deliverable (unpublished) and will be presented across three separate publications. A detailed description of the proposed statistical methodology, which includes estimates of the required number of subsamples per sample tube derived from measurements in Step 1, will comprise one of these publications (Cressie et al., article in preparation). This article describes the second step, and the description of the third step will make up the third publication (McDonnell et al., article in preparation).

Overview of the proposed Sample Safety Assessment Protocol showing the workflow and decision-making processes. In the proposed protocol, Steps 1 and 2 would be required for all samples. If evidence of biotic processes is not found at a to be declared threshold of acceptable risk, samples may “pass” the protocol for release from high containment. If samples do not pass the protocol at Step 2, Step 3 would be invoked to determine whether the putative biotic signal is from ancient martian biology, modern contamination (resulting in passing the protocol), or from modern martian biology, at which point the samples would be held for further review.
The proposed SSAP builds upon recent resources such as the Astrobiology Strategy documents (NASEM, 2019) and the 2023 Planetary Science Decadal Survey (NASEM, 2023). It is an extension of the work chartered by the COSPAR under the Sample Safety Assessment Framework (SSAF) analysis, which began in 2018 and was published in 2022 (Kminek et al., 2022).
Limiting the number of assumptions about putative martian biology and its hazardous potential in the proposed SSAP is an important component of the protocol. Specifically, these assumptions relate to (1) the ability to cause harm, (2) standard definitions of “life,” and (3) similarities between Earth and martian biology. First, it is not feasible to assess all the ways in which martian biology could be hazardous to the biosphere; therefore, the protocol treats any modern martian biology as if it were hazardous. Second, the SSAP does not rely on a standard definition of “life.” For example, viruses and prions on Earth would be detected by the protocol, though they do not necessarily fit the canonical definitions of terran life (Bartlett and Wong, 2020; Chou et al., 2024). Third, the SSAP must be able to detect biology that is different from known life on Earth. The protocol includes but does not assume, for example, the use of DNA or RNA as information molecules. Instead, it is designed to detect agnostic biosignatures expected from carbon-based biology, which includes (but is not limited to) specific diagnostic molecules produced by “life as we know it” (Buckner et al., 2024; Grefenstette et al., 2024; Johnson et al., 2018).
The proposed protocol is centered around a concept called the “abiotic baseline.” The abiotic baseline concept is not new (NASEM, 2007, 2019, 2023), and comparison of abiotic and biotic signals has been integrated into many studies relevant to both Mars and Earth (e.g., Gillen et al., 2023; Sephton and Carter, 2014; Sherwood Lollar et al., 2002; Webster et al., 2018). The use of the abiotic baseline in the proposed SSAP represents a major advance in our understanding of how to address sample safety assessments. This approach uses agnostic interrogation of the entire molecular signal rather than relying on individual biomarkers or traditional assays based on life as we know it. This step enables an effective and comprehensive safety protocol using modern techniques that does not require an excess amount of sample consumption when combined with Bayesian design and statistical hypothesis testing to estimate the amount of subsample required and determine the presence or absence of biotically produced organic material (Cressie et al., article in preparation).
The complexity, distribution, and abundance of organic molecules produced by biotic processes are fundamentally different from what can be produced by abiotic chemistry (Cronin and Walker, 2016). Biotic processes select, assemble, and concentrate molecules into larger molecular structures, whereas abiotic chemistry produces a large number of monomers or short polymers (Box 2; Fig. 2). The SSAP proposes to interrogate the molecular inventory of returned samples, which would reveal whether chemical properties are consistent with abiotic processes (i.e., the abiotic baseline defined by the chemistry and geology of Mars) or if biotic chemistry is needed to explain sample characteristics (i.e., the abiotic baseline is exceeded). These analyses need to be performed in combination with mineralogical and textural analyses to collect sufficient information and context for interpretations. The proposed SSAP consists of the following three steps (as shown in Fig. 1). The abiotic baseline, which is the central component of the protocol and the focus of this publication, is represented in Step 2, described as follows (Fig. 2).
Important Definitions and Acronyms

Abiotic baseline primer. Brief introduction to organic molecules, patterns of organic molecules generated by biotic and abiotic processes, and signals that differentiate between biotically and abiotically produced organic molecules. Parts of the figure are adapted from Chou et al. (2024) and Chapter 14 of NASEM (2023).
There are two possible outcomes of the protocol. If no evidence of modern martian biology is found, samples “pass” the protocol and may be released from high containment for analysis and curation at institutions outside the SRF. On the other hand, any modern or recent martian biology would be treated as a potential hazard. If this signal is detected, the protocol calls for “hold and review.” It should be noted that contamination knowledge and control procedures would be present across all steps of the protocol but are not detailed in this article because they are being developed and written by a separate panel of experts (Sessions et al., 2025).
This article details the scientific rationale behind the abiotic baseline approach and applies it to the expected MSR samples. It provides a practical assessment of the measurements required to define an abiotic baseline for MSR samples. Subsequent publications will describe the statistical methodology, sample heterogeneity, and assessment of biological activity.
State-of-the-art life detection strategies in low-biomass environments on Earth and in extraterrestrial materials rely on the presence and distribution of complex organic molecules that cannot be produced by abiotic chemistry (e.g., Steele et al., 2000). Organic molecules are present throughout the solar system (e.g., in meteorites, comets, and the interstellar medium) and can be produced by abiotic chemistry or through biological processes (Eigenbrode et al., 2018; Schmitt-Kopplin et al., 2023; Steele et al., 2012, 2016, 2018, 2022; Zeichner et al., 2023). Nonbiological chemical reactions can lead to the formation of organic molecules through various mechanisms such as impact events, metal catalysts, or photochemical reactions, which provide the necessary energy to drive these processes (Ruf et al., 2019; Treiman, 2003). In environments where water is present, such as on early Mars, interactions between water and rock—specifically through processes such as serpentinization and mineral carbonation—can also produce reactive intermediates. These intermediates can then react with other substrates, leading to the formation of organic molecules via mechanisms such as the Sabatier reaction, Fischer-Tropsch-type reactions, or the reverse water shift reaction (e.g., Fig. 2) (Sharma et al., 2023; Steele et al., 2022). The types of organic molecules synthesized in an abiotic environment depend on factors such as the composition of starting material (e.g., hydrogen, carbon, oxygen, and nitrogen sources), energy sources, catalysts, and environmental conditions. Simple chemical models (e.g., Schultz-Flory) (Flory, 1936) can predict the distribution and abundance of monomers and short polymers formed through abiotic chemistry.
On Earth, life selects a modest subset of monomers and assembles them into large molecular structures such as membranes, proteins, ribosomes, and information polymers such as nucleic acid chains. The result is a relatively small number of concentrated, repetitive, and complex molecules that are characteristic of biotic chemistry. Although we cannot assume that (putative) biology elsewhere in the solar system builds the same molecular structures as Earth life, an inventory of organic molecules with signatures of biotic processes described above (concentrated, repetitive, and complex) can be considered agnostic evidence of biology. Since these features rely on fundamental principles known about life, they represent distinct features that are discernable from abiotic processes (e.g., Chou et al., 2024). These concepts translate into the abiotic baseline, whereby the molecular inventory of samples is characterized and then assessed for signs of concentration or complexity (Fig. 2).
At its simplest, the abiotic baseline is an example of the baseline or “noise” produced by abiotic (nonlife) chemistry in a given environment. Life produces a biotic “signal” that rises above the abiotic baseline. For any measurement, the ratio of signal to noise determines the robustness of, and confidence in, the measurement. Figure 3 provides a theoretical example of these principles. Figure 3A shows a high signal-to-noise ratio where signal I rises significantly above the baseline or background and is thus reliable. InFig. 3A, signal II is unreliable because it is indistinguishable from the noisy baseline. By contrast, in Fig. 3B, signal II is detected at the same magnitude as in Fig. 3A, but the result is robust in this context, relative to the lower (quieter) baseline. A “signal” or measurement alone is insufficient; rather, it must be evaluated in the context of the baseline. For complex samples from Earth, Mars, or other bodies, the abiotic baseline is defined not as the instrumental baseline noise of the proposed measurements alone, but as the background or “noise” created by non-life-associated processes for a given environment. In the case of potential signs of life then, it follows that they can only be confidently identified when compared to the “abiotic baseline” or “noise” produced by nonlife chemistry.

There is an important difference between the analogy to an instrumental signal-to-noise ratio and the “noise” of the abiotic baseline, as the latter is not random (as implied by Fig. 3A and B). For instance, abiotic organic chemistry is not randomly variable but produces distinct patterns of carbon compounds independent of life. Selecting target compounds that have resolvably different patterns created by biotic versus abiotic processes is fundamental to the approach described in this protocol. Where life occurs, it does not use all the inventory of organic chemicals from the abiotic background environment, choosing a more limited alphabet of organic species. Hence, the diversity and distribution of life’s signatures can be recognized when they are found in concentrations and patterns that are different from the abiotic, or nonlife, background (e.g., Buckner et al., 2024; Dorn et al., 2011; NASEM, 2007; Steele, 2016; Steele et al., 2007).
Recognition of an abiotic baseline has been a powerful catalyst in modern biosignature research. Previously, the implicit or explicit principle was that any signal produced by life could be a “biosignature,” that is, a key measurement in life detection. Thus, the presence of an individual molecule, isotopic signature, morphology, or other parameter that was often associated with terran life was relied on for life detection, whether in early Earth studies, in the terran subsurface, or elsewhere in the universe. These features are only reliable biosignatures by comparison to the abiotic baseline (NASEM, 2023).
The abiotic baseline is well-suited for life detection strategies because it is agnostic (Box 2), and it minimizes assumptions about the forms and characteristics of (putative) non-Earth biology. We do not know what molecules might be used by potential martian biology, so applying an agnostic approach and interrogating the whole molecular signal rather than relying on specific molecules or traditional microbiological assays is better suited to the search for biology that may be substantially different from Earth life. In this way, the SSAP is not constrained by any formal definition of “life.” Dormant forms (e.g., spores) or other potentially hazardous biotic entities such as viruses and prions (or putative martian equivalents) can be integrated into such an agnostic protocol.
The SSAP does assume that potential martian biology would be carbon-based, as specified in the terms of reference. It also assumes that life, regardless of origin, must be able to replicate and use, transport, and store energy. For carbon-based life, this necessitates the construction of complex organic molecules from simpler building blocks, such as free amino acids, nucleosides, and carbohydrates. Earth life uses only a fraction of the available building blocks to build these macromolecules. The assumption of the SSAP is that life is selective and polymerizes small simple organic molecules to form large complex ones. These properties (selectivity and polymerization) are detectable attributes central to the logic proposed in this protocol.
An example of Earth life’s selectivity is chirality. Abiotic processes generally produce molecules (such as amino acids) that are racemic; that is, there is a 1:1 distribution of the l- and d- enantiomers with no strong preference for synthesis of either form (e.g., Glavin et al., 2020). In contrast, life as we know it selects and creates different relative concentrations of l- and d- enantiomers depending on the compound class in question. In Fig. 4, we show an example of hypothetical liquid chromatography mass spectrometry (LC-MS) chromatograms. In this example, life preferentially produces l-alanine, creating a distinctly nonracemic mixture with an excess of l-alanine to d-alanine—a “biosignature” (Fig. 4B) clearly distinct from the abiotic baseline where the two isomers appear in an approximately 50:50 ratio (Fig. 4A). Figure 4C provides an example of an agnostic biosignature—a pattern of d-alanine preference that is different from both the abiotic baseline (Fig. 4A) and the expectation of l-alanine preference from known Earth life (Fig. 4B).

Illustration of hypothetical organic signals relevant to the SSAP. The amino acid alanine (ala) can be biotic or abiotic, and a way to determine its provenance is to evaluate the enantiomeric excess.
Searching for evidence of biology on Mars would require (1) understanding martian chemistry and (2) the distribution of organic molecules expected (based on previous detections of organic molecules on Mars, inferences from meteorite studies, and potentially laboratory or analog environments on Earth) from solely abiotic processes. For the following discussion of the abiotic baseline, we present a scenario where sample heterogeneity has been characterized (Step 1 of the SSAP) and the abiotic baseline approach is being applied to representative subsamples (Step 2 of the SSAP). Many methods exist for detecting and characterizing biotically produced organic material and could potentially be used to determine whether biotic processes contribute to the chemical inventory of returned samples (e.g., Kminek et al., 2022). The measurements that the SSAP-TT proposes to characterize the organic inventory of returned samples, guided by textural information to determine whether the abiotic baseline is exceeded, are based on the criteria outlined below. Within instrument capabilities and resolution, measurements must contribute to our understanding of the diversity of organic molecules present in returned samples. Measurements must be “agnostic.” That is, tests must evaluate patterns expected from any potential modern biology, not just “life as we know it.” Data produced must be statistically robust (i.e., low false negative error and false positive error rates) with potential for high signal (life) to noise (abiotic baseline). Measurements should minimize sample mass consumed.
No single method captures the potential chemical diversity of returned samples, and confidence is increased with several lines of evidence from multiple measurements. Therefore, we identified a combination of robust, established methods that best enable us to achieve primary SSAP objectives while minimizing sample use. Many of the techniques we suggest in characterizing the abiotic baseline are also used by the rovers. However, laboratory-based measurements have much higher sensitivity and lower detection limits than in-flight measurements. The recommended measurements are divided into three tasks. Tasks A and B are recommended for all subsamples. Task C is provided as a contingency in case A and B results are inconclusive. These methods fall into two general categories: spatially resolved (Task A) and bulk (Tasks B and C). Using both spatially resolved and bulk techniques together enables a high degree of characterization of the organic inventory and its provenance, which is necessary to adequately interrogate samples for evidence of recent biology. A workflow and summary of the proposed measurements and instrumentation are in Fig. 5 and Box 3. Detailed information is provided in subsequent sections.

Summary of the proposed Step 2 of the SSAP, showing its organization into three sequential tasks
Evaluation Points for Abiotic Baseline Assessment Workflow
Ultimately, the collective data from the abiotic baseline comparison would yield a broad view of organic speciation, elemental composition and mineralogy, and molecular organic characteristics in context with mineralogy. Each data type informs the interpretation of data from other instruments. These overlapping datasets, and the robust physicochemical picture they produce, are necessary to adequately perform biohazard assessment. Though the specific measurements and workflows recommended by SSAP were guided by hazard assessment, these data are equally as valuable from a scientific perspective, providing an unprecedented understanding of Mars and guiding downstream investigations.
Rationale, introduction to imaging methods, and sample preparation
The inclusion of spatially resolved methods (Task A) in the SSAP overcomes some of the limitations associated with bulk methods (Tasks B and C). Bulk methods obscure fine-scale variation because organic inventories are averaged across milligrams of sample. The spatial context of the organic material (i.e., where it is located in the sample, including what structures or minerals the organic matter is associated with) is lost. For bulk methods, the organic molecules detected are dictated by the extraction technique or solvent (e.g., thermal [pyrolysis], hot water, dichloromethane, methanol), which do not extract all of the organics present in the sample. In some cases, imaging can provide resolution down to the equivalent of a single bacterial cell, which is at or below the current limits of detection (LOD) for bulk methods (e.g., Schie and Huser, 2013; Serrano et al., 2014).
The proposed protocol begins with an initial microscopic description of subsamples (e.g., lithology, grain size, aqueous alterations) (Table 1). At this point in the protocol, the samples would have previously undergone low-resolution imaging in the Basic Characterization (BC) phase as part of curation and to guide subsampling. The initial measurements would be limited by containment in BC vessels. Additionally, drill cuttings and dust could interfere, interiors would be inaccessible, and samples could not be manipulated during BC measurements. Out of BC vessels, subsampling would provide access to sample interiors and prepared surfaces. The ability to target regions of interest and obtain higher spatial and spectral resolution is necessary for discriminating, differentiating, and characterizing sample components. This description would enable investigators to check whether higher microscale heterogeneity exists than previously thought and to adjust subsampling strategies accordingly. If a putative biotic signal were detected, additional imaging could be conducted prior to downstream destructive analyses.
Summary of the Proposed Imaging Measurements
Summary of the Proposed Imaging Measurements
For mass allocation rationale, see section 5.4. “Data from analogs informing mass estimates.”
BC = Basic Characterization; DU = deep ultraviolet; PE = preliminary examination.
High-resolution mapping spectroscopy and spectrometry is a powerful tool for analyses of meteoritic (including martian meteorites) and terrestrial materials (Steele et al., 2018). It is a crucial component for assessing heterogeneity at the micron and submicron scales, which corresponds to variation in potential habitability, organic chemistry, and biomaterial preservation. Though such fine-scale imaging can be hindered by it being difficult to find trace signs of life, lower-resolution imaging can be used to identify regions for high-resolution scans such as microfractures or organic hotspots.
Four general types of imaging are recommended in Task A: (1) light microscopy, (2) scanning electron microscopy (SEM) (3) optical spectroscopy (Raman/Deep Uv (DUV)DUV), and (4) imaging mass spectrometry (IMS). In spectroscopic imaging, intense light (typically from a laser) is used to illuminate a specimen. Molecules in the sample “scatter” the light in different ways that depend on the wavelength of light and molecule characteristics. For IMS, a highly focused beam ionizes molecules at the sample surface, which is then detected by mass spectroscopy. By combining microscopy and optical spectroscopy, organic and mineral distributions can be mapped onto optical images (Table 1). These instruments can detect the molecular composition of morphological features (i.e., the distribution of organic material within and surrounding cell-like structures), spatial associations between organic molecules and known mineral catalysts, the distribution of organic molecules in highly variable samples, and small organic-rich regions in otherwise low organic content samples. These instruments are minimally destructive, and some of the Step A material could potentially be used for Tasks B and C, though to be conservative, sample mass calculations do not assume reuse of Task A material.
The preparation of subsamples for imaging depends on sample lithology (e.g., basalt, sandstone, regolith), and multiple preparations would likely be done for each subsample. Lower-resolution imaging could be conducted on intact rough surfaces by using adaptive focusing, but a flat surface is required to obtain high accuracy and precision for some types of measurements. For these measurements, we assume that it would be necessary to prepare grain mounts and/or fresh fracture surface mounts. Drill cuttings and dust could be prepared for imaging, which represents an additional data source that does not require use of the main sample. Each subsample would need to be imaged at low magnification (to establish sample context and to target areas for further investigation) and high magnification (for high-resolution characterization) using several types of microscopy (e.g., transmitted, reflected, and plane polarized) on slabs and polished grain mounts.
High-resolution assessment of chemical heterogeneity, specifically organo-mineral associations and organic associations with physical structure, would be performed using Raman/DUV spectroscopy and IMS. Both provide information about the spatial distribution of sample constituents, but they have different strengths and limitations. The complementary information obtained from optical spectroscopy and IMS can yield a more complete picture than is available by either technique alone. The combination of these techniques has been used to characterize organic species in association with mineral catalysts on several martian meteorites, revealing a previously unknown abiotic synthesis mechanism in martian materials. In operation, it is preferable to triage a sample with Raman spectroscopy to pinpoint hotspots of organic carbon and to gain context mineralogy for the sample and the organic material. These hotspots can then be precisely targeted by IMS techniques, and a fuller picture of the molecular character of any Raman signals can be gained.
Microscopy should be performed on the samples to assess spatial heterogeneity, document textural and mineralogical phases, and target regions of interest. Different forms of microscopy would be necessary for this task. Reflected light microscopy would be needed to assess mineral phase heterogeneity, surface features of interest, and porosity. Transmitted light microscopy would allow for assessment of the presence of inclusions/gross morphologies, distribution of opaque phases (spinel grains, etc.). Additionally polarized light microscopy would be necessary for preliminary mineralogical analysis of the sample. Although polarized and transmitted light microscopy are traditionally performed on thin sections, here we recommend their use on polished grain mounts, due to contamination concerns in the SRF (Sessions et al., 2025).
In addition to traditional microscopy, the SSAP recommends SEM to provide ultra-high-resolution images of the samples. SEM is a microscopic technique that scans samples with a focused electron beam, causing secondary electrons to move, which are used to image sample topography. SEM is typically paired with energy dispersive spectroscopy, which provides elemental information. The much higher resolution of SEM compared to microscopy and spectroscopy allows interrogation of mineral structures and sample topography, including cracks or crevices that might preserve organic molecules or morphological features indicative of biology, such as potential microbial cells and biofilms (Steele et al., 2012).
SEM subsamples should be determined from areas of interest identified by light microscopy, and Raman mapping can be subsampled and mounted for analyses by SEM/Energy Dispersive X-ray Spectrometer, thus limiting the size of samples imaged to submilligram levels. During SEM imaging, submicron elemental mapping of carbon, nitrogen, and common rock-forming elements can be performed to give chemical context to the samples by combining high-resolution imaging with energy-dispersive X-ray analysis. Samples analyzed by SEM are typically coated to increase conductivity and increase resolution on samples that readily charge. However, for some subsamples, coating may need to be avoided for contamination or further downstream analysis, so mitigation strategies such as variable pressure systems should be investigated.
Raman spectroscopy has a mapping capability that provides high-resolution mineral and organic material identification. The inclusion of confocal systems (recommended by SSAP) increases optical resolution and enables reconstruction of three-dimensional images. At its highest magnification (3 pixels per micron), Raman can detect single microbial cells (Schie and Huser, 2013; Serrano et al., 2014; Steele et al., 2016) and has been used to map the distribution of organic material and terrestrial contaminating organisms to submicron features in martian samples (Steele et al., 2012). Raman systems have been adapted to remain in focus on rough samples and, therefore, can be used at a range of resolutions and depths within a sample, enabling imaging and analyses within fractures and porous materials, dependent on material (e.g., Neuville et al., 2014). Systems for high-resolution Raman mapping contain high-performance, high-magnification light microscopy with X, Y, and Z extended focal imaging and mapping. Although we discuss microscopy and optical spectroscopy somewhat separately, they can be integrated into the same instrumentation system. Such systems using piezo stage controllers, in combination with more traditional mechanical stage controllers, are capable of mapping from many centimeters to the nanometer size range with great precision, and both Raman and infrared spectroscopy can achieve resolution at or below (using tip-enhanced techniques) the refractive index of the laser/light excitation used (Steele et al., 2016).
In practice, Raman spectroscopy can identify functional groups at very high spatial resolutions and has been used, for example, to detect organic material in microbial fossils and pigments in viable single-celled bacteria (Jehlička et al., 2014; Steele et al., 2016). The pattern of peaks produced can be indicative of individual organic molecules, but in practice, viable life tends to produce a distribution of peaks representative of proteins, nucleotides, and cell wall components (e.g., Marshall et al., 2006; Serrano et al., 2014). In fossil life, these components are no longer present as individual peaks but as D and G bands (Pasteris and Wopenka, 2003), which are not useful in distinguishing biotic material from abiotic material, since they are also a signature of insoluble organic material in ancient geological samples from Earth, carbonaceous meteorites, and martian macromolecular carbon (Bower et al., 2013, 2016; Steele et al., 2016). In practice for the SSAP, the presence of D and G bands and no other signs of organics would be an indicator that no biohazards are present in the sample. But other spatially resolved techniques, such as IMS, are necessary to complement Raman analyses to reach an acceptable level of certainty.
Infrared (IR) spectroscopy is another commonly used technique for assessing the organic functional groups and mineral bands in geological samples and is complementary to Raman spectroscopy. IR can, for example, be used to detect organic compound classes such as alkanes, organic acids, and amides (Bullen et al., 2008) and is typically more sensitive to polar organic molecules than Raman spectroscopy (Olcott Marshall and Marshall, 2015). IR and, in recent years, nano-IR have been used extensively to map minerals and organic material in fossils and meteorites (Dominguez et al., 2014; Matrajt et al., 2004; Yesiltas et al., 2020, 2024). In the SRF, IR could be installed on the same instrument as Raman spectroscopy, allowing for colocated analyses. A variety of nano-IR techniques are available, which have different spatial resolutions, and these should be considered when selecting SRF instrumentation. For instance, micro Fourier transform infrared spectroscopy (µFTIR) has a detection limit of ∼10 µm, which does not reach single-cell size for some microorganisms (Suzuki et al., 2025). Alternatively, optical–photothermal infrared (O-PTIR) spectroscopy can reach ∼0.5 µm and has been shown to be able to detect trace organics related to microbial organisms in Mars analog samples using thick sections (Suzuki et al., 2025).
Imaging Mass Spectrometry
IMS (e.g., laser desorption ionization [LDI], time of flight secondary ion mass spectrometry [ToF-SIMS] and desorption electrospray ionization [DESI]) enables the analysis of individual organic molecules sputtered from a sample surface by a focused ionization beam (i.e., laser, ions, solvent) into a mass spectrometer. These techniques are highly sensitive and, depending on the type of instrument, can supply highly spatially resolved mass spectrometry (∼100 nm) over a very wide mass range (up to 10,000 Daltons). Because of these capabilities, IMS techniques are used to measure specific organic (and in some cases inorganic) molecular distributions including polyaromatic hydrocarbons, thiophenes, alkanes, lipids, and amino acids in samples such as meteorites, bacterial biofilms, and fossils (Beech et al., 1999; Dunham et al., 2016; Greenwalt et al., 2013; Kuznetsov et al., 2015; Ma et al., 2023; McKay et al., 1996; Meng et al., 2023; Siljeström et al., 2017, 2022; Steele et al., 2001, 2012, 2018; Stephan, 2001; Stephan et al., 2003; Taylor et al., 2021; Toporski et al., 2002; Toporski and Steele, 2004). As such, IMS would complement extraction-based analyses (i.e., Tasks B and C) by providing contextual molecular data mapped to the mineralogy and morphology of any features of interest.
IMS is an area of active development, including its application to microbial detection, and precise instrumentation would likely be determined in the future based on technological advances. Commercial benchtop IMS systems (e.g., ToF-SIMS, LDI) already exist that could be used to obtain spatially resolved organic mass spectroscopic information, though improvements in terms of sensitivity, mass range, mass resolution, and spatial scale are needed. One promising IMS technology for the identification of intact biomolecules at the necessary sensitivity, mass range, mass resolution, and spatial scale is a laser coupled to an Orbitrap with postionization/ion trap signal enhancement and deconvolution. The high-resolution mass accuracy of the Orbitrap would allow the precise chemical formula to be calculated from the measured peaks and the separation of organic and inorganic contributions to be detected and mapped (Kotowska et al., 2025; Ray et al., 2024). If further developed, these types of instruments would be practical for use in a high-containment setting like the SRF.
These spatially resolved techniques are not currently sensitive to isomerization or capable of quantitative measurements, thus requiring solvent extraction of bulk samples and mass spectrometry techniques for these important data. The combination of in situ spectroscopy/spectrometry and bulk analysis can provide quantitative data of the molecular composition, its context within the matrix (within specific morphological features or associated with catalysts known to facilitate abiotic synthesis reactions), and the distribution of key isomers, which will be essential in evaluating the resultant data.
Task B: Organic Molecule Characterization Using Bulk Methods for Comparison to the Abiotic Baseline
Rationale, introduction to analysis methods, and sample preparation
Task B contains the primary measurements for bulk organic molecule characterization and comparison of the organic inventory of MSR samples to the abiotic baseline (Table 2). The three recommended measurements rely on mass spectrometry, which is invaluable for identifying organic species associated with both biotic and abiotic chemistry due to the high sensitivity, ability to detect a diversity of organic molecules, and ability to target specific molecules of interest. Though these techniques are destructive, they are the primary technique for understanding the organic composition of carbonaceous meteorites, martian organics (both in meteorites and on Mars), asteroid return samples, organic matter preserved in geologic and environmental samples on Earth, and detecting biology against an abiotic background (e.g., Eigenbrode et al., 2018; Freissinet et al., 2015; Glavin et al., 2013; Hallmann et al., 2022; Millan, Teinturier, et al., 2021; Millan, Williams, et al., 2022; Stern et al., 2022).
Summary of Proposed Organic Characterization Measurements
Summary of Proposed Organic Characterization Measurements
GC-MS and LC-MS instrumentation can be customized based on target analytes and desired resolution (e.g., the addition of an ion trap mass analyzer, column selection).
Polar and nonpolar extractions would be performed on the same material. Extraction order, solvents, and conditions TBD (see “Recommended Research and Development Priorities” section).
For mass allocation rationale, see the section entitled “Sample Mass Calculations for the Abiotic Baseline.”
GC-MS = gas chromatography mass spectrometry; LC-MS = liquid chromatography mass spectrometry.
The need for several instruments is due to variable analyte sensitivity. Generally, LC-MS is used for polar organic molecules, while gas chromatography mass spectrometry (GC-MS) is used for nonpolar organic molecules that are volatile or semi-volatile. GC-MS and LC-MS both use chromatography coupled with mass spectrometry to first separate complex mixtures of molecules by retention time according to either volatility (GC-MS) or polarity (LC-MS) and then provide mass spectra to identify those separated compounds. For example, LC-MS is necessary for the detection of peptides, nucleic acid polymers, chirality, and even large signaling molecules, since these compounds are polar, highly functionalized, and have high molecular weights that make them nonvolatile (i.e., not GC-amenable). When GC-MS or LC-MS are operated in a survey (nontargeted) mode called a full scan, the resulting chromatograms provide mass spectra for a wide range of analytes (5–2,000 m/z).
Extraction solvents and conditions affect which analytes are recovered from the sample matrix. LC-MS can use water as the mobile phase, thus separating potential biomolecules in the same solvent that an organism is likely to use for metabolic functioning. However, we also recommend allocating a portion of the polar extract for acid hydrolysis to determine if any bound chemical precursors (e.g., polymers) were originally present in the extract (see later discussion for Task C).
Meteorites are good analogs to better understand organic compounds that are detectable using these techniques. Compound classes such as amino acids, branched fatty acids and aldehydes, olefins, and polyaromatic hydrocarbons have been detected in meteorites (e.g., Schmitt-Kopplin et al., 2023; Glavin et al., 2018). On Earth, these techniques are also used to find biomarkers (organic compounds of biological origin) that are typically lipids or diagenetic products of lipids. Some examples of Earth-based biomarkers are hopanoids (bacteria), sterols (eukaryotes), and glycerol dialkyl glycerol tetraether lipids (archaea). Given these are Earth-based nonagnostic compounds, we do not expect to find these specific compounds in the martian samples. Potential agnostic indicators of biological activity that these instruments are capable of detecting include amino acids with a preferential chirality and the presence of alkanes or fatty acid chains of distinct lengths or branching patterns.
Stepwise pyrolysis GC-MS
Stepwise pyrolysis (py-GC-MS) to analyze insoluble organic material would complement extraction-based techniques, such as GC-MS and LC-MS. This method uses stepwise heating to release compounds from their macromolecular matrix and volatilizes them so they can be analyzed with GC-MS. Importantly, py-GC-MS is onboard the Curiosity Rover in the Sample Analysis at Mars (SAM) instrument suite and was included on Viking (albeit at lower resolutions). Py-GC-MS data from SAM and Viking are the only compound-specific organic measurements collected on Mars. The samples analyzed by SAM and Viking are analogous to Perseverance samples because they are sourced from similar depths in the martian regolith/bedrock, are of similar ages, and lack alteration signals found in meteorites. Pyrolysis GC-MS can be performed using the same GC-MS as used for solvent extracts with an added front end that can perform pyrolysis. This technique is typically performed on the remaining solid sample residue after organic extractions for GC-MS and LC-MS. In our flow, we propose that pyrolysis of insoluble material be undertaken on the residue that has been solvent extracted. Pyrolysis of non-solvent-extracted material could also be undertaken if desired or deemed necessary for the statistical model, which may help in comparing analyses performed at Gale Crater by the SAM instrument.
It is also possible to add a derivatizing agent to pyrolysis to make the compounds more GC amenable and better detectable. A Mars-relevant derivatizing agent is tetramethylammonium hydroxide (TMAH), which is commonly used for fatty acid analysis and has been used on Mars at Gale Crater (Williams et al., 2021). TMAH methylates fatty acid carboxyl groups to make them more volatile and thus more readily detectable than nonmethylated fatty acids. Compounds so far detected on Mars by using derivatization pyrolysis include aromatic and polyaromatic molecules (with and without chlorination) and sulfur compounds like thiols (Eigenbrode et al., 2018; Freissinet et al., 2015; Millan, Teinturier, et al., 202; Millan, Williams, et al., 2022; Szopa et al., 2020). Other compound classes detectable with pyrolysis, if present, include alkanes, alkenes, ketones, alcohols, and polysaccharides.
Task C: Large Molecule and Peptide Characterization (Contingent)
Task C would be triggered if a biotic signal that rises above the abiotic baseline is detected or if there is not sufficient certainty to make that determination based on outputs from Tasks A and B. This includes a lack of statistical certainty and/or observations that suggest further investigation is warranted. Task C would characterize polymers and other large molecules using liquid chromatography high-resolution mass spectrometry (LC-HRMS) with the ability to operate in tandem mass spectrometry/mass spectrometry (MS/MS) mode (Table 3). Polymers are rare in extraterrestrial material. For example, although the amino acid diversity in meteorite samples is high, peptides (chains of two or more amino acids) are very rare. Specifically, the only peptides that have been detected in meteoritic samples are diglycine and a cyclic glycine polymer (Shimoyama and Ogasawara, 2002); however, these were in low abundance (pmol/g). No peptides with three or more amino acids have been identified with confidence (Parker et al., 2023), although abiotic synthesis of polypeptides has been demonstrated in laboratory experiments (Comte et al., 2023; Imai et al., 1999; Krasnokutski et al., 2024).
Summary of the Contingent Organic Characterization Measurements Proposed as Part of Abiotic Baseline
Summary of the Contingent Organic Characterization Measurements Proposed as Part of Abiotic Baseline
For mass allocation rationale, see the section entitled “Sample Mass Calculations for the Abiotic Baseline” section.
LC-HRMS = liquid chromatography high-resolution mass spectrometry; MS/MS = tandem mass spectrometry/mass spectrometry.
LC-HRMS with the ability to operate in MS/MS mode enables structural characterization of large molecules by measuring the mass of the large molecule fragments. Large molecule fragment mass data allows for the identification of smaller large molecule building blocks, which can then be used to decipher the intact molecular structure of the large molecule. MS/MS analysis is also used to resolve and identify compounds that coelute, which is a challenge in the analysis of complex organic matter found in geologic materials (Radović et al., 2016).
Extensive reported LC-HRMS analyses for both biological and meteoritic extracts allow very specific comparison with previous observations. As with any mass spectrometry, LC-HRMS requires careful analysis of masses generated in the context of separation chemistry to correctly assign the molecule’s identity; this process is greatly improved by the utilization of MS/MS to allow the interrogation of the fragments produced from particular parent masses. False positives may be the result of reliance on spectra libraries that overrepresent biological molecules compared to geomolecules.
Diverse nucleobases, including those that are associated with biological processes, are often found in meteorites (Callahan et al., 2011; Martins et al., 2008; Oba et al., 2023), but there is no evidence of polymerization. While there are no reported observations of nucleosides in meteoritic samples, there is experimental evidence that points toward prebiotic nucleoside synthetic pathways relying on meteoritic material (Saladino et al., 2015), as well as nucleoside phosphorylation reactions that can be performed on meteorite samples (Bizzarri et al., 2020). Any potential detection of a nucleobase dimer, nucleoside, or nucleotide would suggest the need for further investigation or optimized sample preparation to capture these species. We also note that sugar polymers or other multisubunit repeating molecules would be detected by this analysis.
Information obtained from ensuring Earth’s safety from a potential martian biohazard overlaps with proposed MSR scientific objectives of searching for evidence of life (e.g., Carrier et al., 2025). The proposed measurements and data that would be generated by both safety and scientific objectives are similar, in some cases identical, and they share the common goal of detecting life in the same samples. We recommend that both should be undertaken cooperatively, as the complementary data sets will be useful for science and policy. This approach would maximize sample utilization for science objectives and Planetary Protection goals (e.g., Kminek et al., 2022).
Lessons from terrestrial analogs
Sample mass estimates for characterizing the abiotic baseline are shown in Tables 1–3. The SSAP abiotic baseline approach was designed to be flexible, enabling it to be updated as technology advances and more information becomes available. Consequently, the precise sample masses necessary for analysis will evolve, possibly reducing in requirements with technological developments over time. Obtaining precise mass estimates will require: (1) testing the recommended instruments on highly representative compositional and physical analogs (e.g., Thorpe et al., 2024) and (2) optimizing protocols for low mass inputs. Currently, these data do not exist. While there is a rich body of literature on meteorites, asteroids, and low-biomass terrestrial environments, many knowledge gaps remain. Below, we highlight important considerations that influenced our conclusions and estimates.
Low-biomass samples from Earth are typically collected in large enough quantities to compensate for low analyte concentrations and/or detection limits. The limits of instrument performance on terrestrial analogs have not been challenged using small amounts (mg) of sample material combined with the low analyte concentrations that are expected in returned samples. Data are typically derived from grams rather than mg of material (e.g., compare Amashukeli et al., 2007; with Zeichner et al., 2023). The unique problem for SSAP is that we expect low analyte concentrations and minimal sample mass and we require replicate analyses. Another limitation of terrestrial analog studies in this context is that analytes targeted by these investigations address specific research objectives that differ from MSR. They often include DNA-based methods, which have the advantage of amplifying the signal if needed (Dragone et al., 2021), but limit direct comparisons to SSAP.
Despite the aforementioned differences, knowledge gained from terrestrial analogs has informed SSAP and MSR efforts. Examples of relevant studies include characterization of low concentration substrates in carbon-limited environments (e.g., 0.001–0.78 ppb PAHs and 1–70 ppb amino acids in the Atacama Desert) (Amashukeli et al., 2007; Mörchen et al., 2019, 2024) and investigation of life at the limits of survival (Azua-Bustos et al., 2012; Ford et al., 2024; Goordial et al., 2016; Horstmann et al., 2024). Future studies of Earth analogs relevant to the MSR samples and testing grounds for analytical techniques will likely play a crucial role in technique development and the interpretation of data generated from returned samples. Low-biomass terrestrial analogs with similar lithology as the MSR samples have been exposed to terran life, and they provide the opportunity to understand life’s impact on its environment, which will be necessary to interpret the data from returned samples. Life alters its environment by using substrate from both abiotic and biotic sources for carbon and energy while generating waste products and necromass. These processes likely result in a signal or “fingerprint” that is different from a physicochemically matched sample that has not been acted upon by generations of life. It will be important to understand the limitations of analog studies of organic matter, as Earth-based alteration processes (such as plate tectonics and burial) will have significant effects on what organic molecules are preserved, differentiating them from Mars samples (e.g., Teece et al., 2024). This argues for a broad spectrum of Earth analogs, including those in Earth’s ancient rocks and in the subsurface, where the overprinting of abiotic characteristics by biological signals is lowest (e.g., slow life biomes) (Ford et al., 2024; Hoehler and Jørgensen, 2013; Onstott et al., 2014; Sherwood Lollar et al., 2024; Trembath-Reichert et al., 2017).
Lessons from extraterrestrial analogs
Like terrestrial samples, studies of meteorites and returned asteroid material have different research goals and analyte targets compared with SSAP. For example, OSIRIS-REx (a mission that returned samples from asteroid Bennu to Earth in 2023) objectives included a comprehensive analysis of soluble organics, requiring orders of magnitude more material than would be available to SSAP (Lauretta et al., 2023). One common and important difference is the relative excess of carbon in most of these samples compared to the low concentrations expected in MSR samples (Sharma et al., 2023). Asteroids Ryugu (target of the Hayabusa2 sample return mission) and Bennu are rich in organics (e.g., total organic carbon of 4.5 wt % in Bennu) (Lauretta et al., 2024; Nakamura et al., 2022) and most meteoric organic chemistry studies are on carbonaceous chondrites (e.g., Elsila et al., 2016; Steele et al., 2016). These differences lead to uncertainties when attempting to project required sample masses for SSAP.
Though we cannot project the exact mass required for safety assessment, the practical experience gained from extraterrestrial analogs to martian samples has guided SSAP recommendations and estimates. For meteorites and returned asteroid samples, material is limited to what has either fallen to Earth or has been brought back from space. Consequently, scientific analyses have required using the smallest possible amount of these samples while still ensuring reliable results. This constraint necessitates sensitive analytical techniques, sample preparation, and handling protocols (Aléon-Toppani et al., 2021; Chan et al., 2020; Dworkin et al., 2018; Genge et al., 2025; Naraoka et al., 2019; Steele et al., 2022; Summons et al., 2014). As a result, research on extraterrestrial material has been pushing the boundaries of what is possible in mass-constrained samples for years (e.g., Callahan et al., 2014; Koga et al., 2024; Simkus et al., 2019; Steele et al., 2012, 2018). Importantly, these data indicate that the proposed protocol can be accomplished by consuming just a small fraction of returned material (Table 4).
Selected Examples of Sample Safety Assessment Protocol-Relevant Analyses of Meteorite and Returned Asteroid Material Including Sample Mass Consumption, Instrumentation, and Analytes
Selected Examples of Sample Safety Assessment Protocol-Relevant Analyses of Meteorite and Returned Asteroid Material Including Sample Mass Consumption, Instrumentation, and Analytes
The Hayabusa mission to asteroid Itokawa returned less than 1 g of sample to Earth. Since their sample material is so limited, no bulk organic characterizations like those recommended by SSAP have been performed. However, imaging methods that are recommended by SSAP (e.g., Raman spectroscopy) have provided detailed information about the organic content (Chan et al., 2021).
Material allocated for Bennu triage.
Combining the information derived from analog studies and expert elicitation (Table 4), the SSAP-TT concluded that a practical lower limit for generating quantitative organic concentrations in Task B using today’s instrumentation is ∼50 mg for each subsample, though this estimate could decrease with future instrument development. To account for uncertainties, we propose a mass allocation range of 50–200 mg per subsample (Tables 1–3). For high spatial resolution imaging (Task A), we propose a 20 mg allocation per subsample. For Task C, which would only be triggered if evidence of a biotic signal is detected, we propose a 100–200 mg allocation per subsample. There are two arguments in SSAP’s favor relative to these estimates. Meteorite and asteroid research has been pushing the boundaries of sample mass requirements, producing reliable quantification of organic material, often using sub-50 mg inputs. See selected examples in Table 4. Based on current proposed sample return dates, the SSAP measurements would begin in approximately 10-15 years. Over that time, instrument performance improvements and protocol optimization can be expected to decrease LOD and mass input requirements.
The subsample sizes for Step B of the SSAP could be increased if necessary to 200 mg, which would not be ideal from the perspective of sample conservation but is in excess of sample sizes used in Table 4 and is consistent with masses used for LC-MS analysis of the Atacama Desert (Aerts et al., 2020).
Sample return missions from asteroids Bennu, Ryugu, and Itokawa provide experience that has informed SSAP, including providing a framework for how to plan and execute returned sample analyses (Chan et al., 2021; Dworkin et al., 2018, p. 20; Lauretta et al., 2023; Schmitt-Kopplin et al., 2023). The total mass of Bennu samples (121.6 g) is greater than what is expected for any one of the martian samples (∼20 g per tube) (Lauretta et al., 2024), so the OSIRIS-REx soluble analysis investigation has budgeted more material than SSAP for peptide and nucleobase analysis (see the OSIRIS-REx Sample Analysis Plan: Lauretta et al., 2023). However, this does not imply that the SSAP would require sample amounts equivalent to Bennu allocations. OSIRIS-REx objectives require a comprehensive suite of analytes that include compound classes not necessary for safety assessment of returned martian samples.
OSIRIS-REx conducted an organic molecule triage study to determine which lithologies are worth consuming larger quantities of sample. We recommend that similar studies to select target analytes and optimize protocols for SSAP be performed. The mass allocation for Bennu triage ranges from 20 mg EA-IRMS (elemental analyzer–isotope ratio mass spectrometry), 2 mg pyrolysis GC-MS, 2 mg for micro Fourier transform infrared spectroscopy, 20 mg for nucleobases using LC-HRMS, and the remaining mass for LC-MS and LC-HRMS for chiral amino acids, amines, and ammonia and GC-MS for carboxylic acids (Lauretta et al., 2023).
Planetary Protection Strategy and the Abiotic Baseline
Traditional Planetary Protection strategies
Since the Viking missions (launched in 1976), Planetary Protection policy for forward contamination has relied on detecting and counting live organisms as a measure of life detection (Benardini and Lalime, 2025; Office of Safety and Mission Assurance, 2022). This strategy was the logical result of available scientific knowledge and technology of the time, which depended on culturing and/or microscopy to find and identify microorganisms. As science and technology have progressed, Planetary Protection protocols have been updated to include tests such as endospore detection, DNA sequencing, ATP detection, and viability assays. However, direct interrogation of microorganisms remains a central focus. In that context, an accepted measurement would be to assay for biological cells and look for a result of <1 cell.
The past 10–20 years have seen a revolution in the ability to interrogate and understand microbial communities in their environmental context. Most microbial life cannot be readily grown in the laboratory, so notable advances include methods that enable the study of the unculturable majority (Pachter, 2007) and to gain a better understanding of biology that blurs the line between living and nonliving (e.g., viruses and prions). The rapidly growing fields of astrobiology and geomicrobiology have played an important role in these developments, advancing our ability to detect life in low-biomass terrestrial analog, understand the environmental context in which it occurs, and determine how life, in turn, alters its environment. Many of these updated approaches have the added benefit of being agnostic. We do not know what molecules might be used by potential martian biology, so interrogating the whole molecular signal rather than relying on single molecules or traditional microbiological assays is better suited to the search for biology that may be substantially different from Earth life.
Recent work addressing backwards Planetary Protection (Box 2), including the SSAF and SSAP, shares a philosophical approach, which is to search for indicators of life expected from biological processes (Kminek et al., 2022). Both the SSAF and SSAP agree that, if modern martian biology is absent, the sample would be considered safe for release. Regarding the abiotic baseline approach presented in this article, both the SSAF and SSAP center around a test sequence that examines three-dimensional sample structure with spatially resolved chemical and mineralogical information, coupled with bulk analyses of organic molecules that includes a targeted search for large molecules.
Currently, there is no community-agreed-upon protocol for sample safety assessment. The SSAF established the framework, and we present a proposed protocol to be vetted by the scientific community, with the goal of adoption and implementation. Since scientific technology is expected to advance between publication of this document and sample return, and community input has the potential to improve aspects of the protocol, the SSAP proposes a responsive, implementation/assessment approach to inform decision-making on sample release from containment. Its flexibility enables us to take full advantage of recent scientific and technological progress and incorporate community feedback. This proposal is consistent with recent developments in NASA’s Planetary Protection policy that allows missions to use novel methods to meet Planetary Protection requirements (Benardini and Lalime, 2025).
Is the ability to detect single cells necessary for the SSAP?
Molecular geochemistry that targets organic chemistry and inorganic chemistry and describes their codistributions takes advantage of the integrated biotic signal or fingerprint from the prior generations that have inhabited a location. Using a molecular geochemical approach would represent a major expansion of the traditional Planetary Protection strategy. This biotic fingerprint produces volumetrically and compositionally larger signals than is possible for single-cell assays, improving our ability to detect biology in low-biomass environments and to avoid false negatives from cell counting or culturing approaches. It is also unlikely that a single viable cell, virus, or prion (or the potential martian equivalents) would exist in isolation in the returned samples. The replication processes needed for survival through geologic time tend to produce multiple organisms (along with their biotic “fingerprint”) rather than isolated cells.
Conclusions and Future Directions
This article describes the central proposed strategy for conducting a safety assessment on samples returned from Mars. Applying the abiotic baseline concept to MSR enables interrogation of the biotic signal from potential martian biology and provides a mechanism for conducting safety assessment. We estimate that the entire protocol could consume 10% or less of returned material. The version of the SSAP proposed in this article builds on the initial draft biohazard test protocol proposed by Rummel et al. (2002) published 22 years ago, which was significantly updated and modernized by Kminek et al. (2022) in the SSAF. Key components of the SSAF that were adopted by the SSAP include high-resolution spatial and spectral imagery, a focus on the molecular patterns of organic molecules, and agnostic methods of life detection. The SSAP further develops the approach to safety assessment by incorporating the abiotic baseline framework, which is partially enabled by advances in our understanding of organic synthesis in nature, biosignatures (including the possibility of agnostic ones), and improvements in instrumentation capability. This framework updates how measurements are used to identify signatures of modern biology and provides “off-ramps” at different stages of the protocol if evidence of biology is not detected, allowing samples to be released from high containment. The SSAP also builds upon the Bayesian statistical framework initially proposed in the SSAF, starting with Bayesian design followed by Bayesian analysis, including statistical hypothesis testing, which will be described in an upcoming publication (Cressie et al., article in preparation).
Recommended Research and Development Priorities
As of this writing, the exact timeline for potentially returning samples to Earth from Mars has not been established, but the years 2035–2040 have been proposed—this is 11–16 years from the time of publication. In that time, advances in the field of astrobiology and significant improvements in instrumentation technology are anticipated. The proposed SSAP was developed with the flexibility to incorporate scientific and technological advances. Successful application of the proposed SSAP would require a concerted research and development strategy, which includes a continued focus on the use and development of state-of-the-art science and technology. Below, we outline recommended high-priority research and development objectives related to the abiotic baseline approach presented here.
Objective 1: Establish the martian abiotic baseline
To compare the MSR samples to the martian abiotic baseline, it will be necessary to better characterize the abiotic baseline. The effect of surface processes (e.g., radiation exposure) on organic molecules and the abiotic baseline will be an important consideration because returned samples would be from the martian surface. Better characterization of the baseline would require: (1) characterizing the expected signal-to-noise ratios in biotic versus abiotic environments. (2) Identifying the abiotic characteristics that define the baseline (nonlife) framework against which potential life signals will be evaluated. (3) Understanding how the abiotic baseline would present when altered by the surface processes on Mars. Existing data can help build this knowledge. A variety of investigations have explored the organic inventory of Mars through in situ robotic exploration and analysis of meteorites (Callahan et al., 2011; Eigenbrode et al., 2018; Freissinet et al., 2015; Glavin et al., 2013; Millan, Teinturier, et al., 2021; Millan, Williams, et al., 2022; Steele et al., 2016, 2018; Williams et al., 2021). These data should be synthesized and explored to understand molecular diversity and concentrations to establish the expected abiotic baseline of Mars. In addition, the data from Mars should be compared with other organic inventories from bodies that definitively do not contain life (e.g., Bennu, the Moon, Ryugu) to better understand if this is a representative background for the solar system.
Objective 2: Develop and optimize sample preparation protocols, instrument methods, and strategies for data analysis
The expected low concentration of organic carbon in returned samples (e.g., Sharma et al., 2023) combined with the importance of preserving sample material will require determination and optimization of the LOD. The LOD of an instrument is the lowest concentration of an analyte that can be distinguished from background noise. Multiple factors influence LOD such as sample characteristics, analyte targets, extraction solvents, instrument selection, and instrument settings, all of which will need to be investigated in the context of the SSAP. To establish and reduce LOD will require the use of analogs matched to MSR samples, such as meteorites, those from the Mars Sample Return Analog Collection (e.g., Thorpe et al., 2024), and material from other low-biomass areas identified as Mars analogs (e.g., Atacama Desert). Having the most sensitive instruments to support the detection and identification of a potential biohazard is essential to characterize the samples returned from Mars.
While we have provided a proposed strategy for bulk molecular organic characterization (i.e., GC-MS, LC-MS, pyrolysis GC-MS, and LC-HRMS) and discussed classes of potential analytes (e.g., amino acids, nucleobases, fatty acids, polymers), we have deliberately refrained from specifying particular target analytes or sample preparation protocols. Different sample extraction and preparation methods (e.g., type of extraction, extraction solvents, temperature settings, hydrolysis, derivatization agents, or desalting procedures) target different categories of organic molecules (e.g., Boulesteix et al., 2023; Lau et al., 2010; Simkus et al., 2019). Extraction methods for SSAP samples will likely be similar to those used for meteorites and asteroids Ryugu and Bennu (Oba et al., 2023; Yabuta et al., 2023), but we lack sufficient data to make informed choices at this juncture. In addition, it would be advantageous to determine an appropriate sequential extraction protocol to target compound classes of interest on the same sample aliquot to reduce sample mass usage. These determinations should ultimately be made by a future team of experts specifically convened to focus on this issue.
Organic extraction protocols will need further research using Mars-relevant samples to investigate how different extraction methodologies might affect the resultant analytes, which would need to be studied in combination with the other proposed techniques. For example, one potential line of evidence could be derived from comparing amino acid abundance in acid-hydrolyzed versus nonhydrolyzed polar extracts. Amino acids released after acid hydrolysis of a water extract could indicate that amino acids were originally bound in peptides. However, these amino acids could have been bound to other organic matter or even generated from the nitrile groups on other molecules during the hydrolysis reaction. Further analysis (including in situ analytical studies) would be required to identify the source of the amino acid monomers. Another potential line of evidence would be repeating masses of subunits that indicate a polymer. Further analysis would pinpoint the identity of the monomers and the nature of the parent macromolecule.
Though the measurements proposed by the SSAP are well established, data analysis strategies will need to be developed and optimized. Data that would be generated represent multiple data types from several instruments (e.g., mass-to-charge ratios of many compounds, imaging results), and some of the outputs are potentially highly multivariate. We concur with the SSAF (Kminek et al., 2022) that experiments and simulations should be conducted with similar instruments on a variety of analogs to build the measurement populations on which to base statistical models and analyses.
The SSAP-TT identified the need for an IMS instrument capable of providing high-resolution, geospatially resolved mass-specific information. However, the exact type of IMS requires future research because optimization of analytical and spatial resolution is needed to meet SSAP requirements. Future research should determine how to identify areas of interest in the returned samples, the specific imaging components required for colocation, and the type of ionization energy best suited for the samples in order to accurately detect groups of cells or polymers. There are several types of IMS that could be appropriate for the SSAP, but they need further investigation. For example, ToF, DESI, and LDI IMS ionize compounds in different ways, and each can be coupled to different mass spectrometry systems. Characterization of analog samples with different types of IMS would enable a better understanding of which IMS instrumentation has the most efficient ionization mechanisms for the sample matrices and can reach sufficient detection limits.
Last, significant advances in artificial intelligence and machine learning methods are expected and could be important for the next generation of data visualization, analysis, and archiving. Research and development in these areas would enable the use of the most modern and best tools available. Finally, future research should perform specific triage studies to determine which types of these instruments are most suited to the task, for example, would an Orbitrap be required for GC- or LC-MS, or is there a need for tandem MS/MS to reach the resolutions needed?
SSAP-TT Membership
Brooke M. Ahern (US Army DEVCOM Chemical Biological Center, Gunpowder, MD, USA); David W. Beaty (NASA Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, USA); Nicolle Baird (Division of High-Consequence Pathogens and Pathology, Centers for Disease Control and Prevention, National Center for Emerging and Zoonotic Infectious Diseases, Atlanta, GA, USA); Noel Cressie (School of Mathematics and Applied Statistics, University of Wollongong, Wollongong, Australia; NASA Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA); Richard E. Davis (Texas State University/NASA Johnson Space Center); Katherine L. French (Central Energy Resources Science Center, US Geological Survey, Denver, CO, USA), Mihaela Glamoclija (Department of Earth and Environmental Sciences, Rutgers University, Newark, NJ, USA), Heather V. Graham (Solar System Exploration Division, NASA Goddard Space Flight Center, Greenbelt, MD, USA); Kimberly B. Hummel (Division of High-Consequence Pathogens and Pathology, Centers for Disease Control and Prevention, National Center for Emerging and Zoonotic Infectious Diseases, Atlanta, GA, USA); Rachel Mackelprang (California State University Northridge, Northridge, CA, USA); Lisa E. Mayhew (Department of Geological Sciences, University of Colorado Boulder, Boulder, CO, USA); Gerald McDonnell (Johnson & Johnson, Inc.); John McQuiston (Division of High-Consequence Pathogens and Pathology, Centers for Disease Control and Prevention, National Center for Emerging and Zoonotic Infectious Diseases, Atlanta, GA, USA); William Page (NASA Jet PropulsionLaboratory, California Institute of Technology, Pasadena, CA, USA); Neil Pearce (Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK); Aaron Regberg (NASA Johnson Space Center); David A. Relman (Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA); Mark Sephton (Department of Earth Science and Engineering, Imperial College London, London, UK); Barbara Sherwood Lollar (Department of Earth Sciences, University of Toronto, Toronto, Ontario, Canada); Timothy B. Shirey (NASA Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, USA); Sandra Siljeström (Department of Methodology, Textiles and Medical Technology, RISE Research Institutes of Sweden, Stockholm, Sweden); Andrew Steele (Carnegie Institution for Science, Earth and Planets Laboratory, WA, DC, USA); Bronwyn L. Teece (NASA Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, USA); Jessica Vanhomwegen (Laboratory for Urgent Response to Biological Threats, Institut Pasteur, Paris, France), Mary Beth Wilhelm (Space Science and Astrobiology Division, NASA Ames Research Center, Moffett Field, CA, USA).
Footnotes
Acknowledgments
The authors gratefully acknowledge multiple inputs from or discussions with the members of the Measurement Definition Team (MDT) during the course of preparation of this analysis. Of particular significance were discussions with the MDT facilitators and cochairs: Brandi Carrier, Tim Haltigin, Elliot Sefton-Nash, and Chris Herd (note that Heather Graham is a cochair of the MDT, but also a coauthor of this report). Discussion with the following colleagues contributed to various aspects of this report (listed alphabetically): Morgan Anderson, Walter Alvarado, Tanja Bosak, Denise Buckner, Ben Clark, Brian Clement, Andy Czaja, Sam Edwin, Kate Freeman, Danny Glavin, Kristo Kriechbaum, Michael Meyer, Victoria Orphan, Maggie Osburn, Alan Pearse, Lori Shiraishi, Alvin Smith, Gabriella Weiss, Amy Williams, and Maria-Paz Zorzano. A very helpful detailed review of an interim version of this analysis was provided by the MSR Campaign Science Group (MCSG). Members included: Gerhard Kminek, Lindsay Hays, Audrey Bouvier, Andy Czaja, Nicolas Dauphas, Lydia Hallis, Rachel Harris, Ernst Hauber, Laura Rodriguez, Susanne Schwenzer, Kimberly Tait, Michael Thorpe, Tomo Usui, Michael Velbel, and Mari-Paz Zorzano (note that Kate French, Jessica Vanhomwegen, and Andrew Steele were also members of MCSG at the time, but since they are also coauthors of this article, they are not included in the above list). The SSAP-TT is deeply appreciative of the careful thinking and challenges that came back to us via the review process. The individuals who particularly helped us included Megan Ansdell, Nick Benardini, Ben Clark, Brian Clement, Jason Dworkin, Sam Edwin, Karen Gelmis, Brandon Hatcher, Lindsay Hays, Jonathan Hobbs, Doug Isbell, Erin Lalime, Cara Magnabosco, Richard Mattingly, Francis McCubbin, Michael Meyer, Elisha Moore, Karen Olsson-Francis, Laura Ratliff, John Rummel, Alex Sessions, Silvio Sinibaldi, Andy Spry, and Elizabeth Trembath-Reichert. Additional constructive institutional reviews were provided by JPL (Doug Isbell), CDC, and USGS (Eli Moore). For the following coauthors, US Government sponsorship (from several different institutions) is acknowledged: From the Jet Propulsion Laboratory/California Institute of Technology: D. W. Beaty, B. Shirey, B. L. Teece, William Page; From the Centers for Disease Control and Prevention: John R. McQuiston, Kim Hummel, Nicole Baird; from Johnson Space Center: Richard E. Davis, Aaron Regberg; from Goddard Space Flight Center: Heather Graham; from the US Geological Survey: Katherine French; from the Department of Defense: Brooke Ahern; from Ames Research Center: Mary Beth Wilhelm. The decision to implement Mars Sample Return will not be finalized until NASA’s completion of the National Environmental Policy Act (NEPA) process. This document is being made available for planning and information purposes only. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the US Government. © 2025.
Authors’ Contributions
The entire SSAP-TT contributed to discussions surrounding the SSAP. B.L.T., D.W.B., H.V.G., G.M., B.S.L., S.S., A.S., and R.M. conceived and developed Step 2. B.L.T. and R.M. wrote the initial draft of the manuscript with significant input from B.S.L. All other authors contributed to reviewing and editing. R.M. generated the figures.
Author Disclosure Statement
No competing financial interests exist.
Funding Information
S.S. acknowledges funding from the Swedish National Space Agency (contracts 2021-00092 and 2024-00240). A portion of this report was written at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration (contract number 80NM0018D004).
Associate Editor: Sherry L. Cady
