Abstract
This study examined the effect of the amino acid composition of protein capsids on virus inactivation using ultraviolet (UV) irradiation and titanium dioxide photocatalysis, and physical removal via enhanced coagulation using ferric chloride. Although genomic damage is likely more extensive than protein damage for viruses treated using UV, proteins are still substantially degraded. All amino acids demonstrated significant correlations with UV susceptibility. The hydroxyl radicals produced during photocatalysis are considered nonspecific, but they likely cause greater overall damage to virus capsid proteins relative to the genome. Oxidizing chemicals, including hydroxyl radicals, preferentially degrade amino acids over nucleotides, and the amino acid tyrosine appears to strongly influence virus inactivation. Capsid composition did not correlate strongly to virus removal during physicochemical treatment, nor did virus size. Isoelectric point may play a role in virus removal, but additional factors are likely to contribute.
Keywords
Introduction
Mitigation of enteric pathogens is an essential part of drinking water treatment. Historically, pathogens have been well controlled in the developed world using multibarrier treatment approaches relying on a combination of physical removal and chemical inactivation processes. However, pathogens continue to be the greatest detriment to global human health, especially in developing regions. 1 In particular, adequate detection and control of viruses remain a challenge in modern water treatment. 2 Viruses are generally more difficult to detect and more resistant to treatment processes in comparison with other pathogenic microorganisms such as bacteria since they often exhibit slower inactivation kinetics and less physical removal due to their smaller size. 3
Although a great deal of empirical data exists, the exact mechanisms governing virus removal via adsorption4,5 and inactivation6,7 in water treatment processes remain unclear. Environmentally relevant (nonenveloped) viruses consist of a nucleic acid genome enclosed by a protein capsid. As the outer exposed surface, the protein capsid may play a major role in virus removal and inactivation, yet analysis of the role of proteins in water treatment processes has lagged behind studies of nucleic acids. 8 This study examined the influence of the amino acid composition of protein capsids on virus inactivation and removal. Particular emphasis was afforded to the high research priority viruses identified on the US Environmental Protection Agency's Contaminant Candidate List (CCL) and their potential surrogates.
The CCL is updated approximately every five years in order to identify contaminants that are not yet regulated but which are known or believed to occur in public water systems and are thus high-priority targets for research and data collection. 9 To date, each version of the CCL has included four families of viruses, as listed in Supplementary Table 1. Adenoviruses, caliciviruses (eg, norovirus [NV]), and at least a subset of enteroviruses (eg, coxsackievirus and echovirus) have been on each CCL, demonstrating the need to better understand the risks that these emerging viruses pose to human health.
Viruses and key physical, chemical, and genetic characteristics (Adapted from Refs. 4, 5, 16, 20, 56).
GC content determined as sum of G and C basepairs divided by the total number of basepairs determined from sequence obtained from the NCBI 45 online database.
Pyrimidine content determined as the percent of pyrimidine basepairs (C, T or U) divided by the total number of basepairs determined from sequence obtained from the NCBI 45 online database.
Dimerization value calculated using doublets and triplets in the genomic sequence obtained from the NCBI 45 online database, as described by Kowalski et al. 46
Not tested in lab experiments of physical removal or inactivation.
not CCL viruses. FCV and MNV are laboratory surrogates for human caliciviruses (human NV). Poliovirus shares many similarities with coxsackievirus and echovirus, and was used for comparative purposes.
Average theoretical pi calculated in this study using protein analyses.
Mandel 75 hypothesized that Polio1 proteins may exist in two conformational states (A-form and B-form), thereby resulting in two isoelectric points, 7.0 and 4.5.
Adenovirus is the most ultraviolet (UV)-resistant enteric pathogen and, as such, has directly impacted drinking water regulations, including the Long Term 2 Enhanced Surface Water Treatment Rule (LT2) and the Groundwater Rule.10,11 NV, the most infamous of the caliciviruses, is the leading cause of nonbacterial gastroenteritis in the world and continues to pose a challenge as measures of disinfection efficacy are impeded by difficulties culturing the virus in vitro. 12 The enterovirus family includes poliovirus, which is perhaps the most widely studied of all viruses. 4 However, far less is known regarding the occurrence, prevalence, health effects, and treatment of non-polio enteroviruses, which has led to their inclusion on the CCL. 4
This study provides a deeper understanding of the role that virus capsid proteins (based on their amino acid composition) play in influencing virus treatability in water treatment processes, which will assist in the development of treatment strategies to better mitigate CCL viruses. The treatment processes evaluated included UV irradiation, titanium dioxide (TiO2) photocatalysis, and enhanced coagulation. Each of these processes is accruing greater interest in the face of emerging water treatment challenges. UV is highly effective against Cryptosporidium, which is the focus of the LT2 regulation. TiO2 photocatalysis is capable of mineralizing recalcitrant organic compounds, including micropollutants such as endocrine-disrupting chemicals, several of which are also on the CCL. Enhanced coagulation is identified as a best available technology for reduction of potentially carcinogenic disinfection byproducts. The following provides a brief introduction to viral inactivation and physical removal mechanisms associated with these treatment processes.
Disinfection
Elucidation of the mechanisms of bacterial inactivation via common water disinfection processes has recently advanced; however, there is not yet an understanding of the fundamental mechanisms governing virus inactivation. 2 Existing explanations are often widely variable, ambiguous, or even contradictory.6,13 This may be exacerbated by the dramatic differences observed in disinfection kinetics for related viruses, which suggests that even minor variations in structure or genome can substantially influence virus susceptibility to different disinfectants. 13 The two main classifications of disinfection processes used in water treatment are (1) chemical oxidants (free chlorine, chloramines, chlorine dioxide, and ozone) and (2) UV photolysis.
Oxidants
For strong oxidizers such as free chlorine and ozone, protein damage is suggested to be the dominant mechanism of viral inactivation. 14 These oxidants are most likely to degrade the following protein sites: N-terminal amino acids as well as free amine, aromatic, and/or organosulfur side chains of C, H, K, M, W, or Y amino acids. 14 Protein damage may inhibit the ability of a virus to bind to cells (eg, poliovirus, hepatitis A virus, and feline calicivirus [FCV]) 15 or affect post-binding life cycle processes (eg, adenovirus). 2 More detailed determination of the molecular mechanisms by which disinfectants inactivate viruses will assist in the identification of novel approaches to detect and control viruses in water. 2
UV
Unlike chemical oxidants, the kinetics of UV degradation of nucleobases is several-fold to orders of magnitude higher than amino acids (of which only F, W, and Y exhibit significant absorption in the germicidal UV range, whether in their zwitterion form or as part of a peptide or protein).7,16 However, the specific determinants responsible for UV resistance of viruses have not been identified, nor is there a general rule for UV resistance based on nucleic acid or amino acid composition or structure. 3
As resistance to disinfection may be assisted by the ability of some viruses to photorepair (eg, adenovirus), optimal disinfection strategies must damage more than just viral DNA. 11 Thus, enhanced UV disinfection using either polychromatic medium pressure (MP) UV or UV-based advanced oxidation processes (AOPs) are of increasing interest.10,11,17–20 Both approaches may improve inactivation through more widespread viral protein damage: MP UV through increased amino acid absorbance across the range of wavelengths and AOPs through the generation of highly reactive hydroxyl radicals (HO•).
AOPs
TiO2 photocatalysis is one type of AOP that relies on UV irradiation of TiO2 particles to generate radicals. While this process has traditionally been used for organic chemical destruction, there has also been interest in using it to inactivate microorganisms.1,18–23 Multiple reactive oxygen species may be generated via photocatalysis, including HO• as well as superoxide (O2–•). It appears that HO• is mainly responsible for virus inactivation, as demonstrated by studies of MS2. 19 The main mechanism of inactivation by HO• is yet to be elucidated: damage to the capsid protein, the genome, or a combination thereof.10,24 Determining the extent of capsid protein damage would be beneficial to understanding virus inactivation using AOPs. 10
Physical removal
Physicochemical treatment processes such as coagulation, flocculation, sedimentation, and filtration have been shown to be capable of effectively removing a variety of microorganisms, including viruses, bacteria, and protozoa, from drinking water.4,5,25–31 Log reductions (LR) of viruses typically range from ~0.25 to 3 LR (but can exceed 6 using electrocoagulation-microfiltration), as shown in Supplementary Table 2. The primary mechanism for virus removal during physicochemical water treatment is believed to be adsorption and charge neutralization,4,5 with significant influence from hydrophobic interactions.32,33 This effectively destabilizes colloidal virus suspensions such that they can be physically separated from the effluent via gravity sedimentation or filtration. Sweep flocculation has been reported to be the dominant virus destabilization mechanism during coagulation, with charge neutralization playing a secondary role.32,34 While many studies have focused on virus adsorption to granular media,33,35–37 and some have examined virus adsorption during coagulation processes,4,5,32,34 virus adsorption is a complex process for which the underlying mechanisms remain poorly understood.33,36,38–41 Particle adsorption is generally believed to be a function of characteristics such as size, shape, and isoelectric point (pI) or surface charge.26,38,42 As pI and surface charge arise from capsid protein composition and structure of a virus, the impact of virus proteins on physical removal is of great relevance to improved design of water treatment processes. Yet, direct analysis of the role of proteins in virus adsorption has yet to be reported.
Theoretical virus pi values determined using varying pK values.
Materials and Methods
This study assessed the impact of virus proteins, that is, the capsid's amino acid composition, on removal and inactivation during water treatment processes. The treatment processes evaluated included low pressure (LP, λ = 254 nm) UV disinfection, as described by Mayer et al 43 and Gerrity; 16 TiO2 photocatalysis, as described by Gerrity et al; 20 and enhanced coagulation, as described by Abbaszadegan et al 5 and Mayer et al. 4 Briefly, UV experiments were conducted using a bench-scale collimated beam apparatus. The collimated beam contained a 46-cm, 15 W LP mercury arc bulb (Model G15T8, Ushio). An IL1700 research radiometer with a SED005W sensor and NS254 narrowband filter (International Light) was used to measure incident UV light intensity at the surface of the sample. The average adjusted intensity (after accounting for collimated beam correction factors)16,20 was ~0.13 mW/cm2. Viruses were spiked in 14 mL of buffered demand-free (BDF) water in 60 × 15 mm quartz petri dishes, with continuous stirring using a Teflon-coated magnetic stir bar. The UV fluence for each UV and AOP sample was calculated as the average intensity multiplied by the time of UV exposure. The collimated beam photocatalysis experiments were performed with a dose of 1 mg/L TiO2 (Degussa P25) due to decreased efficiency at higher doses. 20 With this low TiO2 dose, the primary treatment mechanism is assumed to be UV photolysis, with some additional treatment provided by oxidation.
The collimated beam, as well as a Photo-Cat Lab pilot-scale reactor (Purifics), was used for photocatalysis experiments. As performance in photocatalytic reactors has been observed to vary substantially with reactor design, 20 bench-and pilot-scale reactors were comparatively assessed. The Photo-Cat Lab included eight 75-W LP mercury arc bulbs arranged in series, with the annular configuration of the bulbs providing a flowpath of ~3 mm. The average UV intensity was ~7.0 mW/cm2. In contrast to the collimated beam, a dose of 400 mg/L suspended Degussa P25 TiO2 was used for the Photo-Cat Lab experiments. With this relatively high TiO2 dose, the primary treatment mechanism is assumed to be oxidation, as demonstrated by a rutile TiO2 control, 20 with some additional treatment provided by UV photolysis. The TiO2 was separated from the effluent with a submicron pore-size ceramic membrane filter. The system was operated in batch mode with a recirculation flowrate of 25 L/minute and a total volume of ~15 L of dechlorinated tap water spiked with viruses. Dechlorination was achieved with UV irradiation prior to spiking the viruses and TiO2.
Enhanced coagulation experiments were performed using a Phipps & Bird PB-700 jar test apparatus containing 1.5 L sample/jar. Viruses were spiked into environmental surface waters at concentrations of ~10 6 plaque forming units/mL of each bacteriophage or 10 35 –10 5 50% tissue culture infective dose (TCID50)/mL of each mammalian virus. Tests were performed using optimal enhanced coagulation conditions (as defined by removal of dissolved organic carbon, described previously),4,5 consisting of 40 mg/L FeCl3 (Sigma Chemical Co.), 0.4 mg/L cationic polymer (poly(dialyldimethylammonium) chloride, polyDADMAC, Clarifloc 350, Polydyne, Inc.), and pH adjustment to <6.5 using 1 N HCl. Following dosing, jars were mixed at 100 rpm for one minute, followed by 40 rpm for 10 minutes, 20 rpm for 10 minutes, and 0 rpm for 30 minutes. Each jar test was repeated twice and samples were analyzed in triplicate for bacteriophages, while single assays were performed for mammalian viruses due to the time and material limitations of in vitro cell culture analysis.
For each treatment condition tested, the LR of viruses was calculated by comparing the concentration of infectious viruses in the initial control to the concentration following treatment. All experimental results reported here represent the mean values of 2–6 measures, as summarized in Supplementary Table 3. When feasible, that is, no host cell cross-infectivity, viruses were typically tested simultaneously; otherwise, separate experiments were performed, as indicated in Supplementary Table 3. Positive and negative controls were included in each set of assays.
Viruses
A number of viruses (both CCL and surrogates including animal viruses and bacteriophages) spanning a range of physical, chemical, and genomic characteristics were included in this study, as listed in Table 1. Although 3-D organoid cell culture models are promising, 12 there is currently no established cell culture model for human NV infectivity. Thus, FCV and murine norovirus (MNV) are widely used surrogates of NV treatability. 44 Additionally, bacteriophages are commonly used as surrogates for studies of waterborne human pathogens as they share many similarities (ie, size, shape, genetic composition, and structure, as shown in Table 1) but are much easier and faster to assay.4,5 All viruses analyzed are nonenveloped, icosahedral particles with linear genome topology (with the exception of phi-X174, which exhibits circular topology).
Virus stocks were propagated and quantified using standard methods, as described previously.4,5,20,43 Briefly, the viruses and host cells (as specified in Table 1) were obtained from the American Type Culture Collection, with the exception of poliovirus and BGM cells, which were kindly provided by Dr. Charles Gerba of the University of Arizona. Bacteriophage stocks were propagated and assayed using the double agar layer (DAL) method. For propagation, 10 mL of BDF was added to the surface of the plates and allowed to incubate for one hour. The supernatant was collected and centrifuged at 4°C at 1,200 x g for 15 minutes to remove cellular debris. To minimize organic content, the stocks were purified using two successive polyethylene glycol (PEG, MW 8000) precipitations followed by a Vertrel XF (Micro Care Marketing Services) extraction and resuspension in BDF. Purified stocks were stored at 4°C until use.
The CCL and animal viruses were propagated and assayed using conventional in vitro cell culture techniques. The cells were cultured in 1x minimum essential medium (MEM) supplemented with 1.5 g/L sodium bicarbonate, 15 mM Hepes, 2 mM l-glutamine, 0.1 mM non-essential amino acids, 1 mM sodium pyruvate, 100 g/mL antimycotic, and 100 mg/mL kanamycin sulfate. The media used for the PLC cells contained 10% fetal bovine serum (FBS, Hyclone), the media for the CRFK cells contained 10% equine serum (Hyclone), and the media used for the BGM cells contained 5% FBS. For virus propagation, the cells were inoculated with ~106 TCID50/mL and incubated at 37°C until at least 90% infected. A series of three freeze/thaw cycles was used to release the viruses from the cells. Then, as described for the bacteriophages, the supernatant was collected and purified using centrifugation, PEG precipitations, and a Vertrel extraction. Purified stocks were stored at −80°C until use.
Following experiments, bacteriophages were quantified using the DAL method. Quantification of mammalian viruses was based on in vitro cell culture techniques, with some variations in the assay used for the different experiments performed (as noted in Supplementary Table 3). Most commonly used was the Karber TCID50 method, wherein cells were grown in 24-well trays, which were inoculated with 0.1 mL/well of each sample. Each sample dilution was used to inoculate four replicate wells. The trays were incubated in a 5% CO2 incubator at 37°C and examined daily for up to 14 days using a light microscope to detect the presence of cytopathogenic effects. The Karber statistical approach was used to estimate the concentration at which 50% of the inoculated wells (TCID50) were positive for infection. For the UV experiments, viruses were assayed using an integrated cell culture-quantitative [reverse transcriptase] polymerase chain reaction (ICC-q[RT]PCR) assay.16,43 Briefly, the ICC-q[RT]PCR method relies on a 24-hour period of post-inoculation incubation at 37°C to differentiate between infectious and noninfectious viruses, followed by quantification using q[RT]PCR with virus-specific primers and probes, as detailed by Gerrity 16 and Mayer et al. 43 For the enteroviruses, the conventional plaque assay was compared with ICC-qRTPCR in parallel to validate results. For the plaque assay, cells were cultured to confluency in 25 cm2 flasks and then inoculated with 1 mL virus sample. The flasks were incubated at 20°C for one hour with gentle rocking every 15 minutes, followed by addition of 4 mL/flask of 1:1 1x MEM with 2% FBS and 1% agar overlay. After 48 hours in a 37°C, 5% CO2 incubator, plaques were visualized by removing the agar overlay, fixing the cells with ethanol, and staining using a solution of 8% (wt/v) crystal violet and 20% (v/v) ethanol in nanopure water.
Compositional analysis: nucleic and amino acids
The genome length and composition of each virus were obtained from the National Center for Biotechnology Information (NCBI) online database 45 using the GenBank accession IDs summarized in Supplementary Table 4. The nucleotide composition was analyzed to determine the absolute quantity of each type of nucleotide, the number of pyrimidines, the GC content, and the dimerization value (Dv). The Dv was calculated using the number of doublets and triplets in combination with RNA/DNA dimer proportionality constants, as described by Kowalski et al. 46 Each of these genome-related parameters was statistically evaluated to elucidate its role in virus removal and inactivation.
The amino acid composition of the virus capsid proteins was identified using NCBI 45 sequences. The nucleic acid sequence of the capsid proteins was translated to an amino acid sequence using the ExPASy Proteomics Server. 47 The ExPASy ProtPram tool was used to quantify the number of each amino acid in the virus capsid. The hypothetical surface density 20 of each amino acid was calculated using the average virus diameter from the values reported in Table 1. Each of these protein-related parameters was statistically evaluated to elucidate its role in virus removal and inactivation. Additionally, the extent of virus removal and inactivation as a function of the relative abundance of groups of amino acids clustered by functional groups was assessed. The groups were categorized based on structural differences in the amino acids: amino acids with aliphatic R-groups (A, G, I, L, and V), non-aromatic amino acids with hydroxyl R-groups (S and T), amino acids with sulfur-containing R-groups (C and M), acidic amino acids and their amides (N, D, E, and Q), basic amino acids (R, K, and H), amino acids with aromatic rings (F, W, and Y), and imino acids (P). 48
Relative contribution to reaction
The relative contribution to reaction (ie, degradation during disinfection) of the proteins and genomes was calculated for each of the viruses using a simplistic approach to predict the major sites of damage. The number of each relevant nucleotide and amino acid was multiplied by the respective kinetic rate constant for the various disinfectants (Supplementary Table 5), and the products were summed for each virus.6,8
Isoelectric point
The pI of a virus arises from its protein capsid, which consists of weakly acidic and basic functional groups that are ionized when the virus is suspended in water, resulting in a net surface charge, which is dependent on the pH of the suspension.49,50 The amino acid composition of each virus capsid was used to calculate the theoretical pI using the Henderson–Hasselbalch equation (Equation 1).
The theoretical pI can be determined by solving for pH in Equation 2 since it occurs at the pH at which the molecule is neutrally charged.51,52
pK n = Acid dissociation constant of negatively charged amino acids (D, E, C, Y, and the C-terminus [COOH])
pK p = Acid dissociation constant of positively charged amino acids (K, R, H, and the N-terminus NH2).
Equation 2 was used to calculate the theoretical pI for each of the viruses by solving for pH using the number of each of the charged amino acid and terminal groups and their respective dissociation constants (pK). Unfortunately, there is no consensus with respect to the pK of the amino acids. Many different pI calculators are available (eg, EMBL Gateway, ExPASy ProtPram, etc.), but they often predict different pI values because the pK values utilized in their algorithms vary from one source to another. 51 For this study, the pI of each virus was calculated using a range of reported pK values (as well as the minimum, maximum, and mean values), as listed in Supplementary Table 6.
Phylogenetic analysis
Comparative treatability across virus families was further explored through phylogenetic analysis. Using the NCBI database 45 to obtain nucleic acid sequences for the complete virus genomes, alignment was conducted using MEGA 6 software with Clustal W (DNA weight matrix: Clustal W 1.6). The similarity (or distance) of genomic sequences inferred from genomic information was constructed using the Neighbor-Joining method with MEGA 6 software.53,54 Phylogeny testing was conducted using the bootstrap method, and any bootstrap score >70 was generally considered reliable.
Statistics
Pearson product-moment correlation coefficients (r) were used to assess the degree of linear dependence between virus characteristics (amino acid content, nucleotide content, size, pI, etc.) and virus susceptibility to removal or inactivation. GraphPad Prism software was used to calculate r, the coefficient of determination (R 2 ), and the two-tailed P value at the 95% confidence interval.
Results and Discussion
Figure 1 illustrates the relative resistance of a range of viruses to inactivation using UV irradiation or TiO2 photocatalysis as well as resistance to physical removal using enhanced coagulation. For photocatalytic treatment of viruses, the pilot-scale Photo-Cat Lab reactor outperformed the bench-scale collimated beam, as described by Gerrity et al. 20 When using higher TiO2 doses in the collimated beam, treatment efficacy decreased dramatically, which limited TiO2 doses to ~10 mg/L. 20 With this limitation, the collimated beam photocatalysis data are more consistent with UV disinfection (Fig. 1A) rather than oxidation. However, presumably due to the superior mixing and hydraulics of the pilot-scale Photo-Cat Lab reactor, higher doses of TiO2 (eg, 400 mg/L) could be used. In control experiments with 400 mg/L of rutile TiO2 (an inefficient photocatalyst), limited virus inactivation was observed. This suggests that UV irradiation was responsible for only a small portion of the inactivation in the Photo-Cat Lab experiments. Instead, HO• oxidation was likely responsible for the majority of viral damage and inactivation. The higher TiO2 dose and greater level of HO• oxidation are important because they allow for the oxidation of many chemical contaminants. The energy input for virus inactivation using photocatalysis is often considerably less than the energy requirements for organic chemical degradation. 10 One possible explanation is that virus inactivation could stem from damage to only a few capsid proteins. 10 To expand on this concept, the results in Figure 1 can be used to elucidate the potential influence of viral capsid proteins on removal and inactivation, as described in the following sections.

Normalized virus resistance to (
Disinfection
For a virus to maintain its infectious capacity, integrity of both the capsid and genome are required.6,15 The capsid controls virus–host cell recognition and binding, while the genome carries the information needed to build new viruses. Different types of disinfectants may target disparate virion components. In order to assess target selectivity for viral damage due to disinfection, the relative extent of genome reactions compared with capsid protein reactions6,8 was calculated for each of the viruses. Similar approaches have been used to assess the relative significance of one oxidant over another (eg, ozone vs. HO•) during chemical oxidation. 55 Using this approach, the abundance of each relevant nucleotide or amino acid was multiplied by the respective reaction rate constant shown in Supplementary Table 5. It is important to note that the calculated values ignore other factors of importance to oxidant/target reactivity, including the influence of higher level organization, chain reactions due to the intermediates generated by oxidation of protein subunits, and potential for damage repair.6,8 Given these limitations, however, this simple model can yield indications of the relative importance of potential damage to a particular virus component. 8 The results for MS2 and enteric Ad are featured in Figure 2. MS2 is the most widely used virus for assessments of drinking water treatment, and adenoviruses provide an interesting challenge of extreme UV resistance (from several fold up to 60 times greater resistance than other enteric viruses). 56

Theoretical relative contribution to reaction for genome and protein components of (
Oxidants
Analysis of the relative contributions to reaction clearly demonstrates that viruses are expected to be inactivated largely due to damage to the protein capsid when exposed to free chlorine or ozone. The trend of extensive protein damage outweighing genomic degradation was consistently observed for all viruses as a result of free chlorine or ozone oxidation. This is in agreement with the degradation kinetics of the basic virus constituents: rates of reaction of these disinfectants are several orders of magnitude greater for amino acids relative to genomic material. Oxidative damage can occur at both the protein backbone and the amino acid side chains, with the extent of damage varying among oxidants. Some disinfectants target specific residues, while others give rise to widespread, relatively nonspecific damage. 8 In general, the amino acids which are most susceptible to oxidative damage include W, Y, H, M, C, F, and K. 8
As observed for free chlorine and ozone, oxidative damage to proteins caused by HO• exceeds genomic damage. This is in agreement with Bounty et al, 10 who found that enhanced adenovirus inactivation using HO• was not due to increased DNA damage, suggesting capsid damage was responsible. However, in the case of HO•, genomic damage begins to play a more appreciable role compared with more conventional oxidants, ranging from ~10%–35% of the relative reaction rate, as shown in Figure 3. As the strongest known oxygen-based oxidant, 57 HO• is far more reactive (and less selective) than other oxidizing disinfectants. This high reactivity serves as the foundation of AOPs such as TiO2 photocatalysis. Though typical AOP applications focus on the destruction of recalcitrant organic chemicals, HO• can also provide disinfection capabilities. It reacts with virtually all biological molecules, including amino acids and nucleic acids, 8 at rates several orders of magnitude higher than for free chlorine or ozone.

Theoretical relative contribution to reaction for genome and protein components for (
Hydroxyl radicals are considered nonspecific, and all amino acids are susceptible to HO• degradation during photocatalysis. 57 However, proteins are reportedly more susceptible to radical-induced cleavage at specific amino acids, particularly A, G, and P.58,59 Other studies report that Y is particularly sensitive, followed by H. 57 Therefore, viruses with high levels of these amino acids in their protein capsids may be more susceptible to photocatalysis. The trend of generally increasing LR with increasing absolute quantities (and hypothetical surface densities) of these amino acids appears to support the reported target specificities. However, no statistically significant correlations between HO• susceptibility (as shown for photocatalysis in Fig. 1) and absolute abundance of any amino acid or nucleotide were identified (P > 0.05). Additionally, there were no correlations between photocatalytic disinfection and virus size (both in terms of diameter and genome length), pI, or groups of amino acids. The full statistical analysis is summarized in Supplementary Table 7.
The only parameters which demonstrated significant influence on HO•-based inactivation were the hypothetical amino acid surface densities of Q, I, L, T, and Y (Photo-Cat Lab data). Though the analysis as a whole is generally indicative of nonspecific HO• reactions, the prospective relationship with Y is a commonality among this data and others. 57 As a coarse measure of process performance, if the linear relationship between inactivation and hypothetical Y surface density (LR = 0.79Y + 1.57 for the Photo-Cat Lab data) were to hold true, adenoviruses would be expected to be efficiently inactivated (.3 LR), whereas NV would experience ~2 LR. This suggests that NV may be on the more resistant end of the oxidative disinfectant spectrum, similar to MS2. While recognizing that this is a crude estimation based on a limited dataset (n = 4), this result is interesting as cell culture limitations currently impede direct assessments of NV susceptibility to disinfection strategies. Future analysis of this hypothesis would be of great relevance to treatment applications.
As virus capsid proteins appear to be the critical molecular target for oxidative inactivation, there is potential to further develop heterogeneous disinfection processes such as TiO2 photocatalysis, which selectively inactivate viruses at interfaces.1,2 This could offer a prospective advantage of controlling the formation of potentially carcinogenic disinfection byproducts associated with strong oxidants.1,2
UV Photolysis
Unlike oxidizing chemicals, UV penetration is independent of virus structure, so its efficacy is not limited by the availability of solvent accessible areas. 13 Moreover, UV reaction rates for nucleobases exceed those for amino acid residues. Thus, UV irradiation is known to directly damage internal moieties, 14 as indicated by the dominance of the genome's relative contribution to reaction shown in Figure 3. However, Figure 3 also indicates that protein damage does still occur, and it can be significant. 15 This damage may contribute to inactivation7,60 by facilitating access to interior structures 13 or rendering the virus unable to recognize and bind to host cells. 15 Additional research is needed to establish the relationship between extent of capsid/genome damage and inactivation. Another consideration with photolysis is that photochemical radical formation may offer additional potential for amino acid damage beyond direct photolysis.7,14,16,61
Virus capsids comprised of high percentages of aromatic amino acids could reduce UV penetration and damage to the viral nucleic acids, thereby increasing UV resistance. The role of amino acids in photolytic viral inactivation was assessed using statistical analysis of the dependence of UV susceptibility (as shown Fig. 1) on capsid composition. The analysis revealed that the efficacy of UV disinfection is significantly correlated to the absolute abundance of each amino acid as well as all amino acid groups (P < 0.05). Interestingly, when hypothetical amino acid surface density was evaluated, only R, N, Q, M, S, and Y correlated to inactivation. A full summary of the statistical analyses is provided in Supplementary Table 7. If the same rough estimation of inactivation based on a linear relationship with hypothetical Y surface density is applied to the UV data, a fluence of ~28 mJ/cm2 would be required for 4 LR of NV. This value is similar to those reported for FCV and MNV and suggests that NV would be readily inactivated by UV disinfection.
A number of different indicators of UV susceptibility have been proposed, virtually all of which are based on nucleic acid composition and structure. Genome size,62,63 GC 64 content, pyrimidine content, and prevalence of pyrimidine doublets and/or triplets (Dv) 46 have been suggested to relate to UV susceptibility. Each of these parameters was determined for the viruses in this study, as listed in Table 1. Statistical analysis revealed that the nucleotides A, C, and G were positively correlated to the UV fluence required to achieve 4 LR (P < 0.05). Additionally, the analyses showed that the size of the virus strongly correlated with UV susceptibility, both in terms of virion diameter as well as genome length. As either of these parameters increase, the abundance of amino acids and/or nucleic acids are likely to increase, and since these quantities correlate strongly to inactivation, it follows that size would positively correlate to UV susceptibility. A notable exception to this trend is adenovirus, which is relatively large in size, but extremely resistant to UV disinfection. However, this resistance may stem from its dsDNA genome (a trait which is unique among enteric viruses), which could facilitate photorepair of UV damage.
All nucleotides absorb UV, but the pyrimidines (C, U, and T) are considered to be more reactive than purine bases, meaning they exhibit the highest propensity for resultant damage.14,46,65 The formation of pyrimidine dimers has been proposed as the primary mechanism responsible for viral inactivation via UV irradiation. Genomic sequences with a high dimerization potential include pyrimidine doublets (TT, TC, CT, and CC) and triplets composed of single purines combined with pyrimidine doublets. 46 The presence of more hydrogen bonds means GC basepairs are stronger, which could make it harder to form dimers, thereby providing some degree of UV resistance. However, statistical analysis of the nucleic acid-related parameters GC ratio, pyrimidine content, and dimerization value demonstrated no significant correlation to UV treatability (P > 0.05). A full summary of the statistical analyses is provided in Supplementary Table 7.
To further explore the relationship between virus genome and treatability, a phylogenetic analysis was performed, the results of which are shown in Figure 4. For some types of treatments, the virus families align better than with others. For example, using the pilot-scale photocatalysis reactor, MS2 and fr demonstrated greater resistance than did phi-X174 and PRD1. The phylogeny supports this grouping. Although minor differences in composition and structure are known to greatly impact virus resistance to disinfection, 13 one may hypothesize that FCV would exhibit a photocatalytic response more similar to PRD1 and phi-X174 rather than MS2 and fr based on phylogenetic similarities. While the approach warrants considerable caution, the use of the linear relationship based on hypothetical Y surface density does suggest that hypothetical FCV inactivation (2.4 LR) would be similar to PRD1 and phi-X174 (2.6 and 2.4 LR, respectively), whereas greater resistance is observed for MS2 and fr (1.8 and 2 LR, respectively). Likewise, the familial similarity within Adenoviridae and Picornaviridae is reflected by similar UV susceptibilities within the groups. However, for physical removal, no phylogenetic-related trends were observed. These findings suggest that virus resistance to inactivation is inherently influenced by genomic composition, while physical removal is unrelated to the genome.

Phylogenetic analysis showing the relationships of taxa based on genome sequence analysis. The code following each virus name represents the GenBank Accession ID for the NCBI 45 complete genome sequence. The optimal tree with the sum of branch lengths = 8.13292181 is shown. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (500 replicates) are shown next to the branches. 76 The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The analysis involved 12 nucleotide sequences. All positions containing gaps and missing data were eliminated. There were a total of 3,018 positions in the final dataset.
Physical removal
As virus removal relies on physical interactions with the surrounding environment through adsorption and charge neutralization, parameters such as size, shape, and pI or surface charge are generally thought to control removal.26,38,42
Using the maximum LR from optimized enhanced coagulation experiments (Fig. 1), virus size was not found to correlate to degree of removal (P > 0.05). Particle diameter of waterborne enteric viruses is likely to be on the order of ~30 nm, with an approximate range of 20–70 nm. For this size range, particle transport efficiencies do not vary tremendously, as indicated by models of the relative importance of mechanisms of virus collision during coagulation and accumulation during filtration (Supplementary Figs. 1 and 2). Together, these models reveal that for the typical enteric virus size range, macroscale mechanisms such as mixing intensity and sedimentation dominate particle collision and transport. Although collision frequency and transport efficiencies vary by approximately an order of magnitude for particles between 20 and 70 nm, the more typical diameters of 20–30 nm do not differ greatly, which supports the finding that virus size is not an important determinant in removal. Studies in sandy soils have found that for larger particles (.60 nm in diameter), size appears to be the overriding factor for adsorption, but for smaller particles, pI dominates removal. 36
The pI of a virion, which depends on the amino acid composition of the protein capsid, is believed to strongly influence viral adsorption in soil and water matrices. At pH values above the pI, viruses have a net negative charge, but below the pI, the charge is positive. Thus, lower pH values will promote better adsorption of viruses to negatively charged mineral surfaces (ie, sediments).66–74 A virus with a higher pI should demonstrate a higher degree of adsorption since it has a weaker repulsive force relative to the negatively charged water or soil matrix. Unlike the negatively charged soil matrices analyzed in the majority of virus adsorption studies, both positively and negatively charged species are present in water during coagulation. This may facilitate sorption of both positively and negatively charged species. For example, the pI of fr is relatively high (8.9–9.0), meaning that its surface is positively charged in most naturally occurring pH ranges. This enables fr to sorb to the surface of negatively charged particles such as natural organic matter (NOM) and Fe(OH)2–. In the same pH range, other bacteriophages such as MS2 and PRD1 (pI of 3.5–3.9 and 3.0–4.2, respectively) will be negatively charged, thereby allowing them to sorb to positively charged species such as Fe(OH)4+. In this respect, negatively charged viruses such as MS2 and PRD1 may compete with ubiquitous NOM (also negatively charged) for sorption to positively charged species. Alternately, the positively charged fr will encounter modest adsorption competition for negatively charged species. This supports the relatively high removal of fr in comparison to PRD1, but does not necessarily hold true for all observed virus removals.
The pI has been determined experimentally for a variety of microorganisms. However, difficulties propagating and analyzing human/animal viruses have impeded experimental determinations of virus pIs, so limited experimental pI data is available for human/animal viruses. In this study, theoretical virus pIs were calculated using the amino acid composition of each virus capsid and the respective pK values, as listed in Supplementary Table 6. The resulting values are listed in Table 2. The average theoretical pIs and the available empirical data are plotted in Figure 5 in increasing order. When treatment conditions (ie, jar test, coagulant dose, pH adjustment) were assessed separately using all available replicate measures of virus removal, some datasets exhibited strong correlations between pI and removal, 4 while others did not.5,56 Using the pooled maximum LR values shown in Figure 1, no significant correlation between virus pI and removal was identified (P > 0.05). As a number of variables impact coagulation/ flocculation/sedimentation processes, it is reasonably advisable to separate the datasets for statistical analysis. In any case, while pI may play a role in the physical removal of viruses, there are likely other important factors, and the exact mechanisms cannot be identified here.

Mean virus isoelectric points (pI), including theoretical and empirical values. Theoretical pI values were calculated using the capsid protein sequence of each virus in combination with the Henderson–Hasselbalch equation. The error bars represent ±1 standard deviation.
For this assessment of the influence of capsid proteins on the physical removal of viruses, the theoretical pI values are useful since empirical values are not available for many viruses. However, the resulting pI may differ, significantly, from the experimental pI as the calculations do not account for protein structure, which may shield some amino acids from surface exposure, nor are chemical modifications considered (amino acids can be phosphorylated, methylated, acetylated, etc.). These conformational variations can dramatically alter surface charge. 51 A comparison of average theoretical pIs versus average known experimentally derived pIs is provided in Figure 6. As shown, there is an approximate 1:1 correlation; however, the low R 2 value illustrates the substantial differences that can occur between theoretical and empirical values. It appears that the theoretical calculation may perform poorly for viruses with experimentally determined pIs toward the extremes (eg, MS2 with a low pI and fr with a high pI). This makes intuitive sense since the algorithm does not account for protein structure or chemical modifications. It follows that theoretical pIs should be regarded cautiously. Recognizing these limitations, however, the theoretical pI can be used as a starting point for those viruses for which the experimental pI has not yet been determined.

Comparison of theoretical and experimental isoelectric point (pi) values. Each point represents the mean pi, while error bars represent ±1 standard deviation.
In addition to assessing the protein-derived pI's effect on removal, the influence of the capsid's amino acid composition was evaluated. The fractional composition of the virus capsids is illustrated as a percentage of each amino acid group in Figure 7. Supplementary Figure 3 shows the fraction of each individual amino acid. As shown in the figures, and confirmed by statistical analysis, no correlation was observed (P > 0.05). Future analyses of surface charge and physical removal accounting for the influence of higher order structure, for example, using X-ray diffraction to identify exterior amino acids, 37 would help to shed additional light on the complex mechanisms underlying virus adsorption and removal.

Amino acid group composition of virus capsid. There was no statistical relationship between the amino acid composition and the physical removal of the viruses during enhanced coagulation (P > 0.05).
Conclusion
This study examined the relationship between virus capsid proteins and susceptibility to water treatment processes, specifically UV irradiation, TiO2 photocatalysis, and enhanced coagulation. Determination of the molecular mechanisms by which viruses are inactivated and physically removed will assist in the identification of novel approaches to control viruses in water. 2
It was shown that oxidizing disinfectants such as free chlorine, ozone, and HO• primarily target amino acids in the protein capsid of the virus. However, as the oxidative potential increases, disinfectants are less selective and genomic damage begins to play an increasingly appreciable role. For HO•-based AOP disinfection, abundance of the amino acid Y appears to closely relate to virus susceptibility.
In the case of UV disinfection, while the genomic composition strongly influences inactivation, capsid protein composition was also shown to be relevant. Although nucleotide sequence has been suggested as an indicator of UV susceptibility (ie, GC ratio, pyrimidine content, and dimerization potential), these factors demonstrated less influence on virus treatability than absolute composition. Virus inactivation by UV irradiation was strongly correlated to virus size (both capsid diameter and genome length), all amino acids, and groups of amino acids classified by functional groups.
Coarse estimates based on the hypothetical surface density of Y suggest that NV (for which in vitro assessments of susceptibility to disinfection are currently unavailable) may be relatively resistant to oxidation, but susceptible to UV irradiation.
Physical removal of viruses correlated poorly with all nucleotide and amino acid parameters. Moreover, no correlation was identified for physical parameters such as virus size. When distilled into separate datasets, pI did show some correlation with removal, but the trends were inconsistent. Thus, it is believed that while pI may play a role in the physical removal of viruses, there are likely other important factors, and the exact mechanisms cannot be identified here.
This study showed that both virus genome and protein composition influence disinfection potential, while pI may play a role in physical removal. Though genome and protein sequences provide insight into virus treatability, they alone are unlikely to allow for accurate prediction of susceptibility. 6 Future studies addressing the links between protein composition and structure would considerably advance understanding of virus treatability. Additionally, the relationship between disinfectant-related damage and inactivation should be explored to better understand which virus components and functions6,13 to target for more effective and efficient treatment strategies designed to mitigate the risks posed by viral pathogens on a mechanistic basis.
Footnotes
Acknowledgments
The authors gratefully acknowledge the laboratory effort and invaluable research guidance from Dr. Hodon Ryu throughout this work. Additionally, we thank Tony Powell of Purifics for use of the Photo-Cat Lab reactor.
Author Contributions
Conceived and designed the experiments: BKM and DWG. Analyzed the data: BKM, DWG, and YY. Wrote the first draft of the manuscript: BKM. Contributed to the writing of the manuscript: YY, DWG, and MA. Agreed with manuscript results and conclusions: BKM, YY, DWG, and MA. Jointly developed the structure and arguments for the paper: BKM and YY. Made critical revisions and approved the final version: BKM, YY, DWG, and MA. All the authors reviewed and approved the final manuscript.
