Abstract
The cell’s ability to sense and respond to specific stimuli is a complex system derived from precisely regulated protein-protein interactions. Some of these protein-protein interactions are mediated by the recognition of linear peptide motifs by protein modular domains. BRCT (BRCA1 C-terminal) domains and their linear motif counterparts, which contain phosphoserines, are one such pair-wise interaction system that seems to have evolved to serve as a surveillance system to monitor threats to the cell’s genetic integrity. Evidence indicates that BRCT domains found in tandem can cooperate to provide sequence-specific binding of phosphorylated peptides as is the case for the breast and ovarian cancer susceptibility gene BRCA1 and the PAX transcription factor–interacting protein PAXIP1. Particular interest has been paid to tandem BRCT domains as “readers” of signaling events in the form of phosphorylated serine moieties induced by the activation of DNA damage response kinases ATM, ATR, and DNA-PK. However, given the diversity of tandem BRCT-containing proteins, questions remain as to the origin and evolution of this domain. Here, we discuss emerging views of the origin and evolving roles of tandem BRCT domain repeats in the DNA damage response.
The DNA Damage Response
During the life of a cell, its DNA is constantly subjected to damage. This damage can be from external sources, such as chemical carcinogens and ionizing radiation, or from internal sources such as errors in replication. 1 Cells have evolved a network of interconnecting pathways to promptly fix these errors and prevent illegitimate changes to DNA from being propagated to daughter cells. This network can be collectively called the DNA damage response (DDR) and includes a wide array of overlapping systems that recognize abnormal DNA structures such as single- and double-stranded breaks, topological distortions of the double helix, or DNA adducts. 2,3 An important feature of this network in eukaryotes is the coordination of DNA repair with a tight control of cell cycle progression. 4,5 In addition, cells in metazoan organisms will undergo apoptosis as a fail-safe mechanism when they cannot properly fix the damage. 3 Defects in this network can lead to a series of diseases, in particular, cancer. 6,7
Many proteins involved in DNA repair are found in all superkingdoms (Archaea, Bacteria, Eukaryota), but one particular feature found primarily in eukaryotes is the integration of DNA repair processes with cell cycle check points and apoptosis. 8 Indeed, a problem that has captured the attention of many researchers is how organisms have evolved more complex signaling networks during the course of evolution. 9
A Modular Domain View of the DDR
Signaling events are normally mediated by a system of interacting partners, which utilize the recognition of linear motifs, usually present in unstructured regions of proteins, and by modular domains. 10 Even a cursory analysis of the DDR reveals that the system is driven by a series of phosphorylations and ubiquitinations and that there is an overrepresentation of 3 particular protein domains: BRCTs, FHA, and 14-3-3. 11
Modular domains that can be found in combination with a large number of other modular domains are sometimes called promiscuous. 12,13 This is the case of the BRCT and FHA domains, while the 14-3-3 domain can be considered a lonely domain in the sense that it does not occur with other domains. 9 Here, we will focus our discussion on the BRCT domains.
BRCT Domains
Elements of the DNA repair machinery typically exist as large macromolecular complexes that incorporate a diverse array of proteins required for the faithful detection and resolution of the lesion. Outside of the obvious need for DNA binding components of the complex, adaptor proteins (and their domains) are prevalent and mediate the specific protein-protein interactions required for the organization of the complexes. Present in many of the known scaffold proteins involved in the DDR, the BRCT domain is an important component of adaptor proteins involved in DNA repair.
Originally identified in the C-terminal region of the breast and ovarian cancer susceptibility protein BRCA1, known to be truncated or mutated in cases of hereditary breast and ovarian cancer, 14 BRCT domains have been found in a number of different proteins in species ranging from bacteria to humans. 15,16 Although it contains no intrinsic enzymatic activity, the BRCT domain is essential for orchestrating a variety of enzymatic reactions involved in repair and check point regulation at sites of DNA damage. 17-22 The domain itself is relatively small, consisting of approximately 90 to 100 amino acids. 14 Concordantly, most if not all of the proteins that have been found to contain BRCT domains are intimately connected to the regulation of DNA integrity. 15,16
BRCT domains occur in proteins as single (e.g., PARP1) or multiple units (e.g., TOPBP1). Interestingly, in proteins in which they occur as multiple units, BRCTs tend to be organized as pairs of closely juxtaposed units (e.g., BRCA1). Domains that occur as single isolated units function in a variety of homodimerization or heterodimerization interactions, leading to the formation of protein complexes that play important roles in DNA damage signaling, repair, and coordination of cell cycle check points. 23 Tandem BRCT domains can be found in BRCA1 (Fig. 1), LIG4, TOPBP1, PAXIP1 (PTIP), ECT2, NBN, MCPH1, TP53BP1, XRCC1, BARD1, MDC1, and ANKRD32 and seem to function as one structural unit. Although not all of these proteins have yet been systematically analyzed, several of them have been shown to recognize phosphorylated peptides. 24-26 Interestingly, the phosphopeptide recognition pocket of the BRCA1 tandem BRCTs was independently identified by mutation analysis in breast cancer, highlighting the importance of this function to tumor suppression. 27 It is not uncommon for these 2 subclasses of BRCT domains (single or tandem) to occur in the same protein; such is the case for TOPBP1 and MCPH1.

Tandem BRCT domains from BRCA1. Space-filling structure derived from 1JNX. 29
The BRCT domain presents as a globular α/β fold composed of a 4-stranded central β sheet flanked by 3 α helices (α1 and α3 on one side with α2 on the opposite face). 28,29 There is also a strong conservation of the amino acid tryptophan in helix α3 in most, but not all, BRCT domains. 15 When present as a tandem pair, the individual domains fold in a head-to-tail fashion connected by a linker region that can vary in structure and composition. 29-33 Interestingly, the BRCT domain includes a NAD(P)-binding Rossmann fold, 34 a protein architecture commonly found in several different superfamilies 35 and in ancient proteins. 36
While the phosphorylated serine is normally recognized by a groove in one of the BRCT units, the specificity is conferred by the amino acid located in the +3 position of the phosphopeptide (with the serine being the 0 position). 19,25,33,37-40 In BRCA1, the ability to recognize phosphopeptides is mediated by conserved residues S1655 and K1702 in the first BRCT domain that allow hydrogen bonding to the pS/T residue. A hydrophobic pocket composed of F1704, M1775, and L1839 in the second BRCT domain of BRCA1, which form the core of the fold, provides the specificity for phenylalanine at the +3 position. 25,37,38,40 Indeed, this type of “2-knob” interaction has been shown to be essential for BRCA1 binding to DDR proteins CtIP, BACH1, and ABRAXAS as well as others including MDC1 binding to the C-terminal tail of γH2AX and PAXIP1 binding to phosphorylated TP53BP1. 25,33,37,38,40-42 However, because of the diversity in the proteins in which tandem BRCT domains exist and the specific functions of each one, it is not surprising that the phosphopeptides they bind differ as well. For instance, it was found that the MDC1 tandem domain prefers tyrosine at the +3 position. 33,39 Differences in the sequence of amino acids around the pS/T residue allow for tight regulation of the interactions, which likely is further fine tuned by spatial and temporal regulation of specific kinases and phosphatases.
Importantly, not all interactions between tandem BRCT domains and their partners are mediated by phosphorylation events. The example best characterized by comprehensive structural analysis is that of TP53BP1 binding to the tumor suppressor p53, which occurs on the face opposite that of the hydrophobic cleft of the tandem BRCT domains of TP53BP1. 30,32 As these domains exist as globular domains, the conserved phosphopeptide binding pockets only occupy a small portion of the available surface area of the domains. This clearly leaves room for a number of other potential surface interactions to occur. Therefore, caution should be made in only classifying these domains as phosphopeptide binding.
Evolution of BRCT Domains
The tandem BRCT domain organization has undergone significant evolutionary expansion from prokaryotes to higher order metazoans (Fig. 2). Although the specific mechanisms that fostered this expansion remain speculative, interesting correlations can be made between the phyletic distribution of tandem BRCT proteins and aspects of DNA repair and cell cycle check points in diverse lineages.

Taxonomic distribution of tandem BRCT-containing proteins. Distant orthologs of human tandem BRCT-containing proteins were identified in different taxa and curated individually. Orthologs were retrieved using human tandem BRCT-containing proteins (including LIG4, BRCA1, TOPBP1, PAXIP1, ECT2, NBN, MCPH1, TP53BP1, XRCC1, BARD1, MDC1, and ANKRD32) to perform saturated 60 and reverse 61 blast searches on a nonredundant database (NCBI), allowing for small changes in conserved domain architecture. Common blast (NCBI) and conserved domain analyses were performed locally using a HMMER and blast manager software (FAT [functional analysis tool] developed by our group) and PFAM database. 62 All results were manually curated. Taxonomy tree organization was based on NCBI taxonomy, although some taxa were omitted (a complete list of taxa used can be provided upon request) and do not show evolutionary distances. Dashed lines on the tree show indirect taxa links. *Only taxa Alveolata (txid 33630), Amoebozoa (txid 554915), and Euglenozoa (txid 33682) were used for Protozoa division. Blue dashed lines indicate major transitions to eukaryotes and to metazoans.
The minimal complement of BRCT-containing proteins in the human genome is 24 (Mesquita and Monteiro, unpublished results). Of those proteins, 12 present with more than one BRCT unit (Fig. 2). We hypothesize that these 12 proteins might therefore make up the core of the phosphoserine recognition system during the DDR.
One of the earliest analyses of DDR-associated domain architecture by Aravind et al. did not find any instances of the BRCT domain in Archaea, although it was evident in the carboxy terminus of bacterial DNA ligases. 8 Additionally, the presence of the BRCT domain in Trypanosomes (which is not considered part of the early radiation of eukaryotes) prompted speculation that horizontal gene transfer from prokaryotes could be the mechanism by which eukaryotes acquired BRCT domains. 8 However, instances of BRCT domains have since been found in Archaea. 43 A search of currently deposited sequences also identifies the presence of a BRCT domain in proteins with similarity to the human LIG4 in several Archaea genera (e.g., Halorhabdus, Haloarcula, Natronomonas, Haloferax, Halorubrum, and Haloquadratum). The presence of BRCT domain in all 3 superkingdoms suggests that vertical descent of this domain from the last common ancestor, or cenancestor, could explain its phyletic distribution. In addition, 2 “transition points,” the eukaryotic/prokaryotic division and the emergence of the metazoans, can be identified (Fig. 2).
Interestingly, SH2 domains are present almost exclusively in metazoans, suggesting that its origin is close to the emergence of multicellularity. 25 Although the tandem BRCT domain clearly has roots in single-cell organisms, multicellularity requires a more complex and tightly controlled protection of DNA integrity as a genetic insult to a single cell can be detrimental to the whole organism. As organisms expand in complexity, so too does the number of tandem BRCT domain–containing proteins. This could be attributed to the cell’s penchant for highly redundant systems that provide added insurances of successful completion of vital cellular processes such as chromatin regulation, DNA damage repair, and cell cycle check points.
It is likely that the first occurrence of the tandem BRCT domain came from the duplication of a single domain either by genetic duplication or domain shuffling between 2 different genes. Hints of a possible origin of tandem BRCT domains can be extrapolated from the existence of an apparent third, less defined, subclass of BRCT domains found in RFC1-, PARP1-, and NAD+-dependent bacterial DNA ligases. These BRCTs are defined by replacement of the highly conserved tryptophan residue found in helix α3 as well as several other specific structural alterations. 44 Most striking is that some of the individual domains within this subclass display a capacity to bind DNA mediated by the BRCT domain in which binding is mediated by the phosphate binding pocket and an N-terminally located helix outside the BRCT domain. 44-46 Compellingly, in the case of RFC1 binding to the 5′-phosphate of DNA, there is excellent conservation of both the 3-dimensional structure and the chemical nature of the phosphate binding site found in other BRCT domains. Furthermore, there is conservation between the DNA binding residues of RFC1 and bacterial NAD+-dependent ligases, and mutation of these residues in the ligases severely affects their ability to bind DNA, suggesting conservation in the mode of DNA binding. 44,46 These studies suggest that the BRCT domain could have originated as a DNA binding motif through recognition of the 5′-phosphate moiety and the binding to phosphorylated peptides may have developed later.
To probe further into this problem, we generated a tree comparing BRCTs (not the full proteins) from 1) the E. coli ligase, 2) all Archaea BRCTs (all of them from DNA ligases), and 3) all 24 BRCT-containing human proteins (Fig. 3). Surprisingly, the RFC1 BRCT clustered with the Archaea BRCTs, followed by DBF4 and PARP1. These results strengthen the notion raised in the previous paragraph, although a more comprehensive and exhaustive analysis is needed to fully investigate these issues. Importantly, the presence of the BRCT domain in Archaea does not necessarily exclude the possibility of horizontal gene transfer as an evolutionary process that could have contributed to the expansion of this domain.

RFC1 BRCT clusters with Archaea BRCTs. RFC1 BRCT is indicated by an arrow. Blue, yellow, and red lines indicate major branches of BRCT domains. Tree comparing single BRCT units from E. coli ligase BRCT, all BRCTs from Archaea (all derive from DNA ligases), and BRCTs from all 24 human BRCT proteins. BRCTs were aligned with PRALINE. Bootstrap created 1,000 replicates using the partial jackknife method where 1% positions were randomly omitted (SEQBOOT, PHYLIP package). Distance matrix calculations were performed using PMB matrix (BLOCKS conserved domain-based matrix) and software PROTDIST (PHYLIP package). Trees were calculated for each replicate using E. coli as a root and software neighbor. The consensus tree and percentage support were calculated using software consensus (PHYLIP package) and visualized with TREEVIEW and edited with GIMP to include percentage support (numbers included at branching points). Numbers appended to protein names identify BRCT domains. For example, PAXIP1 36 represents the third BRCT repeat out of 6 BRCTs contained in PAXIP1. Archaea BRCTs are identified by protein gi numbers.
Of all human tandem BRCT proteins, DNA ligase IV (LIG4) contains the earliest identifiable ortholog found in all 3 superkingdoms, although found as a single unit in Archaea and bacteria (Fig. 2). In early prokaryotes and Archaea, where there is less pressure to maintain genomic fidelity and error-prone repair could aid in adaptability of the species, the error-prone NHEJ pathway appears to be the first to utilize the tandem BRCT domain. LIG4 complexes with XRCC4 at sites of double-stranded breaks through recruitment by DNA-PK and KU70/KU86 and completes the final step in repair by ligating the 2 strands of DNA.
Interestingly, in the case of the tandem BRCT domain of LIG4, it is not the domains themselves but rather the linker between the domains that mediate the important interaction with XRCC4. 18 Indeed, this mode of interaction is conserved in other instances of the tandem such as the phospho-independent binding of the TP53BP1 tandem BRCT to p53, which utilizes the linker region between the BRCT domains for binding. 30,32 However, the linker region does not always directly bind to the interaction partner. Indeed, each of the BRCT domains found in the tandems of both BRCA1 and MDC1 confers sequence binding specificity for phosphopeptides. 38 Despite their conservation in structure and composition, it is obvious that their scaffolding properties are utilized in a diverse fashion within the BRCT family. It is also important to note that BRCT domains have also been identified in plants 47 and a BRCT-related fold has been observed in the T antigen helicase domain of polyoma viruses. 48
This apparent plasticity in binding functionality is likely to be a driving force in the utilization of these domains as adaptor proteins in the DNA damage response. Furthermore, it is tempting to speculate that combination of the most ancient singleton BRCT domains with other non-BRCT regions (or with other duplicated BRCT units) led to the emergence of new binding interfaces not present in the single domains alone that could confer binding selectivity and specificity. This ability to specifically bind new proteins may have provided the basis for the expansion of this motif as a tandem domain in a single protein.
Perhaps more perplexing is how these conserved domains with diverse modes of mediating interactions do not appear to have branched out into other cellular processes, especially those that also involve serine and threonine kinase signaling. Perhaps the most non-DNA damage use of the BRCT domain has been described in ECT2 control of cleavage furrow formation. 49 However, since this process involves coordination of cytokinesis with chromosome segregation, it is evident that this BRCT domain functions in maintaining genome fidelity. Considering the FHA and 14-3-3 scaffolding domains involved in the DDR, which participate in a number of diverse cellular pathways, BRCT domains have been relatively confined to the DDR. This could possibly be explained by evolutionary determinants that prevent its utilization in other processes or constraints on recognizing consensus regions that are preferred by kinases involved in the DDR.
Conclusion
Recently, approaches to understand the origin and evolution of cellular signal transduction mechanisms have yielded important insights when “writers” (kinases), “erasers” (phosphatases), and “readers” (modular domains) are analyzed as a 3-component signaling system. 9,50-52 These approaches have focused on a well-characterized system involving protein tyrosine kinases, tyrosine phosphatases, and SH2 domain–containing proteins that has a prominent role in growth factor receptor signaling. However, only in the last 15 years has a clearer picture of the DNA damage response network begun to be delineated. 53 Presently, we have accumulated significant knowledge on the kinases involved in signaling proximal to DNA damage sites such as ATM, ATR, and DNA-PK, 54-56 and the main effector kinases CHK1 and CHK2, 57,58 but our knowledge of other kinases and phosphatases involved in DNA damage signaling is still rudimentary. 59
Just like the Roman Praetorian Guard, BRCT domains function as elite body guards keeping close attention on the integrity of the emperor (the cell’s DNA). Also, just like the Praetorian Guard that eventually had its role expanded from the battlefield to politics, BRCT domains’ functions seem to have evolved from participating in DNA repair to include negotiating cell cycle check points. Here, we presented a rough analysis of the occurrence of BRCT domains in diverse lineages. This analysis suggests that the RFC1 BRCT might be the functional prototype of tandem BRCT domains. However, this analysis also highlights several limitations in our understanding and suggests several questions that need to be answered to illuminate the role and evolution of the DNA damage response.
Footnotes
Acknowledgements
The authors thank Hanna Silva Condelo for help with data mining, Yun Liu for help with the figures, and J.N. Mark Glover and Rachel Karchin for helpful suggestions. This article is dedicated to the memory of Saburo Hanafusa.
The author(s) declared no potential conflicts of interest with respect to the authorship and/or publication of this article.
This work was supported by the National Institutes of Health [grant nos. CA148112, CA116167, and CA11809] and by the US Army Medical Research and Materiel Command, National Functional Genomics Center project [grant no. W81XWH-08-2-0101]. The opinions, interpretations, conclusions, and recommendations are those of the author(s) and are not necessarily endorsed by the US Army. N.T.W. is supported by a Florida Breast Cancer Foundation fellowship.
