A Pilot Screen of a Novel Peptide Hormone Library Identified Candidate GPR83 Ligands

Abstract

The identification of novel peptide hormones by functional screening is challenging because posttranslational processing is frequently required to generate biologically active hormones from inactive precursors. We developed an approach for functional screening of novel potential hormones by expressing them in endocrine host cells competent for posttranslational processing. Candidate preprohormones were selected by bioinformatics analysis, and stable endocrine host cell lines were engineered to express the preprohormones. The production of mature hormones was demonstrated by including the preprohormones insulin and glucagon, which require the regulated secretory pathway for production of the active forms. As proof of concept, we screened a set of G-protein-coupled receptors (GPCRs) and identified protein FAM237A as a specific activator of GPR83, a GPCR implicated in central nervous system and regulatory T-cell function. We identified the active form of FAM237A as a C-terminally cleaved, amidated 9 kDa secreted protein. The related protein FAM237B, which is 64% homologous to FAM237A, demonstrated similar posttranslational modification and activation of GPR83, albeit with reduced potency. These results demonstrate that our approach is capable of identifying and characterizing novel hormones that require processing for activity.

Keywords

hormone screening GPCR GPR83 FAM237A FAM237B

Introduction

There are more than 80 known human peptide hormones that mediate a wide variety of physiological processes, including reproduction, appetite, metabolism, growth, behavior, cardiovascular function, and electrolyte balance.^1,2 The diversity and importance of hormone function in health and disease have prompted searches for new peptides that may offer novel insights into homeostatic mechanisms and novel targets for therapeutics.

The first peptide hormones were discovered more than 75 years ago by arduous biochemical strategies involving stepwise purification from large amounts of tissues, guided by activity in cell-based assays or animal models. The biologic activity of hormones often required extensive posttranslational processing from inactive precursor proteins. One critical modification is the sequential cleavage of precursors in a sequence and tissue specific manner by a family of endoproteases known as the prohormone convertases.³ A well-known example is preproopiomelanocortin,⁴ which contains as many as eight cleavage sites that are used by convertases to generate at least 10 different biologically active peptides, depending on the tissue of origin. In addition to proteolytic cleavage, hormones may also require a variety of other posttranslational modifications for activity, including N-terminal acetylation, C-terminal amidation, formation of N-terminal pyroglutamyl residues (pyrrolation), tyrosine sulfation, phosphorylation, glycosylation, disulfide bond formation, and lipidation.⁴ Methods for predicting cleavage and modification sites from sequence information have been described and used successfully to identify novel peptide hormones,^5–7 but the challenge of producing the novel hormones in the correctly processed form at scale means that characterization is limited to individual or a small number of candidate hormones.

Our approach to the challenge of assessing predicted candidate hormones at scale was to develop a process to discover novel hormones from a library of predicted preprohormone genes that were identified using an algorithm. Host cell lines competent for regulated secretion were engineered to express the candidate peptide hormones, which were then screened for function in cell-based assays along with known controls. As proof of concept for our approach, we screened the novel hormone library for the ability to activate a set of G-protein-coupled receptors (GPCRs), which were selected by giving preference to orphan receptors as determined by IUPHAR,⁸ because the receptors for many known hormones are GPCRs.⁹ We successfully identified and characterized two novel, related peptides as potential activating ligands for GPR83.

Materials and Methods

Identification of Candidate Novel Preprohormone Genes

A supervised classifier based on Random Forests of Decision Trees¹⁰ was used to identify potential prohormone convertase substrates. Two training sets of 350 different sequences each were constructed, consisting of 8-mer amino acid sequences derived from unique human proteins from the UniProt database.¹¹ Each 8mer contained centrally positioned residues known to be cleaved by convertases (GRR, KK, KR, RR, RXKR, RXRR, GKR, GKR, and/or RXXR, where X denotes any residue). The positive control set contained sequences that are known to be cleaved, while the negative control set contained sequences that have the characteristic residues but are known not to be cleaved. Associated with each 8-mer were 20 different attributes derived from the following: two Hidden Markov models representing 108 dibasic sites cleaved by the convertases PC1/3 and PC2; nine Hidden Markov models representing each of the characteristic cleavage sites listed above; residue statistics on the four residues immediately surrounding the cleavage sites; and secondary structure analysis of the full-length protein in the region around the cleavage site. The test set consisted of a random selection of 10% of the sequences removed from the two training sets. This out-of-sample set was used to test the overall performance of the classifier, and the algorithm was modified as necessary. The process was repeated until there was an insignificant change in the classification error rate. All sequences were then classified with an overall score ranging from 0 to 1, where 0 is not cleaved and 1 is cleaved. Based on the out-of-sample test set, a baseline score of 0.4 was set and all genes with a score of ≥0.4 and not previously characterized as preprohormones were chosen for further analysis. We identified 56 genes from this set that were additionally conserved across species and co-expressed in tissues with the prohormone processing enzymes carboxypeptidase E, PC1/3, or PC2 as determined by using the GTEx portal (https://www.gtexportal.org/home/). The complete set of 56 candidate preprohormone genes identified, along with the corresponding cloned nucleotide sequences, are shown in Supplemental Table S1 .

Table 1.

Candidate Preprohormone Genes Cloned into STC1 and AtT20-PC2 Comprising the Peptide Hormone Library.

Gene Symbol^a	Protein Name^a	Protein Type^b	Counterstructure(s)^c	Example References^d
C10orf25	Uncharacterized protein C10orf25	Uncharacterized	No
C11orf94	Uncharacterized protein C11orf94	Uncharacterized	No
C12orf73	Uncharacterized protein C12orf73	Uncharacterized	No
C15orf61	Uncharacterized protein C15orf61	Uncharacterized	No
C17orf67	Uncharacterized protein C17orf67	Uncharacterized	No
C17orf77	Uncharacterized protein C17orf77	Uncharacterized	No
C2orf72	Uncharacterized protein C2orf72	Uncharacterized	No
C5orf38	Protein CEI	Uncharacterized	No
C5orf64	Uncharacterized protein C5orf64	Uncharacterized	No
C6orf226	Uncharacterized protein C6orf226	Uncharacterized	No
C7orf69	Uncharacterized protein C7orf69	Uncharacterized	No
C8orf82	UPF0598 protein C8orf82	Uncharacterized	No
CSAG1	Putative chondrosarcoma-associated gene 1 protein	Uncharacterized	No
EXOC3-AS1	Uncharacterized protein EXOC3-AS1	Uncharacterized	No
FAM180A	Protein FAM180A	Uncharacterized	No
FAM237A	Family with sequence similarity 237 member A	Uncharacterized	No
LOC100131496	Uncharacterized LOC100131496	Uncharacterized	No
METTL24	Methyltransferase-like protein 24	Uncharacterized	No
MSANTD1	Myb/SANT DNA binding domain containing 1	Uncharacterized	No
MYRFL	Myelin regulatory factor-like protein	Uncharacterized	No
RTL8A	Retrotransposon Gag-like protein 8A	Uncharacterized	No
SMIM13	Small integral membrane protein 13	Uncharacterized	No
TEPP	Testis, prostate, and placenta-expressed protein	Uncharacterized	No
THEM6	Protein THEM6	Uncharacterized	No
TMEM178B	Transmembrane protein 178B	Uncharacterized	No
ALKAL1	ALK and LTK ligand 1	Secreted	LTK and ALK	31
ALKAL2	ALK and LTK ligand 2	Secreted	LTK and ALK	31
ANGPTL8	Angiopoietin-like protein 8	Secreted	ANGPTL3	32
ECRG4	ECRG4 augurin precursor	Secreted	Multiple	33, 34
OSTN	Osteocrin	Secreted	NPR-C	35
C1QTNF12	Adipolin	Secreted	No	36
CCDC3	Coiled-coil domain containing 3	Secreted	No	37
KRTDAP	Keratinocyte differentiation-associated protein	Secreted	No	38
METRN	Meteorin, glial cell differentiation regulator	Secreted	No	39
MZB1	Marginal zone B and B1 cell-specific protein	Secreted	No	40
ODAPH	Odontogenesis-associated phosphoprotein	Secreted	No	41
PLAC9	Placenta-specific protein 9	Secreted	No	42
PRXL2A	Peroxiredoxin-like 2A	Secreted	No	43
SPACA7	Sperm acrosome-associated protein 7	Secreted	No	44
UCMA	Unique cartilage matrix-associated protein	Secreted	No	45
NF1P4	Neurofibromin 1 pseudogene 4	Other (pseudogene)	No
PYY3	Peptide YY, 3	Other (pseudogene)	No
LOC100132798	Similar to hCG2042756	Other (NCBI removed)	No
LOC100287141	Hypothetical protein	Other (NCBI removed)	No
LOC100293085	Hypothetical protein	Other (NCBI removed)	No
LOC100293335	Hypothetical protein	Other (NCBI removed)	No
DEPP1	Protein DEPP1	Intracellular	No	46
NUPR2	Nuclear protein 2	Intracellular	NUPR1	47
SDHAF4	Succinate dehydrogenase complex assembly factor 4	Intracellular	SDH1	48

Gene symbol and protein names as designated in https://www.uniprot.org/.¹¹

“Secreted” means that there is published experimental evidence supporting secretion of the protein; “uncharacterized” means that the expression and secretion of the predicted protein have not been experimentally verified.

Receptors or binding partners for which experimental evidence exists.

References that provide key support for protein secretion or binding partners.

Endocrine Cell Line Sourcing and Culture

AtT20 mouse pituitary cells stably transfected with prohormone convertase 2 (AtT20-PC2) were generously provided by Dr. Richard Mains (University of Connecticut). STC-1 mouse intestinal cells were purchased from ATCC (CRL-3254). HEK293-6E (293-6E) cells (licensed from National Research Council Canada),¹² which lack expression of the prohormone convertases PC1/3 and PC2, were used for the expression of proteins through the constitutive secretion pathway as controls. AtT20-PC2, STC-1, and 293-6E cells were propagated in Dulbecco’s modified Eagle’s medium (DMEM) containing 10% fetal bovine serum (FBS; Corning, Corning, NY) and 100 U/mL penicillin and 100 μg/mL streptomycin (Thermo Fisher, Waltham, MA).

Generation of Preprohormone-Expressing Cell Library

DNA sequences encoding each of the 56 full-length candidate human preprohormones and the positive controls preproglucagon, preproinsulin, and preprochemerin in the PiggyBac expression vector (Lonza, Basel, Switzerland) were electroporated with PBase transposase vector at a 1:1 ratio into AtT20-PC2 and STC-1 cells using the appropriate Lonza Nucleofector kit, following the manufacturer’s protocol. After 48 h, puromycin selection was added (2 µg/mL for AtT20-PC2 and 10 µg/mL for STC-1). Stable cell lines were successfully generated for 49 of the 56 candidate preprohormone genes, but the remaining 7 genes could not be stably transfected into either cell line ( Suppl. Table S1 ). Each stable cell line was passaged three to five times, expanded to 100-200 million cells, and then frozen at 10 million cells/mL in 90% FBS (Corning) and 10% DMSO (ATCC, Manassas, VA). For screening stocks, 100 µL aliquots of the cell lines were frozen in screw-capped 2D-barcoded tubes (Thermo) and rearrayed into 96-well racks that contain one tube for each candidate preprohormone in the library. Preproinsulin-, preproglucagon-, and preprochemerin-expressing and PiggyBac empty vector-transfected cells were included in each screening rack as controls. The expression of each human gene was verified by qPCR with gene-specific primers and SYBR Green reagent (Qiagen, Hilden, Germany, QuantiTect SYBR Green PCR Kit 204145) following the manufacturer’s recommended protocol. The cDNA pool from each cell line was generated with the QuantiTect Reverse Transcription Kit (Qiagen 205311).

Generation of Conditioned Media from Candidate Preprohormone-Expressing Cells for Screening

AtT20-PC2 and STC-1 cells expressing each of the candidate preprohormones and the controls were thawed, seeded in tissue culture-treated 96-well plates, and passaged every 3–4 days. Cell densities were determined by measurement of supernatant fluorescence at 560 nm excitation and 590 nm emission after a 3 h incubation with Alamar Blue dye (Thermo Fisher DAL1025) and were normalized across cell lines by adjusting culture seeding volumes accordingly during passaging. Normalization for at least three passages was necessary to maximize regulated secretion of hormones for assays, as assessed using positive controls. To induce hormone secretion, the cells were grown to near confluence, and on the day of the assay the cell medium was changed to Opti-MEM (Thermo Fisher 31985070) or DMEM containing 0.5% bovine serum albumin (BSA; Thermo Fisher 15260037), 100 U/mL penicillin, 100 μg/mL streptomycin, and 5 mM N⁶,2′-O-dibutyryladenosine 3′,5′-cyclic monophosphate (dibutyryl cAMP, Sigma, St. Louis, MO, D0260) for 3 h at 37 °C. The supernatants were then collected, centrifuged, transferred into a 96-well deep well microplate (VWR, Radnor, PA, 75870-796), and immediately assayed for GPCR activity as described in “GPCR Activity Assays and Screening.” The candidate preproproteins were also transiently expressed in 293-6E cells and 48 h conditioned media collected for assays as described previously.¹³

GPCR Activity Assays and Screening

GPCRs that have a higher probability of binding to peptide or protein ligands were identified by constructing phylogenetic trees using multiple sequence alignments of selected regions of 816 GPCR encoding genes from UniProt¹¹ and identifying the GPCRs that lack a confirmed ligand and cluster together with GPCRs that have known peptide ligands. The trees were constructed based on the sequences of domains that may influence ligand specificity, including transmembrane domains, exposed loops, cytoplasmic loops, N-terminal sequences prior to the first transmembrane domain, and C-terminal sequences that follow the seventh transmembrane domain. Seventy-three GPCRs that clustered together with GPCRs known to have protein ligands within one or more of these analyses were prioritized for screening. Eurofins (Luxembourg) DiscoveRx PathHunter eXpress cell lines for 31 of these receptors were available for screening. The complete set of PathHunter eXpress cell lines screened for activity or used as positive controls is listed in Supplemental Table S2 .

The DiscoveRx PathHunter eXpress assay protocol was modified to adjust cell seeding density, media, reagent, and treatment volumes for screening. The modified reagent and treatment volumes tested were based in part on recommendations from the DiscoveRx technical support team. Optimization was carried out using the GCGR assay (DiscoveRx 93-0241E2), purified glucagon (Sigma-Aldrich G3157-2MG), and glucagon secreted from AtT20-PC2 and STC-1 cell lines or 293-6E as described in the “Generation of Conditioned Media” method section above. The extra cell volume required for automated liquid handling necessitated seeding a reduced cell density compared with the vendor protocol (1740 cells per well vs recommended 2000 cells per well in a 384-well microplate). In a side-by-side test, the reduced cell density had a dose-dependent response with similar EC₅₀, fold induction, and noise compared with the recommended cell density, albeit with a reduced absolute signal and background (data not shown). The conditioned media from the three cell types (AtT20-PC2, STC-1, and 293-6E) was confirmed to be compatible with the PathHunter eXpress assay by testing a glucagon dose–response in each and showing acceptable EC₅₀, fold induction, and noise compared with the recommended assay media (data not shown).

For each GPCR a single-use vial of the corresponding DiscoveRx PathHunter eXpress cell line was thawed and diluted in the DiscoveRx PathHunter Cell Plating Reagent (93-0563R1B) recommended for each cell line to 8.7 × 10⁴ cells/mL. The resuspended cells were pipetted into a sterile 384-individual-well reservoir with pyramid bottom (Thomas Scientific, Swedesboro, NJ, 1145M87), and 20 µL aliquots were seeded in white opaque 384-well plates (Corning 3570) by a Bravo Automated Liquid Handling Platform (Agilent, Santa Clara, CA). The plates were manually sealed with Breathe-Easy adhesive (Sigma-Aldrich Z380059), and after an overnight incubation at 37 °C in 5% CO₂, 20 µL of each hormone-conditioned media was applied to duplicate wells using the automated liquid handler. After a 90 min incubation at 37 °C, 5% CO₂, 20 µL of freshly prepared detection reagent (DiscoveRx 93-0001L) was added per well. The plates were incubated at room temperature in the dark for a further 90 min, and luminescence was read on an Envision 2103 with 0.2 s integration. For a hormone-expressing cell line to be considered a hit, its corresponding conditioned supernatant had to increase the assay signal ≥3 sigma above background in both duplicate wells. The median background signal and standard deviation were calculated for each cell type (293-6E, AtT20, or STC-1) separately using all wells derived from the conditioned media plate. Up to 12 orphan GPCR DiscoveRx PathHunter eXpress cell lines were tested in a single round of screening, along with one or more of the following as positive controls: the glucagon receptor GCGR (DiscoveRx 93-0241E2) and the chemerin receptors GPR1 (DiscoveRx 93-0335E2) and CMKLR1 (DiscoveRx 93-0313E2) ( Suppl. Table S2 ).

Insulin and Glucagon Assays

Mature insulin was measured by enzyme-linked immunosorbent assay (ELISA; Mercodia, Uppsala, Sweden, 10-1113-01), following the manufacturer’s protocol. Glucagon and proglucagon protein levels were measured by ELISA (Bio-Techne R&D Systems, Minneapolis, MN, DY1249), following the manufacturer’s protocol. Insulin activity was measured in the rat hepatoma cell line H4IIE (ATCC CRL1548) grown to passage 13 in Eagle’s minimum essential medium (EMEM) with 10% fetal bovine serum. H4IIE cells were starved overnight in EMEM/0.1% BSA and then treated for 10 min at 37 °C, 5% CO₂, with supernatants from AtT20-PC2, STC-1, and 293T cells expressing preproinsulin. The supernatants were diluted in a 1:1 ratio with starvation media prior to treatment, and supernatants from cells expressing vector only with and without exogenous recombinant insulin were used as controls. After treatment, cells were washed with phosphate-buffered saline (PBS), lysed, and phospho-Akt measured according to the R&D Systems ELISA protocol (cat. DYC887). Glucagon activity was measured using the DiscoveRx PathHunter eXpress cells expressing glucagon receptor, as described in “GPCR Activity Assays and Screening” and seeding 2000 cells per well.

Purification of Active FAM237A and FAM237B from AtT20-PC2 Cells

FAM237A-expressing and PiggyBac empty vector-transfected (control) AtT20-PC2 cells were grown in 15 cm plates, and secretion was stimulated by changing the medium to DMEM containing 0.05% BSA, penicillin/streptomycin, and 2 mM barium chloride (Sigma 529591) and incubating at 37 °C for 3 h. FAM237A-expressing and control supernatants were subjected to parallel, identical purification protocols, monitored at each step with the DiscoveRx PathHunter eXpress GPR83 activity assay using the protocol described in “GPCR Activity Assays and Screening” (DiscoveRx 93-0441E2A) and by sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS-PAGE) on a 4%–12% Bis-Tris gel (Bio-Rad, Hercules, CA) in MES running buffer, stained with Sypro Ruby (Thermo Fisher S12000). The supernatants were collected, centrifuged, diluted with 2 volumes of 50 mM Tris (pH 9) and 1 M urea, run over Q HP anion-exchange columns (GE) in 20 mM Tris (pH 9.0) and 1 M urea, and eluted with a linear gradient from 50 mM to 1 M NaCl. This and all subsequent purification steps were performed on an AKTA Purifier (GE, Boston, MA). Aliquots of fractions were dialyzed against 20 mM Tris (pH 8.0) and 150 mM NaCl, and active fractions were pooled and acidified with 1% trifluoroacetic acid (TFA; Sigma T6508). The corresponding fractions from the control supernatants were also pooled and acidified. The pooled samples were run over a 10 × 250 mm C18 column (Higgins Analytical Proto300) in 0.1% TFA, eluting with a linear gradient from 10% to 80% acetonitrile. Aliquots of the fractions were lyophilized and resuspended in 20 mM Tris (pH 8.0) and 150 mM NaCl. Active fractions were pooled and diluted with 0.1% TFA to approximately 10% acetonitrile and then run over a 4.6 × 100 mm C4 column (Vydac 214TP104) and eluted over a linear gradient from 10% to 80% acetonitrile. Active fractions were pooled and lyophilized. FAM237B was purified by the same protocol.

Characterization of AtT20-PC2-Derived FAM237A and FAM237B

For DTT reduction, purified FAM237A protein or control fractions from AtT20-PC2 cells were incubated for 30 min at room temperature with or without 5 mM dithiothreitol (DTT; Sigma). A 25 mM concentration of iodoacetamide (IAA; Sigma) was added to both samples and incubated at room temperature for another 30 min. All of the samples were dialyzed into 20 mM Tris (pH 8.0) and 150 mM NaCl prior to testing in the DiscoveRx PathHunter eXpress GPR83 activity assay. The intact mass of purified FAM237A protein from AtT20-PC2 cells was determined by electrospray ionization mass spectrometry at the University of California, San Francisco Sandler-Moore Mass Spectrometry Core Facility. For N-terminal sequencing, the FAM237A and FAM237B proteins were reduced, run on an SDS-PAGE gel, and transferred to Sequi-Blot PVDF (Bio-Rad 1620184). The relevant bands were cut out and processed by the Tufts University Core Facility for sequencing.

Expression and Purification of PAM

Peptidylglycine α-amidating monooxygenase (PAM) amino acids 1–864 were expressed with a C-terminal human IgG1 Fc tag (PAM-Fc) from the pTT5 vector. CHO-3E7¹⁴ cells were grown in CD DG44 medium (Thermo Fisher) supplemented with 8 mM l-glutamine and 0.18% poloxamer 188 (Corning) and transiently transfected with polyethylenimine (PEI; Polyplus, NY, NY):PAM-Fc DNA at a 5:1 ratio. Twenty-four hours after transfection, the culture was fed with 1% tryptone N1 (Organotechnie, La Courneuve, France) and then the supernatant was harvested 5–6 days later by centrifugation. The supernatant was filtered and then run over a HiTrap Protein A HP column (GE) in 500 mM NaCl, eluting on a linear gradient from 10 mM phosphate (pH 7.4) to 100 mM glycine (pH 2.7). Fractions were analyzed by SDS-PAGE, pooled, dialyzed into 25 mM Tris (pH 7.5) and 150 mM NaCl, aliquoted, and frozen.

Generation of Active FAM237A and FAM237B from E. coli

The nucleotide sequence of FAM237A corresponding to amino acid residues 34–114, or FAM237B amino acid residues 25–113, with a start codon added, was cloned into the pET-24a vector and then transformed into Rosetta E. coli cells (Sigma 70954) following the manufacturer’s protocol. A colony was grown in Terrific Broth with shaking at 37 °C to an OD600 of approximately 0.5, and then 1 mM isopropyl β-d-1-thiogalactopyranoside (IPTG; Sigma) was added. The cells were grown shaking at 37 °C for 4 more hours, and then the pellets were harvested by centrifugation and frozen. The pellets were thawed the next day, resuspended in PBS, and then lysed by microfluidization. The lysate was centrifuged and the pelleted inclusion bodies and cell debris were collected. This pellet was resolubilized in 8 M urea and 5 mM DTT for 1 h rotating at room temperature and then centrifuged. The solubilized protein was collected, filtered, and then run over a Q HP column (GE) in 10 mM Tris (pH 8), 7 M urea, and 1 mM DTT, eluting over a linear gradient from 0 to 1 M NaCl. The fractions were analyzed by SDS-PAGE and then pooled and diluted seven- to eightfold in 100 mM Tris (pH 8.0) with 1 M urea, 100 mM arginine, 1 mM EDTA, 1 mM reduced glutathione, and 0.5 mM oxidized glutathione and stirred at 4 °C for 2 days to refold. The refolded material was centrifuged and filtered and the soluble fraction was loaded onto a Source 15RPC column (GE) in 20 mM ammonium bicarbonate (pH 8.0) and eluted with a linear gradient from 10% to 70% acetonitrile. Fractions containing FAM237A (or FAM237B) by SDS-PAGE were pooled and lyophilized together. To amidate the expressed hormone, we adapted a protocol from Bio-Techne R&D Systems (4837-AM). Briefly, pooled material was resuspended in 50 mM Tris (pH 7.0) and mixed with a dilution series of PAM-Fc in 50 mM Tris (pH 7.0) or buffer alone as a negative control. A 0.1 μM concentration of copper chloride, 5 mM ascorbic acid, and a 1/10 dilution of catalase (Sigma) were added and the reaction rotated overnight at room temperature. The next day the amidation reaction was quenched by adding 1 mM EDTA and 100 mM Tris (pH 9), loaded onto a Source 15RPC column, and eluted as above. Aliquots of fractions were lyophilized and analyzed by SDS-PAGE and the DiscoveRx PathHunter eXpress GPR83 activity assay. This E. coli-produced protein was compared with the protein purified from AtT20-PC2 cells on a 2.1 × 250 mm Zorbax C18 column on an Agilent HPLC in 0.1% TFA, running a 40 min gradient from 40% to 60% acetonitrile.

IP₁ Accumulation Assay

293-6E cells were transiently transfected with the nucleotide sequence corresponding to the full-length open reading frame for GPR83 (amino acid residues 1–423) in the PiggyBac vector (or empty vector as a control) by mixing 1:1 with the PBase transposase vector using a 2:1 PEI:DNA ratio. The transfected 293-6E cells were cultured shaking in FreeStyle 293 medium (Thermo Fisher) for 24 h, fed with 0.5% tryptone N1 (Organotechnie), cultured for another 24 h, and then seeded in suspension at 40,000 cells in 7 µL of IP₁ stimulation buffer (IP-One HTRF assay, Cisbio, Codolet, France, 62IPAPEB) per well in an opaque low-volume, 384-well plate (Greiner, Monroe, NC, 784075). FAM237A, FAM237B, and control fractions generated from E. coli not expressing the ligands were diluted in IP₁ stimulation buffer at twice the final concentration in the assay, and 7 µL was added to the cells immediately after seeding. The plate was sealed and incubated for 60 min at 37 °C, 5% CO₂, and IP₁ levels were measured versus an IP₁ standard curve according to the manufacturer’s instructions.

Results

Identification of Candidate Novel Preprohormone Genes

We developed a method based on the Random Forest of Decision Trees algorithm¹⁰ to identify genes that may represent novel preprohormone convertase substrates. Random Forests is a powerful method for distinguishing classes of individuals based on knowledge of a finite number of individual attributes. The method involves supervised learning with test, training, and unknown datasets containing entries corresponding to the same list of attributes. A significant advantage of this method is that Random Forests are robust to noise and errors. The method as described in “Identification of Candidate Novel Preprohormone Genes,” Materials and Methods, correctly identified all 88 known convertase substrates. This analysis was combined with requirements that the amino acid sequence of the candidate preprohormones be conserved across species and the corresponding genes be co-expressed with the prohormone convertases. Fifty-six candidate preprohormone genes meeting all of these criteria were identified, and 49 were successfully used to construct the screening library as described below ( Table 1 ). Vectors encoding the predicted open reading frames for the remaining seven candidate genes failed to generate stable transfectants ( Suppl. Table S1 ).

Since the original list of 56 candidate novel preprohormone genes was generated, the majority have limited or no publications supporting function to date, and only 8 have five or more gene-specific references in PubMed as of December 2019. For 40 of the 49 genes within the screening library, 25 of the encoded proteins are functionally uncharacterized, and 15 have been shown to encode secreted proteins. Of the 15 with experimental evidence of secretion, only 5 have reported receptors or binding partners. For the residual nine genes, the National Center for Biotechnology Information (NCBI) has removed four genes and designated two as pseudogenes, although this does not rule out that one or more of these six genes encode functional protein(s). Transcripts from both pseudogenes are detected in the Genotype-Tissue Expression (GTEx) project database (www.gtexportal.org). The remaining three library genes have been reported to encode proteins localized to intracellular organelles. The gene annotations for the screening library are summarized in Table 1 .

Construction and Validation of Novel Hormone-Expressing Endocrine Host Cells

To ensure that the candidate preprohormones in our library were properly processed and secreted in active form, we stably transfected them into cell lines that are known to contain the regulated secretion pathway and to express active hormones upon stimulation with secretagogues, such as cAMP ( Fig. 1A ). Because a single host cell type may not correctly process every preprohormone to all possible products, we used two different endocrine host cell lines derived from different tissues to express each gene. This approach increased the possibility that novel peptides would be correctly processed, while mitigating background effects of endogenous hormones expressed by either host cell line. AtT20 cells are derived from a mouse pituitary tumor and endogenously express the prohormone convertase PC1/3 but not PC2. For the library, an AtT20 variant (AtT20-PC2) was used that had been stably transfected with PC2 convertase in order to provide a more complete complement of convertases in a single cell line.¹⁵ The second host cell line, STC-1, is a mouse intestinal neuroendocrine tumor cell line that expresses PC1/3, PC2, cholecystokinin, and secretin. As a control, each candidate preprohormone was also expressed through the constitutive secretion pathway in 293-6E cells, which do not express the convertases.

Figure 1.

The peptide hormone library produces processed, active hormones and was used for GPCR functional screening. (A) Host cell lines AtT20-PC2 and STC-1 were stably transfected to generate cells expressing each candidate preprohormone. The processed hormones remained stored within secretory granules until secretion was induced for 3 h with cAMP, at which point the supernatant with the active hormone was collected and used immediately for assays. (B,C) The process described in A was used to collect supernatants from AtT20-PC2 and STC-1 cells stably transfected with preproinsulin or preproglucagon. 293-6E cells were transiently transfected with the same sequences, and supernatant was collected after 48 h of constitutive secretion. The conditioned supernatant from all three cell types was assayed for (B) insulin and (C) glucagon protein expression (left panels) and biological activity (right panels) as described in “Insulin and Glucagon Assays,” Materials and Methods. AtT20-PC2 and STC-1 cells secreted active, mature insulin protein, but 293-6E did not produce mature hormone by ELISA or activity assay. 293-6E secreted only preproglucagon (C, left panel) because the detected protein had no activity (C, right panel). In contrast, the secreted protein from AtT20-PC2 and STC-1 cells was active. (D) The peptide hormone library was screened for the ability to activate a panel of GPCRs using the PathHunter eXpress cell lines described in “GPCR Activity Assays and Screening,” Materials and Methods. The GPR83 reporter screen results are shown for all peptide hormones produced in STC-1 and AtT20-PC2: each bar represents a well from a 384-well microplate treated with conditioned supernatants from a single STC-1 (light gray) or AtT20 (black) cell line. Each cell line supernatant was tested in duplicate wells. As indicated by asterisks, GPR83 was significantly activated (≥3 sigma or approximately 1.25-fold above median background) by duplicate FAM237A-conditioned supernatants derived from AtT20-PC2 (*, 2.5- to 3-fold above background) and STC-1 (**, 1.4-fold above background). The dashed horizontal bar indicates the values above which the signal is ≥3 sigma for conditioned media from either AtT20-PC2 or STC-1 cell lines.

To validate that this approach produced properly processed and active hormones, we stably transfected preproinsulin and preproglucagon into AtT20-PC2 and STC-1 cells, collected cAMP-stimulated conditioned media supernatants, and assayed them for expressed protein and for bioactivity as described in “Insulin and Glucagon Assays,” Materials and Methods. Conditioned media supernatants from 293-6E cells expressing preproinsulin and preproglucagon were included in the assay as a negative control, because 293-6E cells lack regulated secretion pathway machinery. 293-6E produced proinsulin protein (data not shown) but no detectable mature insulin protein ( Fig. 1B , left panel) and negligible insulin bioactivity ( Fig. 1B , right panel). In contrast, both AtT20-PC2 and STC-1 cells produced abundant mature, active insulin protein ( Fig. 1B ). Using an ELISA that detects both immature proglucagon and mature glucagon, all 3 glucagon-expressing cell lines produced detectable glucagon protein ( Fig. 1C , left panel), but only AtT20-PC2 and STC-1 cells produced bioactive glucagon ( Fig. 1C , right panel). As another control, we used preprochemerin, the hormone ligand for the GPCRs GPR1 and CMKLR1 that is known to be processed into bioactive form by serine proteases expressed by all three cell lines.¹⁶ In contrast to insulin and glucagon, 293-6E as well as AtT20-PC2 and STC-1 cells produced bioactive chemerin (data not shown). These data demonstrate that the host cell lines used to express the candidate novel preprohormone library were capable of properly processing exogenous preprohormones into mature, active hormones that can be assayed in conditioned supernatants. Although we were unable to confirm that the 49 candidate novel preprohormones were properly processed by the host cells, we validated that each stable cell line expressed preprohormone mRNA by qPCR as described in “Generation of Preprohormone-Expressing Cell Library,” Materials and Methods.

Screening of the Candidate Novel Preprohormone Library for Ligands of Orphan GPCRs

Because the receptors for many hormones are GPCRs, we screened the candidate novel preprohormone library for the ability to activate a set of 31 GPCRs ( Suppl. Table S2 ), giving preference to receptors that lacked a confirmed endogenous ligand at the time of screening. Because only a subset of GPCRs have peptides as ligands, the GPCRs selected for screening were enriched by phylogenetic tree analysis, as described in “GPCR Activity Assays and Screening,” Materials and Methods, for receptors that have sequence similarity to known peptide receptors, and may therefore be more likely to be peptide receptors themselves.

Activation of the GPCRs was measured using the PathHunter eXpress cell-based reporter system, which relies on enzyme fragment complementation. CHO or HEK-293 cells were engineered to express (1) the GPCR of interest tagged at the C-terminus with an inactive fragment of the enzyme β-galactosidase, and (2) β-arrestin tagged with the remaining (also inactive) fragment of β-galactosidase. Upon activation, GPCR binding to β-arrestin allows enzyme complementation to occur between the β-galactosidase fragments, resulting in a chemiluminescent signal in the presence of the detection reagents. Each line is confirmed by the manufacturer to express tagged GPCR and β-arrestin by lysing the cells to force enzyme complementation and detecting significant chemiluminescent signal above parental controls. The 31 GPCRs with DiscoveRx PathHunter eXpress cell lines were assayed for ligand-induced signaling after treatment with duplicate conditioned supernatants from the 49 candidate preprohormone and control cell lines expressed by AtT20-PC2, STC-1, and 293-6E cells. Assay optimization and screening is described in “GPCR Activity Assays and Screening,” Materials and Methods.

Supernatants from AtT20-PC2 and STC-1 cells expressing the protein FAM237A induced GPR83 activity 2.8- and 1.4-fold above background, respectively, in the screen ( Fig. 1D ). No other candidate novel peptide hormone induced a significant, reproducible change in activation of GPR83 ( Fig. 1D ) or any other GPCR included in the screen, aside from the positive control GPCRs (data not shown). The predicted amino acid sequence for FAM237A ( Fig. 2A ) revealed the presence of several features highly conserved across species, including a potential convertase cleavage site, C-terminal amidation site, and two cysteines. A search of the human genome revealed a homologous gene FAM237B that retains the motifs in FAM237A ( Fig. 2B ), and we generated AtT20-PC2 cell lines to express the FAM237B predicted open reading frame. Partially purified conditioned supernatants from this cell line activated GPR83 ( Suppl. Fig. S1 ). FAM237B lacked a public transcript entry until recently and was thus absent from the set of unique proteins derived from the public databases that was used to identify the candidate preprohormone genes in the library.

Figure 2.

Sequence conservation of FAM237A and FAM237B. (A) Alignment of the predicted amino acid sequences of FAM237A across species. The greatest conservation is within the predicted mature protein, which is indicated by the box. (B) Alignment of predicted amino acid sequences of human FAM237A and FAM237B, which are 30% identical and 64% homologous. The predicted mature proteins are indicated by boxes and were confirmed by N-terminal sequencing and mass spectrometry of the mature active forms. A second box highlights the predicted amidation site GRR in both protein sequences. Cysteines in the predicted mature sequences are indicated with arrows: two cysteines are conserved and a third is present only in FAM237B.

Purification and Characterization of the Active Species of FAM237A and FAM237B

We selected FAM237A-expressing AtT20-PC2s to purify the active species of FAM237A, because dose-dependent GPR83 activity was consistently greater in AtT20-PC2 cell supernatants compared with STC-1 ( Fig. 3A ). Supernatants from 293-6E cells transiently transfected with cDNA corresponding to the full-length sequence of protein FAM237A were inactive ( Fig. 3A ), suggesting either insufficient expression of secreted protein or that posttranslational processing by cells containing the regulated secretion pathway is necessary for activity.

Figure 3.

Partial purification and characterization of the active form of FAM237A protein from AtT20-PC2 cells. (A) Serial dilution of FAM237A-conditioned supernatants derived from AtT20-PC2 (open circles), STC-1 (open squares), or 293-6E (closed triangles) host cells were assayed in the PathHunter eXpress GPR83 cell line. The AtT20-PC2-conditioned supernatant had significant dose-dependent activity (unpaired t test). The STC-1-conditioned supernatant had significant activity (unpaired t test), which was quickly lost upon dilution, and no activity was observed in the 293-6E-conditioned supernatants. (B) The C18 reversed-phase chromatography A₂₈₀ profiles of FAM237A-conditioned supernatant (darker trace) and control supernatants from AtT20-PC2 cells that do not express FAM237A (lighter trace) are depicted on the same scale, but offset to aid in viewing. The two traces are largely identical, except for fractions E1 and E2 (black bar), where the FAM237A supernatant shows a protein peak (darker arrow) that is absent from the control supernatant (lighter arrow). (C) Sypro Ruby-stained, reducing SDS-PAGE of the fraction E1 containing the FAM237A-specific protein peak from B has a prominent band (arrow) between the 3.5 and 10 kDa markers that is absent in the corresponding control fraction. (D) GPR83-stimulating activity as measured by the PathHunter eXpress assay is present in the E1 and E2 fractions (black bar), corresponding to the peak highlighted by the arrow in B. (E) The intact mass of active, partially purified FAM237A from AtT20-PC2 cells was determined by electrospray ionization mass spectrometry. Two main species of 9288.55 and 9346.64 Da were identified, which match the predicted masses of disulfide-bonded FAM237A amino acid residues 34–114 and 34–113 with C-terminal amidation.

Conditioned supernatants of FAM237A-expressing AtT20-PC2 cells were fractionated by serial chromatography as described in “Purification of Active FAM237A and FAM237B from AtT20-PC2 Cells,” Materials and Methods, using the PathHunter eXpress GPR83 activity assay to identify active fractions throughout the purification. FAM237A-expressing cells were stimulated with barium chloride to enhance release of active hormone, and the conditioned supernatants were subjected to sequential anion-exchange, C18, and C4 reversed-phase chromatography. Conditioned supernatants from AtT20-PC2 cells transfected with control (empty) vector were harvested and subjected to the same purification steps as a control. The A₂₈₀ profile of fractions from the C18 reversed-phase step for FAM237A-expressing supernatants had an A₂₈₀ absorbance peak in fractions E1 and E2 that was absent from the control ( Fig. 3B , arrow and black bar). The FAM237A fractions contained a 9 kDa band identified by SDS-PAGE of fraction E1 that was not observed in the control fraction ( Fig. 3C , arrow). When the C18 reversed-phase fractions were tested in the PathHunter eXpress GPR83 assay, a peak of activity was observed in fractions E1 and E2, corresponding to the presence of the A₂₈₀ peak and 9 kDa band ( Fig. 3D , black bar). After further purification with C4 reversed-phase chromatography, the intact mass of this protein was determined by electrospray ionization mass spectrometry ( Fig. 3E ). Two main species of 9346.64 and 9288.56 Da were identified that match the predicted masses of disulfide-bonded FAM237A amino acids 34–114 or C-terminally amidated 34–113, respectively. N-terminal sequencing (not shown) determined that the 9 kDa peptide starts at histidine 34 ( Fig 2A , N-terminal start of boxed sequence). To determine if a disulfide bond is present and required for FAM237A activity, purified protein or control fractions were reduced with DTT, alkylated with iodoacetamide, and assayed for activity ( Suppl. Fig. S2 ). Reduced FAM237A was inactive, confirming the requirement of a disulfide bond for activity. FAM237B was expressed, purified, and characterized in a similar manner to FAM237A, demonstrating a similar A₂₈₀ peak corresponding to the active fractions and a 9 kDa band in SDS-PAGE ( Suppl. Fig. S1 ). The N-terminus of the mature FAM237B protein was also confirmed with N-terminal sequencing as indicated in Figure 2B .

Generation of Purified, Recombinant, Bioactive FAM237A and FAM237B from E. coli

To confirm that we had accurately defined the structural features critical for induction of GPR83 activity, FAM237A (residues 34–114) and FAM237B (residues 25–113) were expressed in E. coli and processed in vitro. The proteins were harvested from E. coli inclusion bodies, solubilized, partially purified by anion-exchange chromatography, refolded, and further purified on a Source 15RPC column. Lysates from E. coli transformed with empty vector were used as controls and subjected to the identical purification and refolding protocol. Because mass spectrometry suggested that C-terminally amidated protein was present in the protein purified from AtT20-PC2 cells, the purified, E. coli-produced material was amidated with varying amounts of a PAM-Fc fusion protein generated as described in Materials and Methods and then repurified on the Source 15RPC column. The final E. coli-purified FAM237A material was a single, approximately 9 kDa band by SDS-PAGE ( Fig. 4A ) that had an elution profile on reversed-phase high-performance liquid chromatography (HPLC) indistinguishable from that of protein purified from AtT20-PC2 supernatants ( Fig. 4B ). The bioactivity of the protein as measured by the PathHunter eXpress GPR83 assay was completely dependent on PAM-Fc ( Fig. 4C ). Because FAM237B contains a third, presumably unpaired, cysteine at position 52 that could interfere with proper disulfide formation during refolding, an additional version of FAM237B containing a serine substituted at that position was also expressed, purified, and tested for activity. The C52S FAM237B was monomeric and had similar activity to the wild-type protein (data not shown). These data taken together suggest that the active forms of FAM237A and FAM237B are cleaved from inactive precursors and require C-terminal amidation and disulfide bond formation.

Figure 4.

Generation of purified, active recombinant FAM237A expressed in E. coli requires amidation. FAM237A was expressed in E. coli, refolded and purified as described in “Generation of Active FAM237A and FAM237B from E. coli,” Materials and Methods. (A) Nonreducing SDS-PAGE of purified material stained with Coomassie. (B) E. coli-produced FAM237A was indistinguishable from AtT20-PC2-purified FAM237A on an analytical C18 reversed-phase HPLC column. (C) GPR83 activation in the PathHunter eXpress assay by E. coli-produced, refolded FAM237A with no modification or enzymatically amidated with varying concentrations of PAM-Fc protein as described in “Expression and Purification of PAM,” Materials and Methods. No activity was observed in the FAM237A protein lacking amidation (no PAM, solid diamonds). Activity was maximal or near maximal in FAM237A treated with 1 or 4 µg/mL PAM (solid triangles or x symbols, respectively). Intermediate activity was observed for FAM237A treated with 0.25 µg/mL PAM (solid squares).

Identification of GPR83 Signaling Pathways Activated by FAM237A/B

Native sequence GPR83 was transfected into 293-6E host cells, which lack endogenous GPR83, and the cells were stimulated with a dose titration of E. coli-purified, bioactive FAM237A, FAM237B, or negative control. The cells were then assayed by using the IP-One HTRF assay as described in “IP₁ Accumulation Assay,” Materials and Methods, for production of IP₁, an indirect measure of the generation of the second messenger IP₃, which is produced downstream of GPR83 coupling to Gαq subunits. Both FAM237A ( Fig. 5A ) and FAM237B ( Fig. 5B ) dose-dependently stimulated the accumulation of IP₁ in GPR83-expressing (solid circles), but not in vector-transfected 293-6E cells (solid triangles), with an EC₅₀ of approximately 3.2 nM. Control fractions taken from purification of empty vector-transformed E. coli (open squares) were inactive on GPR83-expressing cells at dilutions matching FAM237A and FAM237B samples. FAM237B was at least 100-fold less potent under these conditions, and we were unable to produce sufficiently concentrated FAM237B to demonstrate a saturable dose–response curve. Neither ligand stimulated significant cAMP accumulation or reduced cAMP levels using forskolin and IBMX to modulate cAMP levels of GPR83-expressing cells (data not shown). These data indicate that FAM237A and FAM237B stimulate signaling through native sequence GPR83 via G_αq subunits generating IP₃ as a second messenger.

Figure 5.

FAM237A and FAM237B stimulate GPR83-dependent IP₁ accumulation. 293-6E cells were transiently transfected with native GPR83 or empty vector, treated in duplicate with the indicated concentrations of (A) FAM237A or (B) FAM237B purified from E. coli, or a matching dilution series of mock-purified, empty vector-transformed E. coli, and assayed for IP₁ accumulation by using the IP-One HTRF assay as described in “IP₁ Accumulation Assay,” Materials and Methods. (A) FAM237A induced a dose-dependent, saturable increase in IP₁ specifically on GPR83-expressing cells (solid circles), but not vector-transfected cells (solid triangles). The mock-purified control (open squares) had no activity in the assay. The EC₅₀ for the IP₁ response on the GPR83-expressing cells was calculated by using a four-parameter variable slope for a log(agonist) versus response curve to be 3.2 nM (95% CI 1.6–9.8 nM), excluding the response at the highest dose of 860 nM due to the apparent Hook effect. (B) FAM237B also induced a dose-dependent increase in IP₁ specifically in the GPR83-expressing cells (solid circles); however, the response was not saturated at the highest dose. NA, media alone, cells not treated with FAM237A, FAM237B, or the mock purification control. Error bars indicate range.

FAM237A Expression in Normal Human Tissues

FAM237A and GPR83 mRNA expression was assessed in normal tissues by using the GTEx project database (www.gtexportal.org). Consistent with published reports, GPR83 is expressed broadly in the human brain,¹⁷ particularly in the cerebellar regions, as well as in the testis and thyroid gland ( Suppl. Fig. S3 , upper panel). FAM237A transcript was expressed at lower levels in the brain in an overlapping pattern compared with GPR83, and was also detected in the pituitary, testis, and heart ( Suppl. Fig. S3 , lower panel). An analysis of a single-cell transcriptome profiling dataset from human pancreatic islet cells¹⁸ showed that FAM237A was expressed in a subset of alpha cells that also express glucagon ( Suppl. Fig. S4 ).

Discussion

We describe here a process that was developed to identify novel peptide hormones. Our approach was to generate a set of novel preprohormone candidate genes based on known peptide hormone properties, express candidate proteins in host cell lines with a regulated secretory pathway capable of processing preprohormones into mature, active peptides, and functionally screen for hormone activity using GPCR cell-based reporter assays. Once a novel hormone activity is identified, the host cell lines can be scaled up to purify and characterize the mature active form, which is a key advantage given that standard expression hosts lack the posttranslational processing machinery required for many hormones, hormones often have restricted or very low expression endogenously, and synthesis is often not feasible depending on the posttranslational modifications. Expressing each candidate gene in two cell lines increased the likelihood that the hormone was properly processed and minimized the effect of endogenous expression of known hormones by the host cells, which may confound assay results. It is still possible that some candidate genes are not fully processed in these hosts, which can be addressed by incorporating one or more host lines derived from other tissues into the library. For the seven candidate genes that failed to generate stable lines in either host, preliminary troubleshooting suggested that expression of these genes interfered with host cell expansion in some way. For these clones, in addition to a new host cell, an inducible or weaker promoter might mitigate this issue and enable stable pool production, and cloning into more than one vector should be considered for these and future additions to the library.

We demonstrated the utility of our library by identifying a posttranslationally modified form of FAM237A protein, as well as its homolog FAM237B, that is capable of activating GPR83 via the G_αq signaling pathway. This is the first report of a biological function for FAM237A or FAM237B, showing the ability of our approach to identify novel biology. GPR83 has been shown to signal via the G_αq pathway,^19,20 which is consistent with the activation by FAM237A. The ability of FAM237A and FAM237B proteins to activate GPR83 was dependent on specific modifications that are characteristic of peptide hormones and only made by host cells containing the regulated secretion pathway. This supports their identities as prohormones and further demonstrates the utility of the system for identifying novel hormone activities, as standard expression hosts such as 293-6E did not produce active protein. Compared with the STC-1 cell line, AtT20-PC2 cells produced more active FAM237A protein, which could be because these cells originate from the pituitary, where FAM237A expression has been detected. We do not know if the active form of FAM237A or FAM237B described here corresponds to the fully mature endogenous form in humans, since we are using a cell line derived from a murine host. However, the EC₅₀ for FAM237A activation of transiently expressed GPR83 was estimated as 3.2 nM, which is within the range typically observed for other hormone–receptor pairs.⁸ Further confirmatory studies that FAM237A (and FAM237B) is a physiologically relevant GPR83 ligand must also be done, including generation of GPR83 binding curves, specific activity in cells endogenously expressing GPR83, and characterization of the mature peptide form in a cell that endogenously expresses FAM237A. GPR83 has been reported to form hetero-oligomers with other GPCRs, including GPR171,²⁰ GHSR, MC3R, and MC4R;²¹ thus, the ability of FAM237A and FAM237B to activate GPR83 hetero-oligomers should also be assessed. The E. coli-produced FAM237B had significantly reduced potency compared with FAM237A in activating GPR83, and comparison of their sequences revealed an unpaired cysteine at position 52 that is not conserved in FAM237A. Unpaired cysteines can reduce the activity of E. coli-produced proteins; however, a C52S substitution had no effect on FAM237B activity. The reduced potency of FAM237B may instead reflect a physiological role distinct from activation of GPR83 homo-oligomers.

The identification of an additional candidate peptide ligand for GPR83 should further facilitate characterization of GPR83 biology. GPR83 has been reported to be activated by multiple modalities, including basal activity regulated by the N-terminal extracellular domain,^21,22 Zinc(II),¹⁹ and most prominently the ligand neuroendocrine peptide PEN.²⁰ PEN is a peptide hormone with no significant homology to FAM237A (2% identity of mature forms by Clustal Omega¹¹) that is derived from the precursor proSAAS encoded by the PCSK1N gene, which is processed into at least five other peptides: SAAS, GAV, PEN, bigLEN, and littleLEN. PEN binding to GPR83 induces a dose-dependent reduction in cAMP and an increase in PLC activity, indicating PEN induces GPR83 to signal through G_αq and the inhibitory G_αi G-proteins.²⁰ Zinc(II) induces activation of GPR83 signaling via G_αq but not G_αi or G_αs.¹⁹ The dose-dependent increase in IP₁ induced by FAM237A is consistent with inducing GPR83 signaling through G_αq. In contrast, we were unable to detect a consistent, dose-dependent effect of FAM237A on GPR83 resulting in an increase or decrease in cAMP levels. While the signaling pathways downstream of GPR83 appear to be context or ligand dependent, GPR83 activation in response to FAM237A is consistent with the published modalities.

The expression pattern of FAM237A in public transcript datasets is also consistent with a role in GPR83 biology. In normal human tissues, GPR83 and FAM237A expression overlap within the brain, as does the published ligand PEN, which is reported to be robustly expressed in the hypothalamus.²⁰ GPR83 is largely conserved in human and mouse, with the strongest levels within the brain, including cerebellar, hypothalamic, hippocampal, and amygdaloid regions.¹⁷ The expression of FAM237A in the pituitary and pancreatic islet alpha cells is potentially consistent with a role in modulating GPR83 function in the thyroid or in metabolism, respectively.

GPR83 expressed in the central nervous system may play a role in neurological functions, including emotion, learning, reward processing, and metabolism.^17,23 Knockdown of GPR83 in the hypothalamic preoptic area reduces core body temperature and elevates circulating levels of adiponectin.²⁴ Analysis of GPR83 knockout mice has suggested that the receptor may be involved in stress, reward and learning,²⁵ and the regulation of systemic energy metabolism.²⁶ GPR83 is also expressed in mouse FoxP3⁺ regulatory T cells (Tregs)¹⁷ and has been implicated as a mediator of immunosuppressive Treg formation or function during inflammation. Transfer of GPR83-overexpressing CD4⁺ T cells into mouse models of inflammation is associated with reduction in inflammation and expression of Treg markers in the adoptively transferred cells.²⁷ This suggests the possibility that FAM237A and/or PEN may be upregulated by inflammation in vivo to activate GPR83 and induce peripheral generation of Tregs, either directly or indirectly.

The fact that GPR83 is the only receptor for which we identified a novel candidate ligand as part of our initial screen may result from a combination of biological and technical causes. The GPCRs in the screen were selected by sequence-based clustering with GPCRs that have confirmed peptide ligands. Despite this enrichment, it is likely that some of these GPCRs do not have peptide ligands and that the candidate hormones in our collection have receptors that are not GPCRs, as is the case for insulin. The GPCR PathHunter eXpress cell lines were not validated to have plasma membrane expression of active GPCRs, and so may lack sufficient surface receptors for a detectable response. The reporter cell line limitations could be mitigated by using different reporter lines and validating surface expression.

It has been noted that the rate of novel peptide discovery and GPCR de-orphanization has slowed, which has been attributed to a number of causes, including the complexity of signaling pathways and posttranslational processing.⁹ Our success rate in identifying novel ligand–receptor interactions is consistent with other GPCR de-orphanization attempts using β-arrestin recruitment as a readout,²⁸ which may be caused by distinct signaling mechanisms of the remaining orphan GPCRs or the pleiotropic nature of GPCR signaling.²⁹ This strongly argues that orthogonal assays are required to further interrogate the library. Techniques such as label-free dynamic mass redistribution that are agnostic to downstream signaling pathway have been successfully used to identify GPCR ligands in a variety of cellular formats in a high-throughput manner.³⁰ This would enable GPCRs to be screened in a pooled fashion or in primary cells, which would also allow for the identification of ligands for hetero-oligomers. Given our success in identifying a novel activator for GPR83, future screens should incorporate a broader array of GPCRs, particularly those of therapeutic relevance, to increase the likelihood of identifying novel ligand–receptor interactions.

Beyond direct identification of novel ligand–receptor interactions, we believe that our peptide hormone library is compatible with most in vitro assays for the identification of novel candidate hormone functions, particularly for activities that can only be detected with endogenous receptors under more physiological conditions. To identify hormones that induce complex physiological phenomena, such as behavior or metabolism, the peptide hormone cell line collection is adaptable for in vivo approaches by implantation of host cell lines via alginate beads into appropriate murine model systems to assess the effects on behavior or metabolism. We believe that our approach can contribute to the future characterization of novel hormones, and that the novel hormone we identified, FAM237A, may play an important role in GPR83 function in the central nervous, endocrine, metabolic, or immune system.

Supplemental Material

Supplemental_Material_for_Peptide_Hormone_Platform_Identified_Candidate_GPR83_Ligands_by_Sallee_et_al – Supplemental material for A Pilot Screen of a Novel Peptide Hormone Library Identified Candidate GPR83 Ligands

Supplemental material, Supplemental_Material_for_Peptide_Hormone_Platform_Identified_Candidate_GPR83_Ligands_by_Sallee_et_al for A Pilot Screen of a Novel Peptide Hormone Library Identified Candidate GPR83 Ligands by Nathan A. Sallee, Ernestine Lee, Atossa Leffert, Silvia Ramirez, Arthur D. Brace, Robert Halenbeck, W. Michael Kavanaugh and Kathleen M. C. Sullivan in SLAS Discovery

Footnotes

Acknowledgements

The authors thank DiscoveRx, Jackie Chan, Thomas Bray, Shawn Russell, Grayson Kochi, Kaumudi Bhawe, Nallakkan Arvindan, John Cesarek, Elizabeth Bosch, Hongbing Zhang, and Jin Zhou for technical assistance, and Richard Mains and Lewis T. Williams for guidance and advice. The data used for the convertase co-expression analysis and were obtained from the GTEx portal, which is supported by the Common Fund of the Office of the Director of the National Institutes of Health, and by NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS. The work described in this manuscript work was funded by Five Prime Therapeutics Inc.

Supplemental material is available online with this article.

Authors’ Note

Kathleen M. C. Sullivan is now affiliated with ChemoCentryx.

Declaration of Conflicting Interests

The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: All authors were employed by Five Prime Therapeutics, and their research and authorship of this article was completed within the scope of their employment with Five Prime Therapeutics.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

References

Nussey

Whitehead, S. Endocrinology: An Integrated Approach; Oxford, 2001.

Vivoli

Lindberg

Prohormone Convertase 1/3. In Handbook of Biologically Active Peptides, 2nd Ed.; Kastin

A. J.

, Ed.; Academic Press: San Diego, CA, 2013, pp 1789–1796.

Seidah

N. G.

Prat

The Biology and Therapeutic Targeting of the Proprotein Convertases. Nat. Rev. Drug Discov. 2012, 11, 367–383.

Takahashi

Mizusawa

Posttranslational Modifications of Proopiomelanocortin in Vertebrates and Their Biological Significance. Front. Endocrinol. 2013, 4, 143.

Mirabeau

Perlas

Severini

; et al. Identification of Novel Peptide Hormones in the Human Proteome by Hidden Markov Model Screening. Genome Res. 2007, 17, 320–327.

Sonmez

Zaveri

N. T.

Kerman

I. A.

; et al. Evolutionary Sequence Modeling for Discovery of Peptide Hormones. PLoS Comput. Biol. 2009, 5, e1000258.

Gustincich

Batalov

Beisel

K. W.

; et al. Analysis of the Mouse Transcriptome for Genes Involved in the Function of the Nervous System. Genome Res. 2003, 13, 1395–1401.

Armstrong

J. F.

Faccenda

Harding

S. D.

; et al. The IUPHAR/BPS Guide to Pharmacology in 2020: Extending Immunopharmacology Content and Introducing the IUPHAR/MMV Guide to Malaria Pharmacology. Nucleic Acids Res. 2020, 48, D1006–D1021.

Ozawa

Lindberg

Roth

; et al. Deorphanization of Novel Peptides and Their Receptors. AAPS J. 2010, 12, 378–384.

10.

Breiman

Cutler

Random Forests. http://www.stat.berkeley.edu/~breiman/RandomForests (accessed June 8, 2020).

11.

UniProt

UniProt: A Worldwide Hub of Protein Knowledge. Nucleic Acids Res. 2019, 47, D506–D515.

12.

Durocher

Perret

Kamen

High-Level and High-Throughput Recombinant Protein Production by Transient Transfection of Suspension-Growing Human 293-EBNA1 Cells. Nucleic Acids Res. 2002, 30, E9.

13.

Lin

Lee

Hestir

; et al. Discovery of a Cytokine and Its Receptor by Functional Screening of the Extracellular Proteome. Science 2008, 320, 807–811.

14.

Raymond

Robotham

Kelly

; et al. Production of Highly Sialylated Monoclonal Antibodies. In Glycosylation; Petrescu, S., Ed.; InTech: Rijeka, Croatia, 2012, p 397.

15.

Zhou

Bloomquist

B. T.

Mains

R. E.

The Prohormone Convertases PC1 and PC2 Mediate Distinct Endoproteolytic Cleavages in a Strict Temporal Order during Proopiomelanocortin Biosynthetic Processing. J. Biol. Chem. 1993, 268, 1763–1769.

16.

Zabel

B. A.

Allen

S. J.

Kulig

; et al. Chemerin Activation by Serine Proteases of the Coagulation, Fibrinolytic, and Inflammatory Cascades. J. Biol. Chem. 2005, 280, 34661–34666.

17.

Lueptow

L. M.

Devi

L. A.

Fakira

A. K.

Targeting the Recently Deorphanized Receptor GPR83 for the Treatment of Immunological, Neuroendocrine and Neuropsychiatric Disorders. Prog. Mol. Biol. Transl. Sci. 2018, 159, 1–25.

18.

Segerstolpe

Palasantza

Eliasson

; et al. Single-Cell Transcriptome Profiling of Human Pancreatic Islets in Health and Type 2 Diabetes. Cell Metab. 2016, 24, 593–607.

19.

Muller

Kleinau

Piechowski

C. L.

; et al. G-Protein Coupled Receptor 83 (GPR83) Signaling Determined by Constitutive and Zinc(II)-Induced Activity. PloS One 2013, 8, e53347.

20.

Gomes

Bobeck

E. N.

Margolis

E. B.

; et al. Identification of GPR83 as the Receptor for the Neuroendocrine Peptide PEN. Sci. Signal 2016, 9, ra43.

21.

Muller

Berkmann

J. C.

Scheerer

; et al. Insights into Basal Signaling Regulation, Oligomerization, and Structural Organization of the Human G-Protein Coupled Receptor 83. PloS One 2016, 11, e0168260.

22.

Martin

A. L.

Steurer

M. A.

Aronstam

R. S.

Constitutive Activity among Orphan Class-A G Protein Coupled Receptors. PloS One 2015, 10, e0138463.

23.

Fakira

A. K.

Peck

E. G.

Liu

; et al. The Role of the Neuropeptide PEN Receptor, GPR83, in the Reward Pathway: Relationship to Sex-Differences. Neuropharmacology 2019, 157, 107666.

24.

Dubins

J. S.

Sanchez-Alavez

Zhukov

; et al. Downregulation of GPR83 in the Hypothalamic Preoptic Area Reduces Core Body Temperature and Elevates Circulating Levels of Adiponectin. Metab. Clin. Exp. 2012, 61, 1486–1493.

25.

Vollmer

Ghosal

Rush

; et al. Attenuated Stress-Evoked Anxiety, Increased Sucrose Preference and Delayed Spatial Learning in Glucocorticoid-Induced Receptor-Deficient Mice. Genes Brain Behav. 2013, 12, 241–249.

26.

Muller

T. D.

Muller

C. X.

; et al. The Orphan Receptor Gpr83 Regulates Systemic Energy Metabolism via Ghrelin-Dependent and Ghrelin-Independent Mechanisms. Nat. Commun. 2013, 4, 1968.

27.

Hansen

Loser

Westendorf

A. M.

; et al. G Protein-Coupled Receptor 83 Overexpression in Naive CD4+CD25- T Cells Leads to the Induction of Foxp3+ Regulatory T Cells In Vivo. J. Immunol. 2006, 177, 209–215.

28.

Southern

Cook

J. M.

Neetoo-Isseljee

; et al. Screening Beta-Arrestin Recruitment for the Identification of Natural Ligands for Orphan G-Protein-Coupled Receptors. J. Biomol. Screen. 2013, 18, 599–609.

29.

Hauser

A. S.

Gloriam

D. E.

Brauner-Osborne

; et al. Novel Approaches Leading Towards Peptide GPCR De-Orphanisation. Br. J. Pharmacol. 2020, 177, 961–968.

30.

Schroder

Schmidt

Blattermann

; et al. Applying Label-Free Dynamic Mass Redistribution Technology to Frame Signaling of G Protein-Coupled Receptors Noninvasively in Living Cells. Nat. Protoc. 2011, 6, 1748–1760.

31.

Zhang

Pao

L. I.

Zhou

; et al. Deorphanization of the Human Leukocyte Tyrosine Kinase (LTK) Receptor by a Signaling Screen of the Extracellular Proteome. Proc. Natl. Acad. Sci. U.S.A. 2014, 111, 15741–15745.

32.

Quagliarini

Wang

Kozlitina

; et al. Atypical Angiopoietin-Like Protein That Regulates ANGPTL3. Proc. Natl. Acad. Sci. U.S.A. 2012, 109, 19751–19756.

33.

L. W.

Y. Y.

X. Y.

; et al. A Novel Tumor Suppressor Gene ECRG4 Interacts Directly with TMPRSS11A (ECRG1) to Inhibit Cancer Cell Growth in Esophageal Carcinoma. BMC Cancer 2011, 11, 52.

34.

Podvin

Dang

Meads

; et al. Esophageal Cancer-Related Gene-4 (ECRG4) Interactions with the Innate Immunity Receptor Complex. Inflamm. Res. 2015, 64, 107–118.

35.

Moffatt

Thomas

Sellin

; et al. Osteocrin Is a Specific Ligand of the Natriuretic Peptide Clearance Receptor That Modulates Bone Growth. J. Biol. Chem. 2007, 282, 36454–36462.

36.

Enomoto

Ohashi

Shibata

; et al. Adipolin/C1qdc2/CTRP12 Protein Functions as an Adipokine That Improves Glucose Metabolism. J. Biol. Chem. 2011, 286, 34552–34558.

37.

Kobayashi

Fukuhara

Taguchi

; et al. Identification of a New Secretory Factor, CCDC3/Favine, in Adipocytes and Endothelial Cells. Biochem. Biophys. Res. Commun. 2010, 392, 29–35.

38.

Tsuchida

Bonkobara

McMillan

J. R.

; et al. Characterization of Kdap, a Protein Secreted by Keratinocytes. J. Invest. Dermatol. 2004, 122, 1225–1234.

39.

Nishino

Yamashita

Hashiguchi

; et al. Meteorin: A Secreted Protein That Regulates Glial Cell Differentiation and Promotes Axonal Extension. EMBO J. 2004, 23, 1998–2008.

40.

Zhang

Chen

Sairam

M. R.

Novel Hormone-Regulated Genes in Visceral Adipose Tissue: Cloning and Identification of Proinflammatory Cytokine-Like Mouse and Human MEDA-7: Implications for Obesity, Insulin Resistance and the Metabolic Syndrome. Diabetologia 2011, 54, 2368–2380.

41.

Parry

D. A.

Brookes

S. J.

Logan

C. V.

; et al. Mutations in C4orf26, Encoding a Peptide with In Vitro Hydroxyapatite Crystal Nucleation and Growth Activity, Cause Amelogenesis Imperfecta. Am. J. Hum. Genet. 2012, 91, 565–571.

42.

Ouyang

Y. Z.

Qin

X. H.

; et al. Placenta-Specific 9, a Putative Secretory Protein, Induces G2/M Arrest and Inhibits the Proliferation of Human Embryonic Hepatic Cells. Biosci. Rep. 2018, 38, BSR20180820.

43.

Guo

Z. C.

; et al. Adipocyte-Derived PAMM Suppresses Macrophage Inflammation by Inhibiting MAPK Signalling. Biochem. J. 2015, 472, 309–318.

44.

Korfanty

Toma

Wojtas

; et al. Identification of a New Mouse Sperm Acrosome-Associated Protein. Reproduction 2012, 143, 749–757.

45.

Viegas

C. S. B.

Costa

R. M.

Santos

; et al. Gla-Rich Protein Function as an Anti-Inflammatory Agent in Monocytes/Macrophages: Implications for Calcification-Related Chronic Inflammatory Diseases. PloS One 2017, 12, e0177829.

46.

Salcher

Hagenbuchner

Geiger

; et al. C10ORF10/DEPP, a Transcriptional Target of FOXO3, Regulates ROS-Sensitivity in Human Neuroblastoma. Mol. Cancer 2014, 13, 224.

47.

Lopez

M. B.

Garcia

M. N.

Grasso

; et al. Functional Characterization of Nupr1L, a Novel p53-Regulated Isoform of the High-Mobility Group (HMG)-Related Protumoral Protein Nupr1. J. Cell. Physiol. 2015, 230, 2936–2950.

48.

Van Vranken

J. G.

Bricker

D. K.

Dephoure

; et al. SDHAF4 Promotes Mitochondrial Succinate Dehydrogenase Activity and Prevents Neurodegeneration. Cell Metab. 2014, 20, 241–252.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.84 MB

A Pilot Screen of a Novel Peptide Hormone Library Identified Candidate GPR83 Ligands

Abstract

Keywords

Introduction

Materials and Methods

Identification of Candidate Novel Preprohormone Genes

Endocrine Cell Line Sourcing and Culture

Generation of Preprohormone-Expressing Cell Library

Generation of Conditioned Media from Candidate Preprohormone-Expressing Cells for Screening

GPCR Activity Assays and Screening

Insulin and Glucagon Assays

Purification of Active FAM237A and FAM237B from AtT20-PC2 Cells

Characterization of AtT20-PC2-Derived FAM237A and FAM237B

Expression and Purification of PAM

Generation of Active FAM237A and FAM237B from E. coli

IP1 Accumulation Assay

Results

Identification of Candidate Novel Preprohormone Genes

Construction and Validation of Novel Hormone-Expressing Endocrine Host Cells

Screening of the Candidate Novel Preprohormone Library for Ligands of Orphan GPCRs

Purification and Characterization of the Active Species of FAM237A and FAM237B

Generation of Purified, Recombinant, Bioactive FAM237A and FAM237B from E. coli

Identification of GPR83 Signaling Pathways Activated by FAM237A/B

FAM237A Expression in Normal Human Tissues

Discussion

Supplemental Material

Supplemental_Material_for_Peptide_Hormone_Platform_Identified_Candidate_GPR83_Ligands_by_Sallee_et_al – Supplemental material for A Pilot Screen of a Novel Peptide Hormone Library Identified Candidate GPR83 Ligands

Footnotes

Acknowledgements

Authors’ Note

Declaration of Conflicting Interests

Funding

References

Supplementary Material

IP₁ Accumulation Assay