Abstract
The Gene Ontology (GO) initiative is a collaborative effort that uses controlled vocabularies for annotating genetic information. We here present AGENDA (
Introduction
The emergence of novel genetic techniques and the exponential accumulation of genomic data have increased the need for bioinformatics tools.1,2 Biological ontologies facilitate the handling of complex biological data and contribute to the interoperability across multiple data sources.3,4 The Gene Ontology (GO) database summarizes information about the molecular functions, cellular components, and biological processes related to gene products. 5 Many tools have been created to search, browse, and analyze the GO database. 6 Many of these tools accept only a single gene or GO term as an input, hampering systematic comparisons between GO annotations associated with different GO terms and genes: Complex biological questions that, for example, involve more than one biological process or molecular function cannot be addressed if only one GO term is considered. Similarly, when elucidating a certain biological mechanism, sets of genes rather than single genes are often the focus, raising the need to simultaneously access GO associations of multiple genes. Another limitation in accessing the GO database is that while most programs (eg, EasyGO, 7 GOstat, 8 Onto-Express 9 ) produce a short list of significantly enriched GO terms,10,11 they do not allow to query particular GO terms independent of enrichment, which might be of interest if one wants to know which of the genes that are linked to one GO term are associated with a second, user-defined term.
Here we present AGENDA (
Methods
AGENDA is a web-based application developed using Apache web server and server-side scripting that employs complex SQL queries. The content data of the internal MySQL server is obtained from the GO database archive. 13 HTML pages are dynamically created with PHP and CSS, and supported with JavaScript for the user interactivity. The platform-independent program was successfully tested for cross-browser compatibility on common web browsers. The charts are created dynamically using Google Chart Tools 14 and results of queries are downloadable as CSV files. The application is accessible at http://bioagenda.uni-goettingen.de. The source code and the documentation are freely available under the GNU GPL license for download from the website http://sourceforge.net/projects/bioagenda.
Results
AGENDA offers simple and advanced modes of retrieving GO information that are described below. 12 organisms are supported:
Apart from simple queries that focus on only one GO term or gene, two types of batch queries are supported: First, different, user-defined GO terms can be simultaneously queried using the GO Slimmer, a method that uses parent-child relationships between GO terms to compare gene lists of interest with lists that are annotated to GO terms. GO Slimmer identifies overlap between these lists and produces “GO Slims” that quantify the overlap for each GO term. In AGENDA, gene lists of interest are always related with certain GO terms, but GO Slims can be also produced if different gene sets of interests, such as whole genomes of specific organisms, are used as query input.16,17 Second, queries of different user-defined GO terms can be combined via Boolean operators (AND, OR, NOT). Each query option is represented by a separate web-page in the program. Data from one page can fully be transferred to another so that different types of queries can be linked. Web-pages of the program include input fields, output fields, and charts for visualization of the results.
GO terms can be queried in AGENDA using accession numbers, names or synonyms (if any). When querying apoptotic proteins, for example, “GO:0006915” (accession number), “apoptotic process” (name), or “apoptosis” (synonym) can be typed in as the input. In a similar manner, a gene product can be queried using its symbol, full name or synonyms (if any). For example, “TP53” (symbol), “Cellular tumor antigen p53” (full name) and “P53” (synonym) are all accepted when querying human TP53. A detailed user guide describing this query expansion and other features of AGENDA is available as a web page (http://bioagenda.uni-goettingen.de/userguide.php).
Case study
Many forms of cancer arise from alterations in apoptosis
18
The Gene Ontology database can, for example, be used to find out which genes are implicated in apoptosis (GO:0006915) in humans, and which of the respective gene products localize to mitochondria (GO:0005739), the nucleus (GO:0005634), and the plasma membrane (GO:0005886). Using simple queries only, the cellular localizations of each of the 1771 human genes that are associated with apoptosis would need to be accessed individually. Using the GO Slimmer page of AGENDA, all these GO terms can be simultaneously accessed and the respective information can be obtained with a single mouse click (Fig. 1). Using Boolean queries, in turn, it is, for example, possible to assess which of the human apoptosis genes are associated with mitochondria or the nucleus but not the plasma membrane, linking in one query all the three Boolean operators to delineate genes. By simply exchanging the name of the species, genes of eg, zebrafish or

Screenshot of the “GO Slimmer” option in AGENDA. The screenshot displays the cellular localizations of human genes that are implicated in apoptosis.
Discussion
We have presented a novel tool for accessing and mining GO data. While simple search options are similar to the standard services provided by the AmiGO browser, 19 AGENDA employs a new interface for performing complex queries that include different GO terms and species. The GO content of AGENDA is updated regularly using MySQL dump files that are downloaded manually from the GO database archive. 13 To synchronize AGENDA with the latest GO database releases, we plan to implement an automated update. AGENDA undergoes active development to suit the needs of the research community. For example, AGENDA currently does not support the upload and analysis of user-specified gene lists. Our future goal is to enable the uploading of such files for in-depth analysis. Further perspective of development includes the expansion of the queries to all species in the GO database and the implementation of AJAX (Asynchronous JavaScript and XML) to further simplify the usage of AGENDA. AGENDA is an open source application that is freely available for non-commercial use. As the size and value of Gene Ontology is growing steadily together with our understanding of cellular mechanisms, the impact of tools for browsing and mining GO data becomes more apparent. AGENDA complements the existing bioinformatics tools for mining the GO database and provides new functions for accessing GO information.
Author Contributions
Conceived and designed the experiments: GO, MCG. Analysed the data: GO, QL. Wrote the first draft of the manuscript: GO. Contributed to the writing of the manuscript: QL, MCG. Agree with manuscript results and conclusions: GO, QL, MCG. Jointly developed the structure and arguments for the paper: GO, QL, MCG. Made critical revisions and approved final version: GO, QL, MCG. All authors reviewed and approved of the final manuscript.
Funding
This work was supported by the NRW IGS GFG (International Graduate School in Genetics and Functional Genomics) (to G.O.) and the DFG (Deutsche Forschungsgemeinschaft) (Go 1092/1-1) (to M.C.G.).
Disclosures and Ethics
As a requirement of publication author(s) have provided to the publisher signed confirmation of compliance with legal and ethical obligations including but not limited to the following: authorship and contributorship, conflicts of interest, privacy and confidentiality and (where applicable) protection of human and animal research subjects. The authors have read and confirmed their agreement with the ICMJE authorship and conflict of interest criteria. The authors have also confirmed that this article is unique and not under consideration or published in any other publication, and that they have permission from rights holders to reproduce any copyrighted material. Any disclosures are made in this section. The external blind peer reviewers report no conflicts of interest.
Footnotes
Acknowledgments
We thank the GWDG (Gesellschaft für wissenschaftliche Datenverarbeitung mbH Göttingen) for technical support. We acknowledge the Gene Ontology Consortium as the source of Gene Ontology data and Google for providing its Charts API infrastructure.
