Abstract
The
Introduction
New, large genomic data sets are providing more in-depth insights into the diagnosis and treatment of disease. In the past decade, new and innovative methods have continued to add value to the underlying data and uncover the secrets of the genome. Visual data inspection by experienced researchers is an important quality control element in the analytical process. Additionally, data visualization helps one to prioritize downstream analysis and verification steps. Unfortunately, this part of the process is tedious and time consuming, and the increasing volumes of high-throughput sequencing data of various types and platforms are proving to be a major analytical challenge. Here, we report a visualization tool that allows researchers to explore their data at a very rapid speed and significantly reduce the burden of reviewing tens and hundreds of thousands of variant calls. Areas with systematic read errors can be quickly identified, and inefficient attempts to verify results in noisy regions can be avoided.
Features and Methods
Alview is a fast and portable visualization tool. The core code interfaces with Heng Li et al's SAMtools Library 1 for parsing BAM files. The program is written in platform-independent C. Peculiarities specific to an operating system are isolated with if defined (ifdef) directives; so, for instance, when Microsoft Visual C provides alternate support for a portable operating system interface (POSIX) standard function, a handcrafted, native interface work around is supplied.
For graphical user interface (GUI) frameworks, Alview uses WIN32 interface for Windows, the GTK2 interface for Linux, and BSD Unix-based systems and Cocoa for Apple Mac OS X. SAMtools 1 is written to POSIX standards, but different Microsoft Visual compilers provide various levels of support for these UNIX style standards. As a result, the source code for third-party libraries that were modified for Windows is provided to facilitate compiling and linking Alview on Windows.
The main code for Alview, in the file
Alview can also be compiled as a webserver daemon that uses the common gateway interface (CGI) 3 standard. The CGI version produces interactive html output and uses dynamic HTML54 features, including zoom in by selection via a jQuery 5 library. The CGI webserver Alview version loads a list of permitted-to-access BAM files from a user-maintained text file; so custom lists of BAM files of interest are easy to generate and use. The source code is free and open to modification so that users and local system operators can implement their own security.
The Alview CGI webserver version provides modifiable URL access, so that, for instance, cells in a spreadsheet can link to viewable results for any sample or location. A user-generated custom HTML file can link to specific samples and regions. Stand-alone Alview accepts parameters that specify BAM file name and genomic coordinates. Invoking Alview in a script can create a slideshow of interesting regions. For example, fields in a single nucleotide polymorphism (SNP) detection output file can be used to specify a series of calls to Alview to generate images for each purported polymorphism or mutation. The results can be quickly and easily reviewed by researchers. Users can generate text to annotate the slideshow images. A template is provided for command line creation of slide shows. The burden of reviewing ten and hundreds of thousands of mutation calls can therefore be significantly reduced.
The source code is available at GitHub. 6 The README file there points to links for selected executables and complete download packages that include the associated reference genome data. A live webserver version of Alview for examining public human cancer short-read datasets is available at https://cgwb.nci.nih.gov/cgi-bin/alview. The core source code for Alview is in the public domain. It uses some permissive free software licensed libraries. Alview source code and executables for several operating systems are available at the National Cancer Institute (NCI)/National Cancer Informatics Program's (NCIP's) GitHub site: https://github.com/NCIP/alview. Developers may modify Alview as they wish. NCI retains the copyrights to “National Cancer Institute” and associated images, which may not be used in forked projects.
Results
Alview provides a solid substructure that allows for various types of access to short-read data across different operating systems. Figure 1. demonstrates the various navigation and information buttons available in the web version of Alview and shows how selection via mouse provides zoom in capabilities. Alview is a trim, fast, precise tool and complements existing programs such as the Integrated Genomics Viewer (IGV), 7 BamView, 8 and GBrowse 2.0. 9 The benefits of Alview are extreme speed and a sharp focus on exploring short reads.

Information and navigation in Alview – upper left is original and lower right is zoom in via mouse drag to examine SNP. Various navigation buttons and information blocks assist in browsing BAM files.
Comparison of Alview with other programs should not be judged solely on benchmarks. Compounding factors include operating system cache effects and internet congestion. Different implementation philosophies can influence memory usage and performance but provide useful alternative paths to solving similar problems.
IGV provides much more functionality than Alview by supporting many other input file types other than BAM sequence read files. IGV's Java implementation provides
IGV requires registration for download for running off of disk, whereas Alview does not. Desktop IGV may require internet for full, easy, simple operation, whereas Alview does not require network connection (though it may call user-invoked external webpages). Alview operation does not log any user activity. On a Windows 7 Intel Core i5–2400 CPU at 3.10 GHz and 8 GB RAM, restarts of IGV v2.3 took from 12 to 18 seconds. Restarts of Alview took a small fraction of one second. For a small view of a genomic region, the Java Platform SE Binary for IGV took up 292 Mb, while Alview took up 11 Mb.
Author Contributions
Design and coding: RF. Design and testing: CN, CH, CY, YH, MA, XB and Project management: DM. All authors reviewed and approved of the final manuscript.
