Abstract
Laboratory informatics is defined as the specialized application of information technology to optimize and extend laboratory operations. Rising with the tide of informatics in general, laboratory informatics is one of the fastest growing areas of laboratory-related technology. However, this technological growth has outstripped the expertise of the ones who stand to gain most from it: scientists and other end users in the laboratory. This gap could be bridged by specialists in laboratory informatics who are well grounded in the scientific basis of lab operations, yet also trained in informatics and its particular applications in the lab. However, formal educational programs in laboratory informatics are lacking. To set an educational agenda, laboratory informatics must be delineated as a field, and based on that delineation, a curriculum must be developed that meets the standards of higher education. To address these issues, this paper gives an overview of informatics, describes the context of laboratory informatics in this setting, explains the emergence of laboratory informatics as a distinct field, sets the place for laboratory informatics in higher education by suggesting the nature and scope of the curriculum, and briefly describes the laboratory informatics initiative at Indiana University.
Introduction
The use of the term
Laboratory informatics is defined as the specialized application of information technology to optimize and extend laboratory operations. It encompasses data acquisition, lab automation, instrument interfacing, laboratory networking, data processing, specialized data management systems (such as chromatography data systems), laboratory information management systems, scientific data management (including data mining and data warehousing), and knowledge management (including the use of electronic laboratory notebooks). Laboratory informatics has risen with the tide of informatics in general and is one of the fastest growing areas of laboratory-related technology. This growth is fueled by the commoditization of hardware and software, the integration of laboratory instrumentation and data systems, and the convergence of automation and information technologies. However, this technological growth has outstripped the expertise of the ones who stand to gain most by it: scientists and other end users in the laboratory. This gap could be bridged by specialists in laboratory informatics who are well grounded in the scientific basis of lab operations yet are also trained in informatics and its particular applications in the lab. However, formal educational programs in laboratory informatics are lacking. To set an educational agenda, laboratory informatics must be delineated as a field, and on the basis of that delineation, a curriculum must be developed that meets the standards of higher education, consistent with the typical expectations for related programs in science and engineering.
To address these issues, this paper gives an overview of informatics, describes the context of laboratory informatics in this setting, explains the emergence of laboratory informatics as a distinct field, sets the place for laboratory informatics in higher education by suggesting the nature and scope of the curriculum, and briefly describes the laboratory informatics initiative at Indiana University.
An Overview of Informatics
Categorization of Informatics
To categorize informatics, it must be placed in the larger context of all computer-related activities. We start with a two-dimensional field, the axes of which range from

Mapping of computer-related activities.
This field has four quadrants. Quadrant I concerns computers and humans, in both cases on the small scale—the one-on-one experience of an individual working with a desktop computer, for example. This quadrant is the realm of human-computer interaction, the formal study of how people use computers and ways to improve such use. Quadrant II deals with computers on the small scale and humans on the large scale—many people using computers individually, as occurs with e-business (online auctions, banking, and shopping) and, increasingly, with online education and distance learning. The dynamics in Quadrant III concern large-scale computer-based systems and large numbers of people, with emphasis on the effect of pervasive computing and media technology on people. The fields of communication, media studies, and social informatics are most interested in these dynamics. Finally, in Quadrant IV, the interplay involves individuals or small groups dealing with large-scale computing; for example, a molecular biologist working with impossibly large data sets or a chemist trying to locate compound information that is hidden somewhere in hundreds of heterogeneous databases. This is the home of bioinformatics, cheminformatics, and health (or medical) informatics, to name the most well-known fields. However, this quadrant is also home to fields outside of science—music informatics, to give but one example—that confront the same issues.
The fields in Quadrant IV all have binomial names in which prefixes, adjectives, or adjectival nouns signifying the underlying discipline are used to modify the term
It is important to note that informatics is not simply additive in nature; that is, it is not just some domain (discipline-specific) knowledge plus some information technology; rather, it is the synthesis of domain knowledge with specialized tools that have been developed or optimized to advance the underlying discipline in some specific way.
Commonalities and Differences within Informatics
Informatics has five major functions, all of them acting upon data: transformation, integration, analysis, synthesis, and representation. In
A comparative analysis of informatics brings out the commonalities of the subsumed fields. One shared characteristic is the highly specialized use of information technology for discipline-specific purposes, in which the discipline comes first and the technology second. Another characteristic is that all fields must confront the management of massive amounts of heterogeneous data, which calls for highend solutions in networking, data storage, and so on. Third, these fields all contribute to research in their underlying disciplines by providing analysis of data in completely new ways, often turning from a priori prediction to a posteriori discovery. Last, the fields within informatics put emphasis on the novel use of data visualization, both for primary analysis and secondary dissemination. (In the case of music informatics, of course, this so-called visualization can be temporal as well as spatial.)
However, in addition to commonalities between these fields, there are also differences. This might come as a surprise, considering that all of these fields draw upon fundamentally the same technology. In fact, the differences between these fields are actually more significant than their similarities. The reasons for this are professional, technical, and even cultural, based on such factors as the types of data used in the field, the particular problems that challenge each field, and what pursuits are most valued by their practitioners. Although it is obvious that this is true when comparing bioinformatics with, say, music informatics, this is even true for those fields that might appear more closely aligned. The intentions and goals of bioinformatics, for example, differ entirely from those of health informatics, as do the kinds of tools used and their methods of application. (Here,
Some specific examples include the following: (1) bioinformatics uses highly specialized software and information systems in the study of genomics and proteomics, but these specific tools are of no use in geoinformatics; (2) in health informatics, mobile networking and bedside access to information (say, electronic medical records accessible by tablet PCs) are of great interest, but these are of little or no interest in bioinformatics; and (3) data pipelining and high-throughput screening are technologies that are important in cheminformatics but are unimportant in music informatics. In general, although the fields within informatics share the same basic enabling technology, they are surprisingly distinct and different from each other. The significant difference between these fields has practical implications for informatics education, which are discussed further in this paper.
The Origin of Informatics
There is an increasing disparity between

The growing information gap.
Structural Hierarchy of Informatics
The central purpose of informatics is the transformation of data into information and of information into knowledge. A
In developing his theory of information, Shannon (summarized by Boisot and Canals 9 ) assigned three levels of
The Emerging Field of Laboratory Informatics
Within the realm of informatics, laboratory informatics is rapidly emerging as a distinct specialty. The following sections delineate this field's content on the basis of a model of laboratory data flow, organize this content according to the structural hierarchy of informatics, and suggest a curriculum for laboratory informatics based on this content.
The Scope of Laboratory Informatics
Laboratory informatics can be approached from the standpoint of data flow in the laboratory. In the diagram shown in Fig. 3, the overall data flow is from left to right in three major stages that loosely correspond to the structural hierarchy of informatics:

Data flow in the laboratory.
On the basis of this data flow model, subjects falling under the purview of laboratory informatics can be grouped as in Table 1.
Subject content of laboratory informatics
The Impetus for Laboratory Informatics
The final output of any laboratory, operating in any industry, is information. Similar to the other fields of informatics, laboratory informatics arose from a widening information gap. This gap was caused by the astonishing developments in information technology, the convergence of laboratory instrumentation, allowing potentially seamless data flow, and the explosion of data created by unprecedented advances in laboratory automation. Laboratory instruments have evolved from manually operated, stand-alone devices into digitally controlled, network-aware systems that are at once autonomous and cooperative. Laboratory automation has advanced to the point that even common, bench-top instruments have automated functions in everything from sample handling to data analysis. Consequently, modern scientific output has extended the informational demands of labs far beyond simple data processing. Software applications for various laboratory operations, once hugely expensive and difficult to use, are evolving into affordable, off-the-shelf packages. This puts sophisticated information systems within reach of even the smallest labs. Furthermore, in the pharmaceutical and other industries, federal agency regulations to maintain data integrity, security, and validity are driving laboratories to adopt solutions provided by new developments in laboratory informatics. Finally, the relentless push to improve productivity through increased automation will justify further expenditures in laboratory informatics.
An Educational Model for Laboratory Informatics
Laboratory informatics is rapidly developing as a field, heretofore without benefit of formal educational support. Of the people now working in a capacity related to laboratory informatics, most have had a long and winding road in getting to their current position. As pioneers in an emerging field, they have had to obtain knowledge and skills in this area on a per-need basis. They might have a formal education in a laboratory-based science, with information technology grafted on, or they might be information technologists who have specialized in supporting laboratory operations, with a passing understanding of the underlying science. Neither situation is ideal. Yet the long-term demand for these specialists will continue to increase. The time is right to bring laboratory informatics into higher education and establish a formal curriculum.
In informatics, the best educational model is one in which individuals already possessing a formal education in a discipline are trained in informatics as it applies to that discipline. This works best at the graduate level. Students can enter a graduate program in informatics with undergraduate degrees in a related discipline. Because they enter the program with a solid knowledge base in the discipline, their studies can concentrate on subjects related to informatics in general and the relevant discipline-specific informatics in particular. For example, in the case of laboratory informatics, an individual with an undergraduate degree in a laboratory-based science (e.g., chemistry, microbiology, or clinical laboratory science) could seek a graduate degree in laboratory informatics. The alternate pathway would be for someone trained in information technology to obtain further education in a laboratory-based science as well as specific training in laboratory informatics. However, this alternate pathway is not equivalent, because the individual must obtain far more education in a laboratory-based science to be adequately knowledgeable in the discipline. Usually this is a case of impossible prerequisites. To take a course in biochemistry, for example, one must have taken organic chemistry, which in turn requires a course or two in general chemistry, and so on.
The level of graduate education for laboratory informatics, that is, the type of degree obtained, should be based on the intended result. For laboratory informatics, emphasis would be on professional practice, which is consistent with one of the major objectives of a Master of Science degree. Typically, this degree requires 30 to 36 credit hours, consisting of both coursework and advanced, individual work for so-called capstone credit, most often a research project and subsequent thesis.
The curriculum in laboratory informatics should be set by the general requirements for the degree and by the field's subject content. The general requirements include those subjects that anyone with an advanced degree in informatics, regardless of their particular field, would be expected to know. These subjects include a general overview of informatics, spanning all of its subsumed fields, information theory, information management (including database design), data security, informatics project management, and an introduction to informatics research. In addition, students should be aware of the social and economic impact of the information revolution.
For the core curriculum in laboratory informatics, much of its content is derived from mapping the general data flow in the laboratory (Fig. 3) and is organized by the structural hierarchy of informatics (Table 1). In addition, there are subjects that surround the data flow that deal with the conditions under which data flow occurs. Such subjects include regulatory compliance, systems validation, and quality assurance and control.
The technical knowledge needed by laboratory informaticians requires a solid understanding of current laboratory software applications, corporate databases and information technology systems, operation support software, manufacturing support software, and the general classes of equipment that are used in analytical, research, and production laboratories. The technical skills needed by laboratory informaticians should include the ability to configure and maintain hardware operations relating to all aspects of data management. Programming skills should include the ability to design, program, test, debug, and modify programs in languages used in the laboratory. Project management skills, including establishing requirements and specifications for software and hardware implementation, are also necessary.
Last, graduate education at its best is not simply more coursework continued after undergraduate study. The single greatest factor that distinguishes graduate education from undergraduate education is its requirement for advanced individual scholarship, research, or professional practice. For laboratory informatics to be accepted in academe, it must be held to the same academic standards as its counterparts in science, engineering, and computer science. Moreover, one of the hallmarks of a true profession is that its own practitioners contribute to the professional body of knowledge through scholarship and research. Therefore, it is essential that graduate study in laboratory informatics include this component. This activity should culminate in a final document, the thesis.
Here, then, is a summary of possible subjects to be included in a curriculum for a graduate program in laboratory informatics. (Each subject listed does not necessarily represent a separate course.)
Prerequisite:
Bachelor's degree in chemistry or other laboratory-based science
General requirements:
Overview of informatics
Information theory
Information representation
Information organization
Information management
Social and economic impact of information
Database structures and models
Interfaces and networks
Data security
Informatics project management
Major requirements:
Programming
Data acquisition
Laboratory automation
Instrument interfacing
Laboratory networking
Data transfer protocols
Data processing
Instrument-specific data systems
Data integration systems
Database management systems
Laboratory information management systems
Scientific data management
Statistics and data analysis
Data mining
Scientific visualization
Electronic laboratory notebooks
Scientific dissemination
Data archiving and warehousing
Knowledge management
Thesis requirements:
Introduction to informatics research
Individual scholarship in laboratory informatics
Individual research in laboratory informatics
Individual professional practice in laboratory informatics
Thesis preparation
Thesis defense (or presentation)
In 2002, the Indiana University School of Informatics founded on its Indianapolis campus the Laboratory Informatics Graduate Program, the first of its kind in the country. Curriculum development for this program was supported by a grant from the Alfred P. Sloan Foundation under its Professional Science Masters Program. A current list of courses offered in the curriculum is given below. The first cohort of graduate students was admitted in fall of 2003. Efforts are currently underway to expand the curriculum to include clinical laboratory informatics. The Laboratory Informatics Graduate Program is a positive first step in establishing laboratory informatics in higher education. Its structure is as follows.
Common Informatics Core (6 credit hours [cr]):
INFO I501 Introduction to Informatics (3 cr)
INFO I502 Information Management (3 cr)
Laboratory Informatics Core (12 cr):
CHEM 699 Chemical Information Technology (3 cr)
INFO I510 Data Acquisition and Laboratory Automation (3 cr)
INFO I511 Laboratory Information Management Systems (3 cr)
INFO I512 Scientific Data Management and Analysis (3 cr)
Electives (12 cr total; some examples below):
CHEM 621 Advanced Analytical Chemistry (3 cr)
CHEM 629 Chromatography (3 cr)
CSCI 503 Operating Systems (3 cr)
CSCI 504 Concepts in Computer Organization (3 cr)
CSCI 536 Computer Networks (3 cr)
CSCI 541 Database Systems (3 cr)
CSCI 590 Topics in Computer Science (1–3 cr)
INFO I503 Social Impact of Information Technology (3 cr)
INFO I505 Informatics Project Management (3 cr)
INFO I540 Data Mining for Security (3 cr)
INFO I550 Legal & Business Issues in Informatics (3 cr)
INFO I553 Independent Study in Chemical Informatics (1–3 cr)
INFO I575 Informatics Research Design (3 cr)
INFO I590 Topics in Informatics (1–3 cr)
STAT 511 Statistical Methods I (3 cr)
STAT 513 Statistical Quality Control (3 cr)
Capstone (6 cr):
INFO I693 Thesis/Project (1–6 cr; can be repeated for a total of 6 credits)
Conclusion
It is widely recognized that we have entered a new historical epoch, the Information Age, which is as significant and climactic as the Industrial Revolution that preceded it. Economists speak of the new information economy, in which knowledge workers have become the dominant labor force, cosmologists frame their theories of the universe in terms of its information content, and biologists categorize organisms according to their genetic information. Once considered the ancillary descriptor of phenomena, information itself has become the central object of study. Higher education is embracing this revolution by forming new schools and departments, or expanding existing ones, that offer programs in computer science, information technology, information science, and informatics. Laboratory informatics will certainly be part of this academic transformation, to the benefit of laboratories everywhere.
