Abstract
The Clinical Data Interchange Standards Consortium has developed a Laboratory Model for laboratory data that is generated during the conduct of clinical trials. The Laboratory Model is the first step in proposing standards for the interchange of clinical trial laboratory data. Standards will decrease the time and resources required by stakeholders in the pharmaceutical development process (pharmaceutical companies, biotechnology companies, contract research organizations and laboratories). Standardization will therefore contain costs as well as improve data quality.
FULL TEXT
Anyone who has worked with clinical trial data knows how much time and effort is involved in developing database specifications and structure; validating programs; exporting data from one system, importing into another, and verifying the transfer; then applying quality control procedures to the data at the receiving end. Such transfers occur continually in clinical trials and are extremely costly and time consuming. 1 This is especially true of data generated in clinical laboratories to support the conduct of clinical trials. Data generated during the conduct of clinical trials may include a percentage of laboratory data as high as 60–80%. There are often many players in the laboratory chain: Reference, specialty or performing laboratory, central laboratory, contract research organization (CRO), pharmaceutical company and pharmaceutical partners. Data transfers generally occur to one or more parties on multiple occasions over the course of a trial. Each transfer involves a variety of personnel: programmers, clinical data analysts, quality assurance auditors, clinical reviewers and laboratory personnel.
Some central clinical laboratories are now required to support hundreds of different interfaces in order to accommodate the varying data requirements of pharma and biotech companies and CROs. Each database format requires initial development and validation, staff training, formal maintenance and appropriate quality control.
Pharmaceutical companies, biotechnology companies, regulators, CROs, clinical investigators, technology providers and clinical laboratories — essentially everyone involved in conducting or supporting the conduct of clinical trials for drug research and development - would benefit from a common interchange standard for clinical trial data, especially clinical laboratory data. 1 One study estimates that the drug development industry spends more than $156 million annually to support data transfers to and from electronic data capture providers, clinical laboratories and CROs in global clinical trials (not including integration for merging companies nor electronic regulatory submissions). T he development of clinical data interchange standards is the mandate of the Clinical Data Interchange Standards Consortium (CDISC).
CDISC began as a small group of interested parties who organized a panel discussion on this topic in 1997 at an industry conference. In 1998, interested parties were invited to form a Special Interest Advisory Committee (SIAC) within the Drug Information Association (DIA). From these modest beginnings, CDISC has developed into a not-for-profit organization with hundreds of active participants and liaison groups in Europe and Japan. More than 75 companies, including pharmaceutical and biotechnology companies, technology and software developers and clinical laboratories, provide funding as corporate sponsors and members. Over the last five years, many companies involved in clinical research have realized that the maintenance and support of proprietary database structures and data models are counterproductive to the timely generation of high quality data. 1 Thus, support of CDISC continues to grow.
The open nature of CDISC is at the heart of its operations. CDISC's multidisciplinary vendor neutral approach, independent of implementation strategy and platform, contributes to the applicability and acceptance of the industry standards it is developing. The work of CDISC is accomplished through multidisciplinary volunteer teams that have focused on the development of models to support and facilitate a number of difference types of electronic data interchanges, including:
General clinical trial data from multiple acquisition sources into an operational database (Operational Data Model or ODM)
Clinical laboratory data from a performing, reference or central laboratory into an operational database at a sponsor pharma ceutical or biotechnology company or CRO (Laboratory Data Model)
Operational data from a sponsor pharmaceutical or biotec nology company to a regulatory agency (Submission Data Model or SDS)
Datasets to support statistical reviews of electronic regula tory submissions (Analysis Dataset Model or AdaM)
The focus of this article is the development of the CDISC Laboratory Model, which is now available for review and comment on the CDISC website.
The CDISC LAB Team was formed in August 2000 and currently has representatives from four pharmaceutical companies, four central laboratories, two CROs and one technology application developer. Membership includes expertise from a variety of technical disciplines including clinical laboratory medicine, data systems, information technology, programming and software development. The group has sought to include both academic and clinical perspectives. At this time, the LAB Team has representation from both the United States and two European countries.
The LAB Team's mission was the development of a standard model for the acquisition and interchange of clinical trial laboratory data, testing of the model with representative laboratory data to ensure functionality and the exploration of other opportunities to improve clinical trial laboratory data processing by the development of standards. One of the team's first considerations was determining what other data standards already exist and why they have not been adopted to handle laboratory data generated during the conduct of clinical research. Certainly, other standards have been proposed including standards of the American Society for Testing Materials (ASTM), Health Level 7 (HL 7), the Association for Clinical Data Management (ACDM) and X12. However, by their lack of support and adoption in the drug development process, these already existing standards would appear to have limited applicability to the large volumes of data that are generated during the conduct of clinical trials. Most often, the primary application of existing standards is the handling of general healthcare information or transactional data transfers.
The structure of the previously existing standards does not accommodate the specific requirements of clinical trial laboratory data. Often, other standards are inefficient and difficult to use. They may demonstrate a lack of flexibility that makes their adoption for the various demands of clinical trials difficult. Other standards have inadequate field definitions leading to too much variability in their adoption, and population rules often do not match the structures and inter-relationships of clinical trial data. In addition, some of these other standards are not well known and are certainly not supported by the drug development community. In the absence of industry standards, each company involved in drug development has developed their own. One of the largest central clinical laboratories has estimated that they have used 1200 different data formats to serve these clients.
After determining that no existing standard was adequate for clinical trial needs, the LAB Team next considered the requirements for a new laboratory data transfer standard. To ensure acceptance, the new standard must:
For all stakeholders, provide a clear business advantage
For sponsors, reduce the cost, complexity and resources required to manage lab data
For central laboratories, reduce the number of data formats that must be supported, thereby reducing costs and resource requirements
For CROs, reduce the hours and number of personnel required to support multiple transfer formats, thereby lowering costs and increasing their marketability
For all players, carry reasonable and justifiable costs for converting to the new standard
The LAB Team then worked to define what the industry means by “clinical laboratory data” in order to build a superset of data items that fully describes the laboratory elements of a clinical trial to the satisfaction of the stakeholders involved. This superset of fields constitutes the substance of the model.
Next, the structural definitions of those data items were defined in terms of data type, length, default values, standards of representation, code lists and whether or not the items were optional or required. Wherever possible, appropriate existing standards were employed; one of the guiding principles of CDISC is working with other professional groups to encourage maximum sharing of information and minimum duplication of efforts. This combined information constitutes the Laboratory Model that is now available for review, comment and use on the CDISC web site.
The LAB Team decided that the model should have a main core designed to handle “simple” laboratory data with the classic “one test, one result” data structure such as routine Chemistry and Hematology testing. More complex testing, such as that required for Microbiology or Pharmacogenomics, would require the development of extensions which would be added later. The version of the Laboratory Model currently available on the CDISC website handles routine testing only. The LAB Team is now working on the Microbiology and Pharmacogenomic extensions.
As the LAB Team continued its work, the notion of a multilayer model developed in which the first layer would be the content and, above that, would be an implementation layer. The team wished to develop the content in a manner that would be constant, even if the method of implementation were to change. The advantage of this approach is that it offers enhanced flexibility. The model is not dependent on any single implementation method and, if different implementations are used, the content of the model remains the same, thus the same standards still apply. The default implementation for the Laboratory model is bar delimited ASCII but SAS and XML implementations are also supported.
The CDISC Laboratory Model design was based on an analysis of why existing standards are little used. Certain assumptions were made regarding what an ideal model should be, thus providing a set of goals or specification for the new model, which can be summarized as follows:
The model should offer clear advantages over other clinical lab data interchange standards
The model should be designed specifically to handle clinical trial lab data
The structure and content of the model should be intuitive and clearly understandable
The model should be sufficiently flexible so that it could be applied to all types of lab data, keeping pace with testing evolution, but not sacrificing simplicity to cope with outliers
The first release of the model should be as comprehensive as possible to avoid the need for continual updates
The model should not be limited to a single specific imple mentation and thereby risk rejection due to technical incompatibility
The development of the model should concentrate first on content and then implementation
The model should accommodate controlled flexibility to support differing preferences
The model should support data interchange between all types of players in the industry
The model should use existing standards and draw upon the work and experience of existing standards organizations
A separate model for the interchange of reference range data should be developed, based on the main Laboratory Model
The model should support both incremental as well as cumulative data interchange
The model should support the interchange of data from one or more studies in a single data file
The model should support complex (e.g., Microbiology) as well as very long (e.g., gene sequences) test results
The model must incorporate identification and tracking of specimens at various levels (e.g., kit, specimen within kit)
The model must permit identification of timed collections as well as planned versus actual collection times
The model must permit the differentiation of collection versus receipt versus reporting identifiers
The model must support a variety of collection and reporting units
The model must support a variety of reporting flags and descriptors
The superset of data fields in the CDISC Laboratory Model are separated into ten logical levels as follows:
Good Transmission Practice
Study
Site
Subject
Visit
Accession
Container
Panel
Test
Result
These levels were chosen because they follow the recognizable hierarchy of clinical laboratory data generated during the conduct of clinical trials.
Detailed information on the CDISC Laboratory Model and on each of these levels is available on the website. Visitors to the site may access or download a number of documents relating to the Laboratory Model, including:
a spreadsheet with the content of the Laboratory Data Model
a Reference Range spreadsheet
a Word document with supporting documentation and explanation of the CDISC Laboratory Model
Although each type of standard data model developed by CDISC is unique, the CDISC process for developing standards involves certain common steps. In accordance with this process, after internal and external focused reviews, a review version of the developing standard is posted for public review and comment. After incorporation of comments received from this public review, a released or production version of the standard is posted on the website as Version 1.0.
In accordance with this process, LAB Team members tested the CDISC Laboratory Model and the model modified based on the results of that testing. Those who tested the Laboratory Model reported that the model is easy to use because the structure and levels of the model are clear and logical. Thus, for central clinical laboratories, population of the data fields is straightforward and unambiguous. Clear relationships could be seen between model requirements and the location of data within clinical data management systems (CDMS). Pharmaceutical companies who tested the model reported that data was easily extracted from the files and that the logical organization of the data and the ease of its extraction allowed straightforward translation into existing technical infrastructures and applications.
The Laboratory Model was then sent to a Review Committee of industry experts and additional modifications were made based on their feedback. The Laboratory Model was then posted for public review and comment on the CDISC website and a final round of modifications made to the model before its posting as the first production version of the model, Version 1.0.
For next steps, as mentioned, the LAB Team is working on extensions to the model to handle more complex data types as well as extensions for various approaches to implementation of the model. In addition, the team will prospectively collect implementation case study metrics from sponsors, CROs, central laboratories and technology vendors who work with us while adopting the Laboratory Model.
CDISC is working closely with Health Level 7 (HL7) and has helped to form a Technical Committee within HL7 to explore areas of common interest and collaboration. This Technical Committee is called Regulated Clinical Research and Information Management (RCRIM). Work is focusing on harmonizing CDISC models with the HL7 reference information model. This is currently an area of focus for the CDISC Lab Team. The collaboration with HL7 is of particular interest to the Lab Team as it permits us to capitalize on HL7's 15 years of experience in standards development and ANSI accreditation. It is hoped that close collaboration will result in accreditation for the CDISC Laboratory Model.
We encourage all those interested or involved in laboratory data interchange to support the conduct of clinical research to visit the CDISC website and learn about the organization and specifically the work of the LAB Team. We invite your feedback and would welcome your involvement in our work.
