Abstract
The U.S. National Cancer Institute's Division of Cancer Epidemiology and Genetics (DCEG) conducts population-based and interdisciplinary research to discover the genetic and environmental determinants of cancer. Many DCEG studies are large, multi-institutional, and long-term with national and international study sites involved in the multiple research steps. Current information technology challenges involved in such epidemiological studies include: (1) management and harmonization of a multitude of data types (demographic, environmental, biospecimen, laboratory, analytic, molecular, etc.); (2) unprecedented amounts of data; (3) efficient data mining to derive insights into disease etiology; and (4) secure collaboration between study management systems. If not adequately addressed, all of these challenges will increase the cost of performing studies and decrease the speed of publication. DCEG is examining current data management practices to better utilize recent advances in information technology to enhance its scientific program. This analysis is providing strategic guidance in enhancing interoperability among current data systems, further automating specimen management practices, defining metadata strategies to allow for better cross study comparability and reusability, and in planning for integration of new technologies in support of DCEG's epidemiology research. Early results from the effort include better communication of information technology requirements between contractors and investigators, as well as progress on several focused data interoperability projects, including Web services transactions for biorepository interoperability and improved analytic support utilizing data warehouses.
Get full access to this article
View all access options for this article.
