Sage Journals: Discover world-class research

Abstract

The exposure of clinicians to patients with rare gastrointestinal diseases is limited. This hurts clinical studies, which impedes accumulation of scientific knowledge on the natural disease course, treatment outcomes and prognosis in these patients. An excellent method to detect patterns on an aggregate level that would not be possible to discover in individual cases, is a registry study. This paper aims to describe a template to create a successful international registry for rare diseases. We focus mainly on rare hepatic diseases, but lessons from this paper serve other fields in medicine, as well.

Keywords

Clinical registry database gastrointestinal disease liver disease liver transplantation practical guide rare diseases rare liver disorders registry design

Introduction

Increasing our knowledge about rare liver disorders, commonly defined as a disorder that affects <1 in 2000 citizens, is imperative.¹ Because most physicians are not exposed to large numbers of rare disease patients, their knowledge on the natural course, treatment response and prognosis for that rare disease is incomplete. These difficulties clearly limit our understanding and are an obstacle for research efforts to improve the outlook of patients with rare diseases.

Registries may be the answer to the lack of solid evidence. By definition, a registry is an organized system that uses observational study methods to collect existing or uniform clinical data from individual patients.² A registry offers a unique opportunity to conduct research on populations and conditions that are not generally studied in clinical trials, yet are important to clinical decision-makers.³

The steps in creating a registry study do not differ much from the implementation of a clinical trial. All the fundamental elements, such as design, study population, timeline and data management are likewise present. By contrast, there is no standard guidance as to how to design a registry. A helpful open access resource is Registries for Evaluating Patient Outcomes: A User's Guide,² from the Agency for Healthcare Research and Quality.

The purpose of this article is to provide a methods-based paper on how to develop an effective clinical registry for rare hepatic disorders (Table 1). The most important aspects that are part of the decision process are discussed, in view of our own experiences, and highlighted by examples from successful rare liver disease registries in literature. As such, the lessons from our paper can be applied to other fields in medicine, as well.

Table 1.

Main aspects in the design of a clinical registry.

1.	Objectives	• A variety of aims can be addressed registry studies, for example to study:
		∘ Natural history;
		∘ Epidemiology;
		∘ Quality of life;
		∘ Long-term efficacy;
		∘ Safety;
		∘ Cost-effectiveness.
2.	Study population	• Define the target population;
		∘ Do not handle to strict inclusion criteria.
		• Avoid specialist center bias by including patients from general, as well specialist centers.
3.	Design	• Set up an international collaboration in order to include a large study-population.
3.	Design	• Create a web-based data management system.
4.	Data collection	• Identify a (small) core dataset of the most relevant data variables;
		∘ Use peer review and experts in the field;
		∘ If possible, include PROMs.
		• Involve patients for inclusion of self-reported data.
5.	Data quality	• Verify reliability and reproducibility of data collection;
		∘ Double entry a proportion of all data;
		∘ Cross-validate self-reported data;
		∘ Establish a monitoring committee;
		∘ Formulate validation rules in the data management system.
		• Handle missing data;
		∘ Apply imputation technique.
6.	Privacy	• Every patient needs to have an anonymous research number.

Objectives

The most important task before initiating a registry study is to define the main goal. Dividing your main goal into specific objectives and outcome measures will help you to decide on the best registry design. Registry studies can be created to address a broad spectrum of questions. We will illustrate this by using several examples that demonstrate the impact of international multicenter databases on clinically relevant issues (Table 2).

Table 2.

Examples of multi-country liver disease registries

Name	Founding country	Participating countries	Size (∼)	Website
Hepatitis delta registry	Germany	11	UK	http://hepatitis-delta.org/
DILI registry	Spain	1	901 cases 864 patients	http://www.spanishdili.uma.es/index.php/es/
Spanish Latin American DILI Network³²	Spain	9	190 cases 181 patients	–
PLD registry	The Netherlands	4	>500 patients	–
European liver transplant registry¹⁰	France/Germany/UK	27	106,849 patients 118,441 LTx	http://www.eltr.org/
International PSC Study Group⁹	Norway	>17	7312 patients	http://www.ipscsg.org/
Autoimmune pancreatitis³³	USA/Japan	10	1064 patients	–
International PBC Study group^18,19	The Netherlands	20	>6000 patients	http://globalpbc.com/

DILI: drug-induced liver disease; LTx: liver transplantation; PBC: primary biliary cirrhosis; PLD: polycystic liver disease; PSC: primary sclerosing cholangitis.

Natural course, quality of life and epidemiology

One of the goals of a registry could be to study the natural course of disease and associated factors. We designed a polycystic liver disease (PLD) registry with exactly this in mind. PLD is a disorder where patients progressively develop liver cysts. Information on the natural course of PLD, and answers to questions such as what are the predictors of an aggressive disease course are lacking, to date. This registry will help us to elucidate the behavioral risk factors for disease and assess the differences in treatment choices between countries.^4,5

The UK Primary Biliary Cirrhosis (UK-PBC) collaboration is an excellent example of a network that already established a large successful national registry.⁶ Primary biliary cirrhosis (PBC) is a rare disease (with a prevalence of 30 per 100,000 individuals in the population) with a highly variable phenotype and a high prevalence among women (the male to female gender ratio is 1:10).⁷ The sheer size of this registry makes it possible to study the clinical profile seen in a subgroup of male PBC patients. In addition, this consortium recently developed a UK-PBC risk score, to assess prognosis in PBC patients.⁶ Finally, this registry enables mapping of the natural history of the disease in the total PBC population, to link genetic susceptibility with phenotype and outcome, and to study the impact of PBC on the patients’ quality of life.^7,8 On a different note, registry studies facilitate studies on incidence and prevalence. A requirement is that they sample cases from a confined geographical area. Studies from the Primary Sclerosing Cholangitis (PSC) Study Group are a fine example, where all PSC patients in an area of six adjacent provinces were identified, comprising 50% of the Dutch population.⁹

Long-term efficacy

In order to study the long-term efficacy of therapeutic interventions, a registry is a perfect tool. Indeed, the relative probability of death and graft loss after primary liver transplantation (LTx) for a number of rare liver disorders is difficult to estimate. This is the reason for the European Liver Transplant Registry,¹⁰ which collects data on death and graft loss as rare outcome measures in 8,840 transplanted patients.

Safety

A patient registry can be used to investigate safety, by collecting data on the unexpected adverse events of drugs. Drug-induced liver injury (DILI) is the most cited reason why approved drugs are withdrawn from the market by the US Food and Drug Administration (FDA).¹¹ Bromfenac and troglitazone are two well-known examples of drugs that were withdrawn because of severe hepatotoxicity that became apparent in the post-approval period.¹² A specific registry, such as the Spanish DILI Registry, collects real-life data of drug safety; and therefore, allows better estimation of the magnitude of side effects of a drug, in terms of incidence or prevalence.

Cost-effectiveness

Registries are a tool to investigate cost-effectiveness. This has become an important aspect of the market access package for novel interventions. The National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) database has been used to measure comparative treatments and the cost-effectiveness of treatment modalities for hepatocellular carcinoma (HCC). This has resulted in a clear picture of the costs of treatment modalities (LTx, chemotherapy, radiation, resection or no treatment) over various HCC stages, in relation to survival (effectiveness).¹³ It goes without saying that registries such as the SEER database can be used to address other related questions.¹⁴

Materials and methods

Study population

Target population

The purpose of a registry is a key factor that determines the target population. This is the population for whom the results are relevant, but at the same time are the source of the registry data. The actual population is a mere reflection (and probably a fraction) of the complete patient population. Only in the case of an extremely rare disease is it possible to reach a coverage rate that approaches completeness. For example, the Dutch national Multiple Endocrine Neoplasia Type 1 database has been able to capture >90% of the total patient population in The Netherlands.¹⁵ This contrasts with the situation in PBC, as the UK-PBC group has managed to include approximately 25% of all PBC patients in the UK.⁷

In order to appreciate the variability in phenotypic presentation of a disorder such as PLD, it is paramount to sample a large number of patients whom are followed for a considerable time period. We have found it difficult for PLD to have a watertight disease definition. A cut-off of the number of cysts (as the presence of >20 liver cysts) is rather arbitrary and is not always strictly used by physicians. Some PLD mutation carriers (who most likely will develop the disease phenotype, with time) do not have the required number of cysts and may be asymptomatic at the time of inclusion. The use of overly strict inclusion criteria enhances the risk of exclusion of relevant patient populations, which leads to sampling bias, compromising external validation of results. Therefore, it is key to consider the consequences of having too strict inclusion criteria.

For some diseases, there is a wide variation in terms of the disease complexity and the treatment strategies used between university and general district hospitals. In view of this, the UK-PBC consortium managed to include thousands of patients from general centers, as well as specialist centers across the entire UK.⁷ This resulted in a geographically representative cohort, avoiding specialist center bias.

A large epidemiological study in PSC patients highlights the influence of selection and/or referral bias in population-based studies. The median survival until liver transplantation or PSC-related death was 13.2 years in tertiary referral centers, while transplant-free survival was 21.3 years in the total cohort (p < 0.0001). This highlights that it is paramount to collect data from university and general district hospitals, as well as tertiary referral centers, for an accurate assessment of survival in uncommon diseases such as PSC.⁹

Design

International collaboration

National and international collaboration are crucial, in order to collect a large study population. Isolated PLD is a rare liver disease with a prevalence of 1 in 158,000 people, and may also occur in the context of autosomally dominant polycystic kidney disease, which carries a prevalence of 1 in 1000.^16,17 Currently, our local registry includes approximately 500 patients. We used our professional network, established for clinical trials, in order to achieve a larger study population. Promoting your registry online or by presentations supports visibility of the project, and enables collaboration with international researchers.

The global PBC Study Group is a multicenter collaboration between 15 centers that have developed a registry, including the medical information of almost 5000 PBC patients in Europe and North America, based on individual databases.¹⁸ These data were used to develop a validated scoring system to predict transplant-free survival in ursodeoxycholic acid-treated PBC patients and to elucidate predictors for development of HCC.^19,20 International successes like these emphasize that combining several national databases constitutes a unique opportunity to obtain the power to execute studies.

International cohort studies facilitate our understanding of heterogeneity in rare diseases, by stratification of the at-risk groups. Risk stratification helps to identify the patient subgroups with low and high risk profiles; and allows us to select the patients whom have the greatest potential to benefit from treatment.^21,22 The GLOBE-score is a validated risk stratification tool that predicts transplant-free survival of PBC patients whom were treated with ursodeoxycholic acid, leading to more stratified and evidence-based individualized care.¹⁹

Stakeholders

In the process of creating a registry, it is pivotal to consider the target audience for whom the outcomes matter. The identification of stakeholders is key to help determine the objectives of a registry, as they have an essential role in using or disseminating the results from a registry. Patients, physicians, scientific societies, insurance companies, hospital staff and policymakers who may have a vested interest in the development of the registry, should be involved; and they are needed for public support. Some key success factors are engagement, i.e. the active influence on registry-shaping and long-term commitment. This can be achieved by organizing open sessions with different stakeholders, to introduce the concept of a registry in an early phase of registry development. In addition, it is important to motivate all parties by making the benefits of the registry visible. For example, authorship is important for the visibility of individual participants; and it is advisable to set up agreements on authorship, early in the process of registry development.

Data management

A reliable data management system is essential. Direct communication between electronic patient records and registries would be ideal for the collection of registry data, as it saves money and time. Since most hospital systems are not yet set up to accommodate this, the most accurate and reliable method to collect data is through the creation of a web-based data management system. Though costs are higher in comparison to a non-electronic data management system, it enhances quality; as validation rules can be formulated that allow monitoring of data integrity. The host of a web-based registry can determine which roles the data collectors will have in the electronic environment. Every role comes with its own responsibilities. There can be a role for the patients, in order to complete a questionnaire, or for researchers who collect their medical data. Another benefit of a web-based registry is that it allows decentralized data entry; and thus, the possibility to collect data internationally. Examples of electronic international registries are the Hepatitis C virus-TARGET and the Hepatitis Delta International Network, both of which were used for longitudinal observational studies.^23,24 An electronic registry is a financial investment, but in view of quality monitoring and efficiency, it will certainly pay off.

Timeline

Registries can have a fixed or open-end timeline, depending on the overall purpose of the registry. Most studies using a registry as an observational method have begun as open-ended projects, without a pre-defined stopping point. If continuation of the registry does not add any valuable information to the already captured data, the registry should be terminated and its data reported.²

Data collection

Data elements

Data collection is a time-consuming process; and it is essential to consider all data elements that are central to the objective of the registry, to avoid the collection of high volumes of data with limited value. What helps in this process is to divide the main goals into specific objectives, subdividing further into measurable outcomes.² For example, our goal is to study the natural and clinical course of PLD, so one important objective is to obtain information on the determinants associated with treatment. As such, we need to include at least the following elements: current age, gender, age at diagnosis, date of first treatment and treatment strategy. We used an expert panel in order to capture all the relevant variables. Ultimately, a small number of the most important variables remained.²⁵

Self-reported data by patients and patient-reported outcome measures

Collection of variables in patient registries can be performed by patients, researchers or physicians, depending on the origin of data. Another option is to involve patients in this process. The UK-PBC group has utilized this concept, as the authors used self-reported information from a large national cohort of PBC patients (n = 2353). For items such as age at diagnosis and therapy for PBC, it is recommended to cross-check the self-reported data with the medical record data; but their results showed a high correlation, suggesting a high level of accuracy of the self-reported data.⁷

As the patient’s view on their health status and treatment preferences has obtained a central position in the choice of treatment strategies, it is desirable to include the patient-reported outcome measures (PROMs). PROMs are ideal instruments to measure health gain.^26,27 This development is endorsed, as illustrated by the guidance on PROMs that is offered by the US FDA.²⁸ Web-based questionnaires are an ideal modality to collect PROMs.²⁹

Data quality

Data quality and monitoring

All elements that are included in a registry should be pre-defined; so that during data collection, it is clear to the data collector which information should be entered. For our PLD registry, we tested whether all definitions were interpreted in the same manner, by performing a pilot study. Two researchers collected data from medical records from the same patients, and the results were compared. We were able to clarify obscurities and vague definitions, and include some missing questions or variables. In order to verify the reliability and reproducibility of data, several options are possible. The gold standard for data entry is the double entry of 5–10% of all patient points, to check and verify.³⁰ An even better option is to include a quality and control committee, for central and/or local monitoring, in order to guarantee the quality of data. Such a committee should monitor electronic data collection and visit different sites for quality checks. By formulating validation rules in the electronic data management system, the incorrect or inconsistent (for instance pre-menopausal status in men) data can be easily found and rectified.

Handling missing data

Registry data that are often routinely collected bear the risk of incompleteness. In order to deal with this during data analyses, there are several options. Imputation, a statistical method that replaces missing data with substituted values, may be applied here. There are several imputation techniques, but multiple imputations that replace missing data by the average of the outcomes across multiple imputed data sets, is the most popular. The main advantage of multiple imputations is that the sample size and variability is preserved.³¹ The global PBC studies adjusted for missing data by multiple imputations, which did not affect the results.^18,19

Privacy: anonymous data entry

Anonymous data entry in research is important, particularly for rare disease registries, as the patients may be traced back easily. According to privacy rules, the patient names should be substituted by specific codes. We used anonymous codes for all the PLD patients in our registry; and separated codes for their country and hospital. In order to trace back patients during follow-up, we use decoding lists for every center; including the research number, gender, birth date and hospital number. There needs to be caution taken to check the registries for double inclusion of patients. This can be performed by checking the names; and if needed, the data of patients with similar birth dates.

Conclusions

The use of registries in medical science clearly rises up to offer the opportunity to fill in important gaps in knowledge about rare diseases, through national and international collaboration. This paper provides a framework for the development of a clinical registry and includes the important aspects that need attention during this process.

Footnotes

Funding

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Conflict of interest

None declared.

References

Rare Diseases: Understanding this public health priority, www.eurordis.org/sites/default/files/publications/Fact_Sheet_RD.pdf (2005, accessed 16 May 2009).

Gliklich

Dreyer

. Registries for evaluating patient outcomes: A user’s guide, 2nd ed. (Prepared by Outcome DEcIDE Center [Outcome Sciences, Inc. d/b/a Outcome] under Contract No.HHSA29020050035I TO3). AHRQ Publication No.10-EHC049. Rockville, MD: Agency for Healthcare Research and Quality, September 2010.

Dreyer

Garner

. Registries for robust evidence. J Am Med Ass 2009; 302: 790–791.

Gevers

Drenth

. Diagnosis and management of polycystic liver disease. Nat Rev Gastroenterol Hepatol 2013; 10: 101–108.

Temmerman

Missiaen

Bammens

. Systematic review: The pathophysiology and management of polycystic liver disease. Aliment Pharmacol Ther 2011; 34: 702–713.

Carbone

Sharp

Flack

. The UK-PBC Risk Scores: Derivation and validation of a scoring system for long-term prediction of end-stage liver disease in primary biliary cirrhosis. Hepatology 2015.; doi: 10.1002/hep.28017.

Carbone

Mells

Pells

. Sex and age are determinants of the clinical phenotype of primary biliary cirrhosis and response to ursodeoxycholic acid. Gastroenterology 2013; 144: 560–569.

Mells

Pells

Newton

. Impact of primary biliary cirrhosis on perceived quality of life: The UK-PBC national study. Hepatology 2013; 58: 273–283.

Boonstra

Weersma

Van Erpecum

. Population-based epidemiology, malignancy risk and outcome of primary sclerosing cholangitis. Hepatology 2013; 58: 2045–2055.

10.

Schramm

Bubenheim

Adam

. Primary liver transplantation for autoimmune hepatitis: A comparative analysis of the European Liver Transplant Registry. Liver Transpl 2010; 16: 461–469.

11.

Navarro

Rossi

. Drug-induced liver injury: Its pathophysiology and evolving diagnostic tools. Aliment Pharmacol Ther 2011; 34: 11–20.

12.

Lee

. Drug-induced hepatotoxicity. N Engl J Med 2003; 349: 474–485.

13.

Shaya

Breunig

Seal

. Comparative and cost-effectiveness of treatment modalities for hepatocellular carcinoma in SEER-Medicare. Pharmacoeconomics 2014; 32: 63–74.

14.

Eggert

McGlynn

Duffy

. Fibrolamellar hepatocellular carcinoma in the USA, 2000–2010: A detailed report on frequency, treatment and outcome based on the Surveillance, Epidemiology, and End Results (SEER) database. Unit Europ Gastroenterol J 2013; 1: 351–357.

15.

De Laat

Pieterman

Van den Broek

. Natural course and survival of neuroendocrine tumors of thymus and lung in MEN1 patients. J Clin Endocrinol Metab 2014; 99: 3325–3333.

16.

Spithoven

Kramer

Meijer

. Renal replacement therapy for autosomal dominant polycystic kidney disease (ADPKD) in Europe: Prevalence and survival; an analysis of data from the ERA-EDTA Registry. Nephrol Dial Transplant 2014; 29: S15–25.

17.

Bae

Zhu

Chapman

. Magnetic resonance imaging evaluation of hepatic cysts in early autosomal-dominant polycystic kidney disease: The Consortium for Radiologic Imaging Studies of Polycystic Kidney Disease Cohort. Clin J Am Soc Nephrol 2006; 1: 64–69.

18.

Lammers

Van Buuren

Hirschfield

. Levels of alkaline phosphatase and bilirubin are surrogate end points of outcomes of patients with primary biliary cirrhosis: An international follow-up study. Gastroenterology 2014; 147: 1338–1349.

19.

Lammers

Hirschfield

Corpechot

. Development and validation of a scoring system to predict outcomes of patients with primary biliary cirrhosis receiving ursodeoxycholic acid therapy. Gastroenterology 2015; 1–9.; doi: 10.1053/j.gastro.2015.07.061.

20.

Trivedi

Lammers

Van Buuren

. Stratification of hepatocellular carcinoma risk in primary biliary cirrhosis: A multicentre international study. Gut 2015; 1–9.; doi: 10.1136/gutjnl-2014-308351.

21.

Trivedi

Bruns

Cheung

. Optimising risk stratification in primary biliary cirrhosis: AST/platelet ratio index predicts outcomes independent of the ursodeoxycholic acid response. J Hepatol 2014; 60: 1249–1258.

22.

Trivedi

Corpechot

Pares

. Risk stratification in autoimmune cholestatic liver diseases: Opportunities for clinicians and trialists. Hepatology 2015.; doi: 10.1002/hep.28128.

23.

Wedemeyer H. Hepatitis Delta International Network, http://hepatitis-delta.org/assets/DownloadPage/000000/HDINEN-Protocol-Hepatitis-Delta-International-Network-OFFICIAL-FORM.pdf.

24.

Hepatitis C Therapeutic Registry and Research Network, www.hcvtarget.org/.

25.

Solomon

Henry

Hogan

. Evaluation and implementation of public health registries. Public Health Rep 1991; 106: 142–150.

26.

Black

. Patient reported outcome measures could help transform healthcare. Brit Med J 2013; 346: f167–f167.

27.

Alrubaiy

Hutchings

Williams

. Assessing patient reported outcome measures: A practical guide for gastroenterologists. Unit Europ Gastroenterol J 2014; 2: 463–470.

28.

Health USDo, Human Services FDACfDE, Research, Health USDo, Human Services FDACfBE, Research, et al. Guidance for industry. Patient-reported outcome measures: Use in medical product development to support labeling claims: Draft guidance. Health Qual Life Outcomes 2006; 4: 79.

29.

Younossi

Stepanova

Nader

. Patient-reported outcomes in chronic hepatitis C patients with cirrhosis treated with sofosbuvir-containing regimens. Hepatology 2014; 59: 2161–2169.

30.

Paulsen

Overgaard

Lauritsen

. Quality of data entry using single entry, double entry and automated forms processing: An example based on a study of patient-reported outcomes. PLoS One 2012; 7: e35087–e35087.

31.

Walani

Cleland

. The multiple imputation method: A case study involving secondary data analysis. Nurse Res 2015; 22: 13–19.

Creating an effective clinical registry for rare diseases

Abstract

Keywords

Introduction

Objectives

Natural course, quality of life and epidemiology

Long-term efficacy

Safety

Cost-effectiveness

Materials and methods

Study population

Target population

Design

International collaboration

Stakeholders

Data management

Timeline

Data collection

Data elements

Self-reported data by patients and patient-reported outcome measures

Data quality

Data quality and monitoring

Handling missing data

Privacy: anonymous data entry

Conclusions

Footnotes

Funding

Conflict of interest

References