Skip to main content
  • Regular Article
  • Published:

Usefulness of a large automated health records database in pharmacoepidemiology

Abstract

Objectives

In the present study, using a large automated health records database, we investigated the incidence of cardio-cerebrovascular events, diabetes new-onset events, and dialysis initiation events in hypertensive patients, and examined the effects of antihypertensive medications on these incidences.

Materials and methods

We conducted a search of an automated health records database that contained anonymous information from the health insurance claims and the results of laboratory tests at 15 medical facilities across Japan. The study cohort was defined as patients who were diagnosed with hypertension and who visited a medical institution in the registration period. Events were defined by diagnosis, medication history, and laboratory test results.

Results

We obtained a cohort of 20,686 patients diagnosed with hypertension. The mean (standard deviation, SD) age in the cohort was 67.9 (13.2) years, and the follow-up period was 2.56 (1.42) years. The total incidence rates per 1,000 person-years in the present study population showed good agreement with rates in reported cohort studies: 8.10 (5.6–11.1) for cerebrovascular events, 1.27 (0.5–7.4) for cerebral hemorrhage, 6.57 (4.6–8.9) for cerebral infarction, 0.46 (0.1–1.0) for subarachnoid hemorrhage, and 1.75 (1.6–4.4) for myocardial infarction. The standardized incidence rates of cardio-cerebrovascular events, diabetes new-onset events, and dialysis initiation events were 9.73, 20.94, and 5.99 events/1,000 person-years, respectively.

Conclusions

In terms of the incidence of the investigated events in hypertensive patients, the study results suggested that the automated health records database data were as valid and reliable as data from other epidemiological studies.

Introduction

In Japan, prospective cohort studies based on the drug registry have been conducted as post-marketing surveillance according to Good Post-marketing Study Practice. As these studies were designed by pharmaceutical companies to investigate their own marketed products, they have not included comparative arms and were conducted over different periods of time.

In the USA and Europe, databases are compiled from prescription information and electronic health records and are used for post-marketing surveillance of medical products [1]. In Japan, database services that use claim information are available. The Claims Database of the Japan Medical Data Center is compiled from the claims information of health insurance societies [2]. The Claims Database can provide disease names and prescription records even if a patient switches to another health clinic, but it cannot provide test results or patient outcomes. However, large automated health records databases that use electronic medical charts are useful, because the accuracy of information is increased by the inclusion of patient outcomes and test results in the data. Therefore, large automated health records databases have potential advantages for epidemiological studies. However, it requires a lot of effort to collect data in a unified format from many medical institutions.

In the present study, we conducted a preliminary study using an automated health records database to demonstrate the usefulness of such a database in pharmacoepidemiology. Using this database, we focused on cardio-cerebrovascular events in hypertensive patients for the main purpose of validating it. The selection of these diseases was based on the assumption that the database can be validated by comparing the results of this study with other epidemiological data on cardio-cerebrovascular events. We then calculated the incidence of diabetes in hypertensive patients and in those patients who initiated dialysis, and examined the effect of antihypertensive medications on these incidences. Our aim is to assess the usability of large automated health records databases through these analyses.

Materials and methods

Database

In the present study, we conducted a search of an automated health records database provided by Medical Data Vision Co., Ltd. (MDV). The database is an electronic health records-based database that contains anonymous information from the health insurance claims for about one million patients since January 2003 and on the results of blood tests and other laboratory tests for about 410,000 patients since January 2004 at 15 medical facilities across Japan. This database contains anonymized information, including patient background information such as age, gender, and relevant medical department, as well as disease name on the prescription, and information on medications, surgery, injections, tests, diagnosis procedure combination (DPC) claims, and results of blood tests and other laboratory tests.

Study cohort

The study cohort was defined as patients in the database who were diagnosed with hypertension and who visited a medical institution at least once during the period from January to December 2006, which was taken as the registration period. The start date of follow-up was set when an antihypertensive drug was prescribed for the first time in the registration period. The cohort was followed up for the incidence of various events to December 2009. The follow-up for the cohort was initiated on the date of first visit and was continued until the date of the final visit. Completed cases were defined as those for which there were observations from 2006 to December 2009, while withdrawn cases were defined as those where there was dropout prior to December 2009. Hypertensive patients in the database were identified according to International Classification of Diseases, 10th revision (ICD-10) (Supplementary Table 1). However, patients were excluded from the study cohort if they had any antitumor drug or anti-human immunodeficiency virus (HIV) drug administered before or on the day of registration. Antitumor drugs and anti-HIV drugs were defined according to the first 4 digits of YJ code, which is a computerized receipt code.

Ethical approval

This study was approved prior to implementation, after the protocol was reviewed by the ethics committee of Kyoto University Graduate School of Medicine. Individual patients could not be identified because all the collected data, including the name of each medical institution, were anonymized and unlinkable.

Data collected from the database

The following information was collected from the study cohort database: each subject’s gender, age, date of registration, follow-up end date (day of last visit before December 2009), date of diagnosis with hypertension (day on which the patient was diagnosed with hypertension for the first time), complications, medications, blood test results, and presence/absence of extracorporeal dialysis. Complications and diseases diagnosed before or on the day of registration, except for having past history of cardio-cerebrovascular disease, were classified according to ICD-10 into diabetes, hyperlipidemia, cerebrovascular disease, myocardial infarction, angina pectoris, cardiac failure, renal disease, hyperuricemia, and other macroangiopathy (Supplementary Table 1). We classified patients into six groups based on their medications, according to the first 4 digits of the YJ code at time of registration: the angiotensin II receptor blocker (ARB) group, calcium channel blocker (CCB) group, angiotensin-converting enzyme inhibitor (ACE inhibitor) group, other antihypertensive drugs (others), a two-drug-use group, and an over-two-drug-use group. Other medications recorded included antidiabetic drugs, antihyperlipidemic drugs, and antiplatelet/anticoagulant drugs, and these were recorded according to the first 4 digits of the YJ code. Blood test result information collected was HbA1c, fasting blood sugar, total cholesterol, high-density lipoprotein (HDL) cholesterol, low-density lipoprotein (LDL) cholesterol, serum creatinine, and triglycerides. Apparently abnormal values in blood test results were excluded from the analysis.

Definition of outcomes

For cerebrovascular events, data on cerebral infarctions except cardiogenic embolism (CI), hypertensive intracerebral hemorrhage (ICH), and subarachnoid hemorrhage (SAH) were extracted from the database by ICD-10 code based on the diagnosis information at hospitalization, and the first day of diagnosis after registration was determined to be the date of onset. For cardiovascular events, data on myocardial infarction, angina pectoris, and cardiac failure were extracted from the database by ICD-10 code based on the diagnosis information at hospitalization, and the first day of diagnosis after registration was determined to be the date of onset. Cardio-cerebrovascular events were defined as combinations of cerebrovascular events and cardiovascular events, and the data were analyzed with the date of onset defined as the day when either a cerebrovascular event or a cardiovascular event occurred for the first time.

For events of new-onset diabetes, a treatment-based definition and a definition based on the results of HbA1c tests were used. The diabetes mellitus (DM) medication event was defined as one where an antidiabetic drug was newly administered during the follow-up period, and the date of event onset was defined as the day on which the treatment with the antidiabetic drug was started. Antidiabetic drugs were defined according to their YJ code (first 4 digits). “HbA1c 6.5%” was defined as an event where the HbA1c level was measured at 6.5% or greater, and the date of event onset was defined as the day when the measurement was performed. In addition, “HbA1c 6.1%,” which is a new criterion for the diagnosis of diabetes, was defined as an event where the HbA1c level was measured at 6.1% or greater. As HbA1c was measured in the concomitant use of antidiabetic drugs, these events defined by HbA1c measurement indicated patients for whom blood glucose control with antidiabetic therapy was difficult. Patients who had been diagnosed with diabetes before registration were excluded from the analysis of events of new-onset diabetes.

Newly introduced dialysis events were defined as those where dialysis was performed in patients who had not undergone dialysis before registration, and the date of event onset was defined as the day on which dialysis was performed for the first time. Data analysis was also carried out on the incidence of newly introduced dialysis events in patients diagnosed with diabetes at registration.

Analysis

The crude incidence rate of each event (per 1,000 person-years) was calculated from the number of events and the total follow-up period. In addition, the standardized incidence rates by age and gender were calculated according to 2005 census figures. The adjusted hazard ratio with 95% confidence interval (95% CI) for each event of the usage pattern of antihypertensive drugs was calculated using Cox regression analysis for the time period until the onset of each event. Covariates included gender, age, inpatient/outpatient status at registration, complications (diabetes, hyperlipidemia, cerebrovascular disease, myocardial infarction, angina pectoris, cardiac failure, renal disease, hyperuricemia, and macroangiopathy except cardio-cerebrovascular disease), antihypertensive drugs at registration (CCB, ARBs, other single-use drugs, combination use of two drugs, and combination use of over two drugs), and other medications (antidiabetic drugs, antihyperlipidemic drugs, and antiplatelet or anticoagulant drugs) at registration. A Cox proportional hazard regression was calculated using SAS software version 9.2 (SAS Inc., Cary, NC, USA).

Results and discussion

Cohort background

We obtained from the database a cohort of 20,686 patients diagnosed with hypertension during the period from January to December 2006. The proportion of completed cases at December 2009 was 59.5% (12,299), and the mean (SD) follow-up period was 2.56 (1.42) years. Due to the way hospital health records are handled, the database does not include patient information when a patient moves to another clinic or hospital. As there were observations for almost 60% of patients at December 2009 and the mean observation period in the study cohort was 2.56 years with a 3-year observation period, this suggests that most patients in the cohort continued therapy in the hospital and did not move to another clinic. Patients with chronic hypertension do not move to other medical sites. In these cases, it is possible to apply the data in this database to a population-based survey.

In the present study, the mean age (SD) was 67.9 (13.2) years. The cohort consisted of 48.0% female patients and 52.0% male patients, and 25.6% inpatients and 74.4% outpatients. For treatment of hypertension, ARBs were used in 45.8% of patients, CCB in 60.4%, ACE inhibitors in 10.3%, and other antihypertensive drugs in 34.7% (including overlaps). Antidiabetic drugs were administered in 13.7% of patients, antihyperlipidemic drugs in 22.6%, and antiplatelet or anticoagulant drugs in 33.8% (Table 1). The data on 13,310 patients remaining after excluding patients diagnosed with diabetes at registration were used for the analysis of the onset of diabetes. The data from 20,205 patients remaining after excluding patients undergoing dialysis at registration were used for the analysis of newly introduced dialysis.

Table 1 The distribution of demographic variables in the cohorts

Investigation of the validity of the frequency of cerebrovascular events

Based on diagnosis information at hospitalization, 122 ICH patients, 748 CI patients, and 23 SAH patients were detected to have events. Cerebrovascular events were determined in 869 patients after excluding overlap cases. Also, 1,368 angina pectoris patients, 192 myocardial infarction patients, and 1,849 cardiac failure patients were detected to have cardiovascular events. In Japan, electrocardiography is routinely performed at hospitalization, and disease names such as angina pectoris and cardiac failure are often used as diagnosis names for health insurance treatment. For this reason, as angina pectoris and cardiac failure were not clear evidence of a given condition, we evaluated only the 192 patients with myocardial infarction as a cardiovascular event. The results found 1,038 patients with cardiovascular and cerebrovascular events after excluding overlap cases (Table 2). The crude incidence rate of cardiovascular and cerebrovascular events was 20.08 events/1,000 person-years. The standardized incidence rate by gender and age was 9.73 events/1,000 person-years.

Table 2 Incidence rates for cerebrovascular or cardiovascular event patients

To examine the representation of groups which the database comprised, we compared the disease frequencies with the data from a previous epidemiological survey in Japan (Table 2). The Hisayama survey is a generally cited, large-scale cohort study in Japan. For hypertensive patients in the Hisayama survey, Arima et al. [3] reported standardized incidence rates of the cerebrovascular events of ICH, CI, and SAH for each grade of hypertension. In our present study, the data have not been compared by severity of hypertension, nor evaluated by subtype of CI. However, if it is concluded that selection bias exists only in patients with grade 1 or grade 2 hypertension in consideration of the fact that the cohort comprised outpatients being treated for hypertension, there were no significant deviations. The JIKEI-Heart study is a randomized clinical trial with controlled valsartan and non-ARB treatment for 3,081 Japanese hypertensive patients [4]. The KYOTO-Heart study is a randomized clinical trial with controlled valsartan and non-ARB treatment for 3,031 uncontrolled Japanese hypertensive patients [5]. The results from the present database research also did not have significant deviations when compared with these clinical trial results. These results suggest that the population in the database used in this study is close to a previous epidemiological survey and clinical study in Japan and is valid.

Investigation of the validity of the frequency of diabetes and dialysis initiation events

The number of DM medication events, in which new administration of a antidiabetic drug was counted as an event, was determined to be 660 (Table 3). The crude incidence rate of DM medication events was 20.42 events/1,000 person-years. The incidence rate of diabetes onset events standardized by gender and age was 20.94 events/1,000 person-years. The number of HbA1c 6.5% events was determined to be 456. The standardized incidence rate was 10.85 events/1,000 person-years. The number of HbA1c 6.1% events was determined to be 778. The standardized incidence rate was 16.63 events/1,000 person-years. The total of the incidence rates for these three groups in our study was less than that in the KYOTO-Heart study.

Table 3 Incidence rates for diabetic events and newly introduced dialysis

The number of patients with the event of newly introduced dialysis was determined to be 365 (Table 3). The standardized incidence rate was 5.99 events/1,000 person-years. The number of hypertensive patients with a complication of diabetes at registration in whom dialysis was newly introduced was 230. The standardized incidence rate was 12.08 events/1,000 person-years. The incidence rate for newly introduced dialysis in our study was higher than that in the JIKEI-Heart study and the KYOTO-Heart study. One reason for the deviation may be attributable to a possible high capture rate in our study, because the database included data from several dialysis therapy hospitals, while hospitals in the above clinical trials did not (Table 3).

Risk of cerebrovascular or cardiovascular event and hypertension treatment at registration

The adjusted hazard ratio (95% CI) versus CCB for cardiovascular and cerebrovascular events was 0.91 (0.74–1.12) for ARBs, 1.02 (0.83–1.25) for other antihypertensive drug use, 0.98 (0.83–1.16) for use of two antihypertensive drugs, and 1.17 (0.94–1.45) for use of three or more antihypertensive drugs at registration (Table 4). The hazard ratio of ARBs with use of two drugs was not statistically significant, but was less than 1.0 with CCB. Of patients using two antihypertensive drugs, 65.0% used ARBs (4,174/6,417). Our results from the database could provide support for the evidence that ARB medications are associated with reduced cerebrovascular or cardiovascular events compared with CCB in hypertensive patients.

Table 4 Hazard ratios for cerebrovascular or cardiovascular events

This study is the first pharmacoepidemiologic study to use a large automated health records database in Japan. Health records databases are useful in evaluating clinical event rates in hypertensive patients. Using the database and adjusting for confounding factors, it is possible to evaluate the relationship between antihypertensive drug medications and clinical events. Carrying out database surveys is an important method in pharmacoepidemiologic research, and we believe that database research can apply to other chronic disease surveys. However, there are some limitations with database surveys using automated health records databases. Firstly, although the data on complications, treatment history, and laboratory test results in the database were adjusted to the degree possible, it may not have been possible to completely adjust for confounding by indication, whereby a particular antihypertensive drug is administered to high-risk patients. As the database does not have vital sign measurements, such as blood pressure, weight, body mass index (BMI), and heart rate, this is a particularly important limitation in hypertension research. Secondly, the database has hospital-based data and not population-based data. In our case for hypertension, although there were few patients who moved to other clinics or hospitals, in other chronic disease surveys this point should be checked carefully. Other limitations were absence of confirmation of the clinical diagnosis, lack of random assignment to treatment groups, and lack of some information.

In conclusion, although it may be difficult to perform valid analyses in studies of highly specialized diseases and treatments due to selection biases, the present study indicates that automated health records databases in Japan can be applied to pharmacoepidemiologic studies of general diseases treated in various medical institutions. In the future, it will be possible to conduct comparative studies between treatment groups and nontreatment groups.

References

  1. Johansson S, Wallander MA, de Abajo FJ, Garcia Rodriguez LA. Prospective drug safety monitoring using the UK primary-care General Practice Research Database: theoretical framework, feasibility analysis and extrapolation to future scenarios. Drug Saf. 2010;33:223–32.

    Article  PubMed  Google Scholar 

  2. Kimura S, Sato T, Ikeda S, Noda M, Nakayama T. Development of a database of health insurance claims: standardization of disease classifications and anonymous record linkage. J Epidemiol. 2010;20:413–9.

    Article  PubMed  Google Scholar 

  3. Arima H, Tanizaki Y, Yonemoto K, Doi Y, Ninomiya T, Hata J, Fukuhara M, Matsumura K, Iida M, Kiyohara Y. Impact of blood pressure levels on different types of stroke: the Hisayama study. J Hypertens. 2009;27:2437–43.

    Google Scholar 

  4. Mochizuki S, Dahlof B, Shimizu M, Ikewaki K, Yoshikawa M, Taniguchi I, Ohta M, Yamada T, Ogawa K, Kanae K, Kawai M, Seki S, Okazaki F, Taniguchi M, Yoshida S, Tajima N. Valsartan in a Japanese population with hypertension and other cardiovascular disease (Jikei Heart Study): a randomised, open-label, blinded endpoint morbidity–mortality study. Lancet. 2007;369:1431–9.

    Article  PubMed  CAS  Google Scholar 

  5. Sawada T, Yamada H, Dahlof B, Matsubara H. Effects of valsartan on morbidity and mortality in uncontrolled hypertensive patients with high cardiovascular risks: KYOTO HEART Study. Eur Heart J. 2009;30:2461–9.

    Article  PubMed  CAS  Google Scholar 

Download references

Conflict of interest

This study was sponsored by Nippon Boehringer Ingelheim Co., Ltd. The cost of using the database was born by Nippon Boehringer Ingelheim Co., Ltd., and the data were lent to Kyoto University. The study executors, Hirokuni Hashikata, Akio Koizumi, and Kouji Harada, do not have any conflict of interest with Nippon Boehringer Ingelheim Co., Ltd. and Medical Data Vision Co., Ltd. Tatsuo Kagimura is an employee of Nippon Boehringer Ingelheim Co., Ltd. Masaki Nakamura is an employee of Medical Data Vision Co., Ltd.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Akio Koizumi.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Table 1 (DOC 44 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hashikata, H., Harada, K.H., Kagimura, T. et al. Usefulness of a large automated health records database in pharmacoepidemiology. Environ Health Prev Med 16, 313–319 (2011). https://doi.org/10.1007/s12199-010-0201-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12199-010-0201-y

Keywords