An international perspective on health data collection and usage
By Yair Babad and Joe Allbright
The authors are affiliated with the American Academy of Actuaries Health Practice International Committee.
This article explores how health care data is collected, used, and shared within and between countries. There are many benefits to having a robust health care data infrastructure: quality of care, efficiency, coordination, monitoring, etc. As the size and breadth of the data grows, though, so does the potential for its misuse. The threats to data security, individual privacy, and potential for abuse loom especially large when it comes to health data. Technological, social, and political considerations further complicate the way health care infrastructure is constructed, used, and secured. National governments and health systems have balanced these priorities in different ways, each reflecting the values of their institutions and the trust of their populations.
The effective use of health care data is closely linked to the cultural perspective of the country: Do they favor private enterprise or national systems, transparency and accountability or trust in institutions, emphasis on the individual or the collective good? This is closely related to who owns the collected data (e.g., does it belong to patients from whom it was collected, to providers of health care, or the public?), and how that data may be stored, protected, used, or shared. In most cases, the health system will already represent these preferences.
Health data is only as effective and useful as its structure allows it to be. Conflicting or inadequate structures place obstacles in the way of usefulness, but do not necessarily preclude it, as there may be many ways to overcome or compensate for these challenges. In many cases, national (or international) standards have been adopted to allow health data systems to integrate and communicate with one another. There continue to be pockets of health data that, intentionally or otherwise, cannot be joined to others. This is also a choice that many countries must make: follow international standards or create a solution tailored to their specific needs?
The following sections emphasize how health care data is collected and used today. This is a highly technical field, yet we will endeavor to remain at the “30,000 feet” perspective and ignore technical issues and specific details of health care data collection, storage, security and privacy, use, measurement, and assessment, as well as the resulting complexity, efficiency, and efficacy of the related health care systems.
Overview of Health Data
Health data refers to all data pertaining to the physical or mental past, current, or future health status, provision, and outcomes of care, whether of an individual or a population. This is the information which health information technology (HIT) is dedicated to collect, store, and share. Due to the prevalence of HIT in 21st-century health care, nearly all interactions between individuals and the health care system produce some sort of recorded health data. This may take many forms, including electronic health records (EHR), which document:
- personal identification, family, wellness, and medical history and genetic information;
- patient-provider interactions, diagnostics, testing, treatments, and outcomes;
- administrative or financial records containing non-clinical data such as hospital discharges or claims billing; and
- health surveys that could reveal information about population health.
Health data may also be paired with regulatory, environmental, socioeconomic, or behavioral information to drive deeper insights.
The most direct and primary use of health data is in delivering patient, community, and population care and well-being. Health data allows care providers to log notes, prescribe treatment, order tests, advise patients, and support clinical decision-making. There are also many secondary uses of health data that may not directly benefit the patient but may improve care for all. These include tracking provider quality and outcomes, public health initiatives, pandemic response, regulation, health care and safety net planning, and clinical research.
The international adoption of digital technology has continued to enhance the scope, quality, and volume of health data. A robust HIT infrastructure generates undeniable improvements in the way data is shared and care is delivered and monitored, but as these systems grow in both size and importance, there are new risks and unintended consequences to consider. Disagreements concerning the ownership of the health data raise ethical and legal concerns around security and privacy, and may obstruct data sharing. Different standards and policies have emerged over time that now prevent some data systems from communicating. These problems are compounded by a patchwork of regulation; conflicting priorities among private, national, and international actors; and the underlying variation in health care systems around the world.
Data standardization and electronic health records
Countries have supported the development of EHR systems to better facilitate the efficient and effective use and sharing of health data, while also reducing the costs of health care systems and simplifying their management. The adoption of EHRs enables data sharing, improves the quality and scope of primary health data, and expands the secondary uses of health data. The national use of EHRs has grown steadily, especially among upper-middle and high-income countries.
Standardization helps to make optimal use of an EHR system, especially one with national or transnational data sharing. Consistent procedures must be established and followed throughout the process; the medical conditions and ailments coding, the data structures, input, collection, storage, security and use must all be carefully considered. These standards are often created and maintained nationally. Over half of the countries that have a universal EHR system have developed legislation to govern its use.
International organizations have been instrumental in lowering the barriers to universal adoption of EHRs, and particularly in the coding of medical conditions and ailments. The international practice of medical coding today, in fact, grew from the application of mortality codes in 17th-century England. This has evolved into tools such as the International Classification of Disease (ICD) codes maintained by the World Health Organization (WHO). Today in its 10th revision, ICD-10 has tens of thousands of unique alphanumeric codes. The system is continuously being expanded; ICD-10, which has several variants, is scheduled to be replaced by ICD-11 in 2022.
The standardization of EHR structures and procedures, in contrast, is mostly formalized and regulated through national legislation in various, mostly developed, countries. Still, even the existence of such legislation does not assure obstacle-free EHR activities. In fact, a 2015 WHO study that included 63 mostly developed countries found that over 50% of upper-middle- and high-income countries have adopted national EHR systems, while adoption rates were much lower in lower-middle (35%) and low-income (15%) countries. Still, the WHO notes that there is steady growth in the adoption of EHR systems. The most frequently cited barriers to implementation of EHRs were lack of EHR funding, infrastructure, capacity, and frameworks. As these studied countries were presumably relatively developed, and if one adds other countries—and particularly less-developed countries—the importance of lack of funding, legislation, capacity, and infrastructure will probably become even more pronounced.
Indeed, interoperability of EHR is still hindered even in a developed country such as the U.S.—with its 1966 Health Insurance Portability and Accountability Act (HIPAA) and the 2009 Health Information Technology for Economic and Clinical Health Act (HITECH) —that set aside funds for the creation of a nationwide network of EHRs, and regulates the use and disclosure of protected health information by specified entities, including health providers, health care clearinghouses, health plans, and datastores of health care data. The U.S. is plagued by high integration costs, lack of consistent patient identification across Health Information Exchanges, data ownership and payers’ participation in data sharing, and inconsistent communication standards. From the WHO report noted above, as well as from the 2019 Black Book Market Research report, it is clear that there is quite a variation in the national level of EHR adoption in different countries.
Consequently, the adoption level of EHR varies among countries, as evident in the level of acceptance by primary care physicians in various countries, as presented in Figure 4.
The European Union (EU) promotes a Single Market Strategy regarding EHR and ePrescribing, stipulating three strategic pillars:
- Secure access and sharing of personal health information across borders … (with) full interoperability of member states’ EHRs and a European exchange;
- Connect and share health data to enable research, better diagnostic and improved health; and
- Strengthen citizen empowerment … through eHealth solution and new care models.
In November 2018, the EU issued a roadmap for an EHR standard format initiative. In April 2020, it declared that “we need decisive EU action to harmonize conditions for health-data processing across Member States … creating a Common European Health Data Space. COVID-19, for example, reminded that access to health data for scientific research is still subject to various rules and interpretations.”
Health data collection and its secondary use
All countries, their health care systems, many industry and research organizations, and international entities, collect health-related data as well as personal related data. Thus, there is seemingly no lack of health data, but its sharing and effective use, as noted above, are often very hard or even impossible. Further, many countries maintain centralized databases and data stores for some of their health data. Many international companies, such as Google, collect and store personal data all over the world, raising fears and regulatory actions to limit its power. In the U.S., for example, Google’s “Project Nightingale” gathers personal health data on millions of Americans. The WHO, on the other hand, is a minor player in the health care data collection, except for special cases (e.g., pandemics like COVID-19).
The main Organisation for Economic Co-operation and Development (OECD) Health database includes more than 1,200 indicators covering all aspects of health systems for the 36 OECD member countries, as well as key partners. It guides participating nations as to “governing data for better health and healthcare.” The data is collected from participating nations through a detailed questionnaire. For U.S. health data and data sets for secondary analysis, great sources are the Johns Hopkins Medical Information Mart of Intensive Care (MIMIC) database and the John Hopkins Welch Medical Library Guides, the Federal Health Information Centers and Clearinghouses and Healthdata.gov, and the databases maintained by the CDC.
Some other notable health data stores include the National Library of Medicine (NLM), which fuels data-driven discovery and innovation and the Global Health Data Exchange, the world’s most comprehensive catalog of surveys, censuses, vital statistics, and other health-related data.
The major concerns that pushed DigitalEurope to recommend a Common European Health Data Space are all related to the sharing of health data; specifically, concerns about the privacy of the data, patients’ trust, and ethical rules governing the use of the data. At issue is the secondary use of health and social data, i.e., “the use of the data for purposes other than the primary reason for which the data were originally saved. These include scientific, medical and epidemiological research; statistics; development and innovation; authorities’ planning, steering, supervision, and reporting; teaching; and knowledge management.”
In May 2016, the EU adopted a regulation on the protection of natural persons with regard to the processing of personal data. In March 2019, the Committee of Ministers of the EU of the Council of Europe recommended the protection of health-related data “to provide member States with guidelines … in order to guarantee respect for the rights and fundamental freedoms of every individual, in particular the right to privacy and to protection of personal data” as required by Article 8 of the Convention for the Protection of Human Rights and Fundamental Freedoms (ETS No. 5, “European Convention on Human Rights”). Other countries, such as Israel and Finland, have similar regulations.
One of the major tools to assure the anonymity of health data, and thus reduce the privacy threat, is data deidentification. In this process, identifiers are removed from the health information. According to the Privacy Rule of HIPAA, there are two alternative forms of health care data identification. First, a qualified expert determines which identifiers to remove. Second is a safe harbor method wherein a group of “specific individual identifiers are removed as well as absence of actual knowledge by the covered entity that the remaining information could be used alone or in combination with other information to identify the individual.”
Over the past several years, a new threat to the privacy of personal and health data has emerged—big data applications. These provide new methods to collect, collate, analyze, and apply data from multiple sources, including but not restricted to health data, on an unprecedented scale. In essence, information about most (if not all) activities people perform through electronic means, communications, and the internet, can be collected, used, and analyzed. When used with health data, it can significantly enhance the usage of these data; it opens many new health care options, while at the same time it can effectively eliminate the concept of personal privacy. Data deidentification, as envisioned by HIPAA, may be insufficient to assure the anonymity of the personal data.
Health Challenges and WHO Guidelines
In an increasingly globalized economy, health risks in one country may more easily migrate worldwide, thus requiring open sharing of information. Communicable diseases, such as COVID-19, influenza, Ebola, and AIDS, are clear cases for the value of international data in epidemiology and health policy. Other health-related factors that span borders could include drug use, malnutrition, environmental change, technology, and new treatments. Still, cross-country information sharing is to a large extent still in its infancy, with most conditions and responses being managed entirely within national borders.
As threats are shared, so too is the responsibility to act; the United Nations (UN) declared this the “decade of action,” advocating national funding to address gaps in health systems and infrastructures, as well as providing support to the most vulnerable countries. In January 2020, the WHO, concerned that “leaders are failing to invest enough resources in core health priorities and systems,” released a list of 13 urgent health challenge for the next decade:
- Addressing Climate Crisis
- Delivering Health Amid Conflict
- Fighting Health Care Inequality
- Expanding Access to Medicines
- Stopping Infectious Diseases
- Preparing for Epidemics
- Guarding Against Dangerous Products
- Investing in Health Workers
- Protecting the World’s Youth
- Gaining Public Trust
- Harnessing New Technologies
- Protecting Lifesaving Medicines
- Keeping Health Care Clean
These challenges cover a wide spectrum of medical, safety, and social issues, but all require an expanded collection, analysis, use, and sharing of health data within and among countries. The WHO stressed that investment in public health is a political choice for a whole country and not just the health sector, that health is an investment in the future, and (as COVID-19 has shown us) that a viral pandemic can bring national economies to their knees and be far more deadly than terrorists’ attacks.
Comparing National Health Systems
The former discussion emphasized how a robust health care infrastructure can be used directly to administer care and direct resources. Here we see how data can be used to evaluate and compare outcomes across systems. While one can create international standards for health data, there is no international model for how to structure a health system. Each country has created and reformed its health care system to meet its unique needs, priorities, and constraints. Specifically, each nation must ask:
- What is the role of government in providing a safety net?
- Which populations are covered, which services are covered, and how much is paid for by the individual?
- How is a health care system governed, administered, and financed (publicly and/or by private health insurance)?
- How is care delivered (public or private hospitals)?
- What is being done to ensure quality of care throughout the country, reduce disparities, and promote care coordination, innovations, and reforms?
Most nations make use of health data to monitor population health and to assess the cost, quality, efficiency, and effectiveness within their health systems. When charged with assessing the health system as a whole, it is necessary to find international points of comparison.
The simplest health metrics to collect and compare are per-capita health outcomes (mortality, life expectancy, disease prevalence); these can be well defined and are nearly all collected for national purposes already. When more detail is available, it’s possible to evaluate Disability- or Quality-Adjusted Life Years (DALYs/QALYs), or general health status across populations. The widespread use of ICD codes provides a standard international language to identify health conditions. The Global Burden of Disease (GBD) study is able to combine clinical and economic data to evaluate the cost—in years of health life lost—of diseases and risk factors.
Access to care (number of doctors or nurses per capita, available hospital beds) and quality of care (survival rates, wait times, avoidable mortality) require a more robust data infrastructure to collect and share. Wealthier nations (such as those within the OECD) are far more likely to have detailed annual quality metrics than a country with a less-developed health system.
The true challenge comes when we try to evaluate cost data between health systems. Currency conversions necessitate looking at costs on a purchasing power parity (PPP) or percent-of-GDP basis. Another potential measure is to look at costs as a percent of health expenditure, though that requires a consistent definition of what constitutes a “health expense” (voluntary supplemental coverage? dental care? long-term care? public health initiatives? foreign aid?). Further, these costs may be tracked far differently in a nation with a national health service than in one with private hospitals and insurance companies.
The Systems of Health Accounts (SHA 2011) is a standard established by the OECD, WHO, and Eurostat to create precise definitions and description of health care financial flows, as described for international and national entities from an expenditure perspective. The health funding flows from the payor, through a funding scheme, to a care provider and the functions it provides.
In 2000, the WHO published a ranking of 191 national health systems based on 1997 data. This report was based on a composite index of health—disability-adjusted life expectancy (DALY) (50%), responsiveness (25%), and fair financial contribution (25%)—to reflect the health system goals. Despite remarkable efforts to compile and adjust the data, the methods and conclusions from the report sparked wide criticism and debate. The task of distilling a national health system into a single rank proved too complex and too controversial for the WHO, who has declined to publish one since. Other organizations (including the Commonwealth Foundation, the Peterson-KFF Health System Tracker, US News and World Report, and the Euro Health Consumer Index) still do publish rankings periodically, generally with a mix of quantitative and qualitative analysis.
The following comparison of selected international health systems, as adapted from “Insights in Worldwide Health Systems,” demonstrates the complexities of establishing a health care ranking.
Data can be an extraordinarily powerful enabler. If used well, it can enable efficiency, coordination, quality, and innovation within and among national health systems. When shared and communicated well, it can generate insights for health policy, public health, and research. It’s exactly this usefulness, though, that makes health data so challenging. With each new use, each stakeholder, each technological advancement, there is another potential constraint on how the data should be configured and shared. As health systems and information technology are both notoriously dynamic, significant work is required simply to maintain a functioning and secure data infrastructure.
COVID-19 has been an extraordinary lesson in the value of national and international health care data during a public health emergency. Tens of thousands of lives depend on the timeliness and accuracy of health data. Both health policy and public sentiment depend on its clear communication. As data and modeling experts, health actuaries are often well positioned to advise on the best practices in establishing, maintaining, and using health data.
 Source: “Primary and Secondary Uses of Health Data”, U. Srinivasan, CMCRC, 5/2017, https://flyingblind.cmcrc.com/researchers/primary-and-secondary-uses-health-data.
 WHO 2016 report, https://www.who.int/goe/publications/global_diffusion/en/
 WHO 2016 report, https://www.who.int/goe/publications/global_diffusion/en/
 WHO 2016 report, https://www.who.int/goe/publications/global_diffusion/en/
 “HIPAA and HITECH”, HIPAA Journal, 2020, https://www.hipaajournal.com/hipaa-and-hitech/.
 “4 Reasons Why EHR interoperability is a Mess”, G. Barrick, 6/17/2019, https://datica.com/blog/reasons-ehr-interoperability-is-a-
 “2019 State of the Global EHR & Healthcare IT Adoption”, Black Book Market Research, 2019.
 “Adoption of Electronic Medical Records and ePrescribing”, in Part II Chapter 8 (https://www.oecd-ilibrary.org/docserver/health_glance_eur-2018-56-en.pdf?expires=1591009380&id=id&accname=guest&checksum=F18B12B3DEB94B38426E98494E8C9760) of “Health at a Glance: Europe 2018”, OECD, https://www.oecd-ilibrary.org/docserver/health_glance_eur-2018-en.pdf?expires=1591009067&id=id&accname=guest&checksum=F210D8F2D378335E105B9B0582987550.
 “EHR – standard formats”, Initiative and roadmap, 11/2018, https://ec.europa.eu/info/law/better-regulation/have-your-say/initiatives/1999-European-Electronic-Health-Record-EHR-Exchange-Format.
 “DIGITALEUROPE recommendation on health data-processing”, Digital Europe, 4/2020, https://www.digitaleurope.org/wp/wp-content/uploads/2020/04/200409Digitalhealth_Issues_Datasharing_DIGITALEUROPE_PositionPaper-1.pdf.
 http://oecdobserver.org/news/fullstory.php/aid/5780/Governing_data_for_better_health_and_healthcare.html. With scope and subjects specified in https://www.oecd.org/health/health-statistics.htm and in https://data.oecd.org/health.htm.
 “Secondary use of health and social data”, Ministry of Social Affairs and Health, Finland, Act on Secondary Use of Health and Social Data, https://stm.fi/en/secondary-use-of-health-and-social-data.
 Regulation (EU) 2016/679 and Directives (EU) 2016/680 and 2016/681 of the European Parliament, Official Journal of the EU, Vo. 59, 4 May 2016, https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=
 “Recommendation CM/Rec(2019)2 of the Committee of Ministers to member States on the protection of health-related data”, Council of Europe, Committee of Ministers, 3/2019, https://www.apda.ad/sites/default/files/2019-03/CM_Rec%282019%292E_EN.pdf.
 “New Israeli comprehensive draft regulations for health data secondary use”, 10/2019, https://techpolicy.org.il/health-data-secondary-use/.
 “Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule”, US Department of Health and Human Services, https://www.hhs.gov/hipaa/for-professionals/privacy/special-topics/de-identification/index.html#rationale. As the language of the safe harbor demonstrates, this rule is complex, and hard to legislate and apply.
 See. E.g., the WHO’s “Big data in global health: improving health in low- and middle-income countries”, 2015, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4339829/; “Big Data and Health Care: Challenges and Opportunities for Coordinated Policy Development in the EU”, Health Systems and Reform, 1(4): 285-300, 2015, https://www.tandfonline.com/loi/khsr20; “Big Data Fear Plagues German Healthcare”, Handelsblatt Today, Modern Medicine, 3/2016, https://www.handelsblatt.com/today/companies/modern-medicine-big-data-fear-plagues-german-healthcare/23536824.html?ticket=ST-1725028-1blQykzfc2AvAkYkQyQv-ap3; “Israel to launch Big Data health project amid privacy concerns”, Jerusalem Post, 4/2019, https://www.jpost.com/health-science/israel-to-launch-big-data-health-project-547043; and many more.
 “Urgent Health Challenges for the Next Decade”, WHO, 1/13/2020, https://www.who.int/news-room/photo-story/photo-story-detail/urgent-health-challenges-for-the-
 World Health Organization. The world health report: health systems financing: the path to universal coverage. World Health Organization, 2010.
 https://www.who.int/health-accounts/methodology/sha2011.pdf and its 2017 revised edition in https://www.oecd.org/publications/a-system-of-health-accounts-2011-9789264270985-en.htm
 “Measuring Overall Health System Performance for 191 Countries”, GPE Discussion Paper Series # 30, WHO EIP/GPE/EQC, www.who.int/healthinfo/paper30.pdf
 Allbright J, Mateja S “Insights from Worldwide Health Care Systems: An introduction” Contingencies Jul 2017