Big Data Meets Healthcare Data Analytics

Taylor Lake
Making the Most of Your Treasure Trove of Electronic Data

Big Data is More Than an Exercise in Terms

It used to be, before the electronic era, that you’d measure your business data storage requirements in square feet. Today, however, we measure in “bytes” instead of feet. As a computer user, for example, when it comes to storing your documents and electronic files you may have more on your hard drive or in cloud storage than you have in physical filing cabinets. Gigabytes and terabytes are your unit of measurement. Let’s take this one step further, and think of your organization’s data access and storage needs. Gigabytes and Terabytes won’t cut it anymore, it’s time to  learn new words: Petabyte… Exabyte… Zettabyte… Yottabyte. There are more, but you get the point. Each new level is 1,024 times the storage capacity of the one that precedes it. If you aren’t familiar with these terms already, the way your electronic data storage needs are multiplying even as you read this means you soon will be. But for now, we’ll keep it simple: “Big data” is here,  and if you work in the healthcare industry chances are you’re already having to deal with it.

What is Big Data?

No single definition exists for big data. One of the prevalent ways to describe it uses “The Four V’s”: volume (yottabytes and beyond), variety (large amounts of unstructured data supplementing structured data), velocity (information is added to and updated frequently), and veracity (considering the variable degrees of certainty of the information).

If we look at several definitions for big data, we can glean some common elements among them:

  • Massive amounts of information in digital form, often surpassing what is available through randomized clinical trials. In 2011, this amounted to about 150 exabytes for the healthcare field generally; today, this volume has already moved into the yottabyte realm and is still growing, up to about 50 percent annually now.
  • Multiple information sources, not always readily compatible with each other. They include but aren’t confined to clinical data, pharmacy and prescription information, laboratory results, patient records, professional journals, insurance information, patient monitoring devices, and even social media. Some of this information is the result of government electronic recordkeeping requirements. Much of it exists because recent improvements in computer technology provide the capability to drive the shift from paper-based to electronic formats.
  • Conventional methods of storing and managing information are unequal to the task of handling big data. Information overload means that much potentially relevant data is only partly utilized or unused altogether.

The advantages of digital data over paper – like speed, convenience, and ready availability –make the transition to big data inevitable. The important question is, what can you do with it?

Data Analytics Answers the Big Data Challenge

Fortunately, technological advancements in the form of artificial intelligence that parallel the ones behind the growth of big data offer remarkable new ways to make use of it that were unavailable only a few years ago. This new methodology is “data analytics.”

Modern medicine is data-driven. Data analytics touches on all facets of the healthcare industry: patients, providers, insurers, and government legislators and regulators. As a concept, data analytics is not new; reviewing a patient’s paper records to make informed predictions and decisions is analyzing data. The problem is that with the massive amounts of electronic data that are now available, relying on people and earlier forms of regression analysis-based computer programs to make sense of it all is becoming harder to do.

The key to analyzing big data is machine learning, especially in the form of deep learning artificial neural networks. These artificial intelligence systems are, like the healthcare industry, data-driven. And the way they learn is serendipitous with the digital data explosion: the more data you feed them, the better they perform.

Another driver of data analytics is the need to find ways to make healthcare less expensive. Using artificial intelligence as an aid in finding better ways to measure performance while simultaneously providing better service requires management to wring every possible efficiency out of the system is essential, because more emphasis is being given to preventive care and value-based as opposed to fee-for-service health insurance reimbursements.

Examples of Data Analytics in Use

Data analytics is already well established in healthcare. By the end of this decade, its cumulative value in the United States may be more than $18 billion. So, based on what’s already being done, what can you do with big data and data analytics?

You Can Save Money and be More Efficient

Costs connected to healthcare in the United States are about $600 billion annually, or about one-sixth of the country’s gross domestic product. Data analytics plays a vital role in keeping these costs down:

  • Reducing administrative costs. Costs connected to the administration of healthcare, as opposed to the providing of the care itself, amount to one quarter of total costs. For example, hospitals have used data analytics to predict admission rates and patient volume.
  • Reducing costs from fraud and abuse.
  • Encouraging vertical integration and consolidation of electronic sources such as data warehouses.
  • Reducing the need to practice “cookbook” or “defensive” medicine in the form of unnecessary tests and procedures.

You Can Make Better Decisions and Predictions

Artificial intelligence, through its ability to detect patterns and make predictive conclusions, enhances organizational efficiency. These patterns can help identify patients more likely to be susceptible to infections during inpatient stays, or who may need increased attention as being at-risk for being no-shows for follow-up appointments, or even to predict near-term end of life possibilities. Other benefits of AI-based healthcare data analytics include:

  • Improving availability of patient records and treatment records among care providers and first responders.
  • Enabling better preventive care and patient wellness, which carries the secondary benefit of helping to reduce overall costs through personalized approaches like telemedicine, real time patient alerts, and prediction analytics based on patient risk factors.
  • Support for clinical decision making. Having more data available in consolidated form, such as an application dashboard with a patient’s information, makes it easier for health care practitioners to know exactly what treatment measures have already been introduced or need to be undertaken, and reduce the incidence of adverse events like medication errors.

Limitations on Data Analytics

Because data analytics relies to a considerable degree on artificial intelligence, it is subject to the present early development of deep learning systems. What this means is that developmental areas remain in the underlying technology that need to be fully overcome before healthcare data analytics can reach its full potential.

Gathering data from multiple sources into a coherent whole is difficult

It would be nice if all the data necessary to manage a big data solution was contained in a single data warehouse, but today the data is scattered among sources unrelated to one another and which lack common standards. Furthermore, both the Federal and state governments and their agencies also have roles to play in generating and managing healthcare data. Another way to describe this issue is to think of it as “data cleaning”: the need to not only pull all of the information together, but to purge the whole of irrelevant, incorrect or otherwise corrupt data.

Storing mass amounts of data can be a logistical and security challenge

Healthcare records and other big data information consists of millions of records, images and other data that can tax any IT department beyond its limits. Another possible problem for organizations that store and manage their own data is “data siloes,” which are bodies of data unconnected to the entire body of information.

These considerations have led most healthcare industry participants to resort to cloud-based storage, but that raises concerns of its own, most notably whether the cloud storage provider is adequately aware of and can comply with privacy and security requirements, such as HIPAA protections, and how well it can safeguard patient information from security breaches.

Continued testing is still necessary to ensure predictive accuracy

When you are drawing conclusions based on big data analysis, not everything you might learn will be helpful because not every commonality will be relevant. For example, if you are doing a predictive analysis for people likely to experience heart disease in the next year, in addition to factors that are likely sources (obesity, smoking, inactivity) you can also receive results that aren’t (being married, living in a particular zip code, or making more than $100,000 in income). Falsification testing is imperative to make sure that the conclusions you draw from your data are less subject to statistical errors, and that they are causal and not merely coincidental.

Data Analytics are What You Make of Them

If the current trend of double-digit percentage growth in electronic data volume continues, then what you do with your big data will bear directly on the prosperity if not the survival of your business – and the lives and health of everyone who depends on the healthcare system.

Failing to stay abreast with the expansion of big data is not just a waste of potentially critical resource. If your market and your competition are keeping pace with the data revolution, you will not be standing still next to them. You’ll be falling behind.

In specific and measurable ways, data analytics is already making a strong impression on healthcare delivery and administration. Data scientists and healthcare professionals alike know how it can be made even better, and its evolution continues. Choosing the right data analytics integration partner is key to implementing your own strategy to not just cope with big data, but to harness it and put it to work for you.