The Challenges of Unstructured Healthcare Data
During every patient encounter healthcare providers rely on hundreds of discrete data signals on which they base their decisions in order to provide effective treatment. As care teams assess, diagnose and treat patients, they are creating and adding to the patients’ medical record with a range of data such as discrete lab results, qualitative descriptions, transcripts of their opinions and decisions and more. It’s no surprise the industry is facing a monumental challenge: how to unlock all of these unstructured clinical data for new insights into innovation and improving patient care.
A Tale of Two Data Types: Structured & Unstructured
When healthcare practitioners began storing and managing data digitally, they (like people in most industries) used structured data. This type of data is clearly defined and recorded in specific ways that machines can understand. Examples are relational databases and spreadsheets that contain names, dates, currency fields, and text with character limits, and numeric or alphanumeric fields for insurance numbers, ICD-10, and CPT codes. People entering structured data into databases or spreadsheets have clear rules on what data to enter and how to enter it. For example, insurance companies required strict compliance by healthcare providers on how to submit claims for reimbursement This consistency makes structured data relatively easy to search and analyze.
Electronic health record (EHR) systems are designed to use structured data, and the Structured Data Capture (SDC) standards initiative is working toward the goal of interoperability so healthcare organizations can share data for patient safety, clinical research, public health reporting, and healthcare coverage.
Unfortunately, all healthcare data doesn’t fit neatly into a field on a spreadsheet. Based on IDC research, the industry recognizes that at least 80 percent of healthcare data is unstructured. This category of data includes typed and written text, photos, radiological images, pathology slides, video, audio, streaming device data, PDF files, faxes, PowerPoint slides, and emails. Unstructured data is valuable, but it’s difficult to store, search, analyze — and share with collaborative teams, especially if they work in different healthcare organizations.
Barriers Associated with Unstructured Healthcare Data
Processing unstructured data to make it actionable and accessible will mean finding ways to overcome numerous challenges:
Interpreting Handwriting
There’s a gold mine of information, handwritten by patients or recorded by physicians, nurses, and other care providers. Natural language processing (NLP), a type of artificial intelligence (AI), may be the key to making handwritten notes usable in computing systems. More than just recognizing characters, NLP can draw some conclusions and understand intent, such as interpreting abbreviations and misspellings.
Reading Radiological Images
Reviewing and analyzing X-ray, CAT, MRI, ultrasound and other types of medical images require the skill of experienced, licensed professionals. It may also require taking past procedures, patient history, and other variables into account to distinguish healthy tissue from abnormalities. Machine learning algorithms can “learn” from archived images to recognize patterns and correctly interpret the results of radiological scans.
Creating Metadata
It’s possible to preprocess unstructured data for use in an EHR or other system that requires structured data. For example, a technician or physician could describe an image using codes or keywords, and that metadata could be entered into the computing system, adding information and making the image searchable, although the process is time-consuming and would benefit from automation.
Accommodating File Sizes
Unstructured data generally requires more storage than structured data — unstructured data sets are measured in terabytes and petabytes, not gigabytes. This requires storage infrastructure with greater capacity.
Performing Searches
Without metadata, some forms of unstructured data would be impossible to search (e.g., images of a specific type of joint replacement or audio from a conference). Machine learning has the potential to learn and then find what healthcare providers need.
Using Streaming Data
Unstructured data isn’t necessarily stagnant; it can be dynamic. Internet of Things (IoT), edge computing, and AI can be used to analyze data in real time and trigger actions or send alerts.
Managing Massive Data Volumes
Increasing the volume of data by using unstructured sources can create overload for healthcare practitioners. They will need applications that manage and prioritize data so their attention is focused where it’s needed most.
Dealing With Existing Records
There are vast stores of unstructured healthcare data that could be relevant to patients receiving care today but are currently inaccessible by EHRs. Healthcare providers will not only need to manage unstructured data that’s generated now and in the future, but also leverage unstructured data from the past.
Accept the Challenge
Your healthcare clients are actively looking for solutions that will allow them to extract value from unstructured data. But at a time when the industry is facing economic pressures, new regulatory requirements, and a paradigm shift to value-based care, they need to find practical ways to do it. Do you have answers?