Machine Learning and EMRs: Mining Electronic Medical Records for Cancer Treatment Decisions Using AI to Predict Treatments

Platform

PLATFORM

Trust & Security
Read our privacy and security compliance standards
Solutions
What We Do

SOLUTIONS
How we’re solving the top revenue cycle challenges
Who We Serve

CODING
Improve accuracy  
Strengthen outcomes

Coding Optimizer

no-icon-link

Coding Optimizer

CDI
Drive precision 
Capture complexity

CDI Optimizer

no-icon-link

CDI Optimizer

PAYER STATUS
Track faster
Resolve sooner

Auth Status

no-icon-link

Auth Status

Claim Status

no-icon-link

Claim Status
Resources

Browse Resources

Case Studies

Webinars

Blog

Coding Calculator

Blog
Coding Check-Up: Is Your Medical Coding a ...

Blog
Why You Need To Rethink Your Revenue Cycl ...
Company

ABOUT US

LEADERSHIP

CONTACT US

NEWSROOM

EVENTS

Questions?
Want to learn more about AKASA or how our GenAI-powered revenue cycle solutions can help your health system?

Let's chat
Careers

CAREERS AT AKASA

OPEN ROLES

ENGINEERING AT AKASA

MEET THE
TEAM
Amy Raymond, SVP of Rev Cycle

LET'S CHAT

The Gist

New studies indicate that using machine learning to analyze electronic medical records and clinician notes is showing promise for identifying bias in treatment assignments and for addressing incomplete medical records.

At AKASA, we are constantly exploring the intersection of machine learning (ML), artificial intelligence (AI), and the healthcare and human health industry. Given how quickly these fields evolve and intersect, we invest heavily in ongoing research to stay at the cutting edge.

Aside from our investment in research and publishing peer-reviewed articles, AKASA frequently hosts experts in the field to present to our technical teams.

As part of this ongoing learning series, we recently hosted Jiaming Zeng, Ph.D., a postdoctoral researcher in the Center of Computational Health at IBM. She presented her research on leveraging electronic medical records (EMRs) to improve decision-making in ontology by adapting causal inference, ML, and natural language processing (NLP).

Dr. Zeng’s research on leveraging electronic medical records to improve decision-making in oncology by adapting causal inference, machine learning, and natural language processing, has great potential. At AKASA, we’re always looking to push the limits of what’s possible with patient data and machine learning, and Dr. Zeng shows one of many such possibilities with this incredible research.

~ Byung-Hak Kim, AI Technology Lead at AKASA

Mining Electronic Medical Records for Cancer Treatment Decisions

Cancer continues to be one of the leading causes of death worldwide. Since 1990, we have averted roughly 2.1 million deaths for men and one million deaths for women thanks to advancements in cancer treatments and improvements in early diagnosis.

However, the increased number of treatment options means clinicians are faced with more difficult decisions when determining a course of treatment for a given patient, resulting in increased demand for tools to assist in this decision-making. The gold standard for determining if one treatment is better than another is randomized controlled tests (RCTs). Unfortunately, RCTs can be expensive and time-consuming.

Given the healthcare industry’s broad adoption of EMRs, the medical community has massive datasets that can present clinicians and researchers with large amounts of observational information. While some tools currently use this data, they fall short in some ways. Improving upon these tools is the focus of Jiaming’s research.

Comparative Effectiveness Challenges

When working to determine which treatment is better using observational data, clinicians are faced with two primary challenges: selection bias in treatment assignments and incomplete treatment records for patients.

Failure to adjust selection bias can undermine the reliability of observational data in any application. Missing patient records reduce the cohort size that you can build to study. Addressing these challenges is critical in developing practical, reliable data sources upon which to base treatment decisions.

Identifying Selection Bias in Treatment Assignments

One of the critical data sources in determining an approach to cancer treatment is weighing the benefits of the treatment against its potential costs, such as deciding whether to treat a given type of cancer with surgery, radiation, or to monitor it further rather than taking it more invasive measures.

This first study focused on using ML techniques to identify any biases present in current RCT data for multiple cancers, focusing primarily on bladder cancer.

Jiaming and her team built a set of covariates from the EMRs using Bag of Words, then trained a treatment prediction model and survival outcome model, and used Lasso to identify any intersections between these two models. These intersections identified potential sources of bias, called confounders. They then performed survival analysis by training a Cox PH model on these intersections. Finally, her team compared these survival analysis results against an established RCT.

Traditional RCTs indicate that monitoring is much better than surgery for bladder cancer, as surgery has a higher mortality rate. However, the confounders identified during this study suggest that patients with bladder cancer or existing bladder issues are more likely to receive surgery, as bladder cancer doesn’t respond as well to radiation. Bladder cancer patients also tend to be older and have additional medical issues, hence a higher death rate.

Upon further analysis, the bias towards surgery despite worse overall health can confound the data behind making the surgery vs. monitoring decision when considering treatment for bladder cancer.

We have developed a method that offers a coherent and adaptable process to identify sources of bias from textual data. And although here we have applied it in a medical sense, we really believe that this can be easily applied to any other context where there’s textual data that you wish to use.

~ Jiaming Zeng, Ph.D., Researcher, Center of Computational Health at IBM

Using NLP to Identify Cancer Treatments and Address Incomplete Medical Records

Incomplete treatment records present a problem when building a large enough cohort to study.

Currently, the definitive resource for treatment analytics is cancer registries. These records only record the initial treatment decisions, and sometimes they can require hours of extensive manual labor to be useable in large-scale studies. Even with this time expense, the records tend to be incomplete, especially when tracking the outcomes of these treatments.

To reduce the amount of manual effort and to close any gaps in the records themselves, Jiaming explored using NLP to analyze EMRs, specifically focusing on clinicians’ notes and using them to fill gaps in treatment outcome data.

They built three different data sources:

One baseline set of data using treatment groups grouped by billing code
A structured data source using a supervised model
A second unstructured data source using clinical notes as their source

The team then compared the results that all three models returned.

Similar results were produced when these models were applied to both prostate and esophageal cancers. The baseline data is serviceable, but not great. A slight improvement is observable when the structured data is included with the baseline.

However, the most significant improvement was observed when the structured, unstructured, and baseline data were all included. This indicates that the structured and the unstructured data are both valuable in filling in the gaps that the conventional systems of record can contain.

Why This Research Matters

The benefits of providing clinicians with the highest-quality data to inform their treatment decisions for cancer patients are of utmost importance. Jiaming’s research in applying modern ML methods to improve the data upon which the decisions are made is a powerful illustration of new techniques of analyzing data that can directly result in a better quality of care for patients.

If you’re interested in using AI and ML to improve the healthcare industry, AKASA is always looking for top talent.

Join the AKASA Engineering Team Today

WRITTEN BY

Zeke Bergeron

Zeke Bergeron is a senior technical program manager at AKASA. He has worked at companies ranging from small startups to large banks, in roles including technical writing, product operations, knowledge management, and Agile program management. Bergeron works closely with engineering to accelerate their ability to deliver and helps guide the organization towards a healthy and efficient approach to internal knowledge management.