Delivering Cost Savings with Real-Time Claims Data Processing on Google Cloud

Delivering Cost Savings with Real-Time Claims Data Processing on Google Cloud
Anand Mogusala
Senior Manager, Data Engineering

Real-Time Adjudication for Health Insurance Claims delivers “…potential savings of $15 per claim on average or a total of $45 billion annually…” 

Importance of Real-Time Data Processing

In today’s digital business landscape, consumers expect instantaneous responses and services. Lack of access to real-time information can have significant negative impacts like missed opportunities, compliance issues and higher operational costs. 

Real-time data processing is a game-changer for the insurance industry, especially in claims management, as it delivers superior customer experiences. The ability to process claims data in real-time offers significant advantages such as 

  • Enhanced operational efficiency
  • Improved customer satisfaction
  • Better risk management 
  •  Improved Fraud Detection
  • Achieving Compliance

Real-time data processing is a critical component of modern data architecture. Among various cloud-native services, Google Cloud Dataflow stands out as a powerful, fully managed service for stream and batch data processing, enabling real-time processing. Here we will explore how GCP Dataflow can be leveraged for real-time data ingestion and processing.

Real-time data processing using GCP Data flow for an Health Insurance firm

For one of our insurance clients, approximately 3.5 million claims are received for processing daily. However, the existing data processing framework could only handle 80% of these claims. This leaves 20% of the claims data at risk for overpayment, fraud, and non-compliance. Therefore, real-time data processing was seen as a critical architectural component, as delays directly translates to lost dollars in overpayments and compliance issues.

The following diagram depicts our approach to achieving real-time claims processing using GCP services. The process involves key GCP services like Pub/Sub, Dataflow, and BigQuery. We created a process to read claims from a Pub/Sub location, which pulls data from the subscription created for input claims from an MQ listener. A Dataflow process, using a JAR file with the necessary transformations, processes the data. The transformed data is then loaded into a BigQuery streaming table in real-time. Subsequent steps read the data from this table for further processing.

With the introduction of this architecture, we can now process almost all the received claims data in real-time. This enables the client to be notified of compliance issues and as well stop payments upfront.

Steps involved in the build

  • Moved the incoming messages into pub sub location through a nifi and MQ listener setup
  • Created the JAR file with the required transformation logics.
  • Created a dataflow process to read claims from pub sub location, process using the JAR file that created with required transformations
  • Create BQ tables with parsed data

Pub/Sub 

Pub/Sub, Publisher-Subscriber, is a messaging platform where messages are exchanged between independent entities (publishers and subscribers) through a message broker.This decouples the entities producing the data (publishers) from those consuming the data (subscribers), enabling scalable, flexible, and robust communication systems

Key features include:
  • Automatically scales to handle growing workloads without requiring manual intervention.
  • Globally distributed system ensuring low latency and high availability.
  • End-to-end encryption and IAM-based access control for secure message exchanges.
  • Supports push and pull delivery models to cater to different application needs.

Dataflow

Google Cloud Dataflow is a managed service for executing Apache Beam pipelines. It provides a unified programming model for both batch and stream processing, making it a versatile solution for a wide range of data processing needs. With Dataflow, we can build pipelines that read from data sources, transform, and process the data, and write the results to various/desired locations, all in real-time.

Key features include
  • Scales to handle varying volumes of data, ensuring that pipelines can manage both peak loads and regular traffic efficiently.
  • Process and analyze data with minimal delay, enabling quicker decision-making.
  • Supports multiple data sources and sinks, including Pub/Sub, BigQuery, Cloud Storage, and more, allowing seamless integration into architecture.

ASCENDION AVA+ Data Onboard Express (DeX) enables rapid onboarding of data into data lake leveraging metadata-driven ingestion, Gen AI-assisted data validation and configurable orchestration delivering higher productivity and reduced TCO. 

ASCENDION AVA+ DeX offers flexible and scalable Data Engineering design patterns with a variety of cloud-agnostic and cloud-native services from GCP, Azure, and AWS. Key benefits delivered by ASCENDION AVA+ DeX includes

  • Achieve up to 60% reduction in data onboarding time and effort through automated Data ingestion
  • Utilize Metadata-Driven Reusable Assets to expedite Enterprise-wide adoption
  • Deploy standardized artifacts for simplified maintenance
  • Gain instant access to data with an Event-Driven Architecture
  • Facilitate data analysis with streamlined Data Cataloging