Insurance Data Intelligence Platform for Financial ServicesAutomating claims processing with the Databricks Data Intelligence Platform
What is The Insurance Databricks Data Intelligence Platform for Financial Services?
It's the only enterprise data platform that allows you to leverage all your data, from any source, on any workload to always offer more engaging customer experiences driven by real time data, at the lowest cost.
Simple
One single platform and governance/security layer for your data warehousing and AI to
accelerate innovation and
reduce risks.
No need to stitch together multiple solutions with disparate governance and high complexity.
Open
Built on open source and open standards. You own your data and
prevent vendor lock-in, with easy integration with external solution.
Being open also lets you share your data with any external organization, regardless of their data stack/vendor.
Multicloud
Multicloud is one of the key strategic considerations for FS companies, with 88% of single-cloud FS customers are adopting a
multi-cloud architecture (
reference). The Lakehouse provides one consistent data platform across clouds and gives companies the ability to process your data where your need.

Accelerate claims processing by automating insight generation and decision process
We'll showcase how the Lakehouse can accelerate claims resolution and reduce fraud risks for car crash.
In traditional flow, claims transit through an
operational system such as Guidewire and an
analytic system such as Databricks as shown in the diagram below.
Here is the application we will implement:

We will consume informations from our main operational system:
- Customer profile information including their policy details
- Customer Telematics: an app that collects driving metrics (location, speed, how you turn, braking habits...)
- Claims details: contains the claim information, including damage photos (car crash).
This information is then anlyzed and returned to our operational system to act on the claim:
- Claims images are analyzed with AI to ensure consistency with declaration
- Rule engine review the claims and customer profile, providing recommendation and taking automated actions.
Typical Challenges in Claims Processing:
* Insurance companies have to constantly innovate to beat competition
* Customer Retention & Loyalty can be a challenge as people are always shopping for more competitive rates leading to churn
* Fraudulent transactions can erode profit margins
* Processing Claims can be very time consuming
Required Solution:
How to improve the Claims Management process for faster claims settlement, lower claims processing costs, and quicker identification of possible fraud.
Smart Claims demonstrates automation of key components of this process on the Lakehouse for higher operational efficiencies & to aid human investigation How
* How to manage operational costs so as to offer lower premiums, be competitive & yet remain profitable?
* How can customer loyalty & retention be improved to reduce churn?
* How to improve process efficiencies to reduce the response time to customers on the status/decision on their claims?
* How can funds and resources be released in a timely manner to deserving parties?
* How can suspicious activities be flagged for further investigation?Why
* Faster approvals leads to Better Customer NPS scores and Lower Operating expenses
* Detecting and preventing fraudulent scenarios leads to a lower Leakage ratio
* Improving customer satisfaction leads to a Lower Loss ratioWhat is Claims Automation
* Automating certain aspects of the claims processing pipeline to reduce dependence on human personnel especially in mundane predictable tasks
* Augmenting additional info/insights to existing claims data to aid/expedite human investigation, eg. Recommend Next Best Action
* Providing greater explainability of the sitution/case for better decision making in the human workflow
* Serving as a sounding board to avoid human error/bias as well as providing an audit trail for personel in Claims Roles Smart Claims DEMO: Personas involved along the steps & High-level flow:

1
Ingest various data sources to create a Claims Datalake. Curate the data and enrich it by joining it with additional information such as policy, telematics, accident and location data
2
Build an ML model to assess the severity of damage of vehicles in accident and a Rules engine to score the case as a routine scenario that should be addressed quickly or an inconsistent one that requires further investigation
3
Visualise your business insights
4
Provide an easy and simple way to run the workflows periodically & securely share these insights with business stakeholders and claim investigators
1: Data Engineering - Ingesting data & building our FS Data Intelligence Platform Database

Our claims analysis requires huge volumes of data of all types - structured/semi/unstructured - coming at different velocities. For eg. telematic data from a vehicle from an app is both high volume & high velocity and mostly structured.
First, we need to ingest and transform claim, policy, party, and accident data sources to build our database.
Open the [Spark Declarative Pipelines SQL Pipeline notebook]($./01-Data-Ingestion/01.1-SDP-Ingest-Policy-Claims).
This will create a
SDP Pipeline running in batch or streaming.
Security and Governance with Unity Catalog
Unity Catalog provides the centralized governance of
all data and AI assets on Databricks.
It not only provides for easy data discovery but also protects data from unauthorized access through ACLS at the table, row, and column level ensuring PII data remains masked even for the claims investigation officer.
In addition, it provides audit trails of who accessed what data which is critical in regulated industries.
Once our data ingested, we can simply grant access to our Data Scientist team for them to explore the data and start building models.
2: AI - Detect the the severity of the damage in the car accident

The Lakehouse offers a wide range of modeling capabilities to cater to ML practitioners' needs- from autoML to classical, deep learning, and Gen AI models, including the end-to-end lifecycle management of these models as well.
Now that our data is ready and secured, let's create a model to predict the severity of the damage to the vehicle from the accident data. The intent here is to assess if the reported severity matches the actual damage and we can use computer vision algorithms to aid in the automation of this task.
Open the [02.1-Model-Training]($./02-Data-Science-ML/02.1-Model-Training) notebook to start training a model to recognize claims severity from images.
Once our accident severity level is automatically detected as part of our ingestion pipeline, we can apply dynamic rules to start a first level of classification and Analysis. Jump to [02.3-Dynamic-Rule-Engine]($./02-Data-Science-ML/02.3-Dynamic-Rule-Engine) for more details.
3: BI - Visualize Business Insights


As a
Data Warehouse, it provides reporting and dashboarding capabilities. The DB SQL Dashboard is a convenient place to bring together all the key metrics and insights for consumption by different stakeholders along with alerting capabilities all of which lead to a faster decisioning process.
* Faster approvals leads to Better Customer
NPS scores and lower
Operating expenses * Detecting and preventing fraudulent scenarios leads to a lower
Leakage ratio * Improving customer satisfaction leads to lower
Loss ratioOpen the [03-BI-Data-Warehousing-Smart-Claims]($./03-BI-Data-Warehousing/03-BI-Data-Warehousing-Smart-Claims) notebook to know more about Databricks SQL Warehouse and explore your Smart Claims dashboards:
Summary Report Dashboard |
Investigation Dashboard 4: Deploying and orchestrating the full workflow
While our data pipeline is almost ready, we're missing one last step: orchestrating the full workflow in production.
With Databricks Lakehouse, there is no need to manage an external orchestrator to run your job. Databricks Workflows simplifies all your jobs, with advanced alerting, monitoring, branching options etc.
Open the [05-Workflow-Orchestration]($./05-Workflow-Orchestration/05-Workflow-Orchestration-Smart-Claims) notebook to schedule or
access your workflow (data ingetion, model re-training, dashboard update etc)
Conclusion
Databricks Lakehouse combines the best characteristics of a Data Lake and a Data Warehouse and rides on the tenants of open, easy and multi-cloud to ensure you make the most of your infrastructure and investment in data and AI initiatives.
* It provides a Unified Architecture for
* All data personas to work collaboratively on a single platform contributing to a single pipeline
* All big data architecture paradigms including streaming, ML, BI, DE & Ops are supported on a single platform - no need to stitch services!
* End to End Workflow Pipelines are easier to create, monitor and maintain
* Multi-task Workflows accommodate multiple node types (notebooks, SDP, ML tasks, QL dashboard and support repair&run & compute sharing)
* SDP pipelines offer quality constraints and faster path to flip dev workloads to production
* Robust, Scalable, and fully automated via REST APIs thereby improving team agility and productivity
* Supports all BI & AI workloads
* Models are created, managed with MLFlow for easy reproducibility and audibility
* Parameterized Dashboards that can access all data in the Lake and can be setup in minutes