2400 Meadowbrook Parkway, Duluth, GA 30096 | +1 770-493-5588 Follow Us
Select Page

Case Study

Designed and Developed Data Lake Platform using AWS S3 and Apache Spark

About the Client

A Big Data Platform that centralizes different sources of data from payers, providers, clinicians etc.

Client Need

  • Acquired more than 20 companies over the last decade. Multiple products with same use case running in silo environment and data being replicated in multiple location
  • Required a centralized data storage required for data consumption/analysis
  • Accelerate business growth by helping build new products, insights and enable AI and Machine Learning capabilities

Our Solution

  • Designed and Developed Data Lake Platform using AWS S3 and Apache Spark to ingest and process millions of transactions from various data types like Claims, Payments, Eligibility, Clinical, Imaging, etc…
  • Developed Pipeline Development Kit to create data pipelines with ease to quickly onboard tenant into the platform.
  • Created Orchestration Development Kit using Apache Airflow for scheduling AWS EMR data pipelines
  • Developed generic data pipelines for extracting and storing data that can be used by end users to search and retrieve their respective healthcare transactions using Elastic Search


  • Diverse and Ubiquitous Data amounting to 4 Peta Bytesof Cross Enterprise Financial, Operational, Clinical
  • Cost Savings of $400K annually by opting to utilize S3 intelligent storage tier options and creating object’s life cycle rules
  • Build Once – Use Multiple Framework – Operational Efficiency, Rapid Dev
  • Faster On-Boarding  – Reduced time to market, New Growth Opportunities
  • Foundation for Integrated Products, Linked Cross Functional Data and enablement of Machine Learning and AI
  • Authoritative source of Large Data Sets

Tools & Technology

Amazon S3 Bucket
Apache Airflow
Dev Ops
HashiCorp Terraform

Let's Connect

Talk to us and know how our over two decades of experience and a strong global presence can transform your business.


Digital Product Engineering

Cloud Services

Data & Analytics

Intelligent Automation

Cyber Security

Build Operate Transfer

Talent Solutions


Banking & Financial Services

Communication, Media
and Technology

Energy and Utilities


Life Sciences



Transportation and Logistics

Travel & Hospitality

Innovation @ Work

Blogs and Insights

Research and Whitepapers

Case Studies


Webinars & Tech Talks


About Us


Strategic Partnerships

Office Locations

News and Events

The Foundation


Open Positions

Life @ Innova Solutions

Candidate Resource Library

Let's Connect