Select Page

Case Study

Implemented an automated solution for resource configuration, deployment, and scheduling

Client Background

Client is an American media measurement and analytics company providing marketing data and analytics to enterprises, media and advertising agencies, and publishers.

Client Need

  • Needed consultation for evaluation of tools and approaches for cloud adaptation. The objective was to offload computing from existing out-moded on-premise MapR cluster to the cloud.
  • Needed a solution custom-built for their live data (largest module) for evaluation and decision-making. 
  • Needed an automated solution for resource configuration, deployment, scheduling, scalability, etc.
  • Needed the ability to process incoming incremental data (10 TB or more) in a better and more efficient manner.

Our Solution

  • Provided a cloud-optimized, on-demand spin up solution for the computation offloading and Snowflake-based reporting solution.
  • Weekly extraction of 5TB or more data performed from the on premise MapR cluster and placed in S3 using shell script & AWS CLI executed by Airflow jobs.
  • Based on data size, copied over AWS EMR cluster is spun up using cloud formation templates and AWS CLI for executing Spark & Pig scripts.
  • Resultant data post-processing from EMR is pushed into S3 buckets for persistence.
  • AWS EMR cluster is auto-scaling enabled and gets purged post-processing.

Key Benefits

  • Provided a cost-efficient – On-demand solution for computation on AWS platform
  • Added value by providing best-suited recommendations for resource type and configuration for a cost-efficient and optimal solution.
  • Offloaded jobs that would need 48 hours in on perm server to cloud and processed them within 24 hours.

Tools & Technologies

Amazon S3
Apache Pig
Apache Spark
Cloud Formation
Amazon EMR
MAPR
Apache Airflow
Python
R
Powershell
Snowflake
Bash

Services

Digital Product Engineering

Cloud Services

Data & Analytics

AI and Automation
Cybersecurity
Modern Managed Services

Build Operate Transfer

Innova Orion GCC Services

Talent Solutions

Industries

Communications & Media

Government Solutions

Healthcare, Life Sciences,
and Insurance

Banking & Financial Services

Energy, Oil & Gas and Utilities

Hi-Tech

Retail & CPG
Manufacturing

Travel & Transportation and Hospitality

Partnerships

AWS

Automation Anywhere

Databricks

Google

IBM

Microsoft

Pega

Salesforce

SAP
ServiceNow

Snowflake

Uipath

Innovation @ Work

Blogs and Insights

Research and Whitepapers

Case Studies

Podcasts

Webinars & Tech Talks
US Employment Reports

Company

About Us

Leadership Team

Strategic Partnerships

Office Locations

Newsroom

Events

ESG

The Innova Foundation

Careers

Explore Open Positions

Life @ Innova Solutions

Candidate Resource Library

Let's Connect