Select Page

Case Study

Enabling complex joins between datasets and faster iteration of ML models for an AI tech company

Client Background

Client is an AI technology firm that helps large enterprises integrate data from a variety of sources and build their machine learning models both faster and more precise

Client Need

  • The application hosts hundreds and thousands of datasets (either free or paid) sourced from thousands of providers.
  • Enable the enterprise users, decision scientists, and data analysts to upload their organizational datasets.
  • Facilitate joins between the uploaded transactional/non-transactional datasets and the other publicly hosted datasets.
  • These joins should be executed within a few seconds for seamless user experience.

Our Solution

  • Implemented responsive and modern frontend app for data scientists using Redux React.
  • Designed and implemented all middle-tier services that include APIs and data access layer on Python Django
  • Wrote ANSI SQL code generator in Python that considers all user selections, connects with the metadata system, and generates the final query that runs on Snowflake.
  • Built search and recommendation systems on Neo4j that help users find features pertinent to their own uploaded datasets.

Tools & Technologies

Numpy, Django, Redux, React, AWS, Snowflake

Key Benefits

  • Snowflake allows complex joins that include running various math functions between large datasets to happen within seconds, giving an output of billions of rows
  • It auto-creates multiple clusters depending on the count of concurrent queries as the workload increases
  • Data Scientists can quickly iterate over their models and thus move towards higher accuracy levels since they now save a significant amount of time finding the most relevant features.
Key Benefits - Data Engineering for Pipeline Management

Services

Digital Product Engineering

Cloud Services

Data & Analytics

Intelligent Automation

Cyber Security

Build Operate Transfer

Talent Solutions

Industries

Communications & Media

Government Solutions

Healthcare, Life Sciences,
and Insurance

Banking & Financial Services

Energy, Oil & Gas and Utilities

Hi-Tech

Retail & CPG
Manufacturing & Automotive

Travel & Transportation and Hospitality

Partnerships

AWS

Automation Anywhere

Databricks

Google

IBM

Microsoft

Pega

Salesforce

Snowflake

Uipath

Innovation @ Work

Blogs and Insights

Research and Whitepapers

Case Studies

Podcasts

Webinars & Tech Talks
US Employment Reports

Company

About Us

Leadership Team

Strategic Partnerships

Office Locations

News and Events

ESG

The Innova Foundation

Careers

Explore Open Positions

Life @ Innova Solutions

Candidate Resource Library

Let's Connect