Select Page

Case Study

Enabling complex joins between datasets and faster iteration of ML models for an AI tech company

Client Background

Client is an AI technology firm that helps large enterprises integrate data from a variety of sources and build their machine learning models both faster and more precise

Client Need

  • The application hosts hundreds and thousands of datasets (either free or paid) sourced from thousands of providers.
  • Enable the enterprise users, decision scientists, and data analysts to upload their organizational datasets.
  • Facilitate joins between the uploaded transactional/non-transactional datasets and the other publicly hosted datasets.
  • These joins should be executed within a few seconds for seamless user experience.

Our Solution

  • Implemented responsive and modern frontend app for data scientists using Redux React.
  • Designed and implemented all middle-tier services that include APIs and data access layer on Python Django
  • Wrote ANSI SQL code generator in Python that considers all user selections, connects with the metadata system, and generates the final query that runs on Snowflake.
  • Built search and recommendation systems on Neo4j that help users find features pertinent to their own uploaded datasets.

Key Benefits

  • Snowflake allows complex joins that include running various math functions between large datasets to happen within seconds, giving an output of billions of rows
  • It auto-creates multiple clusters depending on the count of concurrent queries as the workload increases
  • Data Scientists can quickly iterate over their models and thus move towards higher accuracy levels since they now save a significant amount of time finding the most relevant features.

Tools & Technologies

Numpy
Django
Redux
React
AWS
Snowflake

Services

Digital Product Engineering

Cloud Services

Data & Analytics

AI and Automation
Cybersecurity
Modern Managed Services

Build Operate Transfer

Innova Orion GCC Services

Talent Solutions

Industries

Communications & Media

Government Solutions

Healthcare, Life Sciences,
and Insurance

Banking & Financial Services

Energy, Oil & Gas and Utilities

Hi-Tech

Retail & CPG
Manufacturing

Travel & Transportation and Hospitality

Partnerships

AWS

Automation Anywhere

Databricks

Google

IBM

Microsoft

Pega

Salesforce

SAP
ServiceNow

Snowflake

Uipath

Innovation @ Work

Blogs and Insights

Research and Whitepapers

Case Studies

Podcasts

Webinars & Tech Talks
US Employment Reports

Company

About Us

Leadership Team

Strategic Partnerships

Office Locations

Newsroom

Events

ESG

The Innova Foundation

Careers

Explore Open Positions

Life @ Innova Solutions

Candidate Resource Library

Let's Connect