// case study

Summary

Objective
Onboard Datasets to a new Smart Data Platform
Industry
Real Estate
Stack
Spring Boot, Elastic Map Reduce, Apache Spark

Strategy

Enfuse.io pairs jumped into a newly formed Pipeline team to begin designing, architecting and building the pipeline. We started building the first pipeline using Hadoop, which client engineers had no experience with. Our team, by pairing and being embedded with yours, are able to execute the vision while training the team. Soon after the first, we realized source systems were quite similar and proposed a design for a generic pipeline onboarding tool to speed up development and onboarding of new data from weeks and months to hours and days.

Context

Real Estate Data Broker had various data sources in disparate source systems. There was no way to join these data sources together within the organization or on the market, so they approached us to build a Smart Data Platform where data could be onboarded, enriched and consumed. This initial effort would be to extract, transform and load three initial data sources.

Results

Client processes more than 1.5 TB of data daily and is able to quickly enrich that data with new data assets as they become available lending to rich and precise inputs for ML and AI models.