Data Pipeline Development Services for Robotics
Geniusee provided our data science department with assisting the product robotics company with developing and implementing the data pipeline solution. Our client specializes in the robotics industry, providing a wide range of services such as robots themselves, warehouses as a service, data pipelines, and real-time streaming of data lakes. All this digital transformation journey is aimed at optimizing other companies' processes.
Head of business operations
Data and robotics company
Geniusee delivers high-quality products and sets reasonable project timelines to ensure they deliver to the client's requirements. Their team of engineers is excellent and rectifies bugs quickly. The partners communicate regularly — Geniusee is available 24/7 and uses Jira to track project progress.
Business context:
Our client reached out owing to the need to improve the data processing system by robots. Our expertise in data engineering allowed us to offer a solution for a data pipeline that collects information in real-time using a data warehouse and allows utilizing robots and resources more efficiently.
Key challenges:
- Ensure data quality and build a process of data collection that has no delays and works just on time;
- Assure completeness of gathered information;
- Upfit system stability;
- Build a cost-effective solution for a cloud solution that requires the collection of 10 Gb per second;
- Tailor infrastructure that does not depend on data volume.
Work approach
-
Batch processing
We processed data in multiple batches. There are different types of batching: mini batching, which can involve only a couple of samples, or full batching, which can involve days of data.
-
Streaming
Stream processing involves handling data on a sample-by-sample basis as it arrives. The system does not build up backlogs periodically. Instead, processes are carried out in real-time.
-
Experimental approach
We expand new horizons when working with clients who create emerging technologies. As a result of our collaboration, we implemented some fresh ideas.
-
Constant self-learning
Data science technologies and a rapidly changing market were required for this project. Due to a reshaped landscape, we had to adapt quickly to meet the needs of our clients.
Process
On this project, our team provided data pipeline development services that process data collected in real time into various data lakes. After that, data is analyzed and presented to metrics and boards. The complexity of the work was based on multiple data sources from different IoT devices and additional tools such as bar code scanners, CRM systems, products, and 3d party services.
Project Tech Stack
As our client works with traders as end customers, we had to choose only fast and secure technologies for warehousing robotics data pipelines. We decided to use cloud possibilities and microservices solutions to satisfy those criteria and ensure a high-performance rate and asset allocation.
List of technologies:
Geniusee Team
As the project centered on data engineering and required a significant amount of expertise to create pipelines, we decided to go with only senior specialists and our Head of Data Science and Data Engineering as a lead.
Product team
Development team
- Backend engineers
- DevOps Engineers
-
Tech Lead
- UI/UX designer
Features
-
System monitoring
We implemented a continuous monitoring service to check several data metrics on a regular basis several metrics to check if all system components work correctly, such as the last time of data extraction and so on. If the monitoring system detects a malfunction, there's an automated notification via multiple communication channels such as e-mail or Slack.
-
System of automated quality control
We based this system on a badge job of data processing. Original data from the source is counted, including empty records from data providers, and is compared to production by different periods. After the verification of collected information, the data analytics stage begins.
-
Microservices infrastructure
Software development using microservices involves creating small, independent services that communicate through well-defined APIs. Small, self-contained teams are responsible for providing these services. Unlike traditional architecture, microservices consist of separate units capable of developing, updating, deploying, and scaling independently. More frequent software updates can improve reliability, uptime, and performance.
-
3d party integrations
- Kafka
- CRM systems
- In-house databases
- IoT devices
-
Data pipeline alerts
The system alerts turn on if the data pipeline is off, stuck, or has no data. This prevents a complete shutdown and provides system stability.