Skip to main content
Tri GunawanDeveloper
HomeERPApplicationsAboutContact
Let's Talk
Tri GunawanDeveloper

Business Automation Architect — 12+ years building ERP, AI-driven platforms, and enterprise solutions that deliver measurable ROI.

GitHubLinkedInEmail

Navigation

  • Home
  • Projects
  • ERP Modules
  • About

Expertise

  • ERP Solutions
  • Frontend
  • DevOps
  • Data Engineering

Resources

  • Case Studies
  • Contact

© 2026 Tri Gunawan. All rights reserved.

Built with using Next.js & React Three Fiber

Back to Case Studies
Data Engineering
10 min read1 Data Engineer Lead, 2 Data Engineers, 1 Analytics Engineer

Modern Data Platform

Building a scalable data engineering pipeline processing 1B+ events daily with Airflow, dbt, and ClickHouse.

Modern Data Platform

data

Overview

We built a modern data platform that ingests data from 20+ sources, processes billions of events daily, and provides sub-second analytics for business users. The platform enables self-service analytics while maintaining data quality and governance.

Challenges

  • 1Data siloed across multiple systems (ERP, CRM, E-commerce)
  • 2No single source of truth for business metrics
  • 3Slow query performance on large datasets
  • 4Manual ETL processes prone to errors
  • 5Limited visibility into data quality issues

Solutions

  • Debezium CDC for real-time change data capture from PostgreSQL
  • Apache Airflow for workflow orchestration
  • dbt for transformation layer with 97 models and version control
  • ClickHouse for real-time analytics
  • Data quality monitoring with Great Expectations
  • Self-service BI with Metabase

Implementation

Data Ingestion Layer

Set up Debezium CDC for real-time change data capture from PostgreSQL. Built connectors for databases, APIs, and file sources.

Orchestration & ETL

Deployed Airflow with custom operators. Created 100+ DAGs for various data pipelines.

Transformation Layer

Implemented dbt models with testing and documentation. Created standardized metrics definitions.

Analytics Layer

Deployed ClickHouse cluster for OLAP queries. Built materialized views for common aggregations.

Results

1B+
Daily Events
Processed reliably with automatic scaling
<100ms
Query Latency
P99 on complex analytical queries
<5 min
Data Freshness
From source to analytics-ready
99.9%
Pipeline Reliability
With automatic retry and alerting
-40%
Cost Reduction
Compared to cloud data warehouse

Tech Stack

Debezium CDC
Apache Airflow
dbt Core
ClickHouse
PostgreSQL
Python
Great Expectations
Metabase
Docker
Dokploy
Timeline
6 months
Team
1 Data Engineer Lead, 2 Data Engineers, 1 Analytics Engineer