Real-Time Financial Data Pipeline
Kafka → Spark Streaming → Bronze/Silver/Gold → Analytics
- Handled late-arriving + out-of-order events using event-time logic
- Prevented duplicates with idempotent writes
- Recovered safely via checkpoints
Batch • Streaming • Reliability
Data Engineer with 6 months internship experience at Intuit. I focus on upstream → downstream systems, debugging, data quality, and reliability.
Kafka → Spark Streaming → Bronze/Silver/Gold → Analytics
Multiple sources → Incremental ingestion → Spark → Warehouse
Quality checks → Metrics → Dashboards → Alerts
Python, SQL, Spark, Airflow, Kafka
AWS (S3, Athena), Warehousing, Data Modeling, Quality