Kafka Backfill Playbook: Accessing Historical Data

Event-driven architectures with Kafka have become a standard way of building modern microservices. At first, everything works smoothly - services communicate via events, state is rebuilt from event streams, and the system scales well. But as your data grows, you face an inevitable challenge: what happens when you need to access historical events that are no longer in Kafka? 1. The Problem: Finite Retention & The Need for Backfills In a perfect world, we would keep every event log in Kafka forever. In the real world, however, storing an ever-growing history on high-performance broker disks is prohibitively expensive. ...

September 25, 2025 · 8 min · 1594 words · Nejc Korasa

Stream unzip files in S3 with Java

I’ve been spending a lot of time with AWS S3 recently building data pipelines and have encountered a surprisingly non-trivial challenge of unzipping files in an S3 bucket. A few minutes with Google and StackOverflow made it clear many others have faced the same issue. I’ll explain a few options to handle the unzipping as well as the end solution which has led me to build nejckorasa/s3-stream-unzip. To sum up: there is no support to unzip files in S3 in-line, there also is no unzip built-in api available in AWS SDK. In order to unzip you therefore need to download the files from S3, unzip and upload decompressed files back. ...

October 22, 2022 · 4 min · 761 words · Nejc Korasa