About

After 15+ years building distributed data platforms that processed multiple exabytes of data, I spent a lot of time wading through docs, obscure blogs, and trial-and-error just to identify best practices and real trade-offs.

Most learning resources lacked depth, were optimized for SEO, or were vendor-funded articles that only ever covered the positives.

So I started Start Data Engineering to fix that: high-quality, project-based content that’s affordable and actually prepares you for the work. Over 20,000 engineers have used these guides to level up their skills.

Everything here is built around three things:

  1. Content you can act on immediately, not just read and forget
  2. Code-first lessons that force you to learn by doing
  3. Honest trade-off discussions so you know when to use what you learn

Questions or feedback? I read every email — reach out anytime.

Back to top