10 Skills to Ace Your Data Engineering Interviews

Oct 11, 2021 · 6 min read

Preparing for a data engineering interview and are overwhelmed by all the tools and concepts?. Then this post is for you, in this post we go over the most common tools and concepts you need to know to ace your data engineering interviews.

What is a staging area?

Oct 5, 2021 · 3 min read

Wondering what is staging and why you need one for your data pipelines? Then this post is for you. In this post, we will go over what exactly a staging area is and why it is crucial for data pipelines.

What is a Data Warehouse?

Oct 3, 2021 · 5 min read

Unclear what a data warehouse is or when to use one? Then this post is for you. In this post, we go over what a data warehouse is, the need for it, and the differences between using an OLTP and OLAP database as a data warehouse.

How to Scale Your Data Pipelines

Sep 16, 2021 · 4 min read

Confused by all the tools and frameworks available to scale your data pipeline? Then this post is for you. In this post, we go over what scaling is, the different types of scaling, and how to choose scaling strategies for your data pipelines. By the end of this post, you will be able to come up with the correct scaling strategy for any data pipeline.

Understand & Deliver on Your Data Engineering Task

Aug 29, 2021 · 7 min read

Want to deliver on your data engineering tasks with confidence? Then this post is for you. In this post, we go over a list of steps that you can use to understand what your assigned work is, why it matters and how to deliver great work.

4 Key Patterns to Load Data Into A Data Warehouse

Aug 17, 2021 · 5 min read

Unsure how to load data into a data warehouse? Then this post is for you. In this post, we go over 4 key patterns to load data into a data warehouse. These patterns can help you build resilient and easy-to-use data pipelines. Level up as a data engineer and deliver usable data faster!

How to Validate Datatypes in Python

Jul 21, 2021 · 5 min read

Frustrated with handling data type conversion issues in python? Then this post is for you. In this post, we go over a reusable data type conversion pattern using Pydantic. We will also go over the caveats involved in using this library.

Designing a Data Project to Impress Hiring Managers

Jun 25, 2021 · 9 min read

Frustrated that hiring managers are not reading your Github projects? then this post is for you. In this post, we discuss a way to impress hiring managers by hosting a live dashboard with near real-time data. We will also go over coding best practices such as project structure, automated formatting, and testing to make your code professional. By the end of this post, you will have deployed a live dashboard that you can link to your resume and LinkedIn.

How to make data pipelines idempotent

May 13, 2021 · 4 min read

Unable to find practical examples of idempotent data pipelines? Then, this post is for you. In this post, we go over a technique that you can use to make your data pipelines professional and data reprocessing a breeze.

Writing memory efficient data pipelines in Python

Apr 26, 2021 · 7 min read

Working with a dataset that is too large to fit in memory? Then this post is for you. In this post, we will write memory efficient data pipelines using python generators. We also cover the common generator patterns you will need for your data pipelines.

Land your dream Data Engineering job!