DEV Community

# dataengineering

Posts

đź‘‹ Sign in for the ability to sort posts by relevant, latest, or top.
Building ETL/ELT Pipelines For Data Engineers.

Building ETL/ELT Pipelines For Data Engineers.

5
Comments
2 min read
What's new and noteworthy on AWS - Summer 2023 edition

What's new and noteworthy on AWS - Summer 2023 edition

4
Comments
24 min read
Automating Talend Jobs Using Apache Airflow .

Automating Talend Jobs Using Apache Airflow .

7
Comments
3 min read
Data-aware Scheduling in Airflow: A Practical Guide with DAG Factory

Data-aware Scheduling in Airflow: A Practical Guide with DAG Factory

Comments
6 min read
Automating Data Pipeline Deployment on AWS with Terraform: Utilizing Lambda, Glue, Crawler, Redshift, and S3

Automating Data Pipeline Deployment on AWS with Terraform: Utilizing Lambda, Glue, Crawler, Redshift, and S3

Comments 1
8 min read
A mage on the Hero’s Journey: a fantasy epic on how a startup rose from the ashes

A mage on the Hero’s Journey: a fantasy epic on how a startup rose from the ashes

6
Comments
9 min read
What is data engineering and a B.I architecture

What is data engineering and a B.I architecture

5
Comments
6 min read
Feature Engineering Has a Language Problem

Feature Engineering Has a Language Problem

1
Comments
15 min read
Debugging Python Data Pipelines

Debugging Python Data Pipelines

Comments
3 min read
How To Create Dataflow Job with Scio

How To Create Dataflow Job with Scio

2
Comments
8 min read
Using pyspark to stream data from coingecko API and visualise using dash

Using pyspark to stream data from coingecko API and visualise using dash

Comments
6 min read
AWS Redshift: Robust and Scalable Data Warehousing

AWS Redshift: Robust and Scalable Data Warehousing

2
Comments
6 min read
Stream data processing with Mage

Stream data processing with Mage

2
Comments
8 min read
PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows

PyJaws: A Pythonic Way to Define Databricks Jobs and Workflows

3
Comments
1 min read
Class to Airflow Custom Operator

Class to Airflow Custom Operator

Comments
3 min read
How to pivot data using Dynamic SQL in SQL Server

How to pivot data using Dynamic SQL in SQL Server

5
Comments 4
3 min read
How to clone tables in BigQuery

How to clone tables in BigQuery

2
Comments
1 min read
kafka: event driven microservices

kafka: event driven microservices

2
Comments
6 min read
Getting started with Apache Flink: A guide to stream processing

Getting started with Apache Flink: A guide to stream processing

1
Comments
8 min read
How to rotate data using Pivot & Unpivot operators

How to rotate data using Pivot & Unpivot operators

3
Comments 2
3 min read
Apply CDC From MySQL To Clickhouse on local environment

Apply CDC From MySQL To Clickhouse on local environment

2
Comments
6 min read
Mage Battlegrounds: Craft insights from real-time customer behavior analysis

Mage Battlegrounds: Craft insights from real-time customer behavior analysis

2
Comments
2 min read
Apache Flink vs Apache Spark: A detailed comparison for data processing

Apache Flink vs Apache Spark: A detailed comparison for data processing

2
Comments
5 min read
Abstract Configurations

Abstract Configurations

1
Comments
3 min read
Apache Flink episode 1: A comprehensive introduction

Apache Flink episode 1: A comprehensive introduction

1
Comments
6 min read
Data sources episode 2: AWS S3 to Postgres Data Sync using Singer

Data sources episode 2: AWS S3 to Postgres Data Sync using Singer

2
Comments
4 min read
Data sources episode 1: Common data sources in modern pipelines

Data sources episode 1: Common data sources in modern pipelines

1
Comments
6 min read
Handling NULL in the DBs

Handling NULL in the DBs

5
Comments 1
2 min read
Unleashing the Magic of Job Schedulers: How to Tame Your Code and Save Your Sanity

Unleashing the Magic of Job Schedulers: How to Tame Your Code and Save Your Sanity

4
Comments
3 min read
Scraper Function to Airflow DAG

Scraper Function to Airflow DAG

1
Comments 1
3 min read
Code optimization

Code optimization

Comments
2 min read
From Class to Abstract Classes

From Class to Abstract Classes

1
Comments
3 min read
Deep Drive SQL ( part 01 )

Deep Drive SQL ( part 01 )

Comments
10 min read
SQL 102:Intermediate SQL

SQL 102:Intermediate SQL

Comments
10 min read
From Functional to Class: a look at SOLID coding

From Functional to Class: a look at SOLID coding

1
Comments
3 min read
Hadoop Migration: How we pulled this off together

Hadoop Migration: How we pulled this off together

Comments
8 min read
Quick Detour on Unit Testing with PyTest

Quick Detour on Unit Testing with PyTest

1
Comments
3 min read
Trigger Azure Data Factory Pipeline from Event Grid (Using Webhook Endpoint)

Trigger Azure Data Factory Pipeline from Event Grid (Using Webhook Endpoint)

Comments 2
4 min read
Bootstrapped to Functional

Bootstrapped to Functional

1
Comments
3 min read
AWS Cloud9 for Data Engineers

AWS Cloud9 for Data Engineers

1
Comments
5 min read
The Pyramid of Alerting

The Pyramid of Alerting

6
Comments 1
6 min read
Batch Processing vs Stream Processing: Why Batch is dying and Streaming takes over

Batch Processing vs Stream Processing: Why Batch is dying and Streaming takes over

Comments
14 min read
Introduction to Data Version Control

Introduction to Data Version Control

Comments
6 min read
Structure Query Language

Structure Query Language

6
Comments
2 min read
Using python dictionary in data engineering.

Using python dictionary in data engineering.

2
Comments 2
2 min read
How we mastered dbt: A true story

How we mastered dbt: A true story

7
Comments
14 min read
Important Questions related to Data Engineering

Important Questions related to Data Engineering

2
Comments
1 min read
Python functions and lambda functions in data engineering.

Python functions and lambda functions in data engineering.

2
Comments
3 min read
Data Wrangling in Python: Tips and Tricks

Data Wrangling in Python: Tips and Tricks

Comments
3 min read
Website Monitoring using AWS Lambda and Aurora

Website Monitoring using AWS Lambda and Aurora

2
Comments
4 min read
Apache Airflow - Deep Dive | All you need to know about Airflow

Apache Airflow - Deep Dive | All you need to know about Airflow

6
Comments
20 min read
How I Decreased ETL Cost by Leveraging the Apache Arrow Ecosystem

How I Decreased ETL Cost by Leveraging the Apache Arrow Ecosystem

Comments
6 min read
Data Platform Architecture Types

Data Platform Architecture Types

1
Comments
9 min read
Integrando uma Web API com Datastore Emulator

Integrando uma Web API com Datastore Emulator

1
Comments
4 min read
Creating Data Pipelines as DAGs in Apache Airflow (Part 1)

Creating Data Pipelines as DAGs in Apache Airflow (Part 1)

Comments
6 min read
SQL101: Introduction to SQL

SQL101: Introduction to SQL

Comments 2
14 min read
Data Pipelines with Great Expectations | Introduction

Data Pipelines with Great Expectations | Introduction

2
Comments
2 min read
22 Best DataOps Tools To Optimize Your Data Management and Observability In 2023

22 Best DataOps Tools To Optimize Your Data Management and Observability In 2023

16
Comments 1
30 min read
Building a Data Lakehouse for Analyzing Elon Musk Tweets using MinIO, Apache Airflow, Apache Drill and Apache Superset

Building a Data Lakehouse for Analyzing Elon Musk Tweets using MinIO, Apache Airflow, Apache Drill and Apache Superset

13
Comments 2
8 min read
Nesting Columns like a Pro: A Guide to Mastering Nested Structs in PySpark

Nesting Columns like a Pro: A Guide to Mastering Nested Structs in PySpark

Comments
4 min read
loading...