DEV Community

# spark

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
Top 12 Spark Interview Problems for Data Engineers, With Answers

Top 12 Spark Interview Problems for Data Engineers, With Answers

Comments
10 min read
Read-Write ETL on NAS Data with EMR Serverless Spark — No Cluster, No Copy

Read-Write ETL on NAS Data with EMR Serverless Spark — No Cluster, No Copy

Comments
10 min read
Stream Processing Continuum: Golang Sockets to Flink and Spark Pipelines

Stream Processing Continuum: Golang Sockets to Flink and Spark Pipelines

1
Comments
36 min read
The Data Refinery: Why Apache Spark is the Engine Behind Real-World Big Data Use Cases

The Data Refinery: Why Apache Spark is the Engine Behind Real-World Big Data Use Cases

Comments
2 min read
Why My Spark Container Keeps Exiting — Docker PID 1 and the Daemon Trap

Why My Spark Container Keeps Exiting — Docker PID 1 and the Daemon Trap

Comments 1
5 min read
Fentanyl Poverty: Building a Big Data Pipeline to Map America's Overdose Epidemic

Fentanyl Poverty: Building a Big Data Pipeline to Map America's Overdose Epidemic

5
Comments 4
3 min read
Understanding Join Strategies in PySpark (With Real-World Insights)

Understanding Join Strategies in PySpark (With Real-World Insights)

Comments
2 min read
Stopping Spark Structured Streaming jobs via external signals

Stopping Spark Structured Streaming jobs via external signals

Comments
3 min read
Spark Performance Masterclass: Delta Lake Optimization Cheatsheet

Spark Performance Masterclass: Delta Lake Optimization Cheatsheet

Comments
8 min read
Streaming Pipeline Kit: Streaming Patterns & Best Practices

Streaming Pipeline Kit: Streaming Patterns & Best Practices

Comments
6 min read
Apache Spark in Plain English: The Engine Behind Databricks

Apache Spark in Plain English: The Engine Behind Databricks

Comments
5 min read
Spark Optimization Playbook: Adaptive Query Execution AQE Tuning Guide

Spark Optimization Playbook: Adaptive Query Execution AQE Tuning Guide

Comments
5 min read
Spark ETL Framework: ETL Patterns Guide — Spark ETL Framework

Spark ETL Framework: ETL Patterns Guide — Spark ETL Framework

Comments
3 min read
Building an open-source vendor-neutral lakehouse

Building an open-source vendor-neutral lakehouse

1
Comments
5 min read
Real-Time Data Streaming with Apache Kafka and Spark

Real-Time Data Streaming with Apache Kafka and Spark

3
Comments
7 min read
👋 Sign in for the ability to sort posts by relevant, latest, or top.