Orchestrating Airflow DAGs with GitHub Actions – A Lightweight Approach to Data Curation Across Spa Post date October 25, 2024 Post author By Alex Merced Post categories In airflow, airflow-deployment, apache-spark, dbt, dremio, github, github-actions, snowflake
What The Heck is Apache Polaris? Post date September 11, 2024 Post author By Shawn Gordon Post categories In apache-iceberg, apache-polaris, apache-polaris-explained, apache-spark, data-space, databricks, snowflake, what-is-apache-polaris
Accelerating Write-Intensive Data Workloads on AWS S3 Post date September 10, 2021 Post author By Bin Fan Post categories In apache-spark, aws-s3, caching, cloud, data-orchestration, performance, software-development, storage
Share Large Amounts of Live Data With Delta Sharing and Docker Post date September 3, 2021 Post author By Frank Munz Post categories In apache-spark, delta-lake, linux-foundation, machine-learning, open source, pandas, programming, python
How to Authenticate Kafka Using Kerberos (SASL), Spark, and Jupyter Notebook Post date July 19, 2021 Post author By Artem Gogin Post categories In apache-spark, jupyter-notebook, kafka, kerberos, programming, pyspark, spark, spark-streaming
Analyzing Dogecoin Tweet Sentiment in Real Time Post date May 25, 2021 Post author By Merlin Post categories In apache-kafka, apache-spark, cryptocurrency, data-analytics, dogecoin, real-time-processing, stream-processing, twitter-sentiment-analysis
Introduction to Delight: Spark UI and Spark History Server Post date May 8, 2021 Post author By Jean-Yves "JY" Stephan Post categories In apache-spark, big-data, data-engineering, data-science, monitoring, open source, spark-history-server, spark-ui
Apache Spark Ecosystem Post date April 30, 2021 Post author By Anello Post categories In apache, apache-spark, big-data, data, spark
The DeltaLog: Fundamentals of Delta Lake [Part 2] Post date March 18, 2021 Post author By Adi Polak Post categories In apache-spark, beginners-guide, big-data-engineer, data-engineering, delta-lake, delta-lake-fundamentals, deltalog, hackernoon-top-story
ACID Transactions: Fundamentals of Delta Lake – Part 1 Post date March 6, 2021 Post author By Adi Polak Post categories In acid-transactions-delta-lake, apache-spark, big-data, delta-lake, delta-lake-fundamentals, deltalog-acid-transactions, hackernoon-top-story, scala