Data Engineering Projects for Beginners

Hi everyone,

I am a little bit obsessed with data engineering and lately I have been working on several open source projects about this topic, here is a list of repositories and technologies used in each one, if you decide to go deeper into this funny…


This content originally appeared on DEV Community and was authored by Ramses Alexander Coraspe

Hi everyone,

I am a little bit obsessed with data engineering and lately I have been working on several open source projects about this topic, here is a list of repositories and technologies used in each one, if you decide to go deeper into this funny world then these repositories could help you as a guide.

❤ means "I like this one"

Tracking your Uber Rides and Uber Eats expenses through a data engineering process

Technologies and skills:

Python, Docker, Apache Airflow, AWS Redshift, Power BI, data modelling, Task schedulling, ETL and ELT processes, Data warehousing, Cloud

Scheduling Big Data Workloads and Data Pipelines in the Cloud with pyDag

Technologies and skills:

Python, Docker, Big Data, Cloud, BigQuery, Workflow Engines, GCP, Task scheduler, Google Cloud Platform, Dataproc cluster, GCS, Google Cloud Storage, Redis, DAG, Parallel Processing, Apache Spark

Building Big Data Pipelines in the Cloud with AWS EMR

Technologies and skills:

Python, PySpark, AWS EMR, Task Schedulling, IAC, EC2 Instances, Apache Spark, Cloud

Building a Lossless Data Compression and Data Decompression Pipeline

Technologies and skills:

Python, Data compression, BZIP2, Parallel programming

Learn how to dockerize an Apache Spark Standalone Cluster

Technologies and skills:

Python, Jupyter Notebook, Apache Spark, Docker, docker-compose, Hive

Dockerizing and Consuming an Apache Livy environment

Technologies and skills:

Python, Big Data, Docker, docker-compose, Apache Livy, Apache Spark, PostgreSQL, PySpark, Jupyter Notebook

Design, Development and Deployment of a simple Data Pipeline

Technologies and skills:

Python, data Modelling, Docker, docker-compose, PostgreSQL, data pipeline, FastApi

Dockerizing a Python Script for Faster Web Scraping

Technologies and skills:

Python, Docker, Sqlite, Dockerfile, Web scraping, Data pipeline, FastApi

Understanding Similarity Measures for Text Analysis

Technologies and skills:

Python, Machine Learning, Similarity measures, Distance metrics, Text Analysis

Learn how to build a content-based Movie Recommender System

Technologies and skills:

Python, Machine Learning, TF-IDF, Cosine similarity, BM25, BERT, NLP, word2vec, Text Analysis, recsys

A Text Analysis of Speeches

Technologies and skills:

Python, Machine Learning, NLP, word2vec, Text Analysis, Sentiment Analysis, PCA, t-SNE, Word Embeddings, Text Preprocessing, Web scraping, Data Visualization, Mexico

Dropout Students Prediction

Technologies and skills:

R, Genetic algorithm, Neural Networks, K-Means, Clustering, Machine Learning

I will be working on more complex projects in the next months using modern tech data stacks.


This content originally appeared on DEV Community and was authored by Ramses Alexander Coraspe


Print Share Comment Cite Upload Translate Updates
APA

Ramses Alexander Coraspe | Sciencx (2022-06-15T23:40:58+00:00) Data Engineering Projects for Beginners. Retrieved from https://www.scien.cx/2022/06/15/data-engineering-projects-for-beginners/

MLA
" » Data Engineering Projects for Beginners." Ramses Alexander Coraspe | Sciencx - Wednesday June 15, 2022, https://www.scien.cx/2022/06/15/data-engineering-projects-for-beginners/
HARVARD
Ramses Alexander Coraspe | Sciencx Wednesday June 15, 2022 » Data Engineering Projects for Beginners., viewed ,<https://www.scien.cx/2022/06/15/data-engineering-projects-for-beginners/>
VANCOUVER
Ramses Alexander Coraspe | Sciencx - » Data Engineering Projects for Beginners. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2022/06/15/data-engineering-projects-for-beginners/
CHICAGO
" » Data Engineering Projects for Beginners." Ramses Alexander Coraspe | Sciencx - Accessed . https://www.scien.cx/2022/06/15/data-engineering-projects-for-beginners/
IEEE
" » Data Engineering Projects for Beginners." Ramses Alexander Coraspe | Sciencx [Online]. Available: https://www.scien.cx/2022/06/15/data-engineering-projects-for-beginners/. [Accessed: ]
rf:citation
» Data Engineering Projects for Beginners | Ramses Alexander Coraspe | Sciencx | https://www.scien.cx/2022/06/15/data-engineering-projects-for-beginners/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.