March 11, 2025

Automated ETL Workflow with Airflow – MySQL to PostgreSQL

For this Data Engineer project, I implemented a testing workflow for Airflow DAGs. The project involves creating more than 5 tables in MySQL, extracting data to PostgreSQL, and managing the pipeline with dynamic DAGs. Jobs are scheduled to run every 2 hours between 9 AM and 9 PM on Fridays, twice a month (week 1 and week 3), at the 15th minute.

Automated ETL Workflow with Airflow – MySQL to PostgreSQL Read More »

ETL Pipeline with Apache Airflow: Extract, Transform, and Load Data into SQLite

This Data Engineer project creates an ETL pipeline using Airflow. Data is extracted from CSV sources, saved as CSV in the `data/` folder, and loaded into SQLite. The `BranchOperator` selects the extraction task. Includes DAGs, staging folder, SQLite results, and screenshots of the pipeline.

ETL Pipeline with Apache Airflow: Extract, Transform, and Load Data into SQLite Read More »

SQL Movie Database Analysis

SQL Movie Database Analysis: Genres, Ratings & Actors

This SQL project explores a movie database using queries to find characters, analyze genres, and determine actor involvement. Key features include identifying the most popular genre, highest and lowest rated movies, and actors in multiple films. SQL joins, grouping, and aggregate functions are utilized to gain insights efficiently.

SQL Movie Database Analysis: Genres, Ratings & Actors Read More »

Youtube Supervised Learning

YouTube Supervised Learning

Predicting YouTube video views using regression models with data analysis and preprocessing on the YouTube Statistics dataset. Involves feature engineering, model selection (Linear Regression, Random Forest), and evaluation using RMSE and R2 metrics to ensure prediction accuracy. Code, Queries & Documentation Find the complete code, and documentation on my GitHub: https://github.com/hijirdella/YouTube-Supervised-Learning

YouTube Supervised Learning Read More »