Portfolio

ETL Pipeline with Apache Airflow: Extract, Transform, and Load Data into SQLite

This Data Engineer project creates an ETL pipeline using Airflow. Data is extracted from CSV sources, saved as CSV in the `data/` folder, and loaded into SQLite. The `BranchOperator` selects the extraction task. Includes DAGs, staging folder, SQLite results, and screenshots of the pipeline.

ETL Pipeline with Apache Airflow: Extract, Transform, and Load Data into SQLite Read More »

SQL Movie Database Analysis

SQL Movie Database Analysis: Genres, Ratings & Actors

This SQL project explores a movie database using queries to find characters, analyze genres, and determine actor involvement. Key features include identifying the most popular genre, highest and lowest rated movies, and actors in multiple films. SQL joins, grouping, and aggregate functions are utilized to gain insights efficiently.

SQL Movie Database Analysis: Genres, Ratings & Actors Read More »

Youtube Supervised Learning

YouTube Supervised Learning

Predicting YouTube video views using regression models with data analysis and preprocessing on the YouTube Statistics dataset. Involves feature engineering, model selection (Linear Regression, Random Forest), and evaluation using RMSE and R2 metrics to ensure prediction accuracy. Code, Queries & Documentation Find the complete code, and documentation on my GitHub: https://github.com/hijirdella/YouTube-Supervised-Learning

YouTube Supervised Learning Read More »

Data Engineering Project: Building a Telco Churn Analysis Pipeline

End-to-end Telco Churn Data Engineering Project using Docker, Python, Apache Airflow, PostgreSQL, and Spark. This project automates data ingestion, processing, and storage through a scalable pipeline. Airflow orchestrates the workflow, Spark handles batch processing, and PostgreSQL serves as the primary database. Dataset: Telco Customer Churn (dibimbing) | Dataset: Telco Customer Churn (Kaggle)) https://github.com/hijirdella/Telco-Churn-Data-Pipeline Dashboard Looker Studio

Data Engineering Project: Building a Telco Churn Analysis Pipeline Read More »