Bus Travel Data Warehouse

Introduction

In modern travel businesses, data plays a critical role in tracking operations, analyzing customer behavior, and optimizing resources. A well-designed data warehouse enables efficient reporting and analytics. This project focuses on designing a Star Schema for a Bus Travel Data Warehouse using Kimball’s methodology. It involves creating dimension tables, fact tables, and handling Slowly Changing Dimensions (SCDs) to support historical tracking and business intelligence.


 

Business Context

A bus travel agency manages various services, including travel bookings, customer interactions, package offerings, and resource allocations (buses and drivers). The primary goal of this data warehouse is to provide insights into:

  • Customer behavior and demographics

  • Sales trends and travel bookings

  • Performance of different travel packages

  • Operational efficiency of buses and drivers


 

Data Modeling Approach

This project follows Kimball’s methodology, which emphasizes a dimensional model for easy querying and reporting. The key steps include:

  1. Identifying Entities

  2. Designing the Source Tables (Operational Database)

  3. Creating the Star Schema (Data Warehouse Schema)

  4. Identifying Slowly Changing Dimensions (SCDs)

  5. Populating the Data Warehouse with Data

Step 1: Identifying Key Entities

The main entities in the data warehouse are:

  • Customers: Tracking personal details for marketing and customer engagement.

  • Travel Orders: Capturing transactions related to bookings and payments.

  • Packages: Representing different travel offerings (destinations, pricing, etc.).

  • Drivers: Managing driver information and their experience level.

  • Buses: Recording details of available buses.

Step 2: Source Table Design (Operational Database)

The source tables store transactional data and are normalized for OLTP (Online Transaction Processing). Key tables include:

  • Customer: Stores customer details.

  • TravelOrders: Stores booking details, including order status and total amount.

  • Package: Stores information about travel packages.

  • Driver: Stores driver details including experience and license numbers.

  • Bus: Records available buses and their respective brands.

Step 3: Designing the Star Schema

A Star Schema is designed with a central fact table surrounded by dimension tables:

Fact Table: TravelOrderFact

  • OrderItemID (PK)

  • OrderID (FK to TravelOrders)

  • CustomerID (FK to CustomerDim)

  • OrderDateKey (FK to DateDim)

  • OrderStatus

  • TotalAmount

  • PackageID (FK to PackageDim)

  • DriverID (FK to DriverDim)

  • BusID (FK to BusDim)

  • QuantityPassenger

  • TravelDuration

Dimension Tables:

  1. CustomerDim – Stores customer details

  2. PackageDim – Stores package details

  3. DriverDim – Stores driver details

  4. BusDim – Stores bus details with reference to BrandDim

  5. BrandDim – Stores bus brand details

  6. DateDim – Stores time-related attributes for trend analysis

Step 4: Handling Slowly Changing Dimensions (SCDs)

Some dimensions require historical tracking, making SCD Type 2 a suitable choice for:

  • CustomerDim: Captures changes in customer details over time.

  • PackageDim: Tracks updates in travel package pricing and destinations.

  • DriverDim: Records experience level changes over time.


 

Step 5: Populating the Data Warehouse

Data is extracted from operational tables and transformed into the Star Schema:

  1. Dimension Tables are populated first to maintain referential integrity.

  2. The DateDim table is pre-filled with a range of dates.

  3. Fact Table is loaded last, linking dimensions using foreign keys.

Powered By EmbedPress

Benefits of This Data Warehouse

  • Faster Query Performance: Simplifies analytical queries using a denormalized schema.

  • Enhanced Business Intelligence: Supports sales trends, package performance, and customer analysis.

  • Historical Tracking: Allows businesses to track changes over time using SCD Type 2.

  • Scalability: The model can be expanded with additional dimensions or fact tables.

Conclusion

This project demonstrates how Kimball’s methodology can be applied to a bus travel agency’s data warehouse to enable efficient reporting, business analysis, and decision-making. The Star Schema ensures optimized query performance, and the use of Slowly Changing Dimensions supports historical data tracking. With this implementation, the business can enhance its customer insights, package optimization, and operational efficiency.

🔗 Find the complete query and documentation on my GitHub:

Leave a Comment

Your email address will not be published. Required fields are marked *