Hotel Price PredictionHotel Price Prediction

Hotel Price Prediction Machine Learning

Hotel Price Prediction System

This project aims to predict hotel room prices on Booking.com for major cities in Saudi Arabia, with prices given in Saudi Riyal (SAR). The model is trained on real hotel listing data and considers practical factors such as the number of beds, average customer rating, total reviews, and room size. By focusing on these key predictors, the system ensures realistic and reliable price forecasts.The results can be valuable for both travelers and policymakers. For tourists, it provides insights into fair pricing and helps in better travel planning, while for the Ministry of Tourism and other stakeholders, it supports monitoring price trends, detecting anomalies, and promoting competitive and transparent pricing across regions.

Download New Real Time Projects :-Click here

Overview

Field Details
Project Name Booking.com Hotel Price Prediction in KSA
Language/s Used Python
Version (Recommended) Python 3.8+
Type Web Application (Machine Learning)

Why this project

  • Helps tourists and the government anticipate likely room prices across major cities in KSA.
  • Supports price regulation and competitiveness by providing model-driven benchmarks.
  • Surfaces data errors and outliers that distort market signals.
  • Addresses unstable or inconsistent search results by grounding decisions in a unified dataset and model.

Data Description

The dataset was built by scraping Booking.com hotel listings and consolidating them into a structured CSV used by the app and notebooks. It contains the following fields:

  • hotel_name
  • location (e.g., Riyadh, Jeddah, Medina)
  • room_type (e.g., Suite, Room)
  • price (SAR)
  • per_night (stay basis)
  • beds (integer)
  • rating (1–10 scale)
  • rating_title (text label of the rating)
  • number_of_ratings (review count)
  • Size (room area in m²)
  • Log_number_of_ratings (derived)
  • Log_price (derived)

These variables power the regression analysis and the interactive predictions in the app.

Design and Modeling Approach

The project follows a clear regression pipeline:

  1. Fetch
    To avoid seasonality bias, Booking.com listings were scraped iteratively and consolidated. Scraping utilities in the repository rely on lightweight tools and selectors tailored to hotel listing pages.
  2. Clean
    The data underwent standard cleaning steps to remove duplicates, handle NaNs, normalize categorical spaces, and align column names. Only useful features were retained for modeling.
  3. Preprocessing
    To place features on comparable scales and stabilize relationships with price, the following transformations are applied:
    • Feature scaling with RobustScaler and StandardScaler
    • “Gaussianizing” transforms where helpful: log, Box-Cox, and polynomial expansions for non-linear effects
  4. Modeling
    Multiple regressors were explored in the notebooks. The best performing configuration used a Random Forest Regressor, evaluated with train/test split and repeated cross-validation to validate generalization. Reported results include:
    • Test set performance ~96%
    • Mean Absolute Error (MAE) ≈ 0.1974 (on the transformed target)
    These figures demonstrate strong fit for the chosen features in the sampled cities.

Web Application (What You Can Do)

The repository ships with a simple web interface that makes the model usable without opening a notebook:

  • Interactive Inputs: Adjust core drivers—beds, number_of_ratings (reviews), rating, and optionally room size—to get an immediate predicted price in SAR.
  • Instant Predictions: The interface displays the predicted room price using the trained model and the same preprocessing applied during development.
  • Model Explainability with SHAP:
    • Summary plot to see which features most influence pricing across the dataset
    • Bar view for global importance comparisons
      These visuals help policymakers and analysts justify pricing decisions and understand model behavior.

Available Features

  • Streamlined Streamlit web UI for interactive price prediction in SAR
  • End-to-end regression pipeline with Robust/Standard scaling and log/Box-Cox transforms
  • RandomForestRegressor training and validation with repeated K-fold cross-validation
  • SHAP-based feature importance (summary and bar charts) integrated into the app
  • CSV dataset (reg22.csv) aligned to the fields listed above
  • Exploratory notebooks for data analysis and model comparison
  • Lightweight scraping utilities with a minimal requirements file for extractor tooling
  • Saved model artifacts/notebooks for reproducibility and quick experimentation
  • Project report and presentation files for stakeholder communication

Tools and Libraries

  • Language: Python
  • Scraping: requests, selector utilities
  • EDA: Pandas, NumPy, Matplotlib, Seaborn
  • Preprocessing/Modeling: scikit-learn, SciPy, statsmodels, pylab
  • Explainability & Visualization: SHAP, Plotly Express, Missingno, Yellowbrick, Sweetviz
  • Interface: Streamlit

We have projects Available in all languages:–Click Here

Hotel-Price-Prediction-1024x541 Hotel Price Prediction Machine Learning

 


hotel price prediction machine learning
house price prediction using machine learning
real estate price prediction using machine learning
predicting hotel bookings cancellation with a machine learning classification model
hotel booking prediction
hotel booking machine learning
price prediction machine learning project
hotel price prediction machine learning github
hotel price prediction machine learning python
hotel price prediction machine learning pdf
hotel price prediction machine learning example
hotel price prediction machine learning 2022
hotel price prediction machine learning excel

 

Share this content:

Post Comment