Header Image

Table of Contents

Project Overview

Using AI to predict the outcomes of NBA games.

This project aims to streamline the process of predicting NBA game outcomes by focusing on advanced AI prediction models rather than extensive data collection and management. Unlike my previous project, NBA Betting, which aimed to create a comprehensive feature set for predicting NBA games through extensive data collection, this project simplifies the process. While the previous approach benefited from various industry-derived metrics, the cost and complexity of managing the data collection were too high. This project focuses on a core data set, such as play-by-play data, and leverages deep learning and GenAI to predict game outcomes.

Current State

The project is in active development with a complete data collection pipeline and basic prediction engines. The current system processes multiple seasons of data with a complete PBP → GameStates → Features → Predictions pipeline. The web app provides a simple interface for displaying games with current scores and predictions.

Project Flowchart

The project is built around a few key components:

Future Goals

Foundational Model Outline
  1. Data Sourcing: Focus on a minimal number of data sources that fundamentally describe basketball. Currently, we use play-by-play data from the NBA API. In the future, incorporating video and tracking data would be interesting, though these require considerably more resources and access.
  2. Prediction Engine: This is the core of the project and the current development focus. The current prediction engine options will be replaced with a DL and GenAI-based engine, allowing for decreased data parsing and feature engineering while also scaling to predict more complex outcomes, including individual player performance.
  3. Data Storage: Future data storage will more seamlessly integrate with the prediction engine. The storage requirements will combine the current SQL-based data used for the API and web app with more advanced vector-based storage for RAG-based GenAI models.
  4. Web App: This is the project's front end, displaying the games for the selected date along with current scores and predictions. The interface will remain simple while usability is gradually improved. A separate GenAI chat will be added in the future to allow users to interact with the prediction engine and modify individual predictions based on their preferences.

Guiding Principles

Project Guiding Principles

Web App

Web App Home Page Web App Game Details

Prediction Engines

Currently, there are a few basic prediction engines used to predict the outcomes of NBA games. These serve as placeholders for the more advanced DL and GenAI engines that will be implemented in the future. The current engines make pre-game predictions for home and away scores using ML models. These predictions are then used to calculate the win percentage and margin for the home team. Updated (after game start) predictions are based on a combination of the current game score, time remaining, and the pre-game predictions.

Current Prediction Engines

Performance Metrics

The current metrics are based on pre-game predictions for the home and away team scores, along with downstream metrics such as win percentage and margin. These simple predictors currently outperform the baseline predictor.

In the future, a more challenging baseline based on the Vegas spread will be added when the DL and GenAI models are implemented.

Prediction Engine Performance Metrics

Installation and Usage

Requirements

Installation

Clone the repository and run the automated setup:

git clone https://github.com/NBA-Betting/NBA_AI.git
cd NBA_AI
python setup.py

The setup script will:

  1. Create a virtual environment
  2. Install all dependencies
  3. Download the database and trained models from GitHub Releases
  4. Create your .env configuration file
  5. Verify the installation

Running the Web App

# Activate the virtual environment
source venv/bin/activate

# Start the web app
python start_app.py

Visit http://localhost:5000 to view games and predictions.

Command Line Options

# Use a specific predictor
python start_app.py --predictor=Tree

# Enable debug mode
python start_app.py --debug

# Set log level
python start_app.py --log_level=DEBUG

Available predictors: Baseline, Linear, Tree, MLP, Ensemble

Historical Data

The default setup downloads only the current season. A database with seasons 2023-2024 through 2025-2026 is available from GitHub Releases as NBA_AI_2023_2025.sqlite.

To use it, update your .env:

DATABASE_PATH=data/NBA_AI_2023_2025.sqlite

Usage Notes

api:
  valid_seasons:
    - "2023-2024"
    - "2024-2025"
    - "2025-2026"

Development Note

Development has been slower than expected as I've been focused on other projects. I'm now refocusing on NBA AI and will be more responsive to issues and messages. Thanks for your patience!