Header Image

Table of Contents

Project Overview

Using AI to predict the outcomes of NBA games.

This project aims to streamline the process of predicting NBA game outcomes by focusing on advanced AI prediction models rather than extensive data collection and management. Unlike my previous project, NBA Betting, which aimed to create a comprehensive feature set for predicting NBA games through extensive data collection, this project simplifies the process. While the previous approach benefited from various industry-derived metrics, the cost and complexity of managing the data collection were too high. This project focuses on a core data set, such as play-by-play data, and leverages deep learning and GenAI to predict game outcomes.

Current State

The project is currently in the early stages of development, with a basic prediction engine that uses simple models like ridge regression, XGBoost, and a basic MLP. The prediction engine is limited to basic game score predictions and win percentages. The web app provides a simple interface for displaying games for the selected date along with current scores and predictions. Fortunately, this is as complicated as the project should become. The goal is to gradually integrate most pieces of the Database Updater and part of the Games API logic into a single prediction engine. This will allow for a more streamlined process and a more capable prediction engine.

Project Flowchart

The project is built around a few key components:

Future Goals

Foundational Model Outline
  1. Data Sourcing: Focus on a minimal number of data sources that fundamentally describe basketball. Currently, we use play-by-play data from the NBA API. In the future, incorporating video and tracking data would be interesting, though these require considerably more resources and access.
  2. Prediction Engine: This is the core of the project and will be the development focus until the 2024-2025 season begins. The current prediction engine options will be replaced with a DL and GenAI-based engine, allowing for decreased data parsing and feature engineering while also scaling to predict more complex outcomes, including individual player performance.
  3. Data Storage: Future data storage will more seamlessly integrate with the prediction engine. The storage requirements will combine the current SQL-based data used for the API and web app with more advanced vector-based storage for RAG-based GenAI models.
  4. Web App: This is the project's front end, displaying the games for the selected date along with current scores and predictions. The interface will remain simple while usability is gradually improved. A separate GenAI chat will be added in the future to allow users to interact with the prediction engine and modify individual predictions based on their preferences.

Guiding Principles

Project Guiding Principles

Web App

Web App Home Page Web App Game Details

Prediction Engines

Currently, there are a few basic prediction engines used to predict the outcomes of NBA games. These serve as placeholders for the more advanced DL and GenAI engines that will be implemented in the future. The current engines make pre-game predictions for home and away scores using ML models. These predictions are then used to calculate the win percentage and margin for the home team. Updated (after game start) predictions are based on a combination of the current game score, time remaining, and the pre-game predictions.

Current Prediction Engines

Performance Metrics

The current metrics are based on pre-game predictions for the home and away team scores, along with downstream metrics such as win percentage and margin. These simple predictors currently outperform the baseline predictor.

In the future, a more challenging baseline based on the Vegas spread will be added when the DL and GenAI models are implemented.

Prediction Engine Performance Metrics

Installation and Usage

Step 1: Clone the Repository

Clone the repository to your local machine using the following command:

git clone https://github.com/NBA-Betting/NBA_AI.git

Navigate to the project directory:

cd NBA_AI

Create a virtual environment:

python -m venv venv

Activate the virtual environment:

source venv/bin/activate

Step 3: Install Dependencies

Install the required dependencies:

pip install -r requirements.txt

Step 4: Set Up Environment Variables

Rename the .env.template file to .env:

cp .env.template .env

Open the .env file in your preferred text editor and set the necessary values:

# .env
# Flask secret key (Optional, Flask will generate one if not set)
# WEB_APP_SECRET_KEY=your_generated_secret_key

# Project root path (Mandatory)
PROJECT_ROOT=/path/to/your/project/root

Replace /path/to/your/project/root with the actual path to the root directory of your project on your local machine. You can leave WEB_APP_SECRET_KEY commented out if you want Flask to generate it automatically.

Step 5: Configure the Database

By default, the configuration will point to the empty database (data/NBA_AI_BASE.sqlite). If you want to use the pre-populated 2023-2024 season data:

  1. Download the SQLite database zip file from the GitHub release page:
    • Go to the Releases page of the repository.
    • Find the latest release (e.g., v0.1).
    • Download the NBA_AI_2023_2024.zip file attached to the release.
  2. Extract the zip file:
    unzip path/to/NBA_AI_2023_2024.zip -d data
  1. Update the config.yaml file to point to the extracted database:
    database:
      path: "data/NBA_AI_2023_2024.sqlite"  # <<< Set this to point to the database you want to use.

Step 6: Run the Application

Run the application using the start_app.py file in the root directory:

python start_app.py

Accessing the Application

Once the application is running, you can access it by opening your web browser and navigating to:

http://127.0.0.1:5000/

Usage Notes

    api:
      valid_seasons:
      - "2023-2024"
      - "2024-2025"