Skip to content

Fergus-MW/AI-Engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Engine

GPU infrastructure management system for deploying isolated GPU instances on Crusoe Cloud with a web-based dashboard.

Overview

AI Engine provides a complete solution for managing GPU resources for teams, featuring:

  • Web-based dashboard for team and GPU allocation management
  • Automated Terraform deployment to Crusoe Cloud
  • Isolated GPU environments with persistent storage
  • Real-time resource tracking and bin packing visualization

Architecture

Frontend (Next.js)

  • Modern React-based web dashboard
  • Real-time GPU quota tracking
  • Visual bin packing algorithm demonstration
  • Dark mode support
  • Located in /frontend

Backend (Terraform)

  • Infrastructure as Code for Crusoe Cloud
  • Automated VM provisioning with GPU allocation
  • SSH key generation and management
  • Located in root directory (main.tf)

API Routes

  • /api/teams - Team CRUD operations
  • /api/deploy - Trigger infrastructure deployment

Prerequisites

  • Node.js >= 18
  • Terraform >= 1.0
  • Python 3 >= 3.7
  • Crusoe Cloud account with API credentials

Quick Start

1. Configure Crusoe Cloud

# Create ~/.crusoe/config
[default]
access_key_id = your_access_key
secret_access_key = your_secret_key

2. Install Dependencies

# Frontend
cd frontend
npm install

# Python scripts
pip3 install json argparse datetime pathlib

3. Run the Application

# Start the web dashboard
cd frontend
npm run dev

Access the dashboard at http://localhost:3000

4. Deploy Infrastructure

Use the "Deploy" button in the web interface or run:

./scripts/deploy.sh

Features

Team Management

  • Add teams with custom GPU allocations (1, 2, 4, 8, or 10 GPUs)
  • Specify team members and custom ports
  • Real-time GPU quota tracking

Infrastructure

  • Each team gets dedicated NVIDIA L40S GPUs
  • Pre-configured with development tools
  • Persistent storage across restarts
  • SSH and Jupyter notebook access

Resource Visualization

  • Bin packing algorithm shows optimal VM placement
  • Visual representation of host utilization
  • Real-time allocation feedback

Project Structure

AI-Engine/
├── frontend/              # Next.js web application
│   ├── app/              # App router pages and API routes
│   ├── components/       # UI components
│   └── public/           # Static assets
├── data/                 # Team configuration data
├── scripts/              # Deployment and utility scripts
├── main.tf              # Terraform infrastructure
└── README.md

Access Information

After deployment:

  • SSH keys: keys/[team]_private_key
  • Connection scripts: outputs/connect_[team].sh
  • Jupyter notebooks: http://[team_ip]:8888

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors