π ParaCodex: A Profiling-Guided Autonomous Coding Agent for Reliable Parallel Code Generation and Translation
A framework for translating code between serial and parallel programming models (OpenMP, CUDA, HIP, SYCL, OpenCL) using AI agents, with automated performance optimization and correctness verification. Supports Rodinia, NAS, HeCBench, ParEval, and custom benchmarks.
ParaCodex implements a complete pipeline for:
- π Code Translation: Converting between serial C/C++ code and parallel implementations (OpenMP, CUDA, HIP, SYCL, OpenCL) using AI agents
- π Cross-Parallel Translation: Translating between different parallel programming models (e.g., CUDA β OpenMP)
- β‘ Performance Optimization: Multi-stage optimization with GPU offloading and profiling
- β Correctness Verification: Automated testing to ensure numerical equivalence
paracodex/
βββ pipeline/ # Core translation and optimization pipeline
β βββ agents/ # AI agent implementations
β βββ prompts/ # AI prompts for each translation path
β βββ utils/ # Pipeline utilities (config, logging, codex runner)
β βββ gate_sdk/ # GATE SDK for correctness verification
β βββ scripts/ # Helper scripts
β βββ webapp/ # Web interface (primary way to run ParaCodex)
β β βββ app.py # Flask server
β β βββ start.sh # Convenience startup script
β β βββ static/ # HTML/CSS/JS frontend
β βββ package.json # Node.js dependencies (@openai/codex-sdk)
β βββ AGENTS.md # Agent role definitions
βββ paracodex_skills/ # Codex CLI skills for each translation path
β βββ paracodex/ # Skills installed to ~/.codex/skills/ on setup
βββ workdirs/ # Working directories for each benchmark suite
β βββ serial_omp_rodinia_workdir/ # Rodinia benchmark workspace
β βββ serial_omp_nas_workdir/ # NAS benchmark workspace
β βββ serial_omp_hecbench_workdir/ # HeCBench workspace
β βββ cuda_omp_pareval_workdir/ # ParEval CUDA/OpenMP workspace
βββ results/ # Results and performance data
βββ setup_environment.sh # Main environment setup script
βββ install_nvidia_hpc_sdk.sh # Automated NVIDIA HPC SDK installer
βββ verify_environment.sh # Environment verification tool
βββ kill_gpu_processes.py/sh # GPU process management utilities
βββ requirements.txt # Python dependencies
- Multi-Agent Pipeline: Specialized AI agents for translation, optimization, and verification
- Serial-to-Parallel Translation: Converting serial code to OpenMP and CUDA implementations
- Cross-Parallel Translation: Translating between different parallel programming models
- GPU Offloading: Automatic translation to OpenMP with GPU acceleration
- CUDA Implementation: Direct CUDA kernel generation and optimization
- 2-Stage Process: Systematic optimization from correctness to performance
- GPU Profiling: Integration with NVIDIA Nsight Systems (nsys) for detailed analysis
- Performance Tracking: Continuous monitoring of optimization progress
- GATE SDK Integration: Automated numerical correctness checking
- Supervisor Agent: AI-powered code repair and correctness enforcement
- Numerical Equivalence: Ensures translated code produces identical results
- Browser-Based UI: No CLI required β select source directory, pick API pair, click Start
- Live Log Streaming: Watch the agent work in real time
- Job Management: Run multiple translations, inspect artifacts per job
- Node.js 22+ and npm: Required for the Codex agent engine
- Python 3.8+: For the pipeline and web server
- OpenAI Codex CLI: Installed automatically by
setup_environment.sh - OpenAI API Access:
- Pro/Plus Users:
codex login(recommended) - API Key Users: Set
OPENAI_API_KEYenvironment variable
- Pro/Plus Users:
- NVIDIA HPC SDK: For OpenMP GPU offloading (
nvc++compiler) - NVIDIA Nsight Systems (nsys): For GPU profiling (bundled with HPC SDK)
- NVIDIA GPU: With OpenMP offloading support
# Clone the repository
git clone https://github.com/Scientific-Computing-Lab/ParaCodex.git
cd paracodex
# Make scripts executable
chmod +x setup_environment.sh install_nvidia_hpc_sdk.sh verify_environment.sh
# Step 1: Install NVIDIA HPC SDK (if not already installed)
sudo ./install_nvidia_hpc_sdk.sh
source ~/.bashrc
# Step 2: Run the main environment setup
# This installs all dependencies AND copies the ParaCodex Codex skills
./setup_environment.sh
# Step 3: Verify your installation
./verify_environment.shsetup_environment.sh handles:
- β Node.js / npm check
- β
Codex CLI install (
@openai/codex) - β
Python dependencies (
requirements.txt) - β
Pipeline Node.js dependencies (
pipeline/package.json) - β
Webapp Python dependencies (
pipeline/webapp/requirements.txt) - β
ParaCodex Codex skills β
~/.codex/skills/paracodex/ - β NVIDIA HPC SDK / nsys / GPU checks
- β OpenAI API key / Codex login check
sudo ./install_nvidia_hpc_sdk.sh
source ~/.bashrc- Downloads NVIDIA HPC SDK v25.7 (~3GB)
- Installs to
/opt/nvidia/hpc_sdk - Configures
PATHand environment variables in~/.bashrc - Tests OpenMP CPU and GPU offload support
Option 1: Login with OpenAI Pro Account (Recommended)
codex login
# or with API key:
echo $OPENAI_API_KEY | codex login --with-api-keyOption 2: Environment Variable
export OPENAI_API_KEY='your-api-key-here'
echo "export OPENAI_API_KEY='your-api-key-here'" >> ~/.bashrcParaCodex is run through its web interface:
cd pipeline/webapp
bash start.shThen open http://localhost:5000 in your browser.
- Source Directory β Browse to the folder containing the code you want to translate
- Source API β The parallel API currently in your code (e.g.
serial,cuda) - Target API β The API to translate to (e.g.
omp,hip) - Click Start Pipeline and watch live logs stream in
- Inspect translated code and profiling artifacts per job
| From | To |
|---|---|
| serial | omp, cuda, ocl, hip, sycl |
| cuda | omp, hip, sycl, ocl |
| omp | cuda, serial |
| β¦ | β¦ |
| Engine | Status |
|---|---|
| Codex (OpenAI) | β Supported |
| OpenCode | π Coming soon |
Each benchmark suite has a pre-configured workdir containing the source kernels, GATE SDK, and golden reference implementations:
| Suite | Workdir | Kernels |
|---|---|---|
| Rodinia | workdirs/serial_omp_rodinia_workdir/ |
nw, srad, lud, b+tree, backprop, bfs, hotspot |
| NAS | workdirs/serial_omp_nas_workdir/ |
cg, ep, ft, mg |
| HeCBench | workdirs/serial_omp_hecbench_workdir/ |
26 kernels |
| ParEval | workdirs/cuda_omp_pareval_workdir/ |
4 kernels (CUDAβOMP) |
Point the webapp's Source Directory to workdirs/<suite>/data/src/ for each benchmark.
Main dependency installer and environment checker. Run this after cloning.
Automated installer for NVIDIA HPC SDK v25.7 with OpenMP GPU offload support.
Verifies all dependencies are installed and shows version info.
./verify_environment.shTerminates stale GPU processes.
./kill_gpu_processes.shsudo ./install_nvidia_hpc_sdk.sh
source ~/.bashrcIf installation fails: check disk space (need 10GB+), sudo privileges, and internet connection (~3GB download).
npm install -g @openai/codexcodex login # Pro users
# or
export OPENAI_API_KEY='your-api-key'export PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/25.7/compilers/bin:$PATH
export LD_LIBRARY_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/25.7/compilers/lib:$LD_LIBRARY_PATH
echo 'export PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/25.7/compilers/bin:$PATH' >> ~/.bashrc
source ~/.bashrcFLASK_RUN_PORT=8080 python pipeline/webapp/app.pypip install --upgrade pip
pip install -r requirements.txt
pip install -r pipeline/webapp/requirements.txtNote: ParaCodex is designed for research and development with multiple benchmark suites. Ensure you have an NVIDIA GPU with OpenMP offloading support and NVIDIA HPC SDK before use.
Supported Benchmarks: Rodinia (nw, srad, lud, b+tree, backprop, bfs, hotspot), NAS (cg, ep, ft, mg), HeCBench (26 kernels), ParEval (4 CUDAβOMP kernels).
