Skip to content
View botAGI's full-sized avatar
😐
😐

Highlights

  • Pro

Block or report botAGI

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
botAGI/README.md

botAGI — LLMOps / AI Platform Engineer · self-hosted LLM/RAG · serving · vector search · observability

botAGI — LLMOps / AI Platform Engineer

I build self-hosted, private LLM/RAG platforms and the tooling around them: local inference, document ingestion, model fine-tuning, serving benchmarks, and source-grounded agent tooling. No cloud, no vendor lock-in.

Python PyTorch vLLM llama.cpp Ollama Hugging Face

Dify Open WebUI RAGFlow n8n Pydantic Phoenix

Qdrant Milvus Weaviate Elasticsearch

FastAPI PostgreSQL MySQL Redis Celery MinIO

Docker Ansible Traefik NGINX Authelia Portainer Prometheus Grafana Linux

NVIDIA AMD Vulkan

Focus

  • LLM serving: vLLM, llama.cpp, Ollama — on NVIDIA GB10 (DGX Spark) and AMD Strix Halo
  • RAG: vector DBs (Qdrant/Weaviate/Milvus), document parsing (Docling), chunking, reranking, evaluation
  • AI platform: Docker Compose, k3s/Ansible, monitoring (Prometheus/Grafana/Loki), security hardening, Day-2 ops
  • ML lifecycle: distillation, PEFT/LoRA, GGUF quantization, local deployment, evaluation
  • Backend: Python, FastAPI, PostgreSQL, Redis, Celery

Selected projects

Project What it proves Stack
morpheus-ai Source-grounded truth layer for agents: compile state, verify claims, sign receipts Python, ed25519, MCP, FastAPI, 678 tests
AGmind One-command private LLM/RAG platform, validated on DGX Spark GB10 Docker Compose, vLLM, Dify, RAGFlow, monitoring
AGmind-ML Full RAG-model lifecycle: distill → LoRA → GGUF → llama.cpp PEFT, GGUF, Vulkan, eval, Hugging Face
AGmind64 Self-hosted LLM/RAG on AMD Strix Halo / x86_64 llama.cpp Vulkan/ROCm, Qdrant, Dify

Writing

How I work with AI

I use AI as a force multiplier. I own architecture, security, and deployment decisions, and I verify everything against real hardware and real data — including the parts AI can't reason about (driver regressions, hardware memory behavior, network topology).

Links

Pinned Loading

  1. AGmind AGmind Public

    Private LLM/RAG platform in one command for NVIDIA DGX Spark / GB10 (arm64). Validated on real hardware.

    Shell 22 3

  2. morpheus-ai morpheus-ai Public

    WAKE.md for AI agents: compile project state so agents stop starting cold.

    Python 6

  3. AGmind64 AGmind64 Public

    Self-hosted LLM/RAG stack in one command — AMD Strix Halo / x86_64 (ROCm/Vulkan, Docker Compose)

    Python 5 1

  4. AGmind-ML AGmind-ML Public

    Fine-tuned local RAG models. First: agmind-rag-splitter-ru — a Russian context-aware document splitter (T-lite-it-2.1 LoRA, distillation, GGUF, AMD Vulkan). 100% valid JSON / boundary-F1@±1 0.821.

    Python 5

  5. llmtrend llmtrend Public

    AI trend monitoring platform for Hugging Face, GitHub, and arXiv with local LLM reports.

    Python 5

  6. AGmind-macos AGmind-macos Public

    One-command local AI/RAG installer for macOS (Metal): Dify, Open WebUI, Ollama, Weaviate/Qdrant, Postgres, Redis. 230 tests.

    Shell 5