Available for new projects

Building intelligent systems
that actually ship.

AI development, full-stack engineering, and infrastructure consulting.
From prototype to production — no hype, just results.

550GB GPU VRAM Fleet
1,014 Codebases Indexed
112 Gb/s RDMA Mesh
Zero Cloud Dependency
Scroll

Engineer first.
Consultant second.

I'm Charles Chen — a software engineer and AI developer who runs his own multi-node GPU cluster and builds production AI systems from scratch. I operate 550 GB of VRAM across NVIDIA RTX PRO 6000 Blackwell and DGX Spark hardware, interconnected via a 200Gbps QSFP mesh with verified RoCE v2 RDMA. I built my own fleet control plane in Rust, indexed 1,014 codebases into a 37GB knowledge system, and run everything local-first with zero cloud dependency.

I don't just talk about AI — I build the infrastructure, write the control planes, design the knowledge pipelines, and deploy the models. My fleet runs 24/7 serving inference at 110+ tok/s per GPU. Whether you need a private LLM deployment, a knowledge extraction system, or someone to architect your GPU infrastructure — I've already built it at scale for myself.

On-Prem GPU Fleet 3x RTX PRO 6000 + 2x DGX Spark
Local-First AI Private inference, no cloud dependency
Rust + Python High-performance systems & AI pipelines

Tech Stack

AI / ML

PyTorch Transformers LLMs RAG Agentic AI MCP vLLM Qdrant tree-sitter NVFP4/FP8 CUDA 13

Backend

Rust Python Node.js Tokio PostgreSQL SQLite Redis

Frontend

React TypeScript Next.js Tailwind

Infrastructure

NVIDIA Blackwell DGX Spark vLLM sglang RDMA/RoCE NCCL Docker systemd Cloudflare

What I build.

End-to-end engineering across the AI and software stack.

AI & Machine Learning

Custom model training, fine-tuning, RAG pipelines, and LLM integration. I run my own GPU fleet and deploy models locally — no cloud bills, no data leaving your network, full control.

  • Custom LLM fine-tuning (DPO, RLHF, SFT)
  • RAG systems & semantic knowledge bases
  • Multi-agent AI architectures
  • Model quantization & inference optimization

Full-Stack Development

End-to-end application development from database design to polished UIs. I build fast, reliable software with clean architecture that your team can maintain.

  • Web applications & APIs
  • Database architecture & optimization
  • Real-time systems & microservices
  • Performance engineering

Infrastructure & DevOps

On-prem GPU clusters, networking, and deployment automation. I build the same infrastructure I run daily — multi-node GPU fleets with automated health monitoring, auto-restart, and zero-downtime serving.

  • Multi-GPU cluster design & deployment
  • High-speed interconnects (QSFP, InfiniBand)
  • Fleet management & automated monitoring
  • Local-first AI infrastructure (no cloud lock-in)

Technical Consulting

Architecture reviews, technology strategy, and team mentoring. I help engineering teams make better technical decisions and ship faster.

  • Architecture review & planning
  • AI strategy & feasibility analysis
  • Code audits & tech debt reduction
  • Team training & mentoring

OCR & Document Processing

Extract structured data from PDFs, images, and scanned documents using local OCR and LLM pipelines. No cloud APIs, no data leaving your network — everything runs on-premise with GPU acceleration.

  • Local OCR with GPU acceleration
  • PDF & image data extraction
  • Legal & financial document processing
  • Layout analysis & table extraction

Embeddings & Semantic Search

Build local semantic search systems with vector embeddings, hybrid retrieval (BM25 + cosine similarity), and intelligent ranking. Power your apps with meaning-aware search that runs entirely on your hardware.

  • Vector embedding pipelines
  • Hybrid search (FTS5 + vector)
  • Knowledge base construction
  • Semantic deduplication & ranking

What I've built.

Real projects, real infrastructure, all running in production.

Rust AI CLI

Klaus Code — AI Development CLI

Full-featured AI CLI in Rust with streaming REPL, autonomous tool calling, plan mode, and multi-agent orchestration. Plugin system with marketplace, MCP client integration, and fleet-aware endpoint discovery via Enterprise. Workspace architecture with separate core library and CLI binary. v2.1.0.

RustTokioSQLiteMCPvLLMStreaming
DGX Ecosystem

DGX Code — NVIDIA Fleet CLI

Klaus fork specialized for the NVIDIA DGX ecosystem. Jensen personality, Volt theme, fleet management commands, MCP client, ACP server. Manages GPU allocation, model deployment, and inference across DGX Spark nodes. v0.9.0.

RustDGX SparkMCPACPFleet Mgmt
Tax AI

Clawoitte — AI Tax Optimization

Agentic tax optimization system with three layers: L0 (categorize), L1 (personal → business reclassification), L2 (cross-category optimization). Four A2A specialist agents handle retirement, deductions, credits, and categorization. LLM-powered reasoning suggests savings that static rules miss. 278 tests, 15 profiles.

PythonLLMsA2A ProtocolSQLite
Open Source

Second Opinion — Claude Code Plugin

Open-source Claude Code plugin that assembles 5 dynamic expert reviewers to critique AI-generated plans. Reviewers are domain-matched with constructive and adversarial perspectives. Two-pass review with severity tracking and delta reporting. v1.3.0, published on GitHub.

PythonMCPClaude CodeLLMs
Mobile

Tutorify — Voice-First Study App

Flutter mobile app for voice-first learning with STT/TTS, voice commands, and AI-powered tutoring. Students interact by speaking — the app transcribes, processes via LLM, and speaks back. v1.1.0.

FlutterDartSTT/TTSLLMs
Legal AI

Legal Practice AI Platform

Full-stack AI platform for a law firm — document analysis, case research automation, and client management. Custom LLM pipelines with domain-specific fine-tuning for legal document understanding.

LLMsFine-tuningReactNode.jsPostgreSQL
E-Commerce

Amazon FBA & Merch Pipeline

End-to-end Amazon product management with SP-API and Ads API integration. Merch design pipeline with AI image generation, inventory tracking, and automated listing optimization.

PythonAmazon SP-APIAds APIImage Gen
Web Development

Custom Websites & Web Applications

Professional website design and development — from marketing landing pages to full web applications. Responsive design, SEO, Cloudflare deployment, custom domains, and ongoing maintenance.

HTML/CSS/JSReactCloudflare PagesSEO

Let's build
something great.

Have a project in mind? I'm always interested in hearing about new challenges — whether it's a greenfield AI project, a complex infrastructure problem, or scaling an existing system.