Pi Web UI: A Browser Interface for the Pi Coding Agent
A full-stack web interface that puts the Pi coding agent in the browser — with system-level access, session history, and model switching through a local LiteLLM proxy.
read more →A full-stack web interface that puts the Pi coding agent in the browser — with system-level access, session history, and model switching through a local LiteLLM proxy.
read more →How LiteLLM, llama-swap, and Lemonade Server compose into a unified local inference platform — routing dozens of models across GPU and NPU through a single API endpoint, accessible anywhere via Tailscale and a local reverse proxy.
read more →Setting up AMD's Lemonade Server on Strix Halo to run LLM and Whisper inference on the XDNA 2 NPU — driver builds, architecture decisions, and benchmarks against the integrated GPU.
read more →A modular collection of services for building a personal AI agent — tool use, memory, browser automation, TTS, and multi-platform chat interfaces.
read more →A Dockerfile and docker-compose setup for running llama.cpp with its Python bindings in a container, because finding a working one shouldn't be this hard.
read more →An all-in-one Docker container that bundles LLMs, embeddings, vision, and TTS into a single unified inference server.
read more →