#strix-halo

← All tags · All dreams

by Zetaphor

Local LLM Infrastructure on Strix Halo

How LiteLLM, llama-swap, and Lemonade Server compose into a unified local inference platform — routing dozens of models across GPU and NPU through a single API endpoint, accessible anywhere via Tailscale and a local reverse proxy.

read more →
by Zetaphor

Benchmarking VoxCPM2 on Strix Halo

Running a 2B parameter tokenizer-free TTS model in both Python and C++ on AMD's integrated GPU — near-real-time speech synthesis on CPU, and the Vulkan crash that stopped GPU acceleration in its tracks.

read more →
by Zetaphor

LoopMaker Web

A browser-based AI music generation tool powered by ACE-Step, ported to Linux for local generation on AMD Strix Halo hardware.

read more →