#amd

← All tags · All dreams

by Zetaphor

Running DramaBox on Strix Halo

Getting Resemble AI's expressive TTS model running on AMD Strix Halo with no NVIDIA hardware. TheRock gfx1151 nightlies, bitsandbytes preview for ROCm, reduced step counts, and torch.compile bringing the 3.3B DiT from RTF 4.0 down to 1.75.

read more →
by Zetaphor

Benchmarking Echo-TTS on Strix Halo

Running a diffusion-based TTS model on AMD's Strix Halo, patching CUDA-only code for CPU, discovering a bf16 GPU hang on gfx1151, and a hybrid GPU/CPU trick that beats every other TTS model I've tested.

read more →
by Zetaphor

Local LLM Infrastructure on Strix Halo

How LiteLLM, llama-swap, and Lemonade Server compose into a unified local inference platform, routing dozens of models across GPU and NPU through a single API endpoint, accessible anywhere via Tailscale and a local reverse proxy.

read more →
by Zetaphor

Benchmarking VoxCPM2 on Strix Halo

Running a 2B parameter tokenizer-free TTS model in both Python and C++ on AMD's integrated GPU, near-real-time speech synthesis on CPU, and the Vulkan crash that stopped GPU acceleration in its tracks.

read more →