mlcweb is a lightweight launcher script that runs an MLC LLM model server and Open WebUI side-by-side in parallel, inside a self-managed Python virtual environment.
The intention is to make setting up an mlc-llm environment as painless as I can.
- Automatically sets up a Python virtual environment at
~/.mlcweb/venv - Installs or updates:
torch,torchvision,torchaudiomlc-llm-nightly-cpu,mlc-ai-nightly-cpuopen-webui
- Uses
mlc_llm serveto launch your chosen model (default: Qwen3-8B) - Opens http://localhost:8080 in your browser when ready
- Gracefully handles
Ctrl+Cto stop both processes
Note: PyTorch CPU wheels are used because both MLC and Open WebUI require
torchas a dependency, but actual model inference is handled via Vulkan by MLC, not PyTorch.
Before installing, make sure your system has:
- Python 3.11 or 3.12
- curl
- Vulkan-compatible GPU + drivers
- xdg-open
Run the following command to install the launcher into /usr/local/bin/mlcweb:
curl -fsSL https://raw.githubusercontent.com/JackBinary/mlcweb/refs/heads/main/install-mlcweb.sh | sudo bashThis will download the launcher script and make mlcweb available globally.
Run with the default model:
mlcwebYou can explore MLC-compatible models here:
π https://huggingface.co/mlc-ai
To use a different model:
mlcweb HF://mlc-ai/Qwen3-14B-q4f16_1-MLCYou can also pass additional arguments to mlc_llm (see MLC Documentation)
mlcweb HF://mlc-ai/Qwen3-32B-q4f16_1-MLC --overrides "tensor_parallel_shards=2"Note: You must always specify a model as the first argument. The script assumes the first argument is an alternate model.
This project is licensed under the Apache License 2.0.