vAquila Control Center

Operate your local vLLM runtime with confidence. Launch models, monitor VRAM, inspect logs, and validate inference from one reliable control surface.

FastAPI local UI for Docker + vLLM workflows

vAquila Dashboard Runtime

Managed containers

1

All vAquila containers

Running now

Currently serving requests

Cached models

Ready from local HF cache

CAPACITY OVERVIEW

GPU 0 • NVIDIA GeForce RTX 5070 Ti Laptop GPU75.6% used

Qwen/Qwen3-4B-Instruct-2507-FP85.37 GiB

⚡

Launch a vLLM container with explicit runtime knobs. Configure ports, context lengths, timeouts, and more.

🧬

Track async launches and deeply inspect both task initialization logs and raw container outputs in real-time.

📊

View detailed breakdown of CPU usage, RAM allocations, and logical core distributions per model.

COMING SOON

Scale your local AI infrastructure across teams. Advanced security, compliance, and orchestration built for production.

🛡️

Secure authentication integrating directly with your corporate identity providers.

🔑

Granular permissions: control who can launch, view, or stop specific models.

🌐

Deploy and orchestrate vLLM instances across multiple remote GPU servers simultaneously.