
vAquila Control Center
Operate your local vLLM runtime with confidence. Launch models, monitor VRAM, inspect logs, and validate inference from one reliable control surface.
Managed containers
1
All vAquila containers
Running now
1
Currently serving requests
Cached models
2
Ready from local HF cache
GPU utilization
Deployment
Launch a vLLM container with explicit runtime knobs. Configure ports, context lengths, timeouts, and more.
Background Jobs
Track async launches and deeply inspect both task initialization logs and raw container outputs in real-time.
Host Metrics
View detailed breakdown of CPU usage, RAM allocations, and logical core distributions per model.
vAquila Enterprise
Scale your local AI infrastructure across teams. Advanced security, compliance, and orchestration built for production.
SSO & SAML
Secure authentication integrating directly with your corporate identity providers.
Role-Base Access (RBAC)
Granular permissions: control who can launch, view, or stop specific models.
Multi-Node Clusters
Deploy and orchestrate vLLM instances across multiple remote GPU servers simultaneously.