Building an AI-ready infrastructure with Vates VMS

Artificial Intelligence (AI) is changing industries, from generative AI models in finance and healthcare to predictive maintenance in manufacturing. However, building a suitable infrastructure to support these workloads is more complex than simply adding GPUs to existing servers.

Organizations must balance flexibility, scalability, and performance, while meeting regulations, data privacy, and managing costs. This is especially important for sensitive sectors like government, healthcare, and fintech, where AI workloads in the public cloud are calling into question sovereignty and security.

🚨

As Sam Altman, CEO of OpenAI, recently warned, information shared with AI tools like ChatGPT isn’t protected by confidentiality laws. This raises serious concerns for organizations handling sensitive data.

That’s where the choice of infrastructure matters and it’s exactly where Vates VMS can help you to run your own AI workoads on your own hardware and by keeping a complete control over your infrastructure TCO.

Why on-prem matters more than ever

If you’re serious about AI, you’re probably dealing with sensitive data. Medical images. Financial records. Proprietary datasets. These aren’t things you want floating around in a public cloud. While generative AI models require significant compute, storage and networking resources, it also raise new legal challenges, including:

Data privacy and sovereignty
Copyright and licensing risks
AI model reproducibility and control

For many sectors, the only viable solution is on-premises infrastructure. Running AI workloads in your own datacenter ensures:

Continuous compliance with frameworks like GDPR, HIPAA, and industry-specific mandates
Full control over sensitive datasets and intellectual property
Predictable cost structures without cloud overages or egress fees

ℹ️

According to IDC’s AI Infrastructure Survey (July 2024), 60% of organizations perceive developing and deploying AI models using on-premises infrastructure as costing less than or about the same as using public cloud services.

What makes a successful AI environment?

Let's break it down:

Compute: Multi-core CPUs and, above all, powerful GPUs.
Storage: Fast, high-throughput drives for massive datasets.
Networking: At least 10GbE to keep up with data-hungry models.
Automation: Snapshots, backups, and monitoring to keep your experiments reproducible and secure.

Vates VMS: built for the AI era

A high-performance AI environment is about more than just hardware—it’s about an ecosystem that balances performance, flexibility, and control, while making sure costs stay predictable.

Scalability and flexibility: With per-host licensing, Vates VMS doesn’t limit you by core or socket count, so you can run dense, high-performance servers without surprise licensing costs. This is critical for AI workloads that need to scale fast and flexibly.
High-performance storage: Vates VMS already supports a range of storage options, from local NVMe to shared storage repositories (SRs). For those looking to consolidate compute and storage, our XOSTOR layer, an alternative to vSAN, lets you pool storage across your cluster, simplifying management and maximizing performance.

ℹ️

We’re also actively working on enhancing performance of our storage stack with active project on both QCOW2 support (vhd greater than 2TiB) and SMAPIv3 (our new storage stack)

Automation and monitoring: Beyond snapshots and backups built-in, Vates VMS stands out with a robust toolset for real-world automation and DevOps workflows. Our DevOps tools integrate seamlessly with modern stacks, whether you’re building Kubernetes clusters, leveraging Pulumi or Terraform, or automating with your own playbooks.
For programmatic control, Vates VMS offers a fully documented (Swagger) REST API that allows DevOps teams to automate VM operations at scale—already relied upon by large organizations for end-to-end infrastructure lifecycle management.

🤝

We are bringing next-generation monitoring, capacity planning, and FinOps capabilities to Vates VMS with our new partnership with EasyVirt, as an alternative to VMware Aria.

Security and compliance: Running AI workloads on-premises with Vates VMS means you keep full control over sensitive data and meet strict compliance requirements.
Predictable, cost-effective licensing: Unlike solutions that charge by core or CPU's socket, Vates VMS’ licensing is simple – per host, no hidden fees. You’re free to invest in the hardware your AI workloads demand, from multi-GPU setups to dense memory configurations, without worrying about licensing costs skyrocketing. And let’s be realistic, speculating a bit around Broadcom’s track record since acquiring VMware, it’s only a matter of time before GPU based monetization strategies become a thing. Vates VMS is built to keep your costs predictable, no matter how powerful your AI stack grows.
Raw compute power: Vates VMS natively supports direct GPU passthrough, giving your virtual machines direct, high-performance access to powerful GPUs, no compromises, no added complexity.

Build your own private AI with our reference architecture

If you're ready to run large language models directly on your own infrastructure, we’ve got you covered. We just released a dedicated tutorial to walk you through the full process of building a GPU-accelerated LLM setup with XCP-ng: from hardware preparation and VM creation to enabling GPU passthrough and deploying your own Ollama + Open WebUI stack.

Your own GPU-Powered LLMs with XCP-ng

Discover how easy it is to deploy a fast, local AI assistant on your infrastructure using XCP-ng and a single GPU.

XCP-ng BlogOlivier Lambert