On-Demand Cloud FAQ

General Questions

What is Hyperbolic On-Demand Cloud?

On-Demand Cloud is enterprise-grade GPU infrastructure that gives you instant access to premium H100 clusters across multiple providers through a single Hyperbolic account. No sales calls, no procurement delays—just click deploy and start training.

Do I need to manage multiple provider accounts?

Nope! That's the whole point. We handle relationships with multiple GPU providers so you get access to their inventory through one Hyperbolic account. No more juggling different platforms, billing systems, or support channels.

Pricing & Billing

How much does it cost?

Single H100 instances: $1.49/hour
InfiniBand clusters: $1.99/hour per GPU
Volume discounts: Available starting at 16 GPUs
No hidden fees: What you see is what you pay

How does this compare to other providers?

You'll save up to 70% compared to AWS, Azure, and GCP. For example, AWS charges $9.01/hour for an H100 instance—we charge $1.49/hour for the same hardware.

What payment methods do you accept?

We support traditional payment methods (credit cards, bank transfers) and cryptocurrency payments, just like our other services.

Are there minimum commitments?

Not for true on-demand usage! You can spin up instances hourly. We also offer flexible commitments (monthly, annual) for additional savings if you prefer predictable costs.

Technical Details

What GPU types are available?

Currently focusing on H100 SXM 80GB instances, both bare metal and virtualized. We're expanding to other GPU types based on demand—let us know what you need!

What's included with InfiniBand clusters?

InfiniBand clusters come with high-speed interconnect networking optimized for distributed training. Perfect for multi-GPU workloads that need fast GPU-to-GPU communication across nodes.

What's included with my instance?

Every GPU you rent includes 2TB of high-performance NVMe attached storage at no additional cost. This gives you plenty of space for datasets, model checkpoints, and outputs without worrying about storage costs.
Start with 8 connected GPUs and scale up to 128+ GPUs. Need something larger? Our team can help with custom configurations.

Where is all my storage that comes with my on-demand rental?

The storage is available, but just not yet mounted. In order to mount the storage in your filesystem, you need to create a volume and then mount it manually. Here are the commands to create and mount the storage at /scratch. Make sure to adjust to your needs:

sudo apt -y install lvm2
sudo vgcreate vg0 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 /dev/nvme5n1 /dev/nvme6n1
sudo lvcreate -n lv_scratch -l 100%FREE vg0
sudo mkfs.ext4 /dev/mapper/vg0-lv_scratch
echo '/dev/mapper/vg0-lv_scratch /scratch ext4 defaults 0 0' | sudo tee -a /etc/fstab
sudo mkdir -p /scratch
sudo mount -v /scratch
sudo df -hPT /scratch

How fast is deployment?

Most instances deploy in under 5 minutes. Even large clusters with InfiniBand networking typically take less than 10 minutes—compared to weeks with traditional providers.

What if I need to download a large Docker image?

Our instances offer high network download speeds that allow you to download multiple GB within 30 minutes during instance startup.

Can I use custom machine images?

This is currently on our roadmap and will be available soon. You'll be able to select or upload your own images that get installed during instance instantiation.

How much system RAM do instances include?

Each VM gets 116GB of RAM allocated per GPU, while bare metal nodes include 1TB of RAM per node.

Can I use my own images?

Yes! You can bring custom Docker images or use our pre-configured environments with popular AI/ML frameworks like PyTorch, TensorFlow, and JAX.

Bare metal vs VM

Bare metal rental are ones where you have access to the underlying system. Which means you'll be able to do things like change the CUDA version. Unless you need these options, a VM will work for most use cases. Please note that bare metal startup times are slightly longer also.

Sometimes I can rent 8 GPUs or more, but not 1.

Suppliers can configure their GPUs to only be rented out in certain sets, for example 8 GPUs at a time. If there is no supplier offering them at that moment in time in the increment you want, you may need to wait for supply to be added or freed up from other renters.

Getting Started

How do I get access?

On-Demand Cloud is live now! Get your GPUs at app.hyperbolic.ai or contact our team for reserved deals.

What if I need help getting started?

We're here to help! Check out our documentation, or contact support. For large deployments, we offer technical consultations to optimize your setup.

Reliability & Support

What's your uptime guarantee?

We provide a 99.5% uptime SLA with service credits for any downtime. Our infrastructure is monitored 24/7 with proactive issue detection.

What kind of support do you offer?

Technical Support: Direct access for On-Demand Cloud customers
Enterprise Support: Dedicated support for large deployments

What happens if my instance goes down?

Our team monitors infrastructure 24/7 and will work to restore service immediately. You'll receive service credits for any SLA violations, and we can help migrate workloads to backup instances if needed.

Can I get dedicated support for my team?

Yes! For larger deployments or enterprise customers, we offer dedicated support channels with faster response times and direct access to our engineering team.

Are IPs on the rentals public?

IPs of the on-demand GPU machines are public, but it’s secured by ssh key authentication.

Use Cases

Is this good for training large models?

Absolutely! On-Demand Cloud is perfect for distributed training with multi-node clusters. The InfiniBand networking ensures your GPUs can communicate efficiently across nodes.

What about inference workloads?

While On-Demand Cloud can handle inference, our Serverless Inference platform might be more cost-effective for API-based inference workloads. On-Demand is ideal when you need dedicated hardware or custom environments.

Can I use this for research projects?

Definitely! Many research teams use On-Demand Cloud because there are no minimum commitments, you get transparent pricing, and you can access enterprise-grade hardware without enterprise sales processes.

What's the largest amount of GPU VRAM available?

We currently offer H100s with 80GB of VRAM each for On-Demand. You can scale to multiple GPUs for larger memory pools. We'll be expanding to more GPU models soon.

Can I always get the instance size I need?

Availability depends on partner inventory and current demand, so we can't guarantee specific configurations are always available. We're working on queuing and reservation systems to improve this experience.

Can I reserve instances in advance?

Instance reservations are currently on our roadmap and will be released soon. This will include options for reserved pools and reduced-cost standby instances.

Migration & Integration

How hard is it to migrate from other providers?

Most customers migrate their first workload within a few hours. Our instances support standard frameworks and tools, so your existing code typically works without modification.

Do you support Kubernetes?

Yes! You can deploy Kubernetes clusters or use our instances with your existing K8s infrastructure. We also support popular orchestration tools like Ray and Slurm.

Can I connect this to my existing infrastructure?

Absolutely. Our instances support VPN connections, custom networking configurations, and integration with your existing MLOps tools and monitoring systems.

Still have questions? Contact our team directly through your On-Demand Cloud dashboard or email us. We're here to help make your transition to simpler, more cost-effective GPU infrastructure as smooth as possible.