Software Details
- Premium subscription (1 year)
- 1 physical/virtual node, 1 AI accelerator
Know your gear
Red Hat AI Inference Server optimizes model inference across the hybrid cloud, enabling faster and cost-effective model deployments. This platform leverages the vLLM project and incorporates Neural Magic technologies to deliver responsive AI inference. Red Hat AI Inference Server supports any generative AI model on various AI accelerators in any cloud environment, giving organizations the flexibility and efficiency needed for their AI workloads. By streamlining model performance while enhancing cost-efficiency, Red Hat AI Inference Server meets the growing demands of enterprises to deploy and scale generative AI in production environments. With features like intelligent model compression and a repository of validated models, it helps deliver rapid and accurate responses in AI applications, ensuring that customers can harness the transformative power of AI without compromise.