Matching GPU Price Performance Using Amazon Instances With Intel® Xeon® Processors

1 Pages

Storm Reply, an IT consulting firm, needed a cost-effective and reliable hosting environment to deploy large language model (LLM) solutions for a major energy sector client. After evaluating options, they chose Amazon C7i-family instances powered by 4th Gen Intel® Xeon® Scalable processors, enhanced by Intel libraries and the open GenAI framework. Optimizations showed that LLM inference on these Intel-based instances matched GPU price performance. Using Intel’s tools, Storm Reply reduced Llama 2-13b model response time from 485 seconds to 92 seconds, highlighting significant gains in efficiency and cost savings for generative AI workloads.

Join for free to read

Case Study Storm Reply Matches GPU Price Performance Ratio Using Amazon…

Case Study Netflix Chooses Intel® Xeon® Processors to Provide Fast and…

White Paper 5th Gen Intel® Xeon® Scalable Processors Empower Winning Health

Case Study Netflix Chooses Amazon EC2 Instances with Intel® Xeon®…

More from Intel

Case Study Storm Reply Matches GPU Price Performance Ratio Using Amazon…

Case Study Netflix Chooses Intel® Xeon® Processors to Provide Fast and…

White Paper 5th Gen Intel® Xeon® Scalable Processors Empower Winning Health

Case Study Netflix Chooses Amazon EC2 Instances with Intel® Xeon®…

Case Study

Matching GPU Price Performance Using Amazon Instances With Intel® Xeon® Processors

Matching GPU Price Performance Using Amazon Instances With Intel® Xeon® Processors

You Might Also Like

More from Intel