Meituan Accelerates Vision AI Inference Services and Optimizes Costs

1 Pages

Meituan leverages vision AI to drive business innovation and deliver personalized services but faced challenges in balancing performance and cost. To optimize AI inference, especially for low-traffic, long-tail models, Meituan turned to 4th Gen Intel® Xeon® Scalable processors with Intel® Advanced Matrix Extensions (AMX). This CPU-based solution, combined with dynamic scaling and service optimization, enabled high-throughput inference without sacrificing accuracy. The result was a 4.13x boost in model performance using BF16 and up to a 3x increase in online resource efficiency, while reducing service costs by 70%.

Join for free to read

White Paper Leveraging OpenVINOTM Toolkit for AI Inference in Medical and…

White Paper The future of manufacturing with generative AI

Report SECURITY COST TAG OF GENERATIVE AI

Case Study Scalable, Real-Time AI, Enabled by AMD, Accelerates SICK…

More from Intel

White Paper Leveraging OpenVINOTM Toolkit for AI Inference in Medical and…

Case Study Numenta and Intel Deliver Cost-Effective, Powerful Inference…

Case Study Numenta Accelerates Large Language Models with Intel® Xeon® CPU…

Case Study Intel® Geti Platform Accelerates AI Model Training for Real-Time…

Case Study

Meituan Accelerates Vision AI Inference Services and Optimizes Costs

Meituan Accelerates Vision AI Inference Services and Optimizes Costs

You Might Also Like

More from Intel