Bill Yao Delves into AI Inference Cost Reduction Strategies at GenAI Conference

Bill Yao Delves into AI Inference Cost Reduction Strategies at GenAI Conference

San Francisco, May 29th —— The 2024 San Francisco GenAI Summit, a gathering of the global AI community, welcomed top enterprises and industry leaders from around the world. At this conference themed around innovation and future technologies, Bill Yao, co-founder of novita.ai, not only delivered a keynote speech but also participated in a panel discussion titled "Innovating Infrastructure for LLM," exploring the innovation and development of AI infrastructure with other industry experts.

Bill Yao, a serial entrepreneur, previously founded China's largest online TV platform based on P2P streaming technology and was responsible for early-stage technology investments at BlueRun Ventures. At the GenAI Conference, he shared insights on how to reduce AI inference costs through technological innovation, especially in the application of large language models (LLMs).

San Francisco Mayor London Breed, as the host of the conference, delivered a speech highlighting San Francisco's history and status as a city of innovation. She mentioned that San Francisco has not only given birth to important inventions that changed daily life, such as the cable car that solves the problem of delivering goods on hills and the television that changed the way humans entertain, but also hosts 21 of the Forbes AI 50 companies in the field of artificial intelligence, which have attracted up to 22.4 billion dollars in investment.

In the Panel Discussion | Innovating Infrastructure for LLM (GPT Stage | 16:00-16:45), Yao, along with other experts, engaged in an in-depth discussion on how to build infrastructure that supports LLMs. He emphasized the importance of reducing the costs of GPU data centers and accelerating inference through model compression, sharing novita.ai's practical experience and technical achievements in this area.Yao also shared novita.ai's innovative technologies in model optimization. He mentioned that by implementing an intelligent international acceleration network with low latency, high bandwidth, and low jitter, as well as software-level dynamic optimal path selection, UDP reliable transmission, and protocol optimization, novita.ai can significantly improve the efficiency of AI inference tasks.Furthermore, Yao emphasized the role of serverless technology in cost reduction. Serverless allows developers to quickly build and deploy applications while reducing operational overhead and costs. Through the serverless platform, novita.ai has helped customers reduce model cold start time, lower request error rates, save costs, and shorten project launch times.Yao's speech not only provided attendees with viable paths to reduce AI inference costs but also offered valuable ideas and practical cases for the further development and application of AI technology.

Novita AI is the all-in-one cloud platform that empowers your AI ambitions. With seamlessly integrated APIs, serverless computing, and GPU acceleration, we provide the cost-effective tools you need to rapidly build and scale your AI-driven business. Eliminate infrastructure headaches and get started for free — Novita AI makes your AI dreams a reality.
Recommended Reading:
  1. Docker for Beginners: Say Goodbye to Deployment Nightmares!
  2. NVIDIA A100 vs V100: Which is Better?