InCabin USA

10-12 June, 2025

|

Huntington Place, Detroit

|

#incabinusa

Session Track: HMI Design for Enhanced UX – Chair: Alex Polonsky

Why, How, and Where to Deploy GenAI and Other SOTA AI to Edge Devices

InCabin

USA

Presentation

Generative AI, particularly Large Language Models (LLMs) and Transformers, has
become a cornerstone of modern applications, powering advancements in natural language
processing, autonomous systems, and real-time decision-making. However, the traditional
cloud-based inference approach poses significant challenges, including privacy concerns,
connectivity constraints, and high operational costs. These issues drive the need to deploy
AI models directly on edge devices, such as vehicles and embedded systems, where
real-time, on-device processing is essential.
This talk explores the latest strategies for deploying LLMs and Transformers in
resource-constrained edge environments while maintaining performance, efficiency, and
reliability. We examine the current generation of system-on-chips (SoCs) from industry
leaders such as Texas Instruments, Nvidia, Qualcomm, and Ambarella, focusing on their
capabilities for executing large AI models efficiently. We will also discuss the trade-offs
involved in model compression techniques, including quantization, pruning, and knowledge
distillation. While these techniques enable deployment on limited hardware, they may
compromise model fidelity, leading to performance degradation in real-world applications
despite favourable benchmark results.

Hear from:

Passes0
There are no passes in your basket!
0