Session Track: HMI Design for Enhanced UX – Chair: Alex Polonsky

Why, How, and Where to Deploy GenAI and Other SOTA AI to Edge Devices

USA

Presentation

Generative AI, particularly Large Language Models (LLMs) and Transformers, has
become a cornerstone of modern applications, powering advancements in natural language
processing, autonomous systems, and real-time decision-making. However, the traditional
cloud-based inference approach poses significant challenges, including privacy concerns,
connectivity constraints, and high operational costs. These issues drive the need to deploy
AI models directly on edge devices, such as vehicles and embedded systems, where
real-time, on-device processing is essential.
This talk explores the latest strategies for deploying LLMs and Transformers in
resource-constrained edge environments while maintaining performance, efficiency, and
reliability. We examine the current generation of system-on-chips (SoCs) from industry
leaders such as Texas Instruments, Nvidia, Qualcomm, and Ambarella, focusing on their
capabilities for executing large AI models efficiently. We will also discuss the trade-offs
involved in model compression techniques, including quantization, pruning, and knowledge
distillation. While these techniques enable deployment on limited hardware, they may
compromise model fidelity, leading to performance degradation in real-world applications
despite favourable benchmark results.

Hear from:

Peter Kristiansen

Head of Business Development

Embedl

InCabin USA

9-11 June, 2026

|

Huntington Place, Detroit

|

#incabinusa

Watch the 2025 presentations back on-demand with an AutoSensPLUS subscription!

Click here to buy now