W07.5 The transition from Tiny ML to Edge GenAI

Session Start
Session End
Speaker
Danilo Pau, STMicroelectronics, Italy

Generative AI (GenAI) models are designed to produce realistic and natural data, such as images, audio, or written text. Due to their high computational and memory demands, these models traditionally run on powerful remote computing servers. However, there is growing interest in deploying GenAI models at the edge, on resource-constrained embedded devices. Since 2018, the TinyML community has proved that running fixed topology AI models on edge devices offers several benefits, including independence from the Internet connectivity, low-latency processing, and enhanced privacy. Nevertheless, deploying resource-consuming GenAI models on embedded devices is challenging since the latter have limited computational, memory, and energy resources. This talk reviews several papers about the progress made to date in the field of Edge GenAI, an emerging area of research within the broader domain of EdgeAI which focuses on bringing GenAI to edge devices. Papers released between 2022 and 2024 that addressed the design and deployment of GenAI models on embedded devices have been identified and described. Additionally, their approaches and results have been compared. These manuscripts contribute to understanding the ongoing transition from TinyML to Edge GenAI, providing the AI research community valuable insights into this emerging and impactful, quite under-explored field. Further examples of Edge GenAI will prove that some of these workloads can run on existing ST MCU and MPU processors, thus showing the EdgeGenAI research field is in active development.