RTCSA 2025 Tutorial
Deep Software Stack Optimization for AI-Enabled Embedded Systems
- Seongsoo Hong (Seoul National University)
- Namcheol Lee (Seoul National University)
- Geonha Park (Seoul National University)
- Taehyun Kim (Seoul National University)
▶ Overview
Objective
This tutorial provides lectures and hands-on exercises on optimizing and deploying LiteRT (formerly TFLite) models on the RUBIK Pi platform, focusing on pipeline parallelism for efficient on-device inference.
Key Topics
- Inference driver and inference runtime
- Model slicing and conversion
- Throughput enhancement via pipelining on heterogeneous accelerators
Target Audience
This tutorial is designed for students, engineers, and researchers interested in on-device AI.It is particularly well-suited for beginners and intermediate-level participants who want to gain practical experience in on-device DNN inference and its optimization.
▶ Notice
- The number of participants is limited to 20 on a first-come, first-served basis.
- RUBIK Pi boards will be provided for use during the tutorial.
▶ Prerequisites
Please bring a personal laptop with the following software installed in advance
- Visual Studio Code (VS Code) with Remote – SSH extension installed
- ADB (Android Debug Bridge)
▶ Tentative Schedule
| Lecture 1: From Inference Driver to Inference Runtime (50 min) | |
| Lecturer | Seongsoo Hong |
| Topics | Step-by-Step Inference Driver Walkthrough |
| Internals of Lite Runtime (LiteRT) | |
| Brief Break (10 min) | |
| Lecture 2: Model Slicer (50 min) | |
| Lecturer | Seongsoo Hong |
| Topics | |
| Model Slicer: Slicing and Conversion Tool for LiteRT | |
| Brief Break (10 min) | |
| Lecture 3: Throughput Enhancement on Heterogeneous Accelerators (1 h 30 min) | |
| Lecturer | Namcheol Lee |
| Topics | Implementing a Pipelined Inference Driver for Heterogeneous Processors |