RTCSA 2025 Tutorial
Deep Software Stack Optimization for AI-Enabled Embedded Systems

  • Seongsoo Hong (Seoul National University)
  • Namcheol Lee (Seoul National University)
  • Geonha Park (Seoul National University)
  • Taehyun Kim (Seoul National University)

▶ Overview

Objective
This tutorial provides lectures and hands-on exercises on optimizing and deploying LiteRT (formerly TFLite) models on the RUBIK Pi platform, focusing on pipeline parallelism for efficient on-device inference.

Key Topics

  1. Inference driver and inference runtime
  2. Model slicing and conversion
  3. Throughput enhancement via pipelining on heterogeneous accelerators

Target Audience

This tutorial is designed for students, engineers, and researchers interested in on-device AI.
It is particularly well-suited for beginners and intermediate-level participants who want to gain practical experience in on-device DNN inference and its optimization.
▶ Notice
  • The number of participants is limited to 20 on a first-come, first-served basis.
  • RUBIK Pi boards will be provided for use during the tutorial.
▶ Prerequisites

Please bring a personal laptop with the following software installed in advance

▶ Tentative Schedule
Lecture 1: From Inference Driver to Inference Runtime (50 min)
Lecturer Seongsoo Hong
Topics Step-by-Step Inference Driver Walkthrough
Internals of Lite Runtime (LiteRT)
Brief Break (10 min)
Lecture 2: Model Slicer (50 min)
Lecturer Seongsoo Hong
Topics
Model Slicer: Slicing and Conversion Tool for LiteRT
Brief Break (10 min)
Lecture 3: Throughput Enhancement on Heterogeneous Accelerators (1 h 30 min)
Lecturer Namcheol Lee
Topics Implementing a Pipelined Inference Driver for Heterogeneous Processors
              Slide      GitHub