ESWEEK 2025 Tutorial
Deep Software Stack Optimization for AI-Enabled Embedded Systems

  • Seongsoo Hong (Seoul National University)
  • Namcheol Lee (Seoul National University)
  • Geonha Park (Seoul National University)
  • Taehyun Kim (Seoul National University)

▶ Overview

Objective
This tutorial provides lectures and hands-on exercises on optimizing and deploying LiteRT (formerly TFLite) models on the RUBIK Pi platform, focusing on pipeline parallelism for efficient on-device inference.

Key Topics

  1. Inference driver and inference runtime
  2. Model slicing and conversion
  3. Throughput enhancement via pipelining on heterogeneous accelerators

Target Audience

This tutorial is designed for students, engineers, and researchers interested in on-device AI.
It is particularly well-suited for beginners and intermediate-level participants who want to gain practical experience in on-device DNN inference and its optimization.
▶ Notice
  • Due to limited number of RUBIK Pi boards, only 20 participants can use a board on a first-come, first-served basis.
    • Other attendees are welcome to follow the exercises by observing
  • RUBIK Pi boards will be provided for use during the tutorial.
▶ Prerequisites

Please bring a personal laptop with the following software installed in advance

▶ Tentative Schedule
Lecture 1: Exercise Overview and Setup (1 h 30 min)
Lecturer Seongsoo Hong
Topics Motivating Example
Development Environment Setup
Lecture 2: From Inference Driver to Inference Runtime (1 h 30 min)
Coffee Break (30 min)
Lecturer Seongsoo Hong and Namcheol Lee
Topics Step-by-Step Inference Driver Walkthrough
Internals of LiteRT
Lunch Break (1 h)
Lecture 3: Model Slicer (1 h 30 min)
Lecturer Seongsoo Hong
Topics
Model Slicer: Slicing and Conversion Tool for LiteRT
Coffee Break (30 min)
Lecture 4: Throughput Enhancement on Heterogeneous Accelerators (1 h 30 min)
Lecturer Namcheol Lee
Topics Implementing a Pipelined Inference Driver for Heterogeneous Processors

              Slide      GitHub