Embedded Systems

Transferring Fine-grained DNN Accelerator Architecture Performance Models to Coarse-grained Models

Bachelor’s Thesis / Master’s Thesis / Student Research Project


Abstract modeling of HW/SW systems is a relatively new research topic. This technique aims to capture only the essential parameters of software and hardware that influence their timing behavior.

Currently, performance models for deep learning accelerator architectures are either find-grained on the scalar operation level (multiplication, addition, etc.) or coarse-grained on the tensor operation level (vector, matrix-matrix multiplication, etc.). While fine-grained models provide high accuracy, they are often inflexible and complex to compute. Coarse-grained models are fast to compute and provide a lot of flexibility. This student project’s goal is to develop a methodology to transfer fine-grained models on the scalar operation level to coarse-grained tensor level models and evaluate the trade-off between performance and accuracy.


  • Python
  • Successfully atteded the lecture “Grundlagen der Rechnerarchitektur” and/or “Parallele Rechnerarchitekturen” (optional)
  • Linux (optional)


Lübeck, Konstantin

Jung, Alexander

Bringmann, Oliver