Skip to yearly menu bar Skip to main content


Scaling On-Device GPU Inference for Large Generative Models

Jiuqiang Tang ⋅ Raman Sarokin ⋅ Ekaterina Ignasheva ⋅ Grant Jensen ⋅ Lin Chen ⋅ Juhyun Lee ⋅ Andrei Kulik ⋅ Matthias Grundmann

Abstract

Chat is not available.