We give a thorough introduction to the intuition and equations of self-attention and transformers. No prior experience needed. This discussion serves to get us on the same page for further discussions on large language models (hardware implementations, new algorithm design, few-shot learning, is it AGI?).
Go to group wiki Go to wiki users Info
|Fri, 05.05.2023||15:00 - 17:00||Lecture room|
|Thu, 04.05.2023||17:00 - 18:00||Lecture room|