Transformers, Attention, Positional Encoding, BERT, BART, GPT, Pre-Training & Finetuning, Decoding Strategies, Tokenization, Data, Fast Attention Mechanisms, LoRA, Fast Inference Mechanism.