Interested in the recent landscape of language model research? This is a well-organized, concise overview. It does a nice job touching on the combinations of model architectures, training methodologies, and dataset characteristics.
https://magazine.sebastianraschka.com/p/understanding-large-language-models