I understand that you're looking for a shortcut. You can read about an architecture and produce an implementation that scores well on performance benchmarks. Although performance feels quite rewarding, it is not an indicator that you are doing well or delivering a high-quality model. So what's the use of all the extra struggle?
The algorithms and architectures that are in vogue change. Everybody can learn a list of an architecture's strengths and limitations. The struggle to understand the underlying technologies is a process that teaches you how to evaluate an architecture - and its use in a specific context - critically and semi-independently*. From that perspective, the architectures you're currently looking at are just a teaching tool.
Knowing that an architecture has a limitation does not mean that you can:
- Recognize whether that limitation applies to your use case.
- Understand how big the impact of the limitation is.
- How to test for, or measure the extent of the limitation.
- Or how to mitigate or compensate for the limitation.
All of the above requires a solid understanding of the architecture's components. An excellent example of a situation where developers may have known of the limitations of their algorithm is ProPublica's coverage of algorithmic bias (via Wayback Machine). If you train a pattern-matching algorithm with biased data, you get a model with the same biases. That's well known, so how did this happen?
Did the developers of these models simply not know about that risk? Did they not care? Or, did they not realize what this limitation meant to their use case and how to mitigate it?
*I do implementations, not original research. But I need to be able to read publications and to understand how their contents apply to my use case - even if the technology has never been applied in a similar context.