The rationale goes this way:
On a "big" system like a PC or a smartphone, ChaCha20+Poly1305 or AES/GCM are very efficient; the latter is fast because the hardware provides dedicated opcodes that implement both AES itself (aesenc, aesenclast on x86 CPU) and the GHASH part of GCM, which is used for the integrity check (pclmulqdq opcode on x86 CPU).
On much smaller systems (embedded microcontrollers), things are not that easy. Pure software implementations of AES/GCM are quite slow; ChaCha20+Poly1305 may fare somewhat better, but not necessarily as fast as can be hoped for. Note that "speed" can mean "maximum bandwidth", but also battery life: when processing small messages, the cost for encrypting can be expressed in terms of a duration during which the CPU must be powered at its high frequency.
On that kind of system, you may have an hardware AES implementation, but usually not a an hardware helper for GHASH (because of size constraints on the CPU die, mostly). Thus, CCM works better than GCM on such microcontrollers. This explains the request for supporting some CCM-based cipher suites.
Constrained systems are usually constrained for everything; computing stuff drains the battery, but so does sending data. Reducing the amount of bytes to send can help, and every byte counts. The normal authentication tag of CCM is 16 bytes. The "CCM_8" cipher suites use tags reduced to 8 bytes, thus saving 8 bytes of overhead, at the expense of a higher probability of forgery.
This is how TLS 1.3 ends up with CCM_8 cipher suites. It's all about performance issues on small embedded systems. Note that MAC can tolerate a relatively high forgery probability because of their "online" nature: when an attacker tries to forge a MAC value, he still have to try it out on an honest receiver for every forgery attempt; if the attacker is talking with small embedded systems, he won't be able to do that many times per second (and each failed attempt implies an error on the receiving system). This contrasts with encryption, where a brute force attack is "offline" (the attacker tries potential decryption keys on his own computers, and can thus do that at a high rate, and quietly).
It is worth pointing out that AES/GCM has a bit of trouble with truncated tags. An authentication tag of $n$ bits cannot have a probability of forgery less than $2^{-n}$, but with GCM, when $n < 128$, it tends to be higher (specifically, successful forgeries leak extra information, making subsequent forgeries easier). As such, tag truncation on GCM is not recommended. CCM, for all its design issues, does not suffer in the same way.