0xff8 is 4088 in decimal. That value is out of range of the 12-bit signed immediate for lw (and all other I-Type instructions).
You could attempt syntax like -8(t0) instead, and, while that will translate into machine code of lw with immediate field of ff8, it will get the wrong address (0x200aff8) because of sign extension.
However, your li solution is much easier to use. If you look closely at the 2-instruction expansion, you'll see that the lui uses 200c to mitigate the sign extension that happens to the 12-bit immediate on the I-Type instruction.
Also, let's note that your li pseudo instruction approach will probably work with lw directly (depending on the assembler):
lw t0, 0x200bff8
However, if that is in a tight loop, it represents 2 instructions, whereas the li form can be used to create the address into a register outside of the loop, then in side only a simple lw is needed.
For the best of both, optimizing compilers will place just an lui outside the loop (instead of the lui/addi pair) and then use the one instruction form of lw with rest of the offset.