Let's categorize data into three categories: data readable by humans (usually texts, varying from books to programs), data intended to be read by computers and other data (parsing images or sound).
For the first category, we need to process them into something a computer can use. As the languages used by humans can generally be captured relatively well by parsers, we usually use parsers for this.
An example of data in the third category would be a scanned image of a page out of a book which you want to parse into text. For this category, you almost always need very specific knowledge about your input, and therefore you need a specific program to parse it. Standard parsing technology won't get you very far here.
Your question is about the second category: if we have data that is in binary, it is almost always a product of a computer program, intended for another computer program. This immediately also means that the format the data is in is chosen by the program responsible for its creation.
Computer programs almost always produce data in a format that has a clear structure. If we parse some input, we are essentially trying to figure out the structure of the input. With binary data, this structure is generally very simple and easy to parse by computers.
In other words, it's normally a bit of a waste to figure out the structure of an input for which you already know the structure. As parsing isn't free (it takes time and adds complexity to your program), this is why using lexers/parsers on binary data is 'so wrong'.