TL;DR: it seems that the separate X and Y coordinates have been in the specifications since a very early draft of the JOSE specification, where the separate parameters are symmetric with how the modulus and exponent are specified for RSA.
First of all, let's take a look at the EC2 key format. It does not just include the public key point.
| Field |
Required for Public Key? |
Required for Private Key? |
kty |
Yes |
Yes |
crv |
Yes |
Yes |
x |
Yes |
Optional |
y |
Yes |
Optional |
d |
No |
Yes (if private key) |
The idea behind this is to have an easily parsable way for getting all the key components so that they can be used within the algorithm. This may also be used to verify the public key, something that is generally not required for the newer curves like Ed25519.
The EC2 structure was copied from the EC key type structure found in the related JOSE specification where they consisted of base64 encoded values. Of course it doesn't make sense to use base64 within the binary COSE specifications. The first official mention of the EC key type is in the second draft version of the JOSE specification. Here it is mentioned as public key specification only with curve name in text and X and Y in base64 encoded values. This sits next to the RSA key type specification where the modulus and exponent are separately defined. So I think the main reason for the split in X and Y is simply for historical reasons.
The newer curves don't have these two elements X and Y and have a single, well defined binary structure. They are always represented the same way and are used as such by the libraries. Here no confusion can arise on how to format the key parameter. They are also generalized to allow different types of public key.
So the newer keys uses the later defined OKP structure, RFC 8152 Section 13.2:
A new key type is defined for Octet Key Pairs (OKPs). Do not assume
that keys using this type are elliptic curves. This key type could
be used for other curve types (for example, mathematics based on
hyper-elliptic surfaces).
Basically the OKP removes y and uses the x parameter for the entire encoded public key.
I would not have been surprised if the authors would have just kept to one binary encoded structure for the public keys if they would have worked the other way around. As it is the split between x and y saves the developer from having to parse the x and y for the few libraries that require it.
Then again, it would require to recombine x and y for those that use the normal compressed / uncompressed key format:
02 - statically sized, unsigned big endian X coordinate
03 - statically sized, unsigned big endian X coordinate
04 - statically sized, unsigned big endian X and then Y coordinate
So I agree that there is little gain in separating X and Y.