From FIPS 186-4, section 6.4
An approved hash function, as specified in FIPS 180, shall be used during the generation of digital signatures.
Thus if the hashing step was removed, then this is no longer ECDSA. Let's call the resulting Modified signature scheme MECDSA.
MECDSA has a potential weakness: it is easily exhibited a valid MECDSA signature for message(s) $m\equiv0\pmod n$: signature $(r,s)$ with $r=s=x_A\bmod n$, where $x_A$ is the $x$ coordinate of the signer's public key $Q_A$; at verification we get $z=0$, $u_1=0$, $u_2=1$, $u_1\times G+u_2\times Q_A\,=\,Q_A$, hence $r\equiv x_1\bmod n$ and the signature verifies. With true ECDSA, it is computationally infeasible to exhibit a message hashing to $0\pmod n$, making this a non-issue.
Update: things get much worse: we can efficiently exhibit an arbitrary number of (message, signature) pairs passing verification (but there's no control on the message); see this answer.
It is not quite clear if MECDSA signs an integer $m$ or a bitstring; and the maximum $m$, or the maximum size of the bitstring:
- If $m$ was an unbounded integer, the signature for $m$ would also be valid for $m+n$, where $n$ is the order of the group ($n\approx 2^{256}-2^{224}$ for curve P-256). To solve this, we'd need to restrict to $0<m<n$.
- If $m$ was a bitstring of exactly as many bits as $n$, then we would also have a number of bitstrings with interchangeable signatures (about $2^{224}$ such 256-bit bitstring pairs for curve P-256).
In either case, we are slightly in trouble if the public key to sign is using the same curve as used for the ECDSA signature: the usual ("uncompressed") form of a public key is twice as wide as $n$ is; and even with point compression (which is not all that usual, and is not considered by FIPS 186-4), a public key is one bit more than $n$ is. In order to solve this, we'd need to restrict the allowable public keys (e.g. with sign bit clear, and the $x$ coordinate less than $n$, thus removing very slightly more than half of the valid public keys).
An important other issue is that signatures of public keys, also known as public key certificates, typically need attributes, like who they belong to. We could spare a few more bits for a serial number at the cost of further restrictions in the public key, but each bit spared doubles the difficulty of generating a public/private key pair. Thus in the end we might need to sign two messages instead of one, with something to link the two signed messages. That's feasible, but hairy!
With such restriction to signing messages $m$ with $0<m<n$, and non-standard form of public key or/and certificate content, carefully enforced/compensated at verification, MECDSA could be used to sign a public key. But it is clumsy, and is it safe? I'm not entirely sure. That's not a well-studied problem. I would never prescribe it, and would feel very uncomfortable endorsing or doing it, for lack of positive security argument, and because of the usability issues. Hashing is cheap (it can be done in less than 1 kByte of code), and signing hashed message is the way to go.
Addition: here is a tiny (1.2 kiB) implementation of SHA-256
// tinysha256 - a public domain, compact implementation of sha256
// input is limited to 0 to 0xffffffff octets(s) in memory
// gcc 5.3.0 for x386 with -Os compiles this to 1280 bytes
#include <stdint.h>
// hash len bytes in memory pointed by ibuf
void tinysha256(
uint8_t hash[32], // result
const uint8_t * data, // input
uint32_t size // input size
)
{
uint32_t s[8] = { // init state
0x6a09e667,0xbb67ae85,0x3c6ef372,0xa54ff53a,0x510e527f,0x9b05688c,0x1f83d9ab,0x5be0cd19};
uint32_t w[16]; // data buffer
uint32_t x; // warning about unitialized x can be safely ignored
const uint8_t *e = data+size; // end of data
int n = 0;
for(;;) {
while (data!=e) { // while there is data..
x = (x<<8)+ *data++; // accumulate it in x, big-endian
if ((3&~n)==0) { // a full 32-bit chunk
w[n>>2] = x;
if (n==63) {
n = 0;
goto c; // compression
}
}
++n;
}
switch(n) { // handle padding
case 64:
n = x = 0;
break;
case 65:
n = 31;
do
hash[n] = s[n>>2]>>((3&~n)<<3);
while(--n>=0);
return;
default:
x = ((x<<8)|0x80)<<((3&~n)<<3);
}
w[n >>= 2] = x;
while (++n!=14) {
if (n==16) { // extra compression needed
n = 64;
goto c;
}
w[n] = 0;
}
w[14] = (uint32_t)(size>>29); // length padding
w[15] = (uint32_t)(size<< 3);
n = 65;
c: { // compression
static const uint32_t k[64] = {
0x428a2f98,0x71374491,0xb5c0fbcf,0xe9b5dba5,0x3956c25b,0x59f111f1,0x923f82a4,0xab1c5ed5,
0xd807aa98,0x12835b01,0x243185be,0x550c7dc3,0x72be5d74,0x80deb1fe,0x9bdc06a7,0xc19bf174,
0xe49b69c1,0xefbe4786,0x0fc19dc6,0x240ca1cc,0x2de92c6f,0x4a7484aa,0x5cb0a9dc,0x76f988da,
0x983e5152,0xa831c66d,0xb00327c8,0xbf597fc7,0xc6e00bf3,0xd5a79147,0x06ca6351,0x14292967,
0x27b70a85,0x2e1b2138,0x4d2c6dfc,0x53380d13,0x650a7354,0x766a0abb,0x81c2c92e,0x92722c85,
0xa2bfe8a1,0xa81a664b,0xc24b8b70,0xc76c51a3,0xd192e819,0xd6990624,0xf40e3585,0x106aa070,
0x19a4c116,0x1e376c08,0x2748774c,0x34b0bcb5,0x391c0cb3,0x4ed8aa4a,0x5b9cca4f,0x682e6ff3,
0x748f82ee,0x78a5636f,0x84c87814,0x8cc70208,0x90befffa,0xa4506ceb,0xbef9a3f7,0xc67178f2
};
uint32_t r[8];
int j = 7;
do
r[j] = s[j];
while (--j>=0);
j = 0;
do
{
#define RL(value, bits) (((value) << (bits)) | ((value) >> (32 - (bits))))
#define S0(x) (RL((x),30)^RL((x),19)^RL((x),10))
#define S1(x) (RL((x),26)^RL((x),21)^RL((x),7))
#define S2(x) (RL((x),25)^RL((x),14)^((x)>>3))
#define S3(x) (RL((x),15)^RL((x),13)^((x)>>10))
#define S4(x,y,z) ((z)^((x)&((y)^(z))))
#define S5(x,y,z) (((x)&(y))^((z)&((x)^(y))))
uint32_t t = r[7] + S1(r[4]) + S4(r[4], r[5], r[6]) + k[j] + w[j&15];
r[7] = r[6]; r[6] = r[5]; r[5] = r[4]; r[4] = r[3] + t;
r[3] = r[2]; r[2] = r[1]; r[1] = r[0]; r[0] = t + S0(r[1]) + S5(r[1], r[2], r[3]);
w[j&15] += S3(w[(j+14)&15]) + w[(j+9)&15] + S2(w[(j+1)&15]);
}
while(++j!=64);
j = 7;
do
s[j] += r[j];
while (--j>=0);
}
}
}