A simple counterexample is given in a 9-page talk note [1]. It shows that for a certain $8$-dimensional Hilbert space, there is no embedding to any classical probability space.
More concretely, assuming the converse, i.e. there is a measurable space $M = (\Omega, \mathcal{F})$, an embedding (a map $(A \mapsto f_A)$ that sends each quantum observable to a classical random variable, and a map ($\psi \mapsto \rho_\psi$) that sends each quantum (pure) state to a probability measure on $M$), such that
- The induced probability distribution is compatible (condition KS1 in [1]).
- The associated random variables are assigned in a compatible way that respects any measurable re-scaling (condition KS2 in [1]).
Note: The proof only use a weaker version KS2' of KS2, which says if $A$ and $B$ are commuting quantum observables (i.e. $AB=BA$), then $f_{AB} = f_A f_B$ (i.e. $\forall \omega \in \Omega, f_{AB}(\omega) = f_A(\omega) f_B(\omega)$).
Then cleverly construct a state $\Psi$ [1, (18)] and an observable $O = A_1 A_2 A_3$ [1, (15)(17)]. Its expected value is $\langle O\rangle_\Psi = -1$ [1, (24)]. However, you can show that if such an embedding exists, mapping $O$ to $a_1 a_2 a_3$, then $a_1 a_2 a_3$ takes value $1$ almost everywhere [1, (22)], a contradiction.
Discussions
Locality assumption of any sort is not required here. Locality is about the Bell's theorem, not Kochen-Specker theorem.
Note that [1] claims to prove this for all Hilbert spaces with dimension at least $3$, but as far as I can tell, the proof only show for a $8$ dimensional one.
More details from the paper: If you dig hard into it, the contradiction still comes from noncommutativity. In the clever setup, three observables $Q_1=A_1B_2B_3, Q_2=B_1A_2B_3, Q_3=B_1B_2A_3$ are given. While they are commuting, their sub-components (the $A$'s and $B$'s) are not. And while
$$1 = \langle Q_1Q_2Q_3 \rangle_\Psi = \langle (A_1 B_2B_3)(B_1A_2B_3)(B_1B_2A_3) \rangle_\Psi$$$$ = \langle A_1 (B_2A_2B_2) A_3 \rangle_\Psi = \langle A_1 (-A_2) A_3 \rangle_\Psi = \langle (-1) A_1 A_2 A_3 \rangle_\Psi,$$
with the nonexistent classical map $Q_1Q_2Q_3 = A_1 (B_2A_2B_2) A_3$ is mapped to $$q_1q_2q_3 = (b_1^2)(b_2^2)(b_3^2) a_1 a_2 a_3 = (+1) a_1a_2a_3.$$ A contradiction thus arises from the sign difference.
Even though KS2' is a much weaker assumption than KS2, and even though KS2' seems natural from the mathematical point of view, until this day I kept questioning myself whether KS2' is a reasonable assumption. The paper [1] provided an artificial solution to the hidden variable problem posed as above, provided that KS2 is dropped. However, the artificial solution has issue that each $f_A$ is independent, which should not be the case. So to introduce dependency, [1] proposed KS2. However, in my opinion KS2 (or even its weaker version KS2') is probably too strong for that. Instead, we should try to come up with KS2B that respects the Born rule, and solve the problem under KS1 and KS2B. A proposal of KS2B may be: Let $A$ be a quantum observable, $\lambda$ an eigenvalue of $A$ with a normalized eigenvector $\psi$, $\phi$ a test function, then
$$\int_\Omega \phi d\,\rho_{\psi} = \frac{\int_{f_A^{-1}(
\lambda)} \phi d\rho_{\psi}}{\int_{f_A^{-1}(
\lambda)} d\rho_{\psi}}.$$
We may also need to require another natural condition
$$\rho_\psi(f^{-1}_A(\lambda)) = \rho_{Q\psi}(f^{-1}_{QAQ^*}(\lambda)).$$
However, note that KS2 has had some history, especially in the math community of topos [2].
Reference