I suggest that you tweak the distance metric slightly, and then tweak the split function in your k-d tree to match it. You can define any distance metric that you want, to capture/measure similarity between two points. And you can modify a standard k-d tree so it works well with the revised distance function.
When computing the distance between two angles $r,s$, a helpful definition is
$$d(r,s) = \min(|r-s|,|r+360-s|,|r-360-s|),$$
or concisely, $d(r,s) = |r-s| \pmod{360}$. Here I am assuming $r,s$ are measured in degrees; if they are measured in radians, replace $360$ by $2\pi$.
The total distance can be obtained by computing the the distance for each coordinate/dimension, squaring all those distances, summing up all of those squares, and taking the square root of that sum. In other words,
$$D(x,y) = \bigg( \sum_i d_i(x_i,y_i)^2 \bigg)^{1/2},$$
where $d_i(\cdot,\cdot)$ is a distance metric that is appropriate for the $i$th dimension: $d_i(x_i,y_i) = |x_i-y_i|$ if the $i$th dimension contains ordinary data, or the angular distance metric listed above if the $i$th dimension contains angular data.
Then, I suggest you define the "nearest neighbor" using this distance measure $D(\cdot,\cdot)$.
Next, you need to adapt the k-d tree so it will be useful for finding the nearest neighbor.
My thought would be to build a k-d tree on all 12 dimensions, but change how the "split" works for the 3 angular dimensions, to accommodate the wrap-around semantics.
Normally, we split based on a threshold $\tau$. All data points $x$ with $x_i \le \tau$ go to one group, and all data points with $x_i > \tau$ go to the other group. This splits the range $(-\infty,+\infty)$ of all possible values into two subranges: $(-\infty,\tau]$ and $(\tau,+\infty)$.
With angular data, one plausible approach would be to split based on a threshold $\tau$, but adapting the comparison to take into account wraparound. Specifically, all data points $x$ with $\tau-180 < x_i \le \tau$ or $x_i > \tau+180$ go to one group, and all data points with $\tau < x_i \le \tau+180$ or $x_i < \tau-180$ go to the other group. (Here I assume $x_i$ is an angle measured in degrees. If it is measured in radians, replace $180$ with $\pi$.) In other words, the threshold $\tau$ basically splits the range $[0,360)$ of all possible angles into two subranges: $(\tau-180 \bmod 360, \tau]$ and $(\tau, \tau+180 \bmod 360]$, with all values taken modulo 360 to take into account wraparound.
Then, for each dimension, the k-d tree would use either a normal split or a wraparound split, according to whether that dimension refers to normal data or angular data.
Finally, you can adapt standard algorithms for building a k-d tree to work with this modification, and you can adapt standard algorithms for finding the nearest neighbor by traversing the k-d tree to work with this modified distance metric and modified data structure.
A minor side note: If you are finding nearest neighbors for nearest-neighbor classification or a similar purpose, it is often helpful to standardize all of the coordinates (subtract off the mean for that coordinate, divide by the standard deviation), before computing the distance. If you do that, you'll need to divide $360$ by the standard deviation in the above distance metric for angles.