Here is the definition of a ($k$-tape) Turing machine from Arora and Barak; everywhere else I've seen it has been effectively the same.
- A finite alphabet $\Gamma$ with distinguished $\square$ blank and $\triangleright$ start symbols
- A finite set of states $Q$ with distinguished $q_\text{start}$ and $q_\text{halt}$ states
- A transition function $\delta: Q \times \Gamma^k \rightarrow Q \times \Gamma^{k-1} \times \{\text{L}, \text{S}, \text{R}\}$
The transition function $\delta$ has no restrictions other than being a mathematical function per, say, ZFC. This seems like it would permit me to make Turing machines with arbitrarily wacky transition functions; for a concrete example, let $k=2$, $\Gamma = \{\square, \triangleright,0,1\}$, $Q=\{q_\text{start},q_\text{halt}\}$, and $\delta(q,\gamma) = (q_\text{halt},\beta,\text{S})$, where $\beta$ is a constant defined as the parity (in $\{0,1\}$) of the number of cyclic tag systems definable in $10^{10^{100}}$ bits or less that halt given an initial word of $1$. This transition function is uncomputable; that seems strange. (It's perfectly well-defined in a mathematical sense, though; halting is a well-defined property ($\exists n . \text{word is emtpy at step $n$}$), it's just not a computable one.)
Superficially, this problem doesn't actually seem that easy to fix to me. Requiring that $\delta$ be computable would be circular, and even if something like that could be achieved in a non-circular manner, I could also induce weirdness by smuggling a large amount of computation inside of $Q$ and $\delta$; for example, $Q$ could include all binary strings up to length $10^{10^{100}}$, and $\delta$ could then allow linearly accumulating the input into $Q$ and then transitioning to a state that includes the full solution in a single step.
Additionally, I thought to use cyclic tag systems for the previous example because they seemed like they would be immune to this problem, seeing as they don't use any function in their definition. However, formally, the production rules of a cyclic tag system are specified by a set, and sets can also easily be made uncomputable (e.g. the set of finite binary strings that, when interpreted as ASCII, form a valid halting Brainfuck program). Brainfuck seems the safest computational model here so far, but still we can construct an individual Brainfuck program whose instructions depend on some uncomputable value. I suspect we could do this for any model of computation because we can "construct" an uncomputable instance of pretty much any mathematical object.
I did some cursory looking around to see if this has been talked about before, but didn't find anything. Is this knowledge so common that it's not mentioned? Is it such a technicality that it's ignored? Am I making this all up? What's going on?