Suppose I have a short-secret LWE instance $As+e=b\mod q$. If I treat this as a single matrix, it becomes an ISIS problem:
$$ \begin{pmatrix} I &A\end{pmatrix}\begin{pmatrix} e \\ s\end{pmatrix}=b\mod q$$
Any short solution to this problem solves my LWE problem. If my ISIS solver expects uniformly random matrix, I can multiply both sides on the left by a random invertible matrix $B$. Thus LWE with parameters $(n,m)$ and secret and error bounded by norm $\beta$ can be reduced to ISIS$_{m,n+m,\beta}$.
Conversely, given an ISIS problem $As=t\mod q$, I can row-reduce $A$ to bring it to the same form as above: $UA = (I | A')$ for some $U$; then I give $A',Ut\mod q$ to my LWE oracle and it returns $(s,e)$ such that $$A's+e=Ut\mod q$$ And so $(e,s)$ solves my ISIS problem: ISIS$_{m,n,\beta}$ reduces to LWE$_{m,n-m,\beta}$.
Going further, ISIS and SIS are equivalent (sort of): Given an ISIS problem $(A,t,\beta)$, pick a random short value $y\in\mathbb{Z}_q$ then give $[A|-y^{-1}t]$ to an SIS solver and it will return $(s,\beta)$ such that $As=\beta y^{-1}t\mod q$. Waving my hands a bit, probably if $y$ is chosen randomly there will eventually be some collision where $\beta=y$ and one obtains $As=t\mod q$.
Conversely, given an SIS problem $(A,\beta)$, let $A=[A'|a]$ where $a$ is the last column of $A$, then iterate through random short $y$ and give $(A',-y^{-1}a)$ to the ISIS solver. It will give you some short $s$ such that $A's=-y^{-1}a\mod q$, so $(s,y)$ is an SIS solution.
This all seems completely reasonable to me. Except, the literature treats these problems very differently! No one says Kyber is based on the hardness of Module-SIS.
The main difference seems to be paramater regimes: LWE is often thought of as "injective", where $(s,e)\mapsto As+e$ is likely an injective function. One can take $A$ to be a square matrix, so that following the above reductions, the corresponding ISIS and SIS problems will involve matrices only twice as wide as they are tall, while typical descriptions of SIS use (number of columns)=$O($(number of rows)$\log q)$. With the smaller regime, SIS and ISIS might have no solutions at all for certain $\beta$, and the above reductions might fail (though I think one can argue that if a solution exists for one, there is some value of $y$ that will make the reduction produce that solution). With the larger regime, counting arguments say that the range of $As$ should be surjective.
Yet, the above reductions imply that ISIS and SIS should still be hard with (number of columns) = 2(number of rows).
Finally, it's easy to show (as long as LWE stays injective) that wider secret and error distributions are harder for LWE (just add your own error before passing to the oracle). The situation for surjective SIS seems to be the opposite: if you have an oracle for a smaller norm bound, just call that and return the answer (as long as $A\mapsto As$ is still surjective in the smaller regime).
The trick seems to be that LWE is actually a promise problem: given $(A,b)$, you're typically promised that a solution exists (or you need to decide this, which is just as hard). But in the surjective parameter regime for SIS/ISIS, there is no promise. The above reduction would fail because if you choose a norm bound so small that there are no solutions, a smaller-norm-SIS oracle is useless.
Anyway, this is more of a rant than a question. Overall, my question is: is this right? Do I have the right perspective and intuition on these two problems? And is there a more authoritative resource that spells out these differences, so that when I see a paper that assumes, e.g., MSIS, I can tell whether they mean the surjective regime or the injective regime?