The point is that when you're interested in the properties of a theory you are actually studying the theory inside a meta-theory, where you define what a theory is, what are is models and what it means that it is consistent.
Usually this meta-theory is a set theory and one use the deductive system of this meta-theory to prove the properties of the theory being studied.
Clearly this means that your consistency proofs rely on the fact that the formal system you're using to prove them (the meta-theory) doesn't prove contraddictions, that it is consistent.
Of course you could prove that your meta-theory is consistent but in order to do that you would need a meta-theory in which you can prove that your theory is consistent, but that is just sweeping the dust under the carpet would bring to an infinite sequence of theories in which each element proves the consistency of the preceding theory.
One could contempt himself to being able to prove consistency of a meta-theory in itself, that is studying the meta-theory using itself as meta-theory. Unfortunatly this wouldn't be a great solution because ever if a meta-theory was able to prove its own consistency this wouldn't be enough to prove that it cannot prove contraddictions: because a formal system which is inconsistent is able to prove everything even its own consistency.
That should make clear that this problem is unavoidable, at some point we have to choose a meta-theory and we have to make an act of faith believing that it is consistent.
Usually mathematicians use set-theory (either ZFC or NBG or other kind of set theory) as meta-theories.
The choice for set theory as meta-theory is due to many different reasons:
- all the kind of set theory invented till now are provable equivalent (in a suitable sense) and no one has been able till now to prove a contraddiction in them
- our minds are used to think in terms of collections and so set theory being a formal system for reasoning with collections is very appealing as a meta-theory
- we have done set theory for many years so we are very strong with set theory and that make really reasonable to use this theory as meta-theory.
Of course there are also other reasons that justify the choice of set-theory as meta-theory here I've listed only some that came to my mind.
Hope this helps.