The free group on $X$ is a group $F$ together with a set map $u\colon X\to F$ such that for every group $G$ and every set map $v\colon X\to G$, there exists a unique group homomorphism $\phi\colon F\to G$ such that $v=f\circ u$. (Technically we should distinguish between the group and its underlying set, but almost nobody does that; one notable exception is George Bergman in his Invitation to General Algebra and Universal Constructions.) You can construct an instantiation of $F$ in many ways; if you want to think of $F$ as the set of reduced group words on $X$ with concatenation-and-cancellation as the operation, that's fine. The existence of the morphism $\phi$ given any $v\colon X\to G$ is called the "universal property of the free group $F$."
If $G$ is any group, and $X$ is a subset of (the underlying set of) $G$ such that $\langle X\rangle = G$, then the set embedding $X\hookrightarrow G$ gives a homomorphism $\phi\colon F\to G$, by the universal property of the free group. This is a surjective map (since the image contains $X$ and hence contains $\langle X\rangle=G$). So we know, from the isomorphism theorems, that $G\cong F/N$, where $N=\ker(\phi)$.
That means that we can describe $G$ up to isomorphism by describing both $F$ and $N$. We can describe $F$ simply by writing out $X$ and saying "$F$ is the free group on $X$". Describing $N$ can be done in many ways: you can find a subset $Y$ of $N$ that generates $N$ as a group; or you can find a subset $R$ of $N$ that generates $N$ as a normal subgroup of $F$; that is, $R\subseteq N$ and the smallest normal subgroup of $F$ that contains $R$ is the subgroup $N$. Then we can describe $G$ by saying "the group obtained by taking the free group on $X$, $F$, and moding out by the smallest normal subgroup $N$ of $F$ that contains $R$." We can denote this by writing "$G=\langle X\mid R\rangle$." We call the elements of $X$ "generators", and the elements of $R$ "relations" or "relations among the generators".
But there are many possible sets $R$ that generate $N$ as a normal subgroup; any such list is a "suitable set" of relations to determine $G$.
Now, the above is the description. What is the intuition?
The idea is to take a set of elements $X$ that generates $G$. Any reduced group word on $X$ will yield an element of $G$ when "evaluated" in $G$. But different words may yield the same element of $G$ when evaluated. For example, if $x$ and $y$ commute, then the word $xy$ will evaluate to the same element as the word $yx$, even though the two words are different as reduced words. Let us write $r_G$ to mean the element of $G$ obtained by evaluating the reduced word $r$ on $X$ as an element of $G$.
Now consider the set $\sim=\{(r,s)\in F\times F\mid r_G=s_G\}$. It is not hard to see that this is an equivalence relation on $F$ which, in addition, satisfies that if $(r,s)$ and $(r',s')$ are in the set, then so is $(rr',ss')$ and $(r^{-1},s^{-1})$. That is, this set, viewed as a subset of the group $F\times F$, is also a subgroup. This equivalence relation allows us to look at the set of equivalence classes $F/\sim$, and define a multiplication on $F/\sim$ by $[r][r'] = [rr']$ (where, $[r]$ is the equivalence class of $r$). This quotient turns out to be the quotient by the normal subgroup $N=\{ r\in F\mid r\sim e\}$. See an extensive discussion here. These are precisely the words $r$ such that $r_G=e$. That is: we can describe the equivalence relation simply by saying which words are equivalent to the identity; and this group $F/N$ is isomorphic precisely to $G$.
So we can describe $G$ by giving the set $X$ and the whole equivalence relation $\sim$, which describes all the words which evaluate to the same thing in $G$; or by just giving the set $X$ and the normal subgroup $N$ that describes the words that evaluate to the identity; or by just giving the set $X$ and enough words that evaluate the identity to generate $N$ as a subgroup; or by giving just the set $X$ and just enough words that evaluate the identity to generate $N$ as a normal subgroup. Any such set is a "suitable set" of relations/relators on $X$ that will determine $G$.
Any group can be presented by generators and relations, since in the worst case you can let $X$ be the set $G$ itself, and then as relations take all words of the form $xyz^{-1}$ where $xy=z$; that is, the whole multiplication table of $G$. But generally you want a set $X$ that is as small as possible, and a set $R$ that is as small as possible, just enough to be able to describe every element of $G$. Neither $X$ nor $R$ are uniquely determined, so the same group will have many presentations.
Conversely, any set $X$ and any collection $R$ of reduced words on $X$ determines a unique group $\langle X\mid R\rangle$ up to isomorphism, namely the group $F/N$ where $F$ is the free group on $X$ and $N$ is the normal subgroup of $F$ generated by the set $R$. von Dyck's Theorem essentially describes the latter, by giving its universal property in terms of the universal property of the free group and the universal property of the quotient.