Can you explain to me why this proof by induction is not flawed? (Domain is graph theory, but that is secondary)

Question

Background

I am following this MIT OCW course on mathematics for computer science. In one of the recitations they come to the below result:

Official solution

Task:

A planar graph is a graph that can be drawn without any edges crossing. Also, any planar graph has a node of degree at most 5. Now, prove by induction that any planar graph can be colored in at most 6 colors.

Solution.:

We prove by induction. First, let n be the number of nodes in the graph. Then define P (n) = Any planar graph with n nodes is 6-colorable. Base case, P (1): Every graph with n = 1 vertex is 6-colorable. Clearly true since it’s actually 1-colorable. Inductive step: P (n) → P (n + 1): Take a planar graph G with n + 1 nodes. Then take a node v with degree at most 5 (which we know exists because we know any planar graph has a node of degree ≤ 5), and remove it. We know that the induced subgraph G’ formed in this way has n nodes, so by our inductive hypothesis, G’ is 6-colorable. But v is adjacent to at most 5 other nodes, which can have at most 5 different colors between them. We then choose v to have an unused color (from the 6 colors), and as we have constructed a 6-coloring for G, we are done with the inductive step. Because we have shown the base case and the inductive step, we have proved ∀n ∈ Z+ : P (n) (Note: Z+ refers to the set of positive integers.)

My Question

To me the inductive step is "backwards", because it proves P(n+1) -> P(n) and then get's to assume P(n) to be true by the inductive hypothesis. Why is this legal? To illustrate my problem, consider the below bogus-proof that sounds "the same" to me.

Thm. "At school, no kid has more than 6 friends." Assume that there is always one guy who has 5 friends or less. Proof by induction over the number of kids in school. Base Case: n = 1, is obviously true: Kid cannot have any friends, if only child.

Inductive step: P (n) → P (n + 1): Take a school G with n + 1 kids. Then take a kid v with at most 5 friends (which we know exists by assumptions), and remove it. We know that the induced sub-school G’ formed in this way has n kids, so by our inductive hypothesis, in G' no one has more than 6 friends. I can then add v back in and know that he has less than 6 friends. q.e.d.

There must be something I am missing, but both proofs look equally flawed to me. I would truly appreciate, if you could break it down to me. I am not questioning the theorem, just the procedure of starting with the case n+1 and then very cleverly choosing how to go backwards to n. At best, to me, this proves P(n+1) -> P(n), but not P(n) -> P(n+1).

Update

Thank you everyone! It was so wonderful to get so many answers on such niche problem. I was precisely hung up on starting with n+1 in conjunction with the fact that I get to “choose” how to go to n from there. I think I can now verbalize a little better, why I felt unhappy with the proof:

In my mind, whenever I look at induction proofs, I model it a bit like a recursive function call in code. I had never explicitly done this, but to illustrate my confusion, here is what they are doing in pseudo-code (in my mind at least):

is6Colorable(graph):
  if graph.size == 1: # base case
    return true
  else:
    specialNode = findSomeDegree5orFewerNode(graph)
    subgraph = graph.drop(specialNode)
    return is6colorable(subgraph)

Notice how I can rename the function in a meaningful way for any statement that is true for both the base case and the special node and it will always return true for a planar graph? E.g. verifyThatNoNodeInThisGraphHasMoreThan5Degrees(graph). In fact, I could rename it most anything that holds for the base case and it would return true and still have some meaning attached (e.g. isNumberOfVerticesOdd(graph) ) So that is obviously garbage code (I could just return "true") and it felt circular to me.

The crux – to me – is that all of the heavy lifting of this proof is done by the fact that adding a degree-5-or-fewer-node to a graph, cannot “taint” the 6-colorability for all the other nodes. (This does not hold for all other statements that are true for the base case and the specialNode. E.g. it might change the number of edges that some of the nodes in the sub-graph have, so I cannot prove that all nodes have degree 5 or less like I suggested above). That means I can deconstruct every graph to the base case, only removing nodes that when added back in, do not taint the 6-color-ability of the induced sub-graph. If I can deconstruct it that way, I can re-construct it that way.

Perhaps, you can see from my code example, how I struggled with the wording of the official proof, though: The induction they do – to me at least – does not add any insight. It is valid “code” and it returns the correct result, but it is void of insight.

You seem to have a misconception about proof by induction. It's not true that in an inductive step, you have to "go from a small thing to a big thing", and in fact such an argument can be wrong. If we take it as given that $P(100)$ is true, can you explain exactly why you aren't convinced by the argument "let $G$ be a graph with $101$ nodes. It must have a node of degree $\le 5$. When you remove this node, you're left with $100$ nodes, and we already know the theorem is true for such graphs, (...), therefore the theorem holds for $G$"? Are you happy with the fact about nodes of degree $\le 5$? — Izaak van Dongen, Mar 23 '24 at 17:27
note that by adding a node of degree $\le 5$ you can bring the degree of other nodes above five - in this way all planar graphs can be reached by repeatedly adding nodes of degree $\le 5$ to the empty graph — Silver, Mar 24 '24 at 11:16
Note that it is a provable theorem, not an assumption, that every nonempty planar graph has at least one node of degree strictly less than 6. Your school example suggests that you may not be appreciating that. — John Bollinger, Mar 24 '24 at 13:52
This kind of construction -- "take the n+1 object and remove something to make the n object, conclude the resulting object has property P, use this to prove the same of the n+1 object" -- is very common when inducting on certain data structures in CS. Done properly, this approach makes it easy to show P for all possible n+1 objects, which may be difficult or impossible if "starting from" an n object. — Glenn Willen, Mar 24 '24 at 17:48
"Now, prove by induction that any graph can be colored in at most 6 colors." ─ Should that be, "that any planar graph"? Is this a mistake in the original question? — kaya3, Mar 24 '24 at 23:01
The key difference between the valid 6-coloring proof and your 6-friend proof is the step of "adding $v$ back in". A friend of $v$ might have exactly 6 friends in $G'$, but then has 7 friends in $G$. But in coloring, $G$ and $G'$ can have the same colors on all vertices other than $v$ without breaking the goal. — aschepler, Mar 25 '24 at 22:13
Re your pseudocode: The fact that you can replace the whole thing with return true is how you know that it's a valid proof. Such a substitution should always be valid, if the theorem is true. If you could instead find some case where it fails to work (e.g. because one of those function calls fails to find what it's supposed to find), then it would not be a valid proof. — Kevin, Mar 26 '24 at 16:46

user1947180 · Accepted Answer · 2024-03-25T02:14:22.097

I think what is confusing you is that you are "starting out" with a planar graph of $n+1$ nodes.

But note that you are not yet saying anything about its colorability. All you're trying to show here is that $P(n) \to P(n+1)$, or:

"Any planar graph with $n$ nodes is $6$-colorable" $\to$ "Any planar graph with $n+1$ nodes is $6$-colorable"

The inductive proof here "starts out" with some arbitrary planar graph of $n+1$ nodes, but we're not saying it's $6$-colorable yet. What we do know about this planar graph is that it has some node with degree $\leq 5$ in it somewhere, something we know is true for all planar graphs. We label this node $v$ and remove it for now.

What do we have left? A planar graph of $n$ nodes, which is $6$-colorable because we've assumed it via $P(n)$.

The nodes that $v$ was connected to can have at most $5$ colors among them. So we connect $v$ back to those nodes again, and color it something different (from those nodes), resulting in (at most) $6$ colors between those $\leq 5$ nodes and $v$. Now we've shown that $P(n) \to P(n+1)$ because we end with the same arbitrary planar graph we began with, but now we know it's $6$-colorable. The fact that the planar graph is arbitrary is important, because we are trying to prove that $P$ holds for "any" planar graph.

In other words, the pathway is more like "arbitrary $n+1$ node planar graph" -> "a reduced $n$ node planar graph that is $6$-colorable" -> "that arbitrary $n+1$ node planar graph is indeed $6$-colorable too".

It begs the question, why not just "start out" with a planar graph of $n$ nodes and then go straight to the graph of $n+1$ nodes? There are some potential issues with such an approach.

For example:

I want to prove that all planar graphs are $3$-colorable. $P(1), P(2), P(3)$, trivially true. Now for $P(n) \to P(n+1)$, I start out with a graph of $n$ nodes that is $3$-colorable. I add a node to it such that it is connected to $2$ nodes of $2$ different colors, and color my new node a third color. Boom! Now we've made a graph of $n+1$ nodes with at most $3$ colors and we haven't violated the fact that all planar graphs have at least one node of degree $\leq 5$...

...wait, not so fast! What have I actually shown here? I've just shown that if you have a $3$-colored graph, you can make another $3$-colored graph with one more node as long as you attach it in a very particular way.

I need to show it holds for any $3$-colorable planar graph. What if we're talking about a graph with a triangle in it ($3$ nodes, $3$ colors) and I add the extra node right in the middle, and connect it to all three vertices? That's a valid $n+1$ graph too, and yet... I need a fourth color. So this proof fails.

In other words, it becomes harder to go from the $n$ case directly to the $n+1$ case because now you have to consider all the possible ways you can hook up that extra node, and all the various ways you might get contradicted.

This is why it's way easier to start from the arbitrary $n+1$ graph, because you are immediately granted the existence of a special node, and then you can say something about the colorability of the reduced $n$ graph, and use that to say something about the colorability of the original $n+1$ graph. And since this works for any $n+1$ planar graph you started with, the resulting conclusion holds for all planar graphs.

The first half of this answer is spot on, but the second half is a bit misleading — the part starting “why not start with an arbitrary $n$-vertex graph instead?”, concluding with “…it’s way easier to start from an arbitrary $n+1$ graph, because you are immediately granted the existence of a special node.” The reason you start with an arbitrary $(n+1)$-vertex graph is because what you’re trying to prove is $P(n+1)$, “any $(n+1)$-vertex graph is 6-colorable”. It’s a statement about all $(n+1)$-vertex graphs, so a direct proof must start by taking an arbitrary such graph. — Peter LeFanu Lumsdaine, Mar 26 '24 at 16:58
@PeterLeFanuLumsdaine I do believe if you read the answer thoroughly you'll see that this is addressed at least twice, so I disagree that it's misleading - by starting with an arbitrary $n+1$ node planar graph, the conclusion that it is $6$-colorable lets you claim it true for all such planar graphs. But it's not necessarily a requirement that it "must" start with an $n+1$ graph - it's just way easier here. If you painstakingly iterate over all possible configurations (as is done in the Four Color theorem) you can also prove such colorability. — user1947180, Mar 26 '24 at 20:31

Dan Doel · Answer 2 · 2024-03-25T19:24:43.723

Based upon your update, here is an answer that may help you.

Proving things by induction is, in fact, just like defining a procedure by recursion. This is exactly the way it is realized in proof assistants based on systems like intuitionistic type theory, like Agda. There, a proof of the theorem you're pondering would be a function with a type like:

$$(G : \mathsf{Graph}) → \mathsf{Planar}(G) → \mathsf{6Colorable}(G)$$

and that function would be definable by a sort of recursion.

The problem with your pseudocode is two-fold. First, your result is a boolean that just sort of reports a 'fact' of whether a graph is 6-colorable. But there is no weight behind it. There is no evidence backing a report of true. The idea behind the more precise type above is that types like $\mathsf{Planar}(G)$ and $\mathsf{6Colorable}(G)$ contain verified structure pertaining to the graph $G$. So, for instance, the latter might be something like a function $\mathsf{coloring} : \mathsf{Nodes}(G) → \mathsf{Color}$, together with evidence that no two adjacent nodes have the same color.

Second, a procedure like your pseudocode will not meet the more stringent specification. The relevant portion is:

lemma(G, P):
  ...
  SG = G.drop(N)
  SP = P.subgraphsArePlanar(N)
  return lemma(SG, SP)

The type of lemma(SG, SP) is $\mathsf{6Colorable}(SG)$, which is evidence about the subgraph. This is given to you by induction/recursion as a starting point for proving something about $G$, but you have omitted that part.

So, a type checker would just reject this proof, because $G$ and $SG$ are two different things. The missing ingredient is a proof of $\mathsf{6Colorable}(SG) → \mathsf{6Colorable}(G)$, which can be obtained using the fact that $N$ has at most $5$ connections in the graph. This is the inductive step, from a smaller to larger structure.

Thank you. But it doesn't remove just "a" node. It removes a very special node. Can you construct a convincing proof that assumes a graph with n nodes, applies the inductive hypothesis and ends up at a proof for n+1? No, because you are only allowed to add nodes that fit my desired characteristics of having degree 5 or less. — SLLegendre, Mar 23 '24 at 16:36
The proof essentially says: if I take a graph and only do operations on it so that my thesis is never wrong, my thesis is never wrong. But I would like to see proof that I cannot simply add a degree 6 node. Anyways, thank you for taking the time... For me the logic of the proof just seems circular. — SLLegendre, Mar 23 '24 at 16:41
What does adding a degree 6 node have to do with anything? You aren't adding nodes arbitrarily to a graph. You are given a planar graph with $n+1$ nodes and are removing one of the nodes to get a smaller graph. A node with the right property always exists by a separate theorem you mentioned. — Dan Doel, Mar 23 '24 at 17:19

score 7 · Answer 3 · answered Mar 23 '24 at 16:14

They are not assuming $P(n+1)$ and using that to deduce $P(n)$, as you seem to believe.

They are assuming that $P(n)$ is true (this is the induction hypothesis) and using this to deduce $P(n+1)$, by considering an arbitrary planar graph $G$ with $n+1$ nodes and reasoning about $G$ using what they know (from the induction hypothesis) about planar graphs with $n$ nodes.

In other words, they are proving the implication $P(n) \implies P(n+1)$, as they should.

score 6 · Answer 4 · answered Mar 24 '24 at 14:47

I think your confusion is ultimately about how you prove a universal sentence (each object of type X has property P) by induction.

To prove a universal sentence, you consider an arbitrary object O of type X. Then you do some arguing, aiming to end up with the conclusion that object O has property P.

To use an assumption which has the form of a universal sentence in your proof, you simply feed it an object of type X of your choice and it allows you to jump to the conclusion that the object has property P. The difference is that previously, you were forced to consider an arbitrary object, whereas here you get to choose which object you apply the assumption to.

Back to the proof. You are trying to prove that each object of type X with cardinality $n+1$ has property P, under the assumption that this holds for all objects of type X of cardinality $n$. Very well, consider an arbitrary object O of type X with cardinality $n+1$. This is how the proof must start (unless you are going for a proof by contradiction), since you are proving a universal sentence. Fiddle with the object O a bit to obtain on object O' of type X with cardinality $n$. Now you can use the inductive assumption to conclude that O' has property P. Finally, you show that if O' has property P, then so does O. And you are done.

You say:

But I would like to see proof that I cannot simply add a degree 6 node.

You certainly, in many instances, can take a planar graph of cardinality $n$ and add a degree 6 node to it to obtain a planar graph of cardinality $n+1$. But this has nothing to do with the proof. If you start with such a graph $G_n$, and add a degree 6 node to it to obtain a graph $G_{n+1}$, the proof that $G_{n+1}$ satisfies your coloring condition will not consist in applying the inductive hypothesis to the graph $G_n$, but rather to a different graph $G'_{n}$ of cardinality $n$, obtained from $G_{n+1}$ by removing some node of degree 5 (in particular, by removing a different node than the one you added). This is okay, because you are not using as your inductive hypothesis the assumption that $G_n$ satisfies the coloring condition, but rather the assumption that all planar graphs of cardinality $n$ satisfy this condition.

score 5 · Answer 5 · answered Mar 25 '24 at 02:10

In the pseudocode you provided, a proof checker would complain about the second return statement:

    specialNode = findSomeDegree5orFewerNode(graph)
    subgraph = graph.drop(specialNode)
    return is6Colorable(subgraph)
           ^^^^^^^^^^^^^^^^^^^^^^
error: cannot deduce why is6Colorable(subgraph) implies is6Colorable(graph)

Instead, you have to supply an explicit proof in order to convince the proof checker:

    specialNode = findSomeDegree5orFewerNode(graph)
    subgraph = graph.drop(specialNode)
    subgraphColoring = is6Colorable(subgraph) # induction hypothesis: assume is6Colorable for all graphs of size (graph.size - 1)
    unusedColor = findUnusedColor(subgraphColoring, specialNode)
    return appendColoring(subcoloring, unusedColor)

Note that a proof that a graph is colorable is equivalent to having an algorithm to construct a valid coloring of that graph. Hence, appendColoring can be interpreted both as (a) an algorithm to append a node with a (locally) unused color to an existing coloring and (a) a proof that a graph with an extra node is colorable if one provides a (locally) unused color.

In general, there's many limitations with this sort of pseudocode. Several hidden assumptions are not explicitly shown: for example, the fact that specialNode has a degree of at most 5 originates from findSomeDegree5orFewerNode and is later needed for findUnusedColor to work.

By the way, if you want to learn more about writing proofs as programs, consider learning a proving language like Agda or Lean!

Definitely learn a proving language! ...And be prepared to be driven crazy because it will all the time complain that something seemingly obvious can't be deduced. (In this case it would for example take a lot of convincing that removing a node from an $n+1$-node planar graph leaves you with an $n$-node planar graph.) But when it finally typechecks, it's extremely satisfying. — leftaroundabout, Mar 25 '24 at 11:16

score 4 · Answer 6 · answered Mar 24 '24 at 00:23

I think what's confusing you is that the theorem is indeed wrong, and the proof is missing a step!

The theorem that you're asked to prove is "any graph can be colored in at most 6 colors." But that theorem is false; not all graphs are 6-colorable.

What they meant to write is that any planar graph can be colored in at most 6 colors. So you're not expected to use a property of planar graphs to prove a general property of graphs, you're expected to use one property of planar graphs to prove another property of planar graphs.

Likewise, the proof relies on the unstated fact that the induced graph G′ is planar. I think this fact is clear enough that it's OK to state it without proof (all edges of G′ are edges of G, so if you draw G with no edges crossing, then the corresponding drawing of G′ also has no edges crossing), but I think the proof does need to actually state this fact.

Once those are fixed, the theorem and proof are both good.

score 3 · Answer 7 · answered Mar 24 '24 at 08:34

The specific flaw in your mock-proof (which I love, by the way) is that the property "there exists someone with at most 2 friends" is not stable under taking subsets. (Put differently, if we define a school to be a group in which "in every school there is someone with at most 2 friends," $G'$ doesn't need to be a subschool of $G$, e.g. if you had a school with one loner and three mutual friends, then removed the loner, you would not longer have a "school".)

On the other hand, the property of being a planar graph is stable under taking subsets, which is a key point in the inductive step of the proof you are questioning (which really ought to contain the observation that $G'$ is planar when it goes to apply the inductive hypothesis).

score 3 · Answer 8 · answered Mar 25 '24 at 12:58

As rufflewind said, you intuition of inductive proofs as recursive programs is actually a good one and made precise by proof checkers like Coq, Lean or Agda. I won't go quite there, but would like to highlight some concrete things that would be different.

One thing that is important when representing proofs by functions is that you have the right type signature. In your attempt, that would have been

is6Colorable: Graph -> Bool

or (just different syntax)

is6Colorable: ∀ (g: Graph): ∃ (b: Bool)

But this is not the right type for such a proof! This type has trivial instantiations like

is6Colorable(g) = True

which obviously doesn't actually prove anything about 6-colourability.

Instead, the proof (specifically, a constructive proof) should give that what the theorem asserts; in this case,

is6Colourable: ∀ (g: Graph): ∃ (c: Colouring): isColouring(g,c) ∧ |c| ≤ 6

Ok, but how would the implementation look like? Well, for the start, actually quite like yours:

is6Colourable(g):
  if g.size == 1: # base case
    ...
  else:
    ...

But the cases still need to do more than just give back a truth value. For the base case, it would provide the single-colour solution plus a proof that it's small enough:

is6Colourable(g):
  if g.size == 1: # base case
    c = uniformColouring
    return ( c
           , proofThatColoursSingleNodeGraph(c)
           , proofThat1<6(proofThatHasOnlyOneColour(c)) )

For the recursive case, this becomes important because you do not just return the proof that the reduced graph is 6-colourable. That wouldn't be sufficient as that graph contains one too few nodes!

Instead, we have to use that proof to prove that the property holds also for the $n+1$-node graph g. This would look something like this:

is6Colourable(g):
  ...
  else:
    specialNode = findSomeDegree5orFewerNode(g)
    subgraph = g.drop(specialNode)
    c, c_colours_subgraph, c_is_6colouring = is6Colorable(subgraph)
    ngbs = g.neighbours(specialNode)
    ζ = unused_colour(c, ngbs)
    c' = c.append_coloured_node(specialNode, ζ)
    return ( c'
           , proofThatColoursAppendedNodeGraph(graph, c, c')
           , proofThatHasAtMost6Colours(c') )

score 1 · Answer 9 · answered Mar 24 '24 at 01:07

The missing step that I see in this proof is reasoning for why the subgraph, which is formed specifically by removing a degree <= 5 node, is guaranteed to still have a degree <= 5 node.

Assuming that you have not omitted anything relevant from the official problem and solution, and that the given solution is in fact correct, I can think of only one explanation for this:

The stipulation that a planar graph has a node of degree at most 5 is a proven property of planar graphs, not part of the definition of planar graphs, and can therefore be assumed true for any planar graph. In other words, the constraint of being able to draw the graph with no crossing edges is sufficient to prove the existence of a node of degree at most 5. For the purpose of the task, this property is stated as a given, with its proof being outside the task's scope.

It seems obvious to me that, with the definition of planar graphs being solely about the lack of crossing edges, any subgraph of a planar graph must also be a planar graph, so that stipulation being a derived rather than definitional property provides the needed guarantee.

Some quick googling easily verified this, finding many sites with proofs that all planar graphs have a node of degree at most 5.

score 1 · Answer 10 · answered Mar 26 '24 at 19:13

There are other good answers here, I wanted to offer a perspective that I think might appeal to a computer scientist: in a proof by induction, the inductive step is often presented as a way to build up $P(n+1)$ from $P(n)$, but it is often better to think of the inductive step as breaking $P(n+1)$ down into smaller, more manageable subparts.

Concretely, in this situation, imagine you're trying to program a function that takes in a planar graph $G$, and returns a 6-coloring of $G$. If I were naively to try to create a recursive algorithm to do this, it might look something like this:

def 6color(graph):
    #Base case: if there's only one vertex, color that vertex
    if len(graph.vertices) == 1:
       return {graph.vertices[0]: 0}
coloring = {}

#Remove an arbitrary vertex to get a smaller graph, then recurse
v = some_vertex(graph)
smaller_graph = graph.remove(v)
smaller_coloring = 6color(smaller_graph)

#For vertices other than v, we can lift the coloring from the subgraph
for vertices u != v:
    coloring[u] = smaller_coloring[u]

#Must make sure v doesn't have the same color as any of its neighbors
colors = [0,1,2,3,4,5]
for u in neighbors(v):
    colors.remove(smaller_coloring[u])

#Assign v one of the remaining, legal colors
coloring[v] = colors[0]
return coloring

This code sometimes works, but it sometimes fails because the last for loop might leave colors empty. This actually makes sense, because the code above does not use the fact that the input graph was planar, so if it did work it would mean that every graph was 6-colorable, which is patently false! However, when we know the input graph is planar, there is a very easy fix: instead of choosing v = some_vertex(graph), we can instead choose v = vertex_degree_5_or_less(graph). Because the graph is planar, we are guaranteed (by an auxiliary theorem) that there is such a vertex, so vertex_degree_5_or_less will always successfully find such a vertex. Now, the last for loop is okay, because we will never remove more than 5 of the colors from the pool, guaranteeing there is a color left to give to v.

Writing the proof by induction is exactly the same thing as proving that the above code (with the necessary change) will always compute the thing we want it to compute.

Moral of the example

The perspective I want to impart is that in some sense every proof by induction essentially works this way (including proofs using so-called "strong induction" and so-called "structural induction"). They always go:

start with an instance of the problem of size $n+1$
somehow construct an instance (or instances) of size $n$ (or possibly smaller)
"lift" some information from the smaller instance(s) to the bigger one
check that nothing has gone catastrophically wrong

This pattern is often obscure when you first learn proof by induction, because when you prove statements about things like factorials or Fibonacci numbers, there is only ever one way to go from the big instance to the small instance, and there's rarely any bad thing that can happen in the lifting process. (One example of where things could go wrong would be a putative "proof" that $n! \leq 10^n$. Here, the base case works out, and the lifting step works for the first 10 steps, but after that, catastrophe!) However, in more sophisticated proofs by induction, usually most of the hard work is checking that disaster does not strike. And, when disaster does strike (your code doesn't compile, your "lifting" step has a bad input), it's often because you weren't careful enough with how you chose to create the smaller instance(s) from the big instance. This is worth emphasizing: the goal is not simply to reduce the problem to a smaller problem, it's to reduce it to a smaller problem and then use that smaller problem to solve the bigger one!

That's exactly what goes wrong in your argument about friends at a school. In that case, you can certainly reduce to a smaller problem, but there is no way to do so that doesn't run into problems in the lifting step, on account of the fact that the "theorem" itself isn't true. (e.g. through extraordinary luck, even I had more than 6 friends in high school.)

Does this provide "no insight"?

Finally, about your charge that the inductive argument seems devoid of insight: you would not be the first person to think so. There are lots of proofs by induction that basically everyone feels successfully prove that a claim is true, but don't really do a good job telling you anything about the underlying phenomenon. For example, you could probably prove by induction that $\left(\sum_{i=1}^n i\right)^2 = \sum_{i=1}^n i^3$, but the proof would not tell you why you would ever think that the two quantities could be the same in the first place.

I think, relatively speaking, this proof by induction actually does have a kernel of insight: a crucial input to the proof is the fact that being a planar graph forces a certain amount of "sparsity" in the graph, which in this case takes the shape of knowing there is a vertex of low degree. There are other theorems about how the condition of being planar precludes a graph having certain "dense" substructures, the most salient being Wagner's theorem, and Hadwiger's conjecture is an open conjecture which, if true, would link these "dense" substructures to the existence of certain colorings.