5

I am designing a simple compiler for my university project. In my programming language, the break keyword is allowed.

I want to know whether break keyword occurs outside a loop should be a syntax error or semantic error. I want to know what is the best approach.

If it is a syntax error I can do it in the grammar file. But if it is a semantic one then I can do it in a pass. But I don't know which one is the better approach.

2 Answers2

13

Both work, so how you do it is up to you. But there are a couple of reasons to consider doing it during a post-parse analysis:

  1. While it is certainly possible to define two different types of block, one in which break is legal and the other one in which it isn't, the result is a lot of duplication in the grammar, and a certain complication because you have to allow break inside blocks which are themselves contained inside a looping construct. If you allow compound statements without explicit blocks, like if (condition) break;, then you also need to define different if statements, etc. This rapidly gets out of hand unless you have a tagged formalism like that used in ECMAScript.

  2. Detecting the error in a post-parse scan will allow you to produce better error messages. Rather than just presenting the user with some kind of syntax error message (which might be "unexpected token 'break'"), you can actually provide a meaningful error message like "'break' statement cannot be used outside of a loop."

rici
  • 12,150
  • 22
  • 40
0

You could do this purely in syntax - you’d have a grammar with two non-terminals for “statement that can contain a break” and one for “statement that may not contain a break”. You duplicate the size of the grammar, making it a right pain.

When you parse a break statement you need semantics anyway to determine what exactly it breaks from, since you may have nested loops and switch statements. Detecting that it has nothing to break from is a tiny part of that, so semantically it’s quite simple. You will save yourself some headache by doing this check in your semantics.

Of course you will use the syntax to prevent things like “x = break + 1;”.

gnasher729
  • 32,238
  • 36
  • 56