I've written a lexer in Alex and I'm trying to hook it up to a parser written in Happy. I'll try my best to summarize my problem without pasting huge chunks of code.
I know from my unit tests of my lexer that the string "\x7" is lexed to:
[TokenNonPrint '\x7', TokenEOF]
My token type (spit out by the lexer), is Token. I've defined lexWrap and alexEOF as described here, which gives me the following header and token declarations:
%name parseTokens
%tokentype { Token }
%lexer { lexWrap } { alexEOF }
%monad { Alex }
%error { parseError }
%token
NONPRINT {TokenNonPrint $$}
PLAIN { TokenPlain $$ }
I invoke the parser+lexer combo with the following:
parseExpr :: String -> Either String [Expr]
parseExpr s = runAlex s parseTokens
And here are my first few productions:
exprs :: { [Expr] }
exprs
: {- empty -} { trace "exprs 30" [] }
| exprs expr { trace "exprs 31" $ $2 : $1 }
nonprint :: { Cmd }
: NONPRINT { NonPrint $ parseNonPrint $1}
expr :: { Expr }
expr
: nonprint {trace "expr 44" $ Cmd $ $1}
| PLAIN { trace "expr 37" $ Plain $1 }
I'll leave out the datatype declarations of Expr and NonPrint since they're long and only the constructors Cmd and NonPrint matter here. The function parseNonPrint is defined at the bottom of Parse.y as:
parseNonPrint :: Char -> NonPrint
parseNonPrint '\x7' = Bell
Also, my error handling function looks like:
parseError :: Token -> Alex a
parseError tokens = error ("Error processing token: " ++ show tokens)
Written like this, I expect the following hspec test to pass:
parseExpr "\x7" `shouldBe` Right [Cmd (NonPrint Bell)]
But instead, I see "exprs 30" print once (even though I'm running 5 different unit tests) and all of my tests of parseExpr return Right []. I don't understand why that would be the case, but I changed the exprs production to prevent it:
exprs :: { [Expr] }
exprs
: expr { trace "exprs 30" [$1] }
| exprs expr { trace "exprs 31" $ $2 : $1 }
Now all of my tests fail on the first token they hit --- parseExpr "\x7" fails with:
uncaught exception: ErrorCall (Error processing token: TokenNonPrint '\a')
And I'm thoroughly confused, since I would expect the parser to take the path exprs -> expr -> nonprint -> NONPRINT and succeed. I don't see why this input would put the parser in an error state. None of the trace statements are hit (optimized away?).
What am I doing wrong?