Questions tagged [lexical-analysis]

41 questions
8
votes
4 answers

Are there real lexers that use NFAs directly instead of first transforming them to DFAs?

I am taking the Coursera class on compilers and in the lesson about lexers it is hinted that there is a time-space tradeoff between using non-deterministic finite automaton (NFA) and deterministic finite automaton (DFA) to parse regular expressions.…
5
votes
2 answers

What is the point of delimiters and whitespace handling

I see that language specifies reserved words, delimiters and whitespaces in the lexer section. Are delimiters just like reserved identifiers in the punctuation space or they have additioinal function regarding skipping the whitespace? Namely, spec…
5
votes
2 answers

Finding the number of distinct strings in regular expression

Given the regular expression $(1 + \varepsilon + 0 )(1 + \varepsilon + 0 )(1 + \varepsilon + 0 )(1 + \varepsilon + 0 )$, how many distinct strings are in the language? How do you determine this from looking at the regular expression? Do I have to…
4
votes
3 answers

Why do we not use CFGs to describe the structure of lexical tokens?

This was an exam question for my course and I am struggling to actually answer it in a way that is not fluff. Here is my current answer: CFGs describe how non-terminal symbols are converted into terminal symbols via a parser. However, a scanner…
3
votes
3 answers

Thompson's construction, transforming a regular expression into an equivalent NFA

There is this case of concatenation that says (from wikipedia): The initial state of N(s) is the initial state of the whole NFA. The final state of N(s) becomes the initial state of N(t). The final state of N(t) is the final state of the whole…
Bite Bytes
  • 259
  • 1
  • 7
3
votes
1 answer

Why using finite automata in implementing lexical analyzers

I studied the subject of building a lexical analyzer using finite automata, through the classical way: Regular Expression -> NDA -> DFA. And I found it elegant, and a more solid approach for building a lexer. Now my question is, what are other…
Bite Bytes
  • 259
  • 1
  • 7
3
votes
1 answer

White-space and comment by lexical analyzer

Lexical Analyzer mostly deletes comments and white-spaces. One example where I think Lexical Analyzer might not be discarding white-spaces is in Python language, as indentation has a important role in python. But I can't think of practical example…
Mayank Jain
  • 135
  • 5
2
votes
1 answer

How do hand-written lexers work?

Im studying a book called "Engineering a compiler" by Linda Torczon and Margaret Cooper. In the second chapter which discusses lexical analysis they mention three aproaches to writing a lexical analyser: 1. Table-driven, 2. Direct coded and 3. Hand…
2
votes
1 answer

Prefix computation used for lexical analysis?

I'm preparing a presentation on prefix computation (aka scan, the generalization of prefix summation to any associative operator) for a class I'm taking on parallel algorithms. Several lists of applications of prefix computation include lexical…
2
votes
2 answers

What defines how many lookahead a lexer has?

if a lexical grammar has multiple token which start with the same character like > >> >>= and their longest length is 3, does it have 2 character lookahead? Or is it implementation defined. Does the number of character required to produce a fixed…
noamin
  • 21
  • 1
2
votes
0 answers

CYK algorithm - how to handle unknown terminals given in a sentence to parse?

There is a given treebank which we derive the Probabilistic context free grammar. I wonder how do one handles with a given sentence which includes terminals that don't exist in the derived rules? Is there a kind of smoothing for this…
2
votes
1 answer

Lexicographically smallest down-right path in matrix

Here is the problem which I thought was simple dynamic programming, which is however not the case. Given an $N \times M$ matrix of numbers from 1 to $NM$ (each number occurs only once), find a path from top left to right bottom while moving right…
user99043
2
votes
1 answer

Why are regular expression used in lexical analysis instead of grammars?

Even though grammars can do what regular expressions do, why are regular expression employed instead of grammars in lexical analysis?
ataush
  • 23
  • 3
1
vote
1 answer

How to implement a maximal munch lexical analyzer by simulating NFA or running DFA?

I'm planning to implement a lexical analyzer by either simulating NFA or running DFA using the input text. The trouble is, the input may arrive in small chunks and the memory may not be enough to hold one very long token in the memory. Let's assume…
juhist
  • 293
  • 2
  • 7
1
vote
1 answer

Lexical analysis and reserved words

My book says lexical analysis detects reserved words. How can lexical analysis detect whether we are using reserved words in incorrect places ? For eg :- while(a>0) and int while = 5. How can lexical anaylsis differentiate the usage of while in…
Sagar P
  • 339
  • 3
  • 13
1
2 3