5

So, I am implementing a context sensitive syntactic analyzator. It's kind of an experimantal thing and one of the things I need are usable syntactical contructs to test it on.

For example, the following example isn't possible to parse using standard CFG (context free grammar). Basically it allows to declare multiple variables of unrelated data types and simultaneously initialize them.

int bool string number flag str = 1 true "Hello";

If I omit few details, the language used can be formally described like this:

L = {anbncn | n >= 1}

So, I would appreciate as much of similar examples as you can think of, ideally something from the area of programming languages.

Also, I am aware that current programming languages and their compilers are context sensitive, mainly thanks to semantic analysis, so I would like to state I am not looking for things like:

type checking
is variable declared?

I would preffer the examples to be actual syntactic constructs like the shown declaration example above.

Thanks in advance ;).

Raphael
  • 73,212
  • 30
  • 182
  • 400
tedd
  • 53
  • 2

1 Answers1

6

Here are three context-sensitive syntaxes actually found in programming languages. I don't believe I've ever seen a language which has types, names and values distributed as per your example, but it could certainly exist, and I'm sure there are even less readable syntaxes which are possible. The following are at least somewhat readable:

  1. Syntactic whitespace, as per Python or Haskell. This is usually handled with a context-sensitive lexical scanner rather than a context-sensitive grammar, but it is certainly context-sensitive, and it could be handled with a context-sensitive grammar if you had the machinery available. (In fact, it could be cleaner to handle it in the parser, especially for languages like Haskell in which layout-sensitive parsing is optional. [Note 1])

  2. Multi-dimensional array literals. Here, I'm not talking about languages which implement heterogeneous one-dimensional arrays as first-class types, so that an array can be an element of another array; in that case, there is no requirement that a multi-dimensional array be regular. Rather, I'm talking about languages in which multi-dimensional arrays must be regular, and so an irregular literal is a syntax error:

    julia> [2 3 4;5 6 8]
    2x3 Array{Int64,2}:
     2  3  4
     5  6  8
    
    julia> [2 3 4;5 6]
    ERROR: hvcat: row 2 has mismatched number of columns
     in hvcat at abstractarray.jl:993
    

    That's a bit of a cheat, because the Julia syntax above is really syntactic sugar for a call to hvcat, as indicated in the error message. But it could have been syntactic. Fortress -- the syntactic collector's favourite vapourware language -- proposed syntactic array literals:

    The parts of higher-dimensional matrices are separated by repeated-semicolons, where the dimensionality of the result is equal to one plus the number of repeated semicolons. Here is a 3 x 3 x 3 x 2 matrix:

    [ 1 0 0
      0 1 0
      0 0 1   ;;   0 1 0
                   1 0 1
                   0 1 0   ;;   1 0 1
                                0 1 0
                                1 0 1
      ;;;
       1 0 0
       0 1 0
       0 0 1   ;;   0 1 0
                    1 0 1
                    0 1 0   ;;   1 0 1
                                 0 1 0
                                 1 0 1 ]
    

    The elements in a matrix expression may be either scalars or matrices themselves. If they are matrices, then the elements along a row (or column) must have the same number of columns (or rows), though two elements in different rows (columns) need not have the same number of columns (rows). A scalar is treated as a one by one matrix. (Quoted from The Fortress Language Specification by Guy L. Steele et al, §2.3.19, p. 21)

  3. Agreement between parameter count in function prototypes and number of arguments in function calls. Perhaps this fits in your concept of "type checking", and I don't think it adds anything interesting to your problem set. But it is certainly context-sensitive.

Notes

  1. I stumbled upon Layout-sensitive Generalized Parsing by Sebastian Erdweg, Tillmann Rendel, Christian Kästner, and Klaus Ostermann. while writing this answer, but I haven't read it. It seems to propose a usable formalism for layout-aware parsing.
rici
  • 12,150
  • 22
  • 40