What piece am I missing to turn this idea into a programming language?

Question

I've been doing some reading (I'll name drop along the way) and have selected a few scattered ideas that I think could be cobbled together into a nifty esoteric programming language. But I'm having some difficulty assembling the parts.

Kleene's Theorem states: Any Regular Set can be recognized by some Finite-State Machine (Minsky 4.3).

Minsky's Theorem 3.5: Every Finite-State machine is equivalent to, and can be "simulated by", some neural net.

"There is a natural way to represent any forest as a binary tree." (Knuth, v1, 333).

And according to Bentley (Programming Pearls, p.126) a binary tree can be encoded as a flat array.

So I'm imagining an array of bit-fields (say 4 bits so it can easily be worked with in hexadecimal). Each field indicates a type of automaton, and the positions of the array encode (via an intermediary binary tree representation) a forest which approximates (? missing piece ?) the power of a graph.

I'm somewhat bewildered by the possibilities of automaton sets to try, and of course the fun Universal Automata require three inputs (I worked up an algorithm inspired by Bentley to encode a ternary tree implicitly in a flat array, but it feels like the wrong direction). So I'd appreciate any side-bar guidance on that. Current best idea: the normal set: and or xor not nand nor, with remaining bits used for threshold weights on the inputs.

So the big piece I'm missing is a formalism for applying one of these nibble-strings to a datum. Any ideas or related research I should look into?

Edit: My theoretical support suggests that the type of computations will probably be limited to RL acceptors (and maybe generators, but I haven't thought that through).

So, I tried to find an example to flesh this out. The C int isdigit(int c) function performs a logical computation on (in effect) a bit-string. Assuming ASCII, where the valid digits are 0x30 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39, so bit 7 must be off, bit 6 must be off, bit 5 must be on, and bit 4 must be on: these giving us the 0x30 prefix; then bit 3 must be off (0-7) or if bit 3 is on, bit 2 must be off and bit 1 must be off (suppressing A-F), and don't care about bit 0 (allowing 8 and 9). If you represent the input c as a bit-array (c[0]..c[7]), this becomes

~c[7] & (~c[6] & (c[5] & (c[4] & (~c[3] | (~c[2] & ~c[1])))))

Arranging the operators into a tree (colon (:) represents a wire since pipe (|) is logical or),

c[7]  6   5   4   3   2   1   0
 ~    ~   :   :   ~   ~   ~   :
    &     :   :   :     &
       &      :      |
           &        :  
                &

My thought based on this is to insert "input lead" tokens into the tree which receive the values of the input bit assigned in a left-to-right manner. And I also need a ground or sink to explicitly ignore certain inputs (like c[0] above).

This leads me to make NOT (~) a binary operator which negates the left input and simply absorbs right input. And in the course of trying this, I also realized the necessity for a ZERO token to build masks (and to provide dummy input for NOTs).

So the new set is: &(and) |(or) ^(xor) ~(not x, sink y) 0(zero) I(input)

So the tree becomes (flipping up for down)

                 ^
           &           &
       &       |      I 0
     &   I  ~     &
   &   I   I 0  ~   ~
 ~   ~         I 0 I 0
I 0 I 0
=   =  = = =   =   =  =
7   6  5 4 3   2   1  0

Which encodes into the array (skipping the "forest<=>tree" part, "_" represents a blank)

_ ^ & & & | I 0 & I ~ & _ _ _ _ & I _ _ I 0 ~ ~ _
  _ _ _ _ _ _ _ ~ ~ _ _ _ _ _ _ _ _ _ _ I 0 I 0 _
  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ I 0 I 0

The tree->array encoding always put the root in array(1) so with zero-indexed array, there's a convenient blank at the beginning that could be used for linkage, I think.

With only 6 operators, I suppose it could be encoded in octal.

By packing a forest of trees, we could represent a chain of acceptors each applied on the next input depending on the result of the previous.

Dave Clarke · Accepted Answer · 2012-09-20T09:28:54.403

This sounds more like the makings of a computational model rather than a programming language, as such, perhaps in the same way that the quantum computation can form the basis of a programming language such as the quantum lambda calculus.

Ask yourself:

What kinds of computation are you trying to perform?
How can these computations be composed?
How can the results of these computations be stored? How can they be used in subsequent computations?
How can I represent this syntactically?
Does my chosen syntax have a clear denotational meaning? Or clear operational meaning?
Do my syntactic constructs compose nicely? Are they sufficiently orthogonal and free from arbitrary constraints?
Can I modularise computations in my language?
What forms of abstraction does my language offer?

There are a number of places you could use as a starting point:

The imperative language WHILE --- a simple language with variables, assignment, sequencing, if statements, and while statements. Maybe later add procedures.
The lambda calculus.
Combinatorial logic. Programs consist of combinators which are plugged together. No variables.
Something like Prolog or Datalog.

Or you could be completely wild and ignore these and see how far you get (which could result in something interesting).

luser droog · Answer 2 · 2013-11-29T03:20:30.750

From the point of view of filling-out the machine model, the three criteria of Turing Completeness (Böhm-Jacopini theorem) appear useful.

Sequence
Selection
Iteration or Recursion

It's clearly #3 that's missing at present, the fanciful "linkage" mentioned above.

Edit: This doesn't really help dig me out of the hole with this question, but ...

I was reading an old book on digital computer fundamentals and found this very nice representation of all possible functions of two binary inputs:

x y  F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 F16
0 0   0  1  0  1  0  1  0  1  0   1   0   1   0   1   0   1
0 1   0  0  1  1  0  0  1  1  0   0   1   1   0   0   1   1
1 0   0  0  0  0  1  1  1  1  0   0   0   0   1   1   1   1
1 1   0  0  0  0  0  0  0  0  1   1   1   1   1   1   1   1

 C =  0  1  2  3  4  5  6  7  8   9   A   B   C   D   E   F
(function code)

So take the function F n, store n as C = n - 1, and the Universal Binary Function is: C x,y = ( C >> ( 2 * x + y ) & 1 ). And it fits in one hex digit!

Edit: There's another piece missing from the above: representing a neural-net as a forest.

Combining the above Universal Binary Function table with the isdigit(c) example from the question gives

[0] used for "size"
 8             
                      [1] tree root
                       8                       8 = X & Y (AND)

          [2]                     [3]          7 = ~(X & Y) (NAND)
           8                       7              
                                               1 = ~(X | Y) (NOR)
     [4]        [5]         [6]         [7]
      1          8           7           C     C = X  (ignore Y)

   [8]  [9]  [10]  [11]  [12]  [13]  [14]  [15]
    c7   c6   c5    c4    c3    c2    c1    c0

Where each function is applied to its two children, and the 8 input values propagate upward toward the root, collapsing into the yielded value.

So, the (1 bit) isdigit(8bit c) function encodes into this array (with the argument row omitted).

8:8 8 7 1 8 7 C [. . . . . . . .]

For building data-types, I'm inspired by this brief description of the types in Konrad Zuse's Plankalkül,

Binary n-bit numbers in the Plankalkül were represented by type S1.n. Another special type was used for floating-binary numbers, namely,

SΔ1 = (S1.3, S1.7, S1.22).

The first three-bit component here was for signs and special markers&dash;&dash; indicating, for example, whether the number was real or imaginary or zero; the second was for a seven-bit exponent in two's complement notation; and the final 22 bits represented the 23-bit fraction part of a normalized number, with the redundant leading-'1' bit suppressed. -- Knuth, "The Early Development of Programming Languages" in Selected Papers on Computer Languages, p. 10.

score 5 · Answer 3 · answered Sep 22 '12 at 00:50

Ahh, the ever persistent question of how to design a language. For years I never had a good solid answer to that until I fond Cognitive dimensions of notations.

The author list out various dimensions, think orthogonal properties, which should be considered when designing a language. I wish I had found this decades ago.

Enjoy.

What piece am I missing to turn this idea into a programming language?

3 Answers3