7

I've been doing some reading (I'll name drop along the way) and have selected a few scattered ideas that I think could be cobbled together into a nifty esoteric programming language. But I'm having some difficulty assembling the parts.

Kleene's Theorem states: Any Regular Set can be recognized by some Finite-State Machine (Minsky 4.3).

Minsky's Theorem 3.5: Every Finite-State machine is equivalent to, and can be "simulated by", some neural net.

"There is a natural way to represent any forest as a binary tree." (Knuth, v1, 333).

And according to Bentley (Programming Pearls, p.126) a binary tree can be encoded as a flat array.

So I'm imagining an array of bit-fields (say 4 bits so it can easily be worked with in hexadecimal). Each field indicates a type of automaton, and the positions of the array encode (via an intermediary binary tree representation) a forest which approximates (? missing piece ?) the power of a graph.

I'm somewhat bewildered by the possibilities of automaton sets to try, and of course the fun Universal Automata require three inputs (I worked up an algorithm inspired by Bentley to encode a ternary tree implicitly in a flat array, but it feels like the wrong direction). So I'd appreciate any side-bar guidance on that. Current best idea: the normal set: and or xor not nand nor, with remaining bits used for threshold weights on the inputs.

So the big piece I'm missing is a formalism for applying one of these nibble-strings to a datum. Any ideas or related research I should look into?


Edit: My theoretical support suggests that the type of computations will probably be limited to RL acceptors (and maybe generators, but I haven't thought that through).

So, I tried to find an example to flesh this out. The C int isdigit(int c) function performs a logical computation on (in effect) a bit-string. Assuming ASCII, where the valid digits are 0x30 0x31 0x32 0x33 0x34 0x35 0x36 0x37 0x38 0x39, so bit 7 must be off, bit 6 must be off, bit 5 must be on, and bit 4 must be on: these giving us the 0x30 prefix; then bit 3 must be off (0-7) or if bit 3 is on, bit 2 must be off and bit 1 must be off (suppressing A-F), and don't care about bit 0 (allowing 8 and 9). If you represent the input c as a bit-array (c[0]..c[7]), this becomes

~c[7] & (~c[6] & (c[5] & (c[4] & (~c[3] | (~c[2] & ~c[1])))))

Arranging the operators into a tree (colon (:) represents a wire since pipe (|) is logical or),

c[7]  6   5   4   3   2   1   0
 ~    ~   :   :   ~   ~   ~   :
    &     :   :   :     &
       &      :      |
           &        :  
                &

My thought based on this is to insert "input lead" tokens into the tree which receive the values of the input bit assigned in a left-to-right manner. And I also need a ground or sink to explicitly ignore certain inputs (like c[0] above).

This leads me to make NOT (~) a binary operator which negates the left input and simply absorbs right input. And in the course of trying this, I also realized the necessity for a ZERO token to build masks (and to provide dummy input for NOTs).

So the new set is: &(and) |(or) ^(xor) ~(not x, sink y) 0(zero) I(input)

So the tree becomes (flipping up for down)

                 ^
           &           &
       &       |      I 0
     &   I  ~     &
   &   I   I 0  ~   ~
 ~   ~         I 0 I 0
I 0 I 0
=   =  = = =   =   =  =
7   6  5 4 3   2   1  0 

Which encodes into the array (skipping the "forest<=>tree" part, "_" represents a blank)

_ ^ & & & | I 0 & I ~ & _ _ _ _ & I _ _ I 0 ~ ~ _
  _ _ _ _ _ _ _ ~ ~ _ _ _ _ _ _ _ _ _ _ I 0 I 0 _
  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ I 0 I 0 

The tree->array encoding always put the root in array(1) so with zero-indexed array, there's a convenient blank at the beginning that could be used for linkage, I think.

With only 6 operators, I suppose it could be encoded in octal.

By packing a forest of trees, we could represent a chain of acceptors each applied on the next input depending on the result of the previous.

luser droog
  • 277
  • 2
  • 10

3 Answers3

12

This sounds more like the makings of a computational model rather than a programming language, as such, perhaps in the same way that the quantum computation can form the basis of a programming language such as the quantum lambda calculus.

Ask yourself:

  • What kinds of computation are you trying to perform?
  • How can these computations be composed?
  • How can the results of these computations be stored? How can they be used in subsequent computations?
  • How can I represent this syntactically?
  • Does my chosen syntax have a clear denotational meaning? Or clear operational meaning?
  • Do my syntactic constructs compose nicely? Are they sufficiently orthogonal and free from arbitrary constraints?
  • Can I modularise computations in my language?
  • What forms of abstraction does my language offer?

There are a number of places you could use as a starting point:

  • The imperative language WHILE --- a simple language with variables, assignment, sequencing, if statements, and while statements. Maybe later add procedures.
  • The lambda calculus.
  • Combinatorial logic. Programs consist of combinators which are plugged together. No variables.
  • Something like Prolog or Datalog.

Or you could be completely wild and ignore these and see how far you get (which could result in something interesting).

Dave Clarke
  • 20,345
  • 4
  • 70
  • 114
6

From the point of view of filling-out the machine model, the three criteria of Turing Completeness (Böhm-Jacopini theorem) appear useful.

  • Sequence
  • Selection
  • Iteration or Recursion

It's clearly #3 that's missing at present, the fanciful "linkage" mentioned above.


Edit: This doesn't really help dig me out of the hole with this question, but ...

I was reading an old book on digital computer fundamentals and found this very nice representation of all possible functions of two binary inputs:

x y  F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 F13 F14 F15 F16
0 0   0  1  0  1  0  1  0  1  0   1   0   1   0   1   0   1
0 1   0  0  1  1  0  0  1  1  0   0   1   1   0   0   1   1
1 0   0  0  0  0  1  1  1  1  0   0   0   0   1   1   1   1
1 1   0  0  0  0  0  0  0  0  1   1   1   1   1   1   1   1

 C =  0  1  2  3  4  5  6  7  8   9   A   B   C   D   E   F
(function code)

So take the function F n, store n as C = n - 1, and the Universal Binary Function is: C x,y = ( C >> ( 2 * x + y ) & 1 ). And it fits in one hex digit!


Edit: There's another piece missing from the above: representing a neural-net as a forest.


Combining the above Universal Binary Function table with the isdigit(c) example from the question gives

[0] used for "size"
 8             
                      [1] tree root
                       8                       8 = X & Y (AND)

          [2]                     [3]          7 = ~(X & Y) (NAND)
           8                       7              
                                               1 = ~(X | Y) (NOR)
     [4]        [5]         [6]         [7]
      1          8           7           C     C = X  (ignore Y)

   [8]  [9]  [10]  [11]  [12]  [13]  [14]  [15]
    c7   c6   c5    c4    c3    c2    c1    c0

Where each function is applied to its two children, and the 8 input values propagate upward toward the root, collapsing into the yielded value.

So, the (1 bit) isdigit(8bit c) function encodes into this array (with the argument row omitted).

8:8 8 7 1 8 7 C [. . . . . . . .]

For building data-types, I'm inspired by this brief description of the types in Konrad Zuse's Plankalkül,

Binary n-bit numbers in the Plankalkül were represented by type S1.n. Another special type was used for floating-binary numbers, namely,

SΔ1 = (S1.3, S1.7, S1.22).

The first three-bit component here was for signs and special markers&dash;&dash; indicating, for example, whether the number was real or imaginary or zero; the second was for a seven-bit exponent in two's complement notation; and the final 22 bits represented the 23-bit fraction part of a normalized number, with the redundant leading-'1' bit suppressed. -- Knuth, "The Early Development of Programming Languages" in Selected Papers on Computer Languages, p. 10.

luser droog
  • 277
  • 2
  • 10
5

Ahh, the ever persistent question of how to design a language. For years I never had a good solid answer to that until I fond Cognitive dimensions of notations.

The author list out various dimensions, think orthogonal properties, which should be considered when designing a language. I wish I had found this decades ago.

Enjoy.

Guy Coder
  • 5,181
  • 2
  • 30
  • 65