APLA

-- under construction --

APLA Stands for APL Assembler. It is the internal representation of APL source code. This is an idea I had back in 2011.

The concept is that APL source code is parsed and converted to an APLA stream once prior to initial execution of the user programs. Continual execution operates on this tokenized or bytecode stream only. There is no ongoing interpretation of APL code thus eliminating the huge overhead that is associated with traditional on-the-fly interpreters. In general tokenization of any language's source code is common practice. However in the case of APL in particular I believe this approach is novel and a bit tricky to achieve. It can be thought of as a form of precompilation.

Here I show how the APL functionality is preserved over bytecode conversion. The resultant bytecode stream (or assembler code) image is static for the most part during run-time but does incorporate a self-modifying feature used when necessary.

Although APLA is shown as keywords and parameters it resides as a bytecode or token stream. The braces and mnemonics are an externalized human-readable form.

APLA code is generated by the scanner/parser internally. It can also be entered hand-coded as source code for purposes of diagnostics, experimental development, or additional system resource accessibility by advanced users.

It is planned that the event loop, parser and interpreter itself will eventually be written in APL or an APL/APLA hybrid.

The APL statement is shown in purple. Resultant APLA production is shown in brown enclosed in braces. These productions normally occur internally. Here they are shown mnemonically to explain the executable bytestream that is generated.

A complete list of keywords i.e. all 'assembler' mnemonics is included later in this document.

Here I give a crash course through a series of examples.

Example 1 - number

12 ESCAPE='HTML'
{ CONST 12 }

APLA code pushes the value 12 onto the stack. The token CONST is used to stack numeric or string constants - scalars or vectors.

Example 2 - object name

DATA ESCAPE='HTML'
{ EVAL DATA [0] }

Token EVAL causes the object name to be evaluated (i.e. executed in the case of a user function). Note the supplementary parameter '[0]' following the object name. This is a valence enforcement mask. During runtime when the object is identified its valence must be compatible with this parameter or else an error is emitted. 0 means 'niladic'.

Example 3 - object name with preceding monadic primitive

xDATA ESCAPE='HTML'
{   EVAL        DATA [0v]
    P_SIGNUM
    }

Note the valence enforcement parameter now includes 'v' meaning the object is compelled to return a value. The APL primitive 'signum' is invoked with token P_SIGNUM. Most APL primitives have corresponding tokens.

Example 4 - an expression containing multiplication

3xDATA ESCAPE='HTML'
{   EVAL        DATA [0v]
    CONST       3
    P_MULTIPLY
    }

Note that the correct primitive's operation is generated. In Example 3 the primitive was used monadically. Here it is used dyadically. Its meaning was either the signum function or multiplication determined from context.

Tokens designated with the "P_" prefix are APL primitives. Examples are P_SIGNUM, P_CEILING, P_ADD, P_LOGARITHM.

APL's dual meanings of primitives might be called 'overloading' in object-oriented parlance. Here the specific function was determinable from syntax alone.

There are cases where syntax isn't enough.

Example 5 - another expression containing multiplication. Or is it?

DATAx3 ESCAPE='HTML'

Even though this example appears mathematically equivalent to the previous one (multiplication is commutative after all), APL does not know how to interpret this from this code snippet alone. It requires more context.

The object 'DATA' may be a variable or alternatively a function that is expecting a value to its right. If the latter, whatever is to the right must be evaluated first. 'x' is then the signum function as in example 3. If the object is a variable or a function that does NOT require a right value then 'x' is a multiplication function as in example 4. In the above scenario it is unknown which it is from the syntax. The 'x' primitive is said to have ambivalent valence - that is, it can accept one or two arguments. Welcome to the joys, power, and possibly mystifying aspects of the APL language.

The interpreter must defer evaluating the expression until runtime when all objects are presumed to be identifiable. The execution path is flexible and sensitive to the object's type during execution.

{   CONST       3
    SKIP4NIL    DATA        //  valence?
    P_SIGNUM
    EVAL        DATA [1]    //  dyadic
    SKIP2
    EVAL        DATA [0v]   //  monadic
    P_MULTIPLY
    }

Token 'SKIP4NIL' examines the object's valence. If it is 'niladic' the next 4 instructions are skipped. If it is NOT 'noladic' then execution drops through to the next instruction.

Token 'SKIP2' skips the following 2 instructions and puts the executer at the end of the stream. Token 'EXIT' could be used here as well.

Object 'DATA' is expected to return a value (the 'v' parameter) as it is required for the APL primitive multiplication. Whereas in the signum case there is no such requirement. The object (user function) is free to either return a value or not.

Finally a more complex example exercising ambivalence to a greater degree. Multiple user functions (or variables for all we know) appear together. These objects are niladic, monadic, or dyadic meaning they can accept zero, one, or two arguments. Depending on how they were defined of course. In this extreme example the arrangement precludes any syntactical hints as to their valences. This is the worst nightmare for APL compilers and likely programmers who are tasked with reading someone else's code as well.

D C B A 123 ESCAPE='HTML'
{   CONST       123
    SKIP1NIL    B         //  B valence?
    EVAL        A [1v]
    SKIP0
    SKIP1NIL    C         //  C valence?
    EVAL        B [0v,1v]
    EVAL        A [2v]    //  this could be NOP
    SKIP1NIL    D         //  D valence?
    EVAL        C [0v,1v]
    EVAL        B [2v]    //  this could be NOP
    SKIP0
    EVAL        D [0v,1]
    EVAL        C [2]     //  this could be NOP
    }

Token 'SKIP1NIL' examines the object's valence. If it is 'niladic' the next instruction is skipped. If it is NOT 'niladic' then execution drops through to the next instruction. SKIP1NIL has one additional effect. It invalidates the instruction at (* + 5) meaning 5 statements beyond its current relative location. This turns that future instruction into a NOP temporarily.

Interestingly the number of unique paths through the stream can be calculated by using the Fibonacci function Fib(n + 1) where n is the number of ambivalent objects. In the above example there are 4 objects A, B, C, D. Fib(5) yields 5. That is, there are five distinct orderings for evaluating 4 objects. They are ABCD, ABDC, ACBD, BACD, BADC

There are in fact 2^n or 8 combinations of objects B, C and D being niladic if you include illegal combinations. Only 5 are legitimate. Illegal ones map into existing paths by forcing evaluation of objects as dyadic before evaluating their left arguments. These situations would be caught as errors.

Why would ambivalent cases remain so during continued execution? Wouldn't the first encounter establish valences and enable simpler bytestreams to be generated? Not necessarily. Even if a programmer was consistent and deliberately careful to code their objects with fixed valences - as is a stated limitation for many APL compilers... There are cases where dynamic valence can not be avoided. They may occur naturally or accidentally. For example, a global function is masked by a function's localized variable. A called function is using a symbol name, expecting it to be global but is now being masked. What was before a monadic function has suddenly turned into a niladic variable.

There is a work-around. If valences are known and are never to be used dynamically / ambivalently on purpose then one could add parenthesis in the APL statement. Specifically parenthesis around single objects removing their ambivalence. While this may appear redundant it isolates objects from erroneously assumed parameters.

→ Page 2 - List of Bytecode Tokens - APL Primitives

Revised 9 May 2020