Syntactically a Tridash program consists of a sequence of node
declarations, each consisting of a single node expression. Node
expressions are divided into three types: atom nodes, functor
nodes and literal nodes. Declarations are separated by a semicolon
;
character or a line break.
Anything occurring between a #
character and the end of the line is
treated as a comment and is discarded.
Basic Grammar.
<declaration> = <node>(';' | <line break> | <end of file>) <node> = <functor> | <identifier> | <literal> <node> = '(' <node> ')' <functor> = <prefix-functor> | <infix-functor> <prefix-functor> = <node> '(' [<node> {',' <node>}*] ')' <infix-functor> = <node>' '<identifier>' '<node> <literal> = <number> | <string> <number> = <integer> | <real>
The grammar notation used in this manual adheres to the following conventions:
<...>
indicate named syntactical entities.
=
designates that the syntactical entity on the left is
defined by the rules on the right. Multiple definitions of the
same syntactical entity indicate disjunction of the rules in
each definition.
|
indicates disjunction in a single definition.
'...'
.
..
indicates ranges, e.g. '0'..'9'
indicates the digit
characters in the range 0
to 9
: 0
, 1
, 2
, 3
, 4
,
5
, 6
, 7
, 8
, 9
.
[...]
are optional.
{...}
are grouped into a single rule.
*
indicates zero or more repetitions of the preceding rule.
+
indicates one or more repetitions of the preceding rule.
An atom node consists of a single node identifier, referred to as a
symbol. Node identifiers may consist of any sequence of characters
excluding whitespace, line breaks, and the following special
characters: (
, )
, {
, }
, "
, ,
, .
, ;
, #
. Node identifiers
must consist of at least one non-digit character, otherwise they are
interpreted as numbers.
The following are all examples of valid node identifiers:
name
full-name
node1
1node
The following are not valid node identifiers:
123
a.b
j#2
— Only the j
is part of an identifier, the #2
is a comment
1e7
— As the e
indicates a real-number in scientific
notation. See Numbers.
A functor node is an expression consisting of an operator node applied
to zero or more argument nodes. Syntactically the operator node is
written first followed by the comma-separated list of argument nodes
in parenthesis (...)
.
Examples.
op(arg) func(arg1, arg2) m.fn()
A line break occurring before the closing parenthesis |
In the special case of an operator applied to two arguments, the operator may be placed in infix position, that is between the two argument nodes. The operator may only be an atom node and must be registered as an infix operator.
Spaces between the infix operator and its operands are required, in order to distinguish the operator from the operands. This is due to there being few restrictions on the characters allowed in node identifiers. |
A line break occurring between the infix operator and its right argument is not treated as a declaration separator. However a line break between the left argument and infix operator is treated as terminating the declaration consisting of the left argument. The following line is then treated as a new separate declaration. |
Functor expressions written in infix and prefix form are equivalent, thus the following infix functor expression:
a + b
is equivalent to the following prefix functor expression (either expression may be written in source code):
+(a, b)
Each node registered as an infix operator has, associated with it, a precedence and associativity. The precedence is a number that controls the priority with which the operator consumes its arguments. Operators with a higher precedence consume arguments before operators with a lower precedence. The associativity controls whether the operands are grouped starting from the left or right, in an expression containing multiple instances of the same infix operator.
a + b * c
The *
operator has a greater precedence than the +
operator, thus
it consumes its arguments first, consuming the b
and c
arguments.
The +
operator has a lower precedence thus it consumes its arguments
after *
. The arguments available to it are a
and *(b, c)
.
As a result the infix functor expression is parsed to the following:
+(a, *(b, c))
Use parenthesis to control which arguments are grouped with which
operators. Thus for an infix expression to be parsed to the following,
, assuming the *
operator has a greater precedence than +
:
*(+(a, b), c)
a + b
must be surrounded in parenthesis:
(a + b) * c
Multiple declarations can be syntactically grouped into a single node
expression using the {
and }
delimiters. All declarations between
the delimiters are processed and accumulated into a special node
list syntactic entity. In most contexts the node list is treated as
being equivalent to the last node expression before the closing brace
}
, however special operators and macros may process node lists in a
different manner.
Node Lists.
<node> = <node list> <node list> = '{' <declaration>* '}'
Each node list must be terminated by a closing brace |
Literal nodes include numbers and strings.
There are two types of numbers: integers and real-valued numbers, referred to as reals, which are represented as floating-point numbers.
Integers consist of a sequence of digits in the range 0—9
,
optionally preceded by the integer’s sign. A preceding -
indicates a
negative number. A preceding +
indicates a positive integer, which
is the default if the sign is omitted.
Integer Syntax.
<integer> = ['+'|'-']('0'..'9')+
There are numerous syntaxes for real-valued numbers. The most basic is
the decimal syntax which comprises an integer followed by the
decimal dot .
character and a sequence of digits in the range
0—9
.
Decimal Real Syntax.
<real> = <decimal> | <magnitude-exponent> <decimal> = <integer>'.'('0'..'9')+
The decimal |
The exponent syntax allows a real-number to be specified in scientific
notation as a magnitude and exponent pair
. The exponent syntax comprises a real in
decimal syntax or an integer, followed by the character e
, f
, d
,
or l
which indicates the precision of the real-number, followed by
the exponent as an integer.
Exponent Syntax.
<magnitude-exponent> = (<decimal>|<integer>)['e'|'f'|'d'|'l']<integer>
e
and f
indicate a single precision floating point number, d
indicates double precision and l
indicates long precision.
Literal strings consist of a sequence of characters enclosed in double
quotes "..."
.
String Syntax.
<string> = '"'<unicode char>*'"'
where <unicode char>
can be any Unicode character.
A literal "
character can appear inside a string if it is preceded
by the backslash escape character \
.
Example.
"John said \"Hello\""
Certain escape sequence, consisting of a \
followed by a character,
are shorthands for special characters, allowing the character to
appear in the parsed string without having to write the actual
character in the string literal.
Table 1. Escape Sequences
Sequence | Character | ASCII Character Code (Hex) |
---|---|---|
| Line Feed (LF) / New Line |
|
| Carriage Return (CR) |
|
| Tab |
|
| Unicode Character |
|
The \u{<code>}
escape sequence is replaced with the Unicode
character with code (in hexadecimal) <code>
. There must be an
opening brace {
following \u
otherwise the escape sequence is
treated as an ordinary literal character escape, in which \u
is
replaced with u
. Currently the closing brace is optional }
, as
only the characters up to the first character that is not a
hexadecimal digit are considered part of the character code. However,
it is good practice to insert the closing brace as it clearly delimits
which characters are to be interpreted as the character code and which
characters are literal characters.
The |
In a future release, omitting either the opening or closing brace, in a Unicode escape sequence, may result in a parse error. |