Compare commits
83 Commits
feature-fu
...
master
| Author | SHA1 | Date | |
|---|---|---|---|
| 80d9b36901 | |||
| 70c9d073f9 | |||
| 5f720ad7c3 | |||
| 748bd10dd9 | |||
| 1ade6cae50 | |||
| 3e4ed82dc4 | |||
| e5edc6b2ba | |||
| f4286db21d | |||
| 3892ea46e0 | |||
| 8b7ed96e15 | |||
| 67b07dfd72 | |||
| 6c0867143b | |||
| abefe32300 | |||
| 742d6706b0 | |||
| 3806a61756 | |||
| 2880ba81ab | |||
| 4e92a416ed | |||
| c1bee69fa6 | |||
| f2331d7de9 | |||
| c4d2f89d35 | |||
| ab059ce18c | |||
| aeedfb4ef2 | |||
| f0c2bd8dde | |||
| 421fbbc873 | |||
| 383da4ae05 | |||
| 7ea5f67f9c | |||
| 235eb460dc | |||
| 2312deec5b | |||
| 948d41fb45 | |||
| fdef796440 | |||
| 926bdeb2dc | |||
| 726dd62794 | |||
| c723b1c2cb | |||
| e7b67d85a9 | |||
| cf2e5348bb | |||
| 8b67c4d59c | |||
| cbf31fa513 | |||
| 56665af233 | |||
| 22634af554 | |||
| d4c6f3d5dc | |||
| 4dbc3adfd5 | |||
| cbea567d65 | |||
| e4977da546 | |||
| 588b3b5b2c | |||
| f6152670aa | |||
| c2b9ee71b8 | |||
| f8e5bd7423 | |||
| d7001a5c52 | |||
| bc68d9fa49 | |||
| 264d8f92f4 | |||
| d8f5b876ac | |||
| 8cf6177cbc | |||
| 39bd4400b4 | |||
| 75b99869d4 | |||
| de0bbb8171 | |||
| 92f59cbf9a | |||
| dd9ca660cc | |||
| 7e2ef49481 | |||
| 86130984e2 | |||
| c4b146c325 | |||
| 7b6fc89fb7 | |||
| 8c9756b6d2 | |||
| 02993142df | |||
| 3348b7cf6d | |||
| 3098dc7e0a | |||
| e0c00019ff | |||
| 35fbae8ab9 | |||
| 23d336d63e | |||
| 39351e1131 | |||
| b7872da3ea | |||
| 5cc89b855a | |||
| 32e4f1ea4f | |||
| b664297c73 | |||
| ea60f17647 | |||
| 5ffa0ea2ec | |||
| 2a59fe8c84 | |||
| 8f79440219 | |||
| 128b05b8a8 | |||
| a9ee8eb66c | |||
| 5c7b6a7b41 | |||
| a569781691 | |||
| 0b75c30784 | |||
| ed2ae144dd |
@ -4,3 +4,4 @@ version = "0.1.0"
|
|||||||
edition = "2021"
|
edition = "2021"
|
||||||
|
|
||||||
[dependencies]
|
[dependencies]
|
||||||
|
thiserror = "1.0.30"
|
||||||
|
|||||||
439
README.md
439
README.md
@ -1,7 +1,440 @@
|
|||||||
# NEK-Lang
|
# NEK-Lang
|
||||||
|
## Table of contents
|
||||||
|
- [NEK-Lang](#nek-lang)
|
||||||
|
- [Table of contents](#table-of-contents)
|
||||||
|
- [Variables](#variables)
|
||||||
|
- [Declaration](#declaration)
|
||||||
|
- [Assignment](#assignment)
|
||||||
|
- [Datatypes](#datatypes)
|
||||||
|
- [I64](#i64)
|
||||||
|
- [String](#string)
|
||||||
|
- [Array](#array)
|
||||||
|
- [Expressions](#expressions)
|
||||||
|
- [General](#general)
|
||||||
|
- [Mathematical Operators](#mathematical-operators)
|
||||||
|
- [Bitwise Operators](#bitwise-operators)
|
||||||
|
- [Logical Operators](#logical-operators)
|
||||||
|
- [Equality & Relational Operators](#equality--relational-operators)
|
||||||
|
- [Control-Flow](#control-flow)
|
||||||
|
- [Loop](#loop)
|
||||||
|
- [If / Else](#if--else)
|
||||||
|
- [Block Scopes](#block-scopes)
|
||||||
|
- [Functions](#functions)
|
||||||
|
- [Function definition](#function-definition)
|
||||||
|
- [Function calls](#function-calls)
|
||||||
|
- [IO](#io)
|
||||||
|
- [Print](#print)
|
||||||
|
- [Comments](#comments)
|
||||||
|
- [Line comments](#line-comments)
|
||||||
|
- [Feature Tracker](#feature-tracker)
|
||||||
|
- [High level Components](#high-level-components)
|
||||||
|
- [Language features](#language-features)
|
||||||
|
- [Parsing Grammar](#parsing-grammar)
|
||||||
|
- [Expressions](#expressions-1)
|
||||||
|
- [Statements](#statements)
|
||||||
|
- [Examples](#examples)
|
||||||
|
- [Extras](#extras)
|
||||||
|
- [Visual Studio Code Language Support](#visual-studio-code-language-support)
|
||||||
|
|
||||||
|
## Variables
|
||||||
|
The variables are all contained in scopes. Variables defined in an outer scope can be accessed in
|
||||||
|
inner scoped. All variables defined in a scope that has ended do no longer exist and can't be
|
||||||
|
accessed.
|
||||||
|
|
||||||
|
### Declaration
|
||||||
|
- Declare and initialize a new variable
|
||||||
|
- Declaring a previously declared variable again will shadow the previous variable
|
||||||
|
- Declaration is needed before assignment or other usage
|
||||||
|
- The variable name is on the left side of the `<-` operator
|
||||||
|
- The assigned value is on the right side and can be any expression
|
||||||
|
```
|
||||||
|
a <- 123;
|
||||||
|
```
|
||||||
|
Create a new variable named `a` and assign the value `123` to it.
|
||||||
|
|
||||||
|
### Assignment
|
||||||
|
- Assigning a value to a previously declared variable
|
||||||
|
- The variable name is on the left side of the `=` operator
|
||||||
|
- The assigned value is on the right side and can be any expression
|
||||||
|
```
|
||||||
|
a = 123;
|
||||||
|
```
|
||||||
|
The value `123` is assigned to the variable named `a`. `a` needs to be declared before this.
|
||||||
|
|
||||||
|
## Datatypes
|
||||||
|
The available variable datatypes are `i64` (64-bit signed integer), `string` (`"this is a string"`) and `array` (`[10]`)
|
||||||
|
|
||||||
|
### I64
|
||||||
|
- The normal default datatype is `i64` which is a 64-bit signed integer
|
||||||
|
- Can be created by just writing an integer literal like `546`
|
||||||
|
- Inside the number literal `_` can be inserted for visual separation `100_000`
|
||||||
|
- The i64 values can be used as expected in calculations, conditions and so on
|
||||||
|
```
|
||||||
|
my_i64 <- 123_456;
|
||||||
|
```
|
||||||
|
|
||||||
|
### String
|
||||||
|
- Strings mainly exist for formatting the text output of a program
|
||||||
|
- Strings can be created by using doublequotes like in other languages `"Hello world"`
|
||||||
|
- There is no way to access or change the characters of the string
|
||||||
|
- Unicode characters are supported `"Hello 🌎"`
|
||||||
|
- Escape characters `\n`, `\r`, `\t`, `\"`, `\\` are supported
|
||||||
|
- String can be assigned to variables, just like i64
|
||||||
|
```
|
||||||
|
world <- "🌎";
|
||||||
|
|
||||||
|
print "Hello ";
|
||||||
|
print world;
|
||||||
|
print "\n";
|
||||||
|
```
|
||||||
|
|
||||||
|
### Array
|
||||||
|
- Arrays can contain any other datatypes and don't need to have the same type in all cells
|
||||||
|
- Arrays can be created by using brackets with the size in between `[size]`
|
||||||
|
- Arrays must be assigned to a variable in order to be used
|
||||||
|
- All cells will be initialized with i64 0 values
|
||||||
|
- The size can be any expression that results in a positive i64 value
|
||||||
|
- The array size can't be changed after creation
|
||||||
|
- The arrays data is always allocated on the heap
|
||||||
|
- The array cells can be accessed by using the variable name and specifying the index in brackets
|
||||||
|
`my_arr[index]`
|
||||||
|
- The index can be any expression that results in a positive i64 value in the range of the arrays
|
||||||
|
indices
|
||||||
|
- The indices start with 0
|
||||||
|
- When an array is passed to a function, it is passed by reference
|
||||||
|
```
|
||||||
|
width <- 5;
|
||||||
|
heigt <- 5;
|
||||||
|
|
||||||
|
// Initialize array of size 25, initialized with 25x 0
|
||||||
|
my_array = [width * height];
|
||||||
|
|
||||||
|
// Modify first value
|
||||||
|
my_array[0] = 5;
|
||||||
|
|
||||||
|
// Print first value
|
||||||
|
// Outputs `5`
|
||||||
|
print my_array[0];
|
||||||
|
```
|
||||||
|
|
||||||
|
## Expressions
|
||||||
|
The operator precedence is the same order as in `C` for all implemented operators.
|
||||||
|
Refer to the
|
||||||
|
[C Operator Precedence Table](https://en.cppreference.com/w/c/language/operator_precedence)
|
||||||
|
to see the different precedences.
|
||||||
|
|
||||||
|
### General
|
||||||
|
- Parentheses `(` and `)` can be used to modify evaluation oder just like in any other
|
||||||
|
programming language.
|
||||||
|
- For example `(a + b) * c` will evaluate the addition before the multiplication, despite the multiplication having higher binding power
|
||||||
|
|
||||||
|
### Mathematical Operators
|
||||||
|
Supported mathematical operations:
|
||||||
|
- Addition `a + b`
|
||||||
|
- Subtraction `a - b`
|
||||||
|
- Multiplication `a * b`
|
||||||
|
- Division `a / b`
|
||||||
|
- Modulo `a % b`
|
||||||
|
- Negation `-a`
|
||||||
|
|
||||||
|
### Bitwise Operators
|
||||||
|
- And `a & b`
|
||||||
|
- Or `a | b`
|
||||||
|
- Xor `a ^ b`
|
||||||
|
- Bitshift left (by `b` bits) `a << b`
|
||||||
|
- Bitshift right (by `b` bits) `a >> b`
|
||||||
|
- "Bit flip" (One's complement) `~a`
|
||||||
|
|
||||||
|
### Logical Operators
|
||||||
|
The logical operators evaluate the operands as `false` if they are equal to `0` and `true` if they are not equal to `0`.
|
||||||
|
Note that logical operators like AND / OR do not support short-circuit evaluation. So Both sides of
|
||||||
|
the logical operation will be evaluated, even if it might not be necessary.
|
||||||
|
- And `a && b`
|
||||||
|
- Or `a || b`
|
||||||
|
- Not `!a` (if `a` is equal to `0`, the result is `1`, otherwise the result is `0`)
|
||||||
|
|
||||||
|
### Equality & Relational Operators
|
||||||
|
The equality and relational operations result in `1` if the condition is evaluated as `true` and in `0` if the condition is evaluated as `false`.
|
||||||
|
- Equality `a == b`
|
||||||
|
- Inequality `a != b`
|
||||||
|
- Greater than `a > b`
|
||||||
|
- Greater or equal than `a >= b`
|
||||||
|
- Less than `a < b`
|
||||||
|
- Less or equal than `a <= b`
|
||||||
|
|
||||||
|
## Control-Flow
|
||||||
|
For conditions like in if or loops, every non-zero value is equal to `true`, and `0` is `false`.
|
||||||
|
|
||||||
|
### Loop
|
||||||
|
- The `loop` keyword can be used as an infinite loop, as a while loop or as a while loop with
|
||||||
|
advancement (an expression that is executed after each loop)
|
||||||
|
- If only `loop` is used, directly followed by the body, it is an infinite loop that needs to be
|
||||||
|
terminated by using the `break` keyword
|
||||||
|
- The `loop` keyword can be followed by the condition (an expression) without needing parentheses
|
||||||
|
- *Optional:* If there is a `;` after the condition, there must be another expression which is used as the advancement
|
||||||
|
- The loops body is wrapped in braces (`{ }`) just like in C/C++
|
||||||
|
- The `continue` keyword can be used to end the current loop iteration early
|
||||||
|
- The `break` keyword can be used to fully break out of the current loop
|
||||||
|
|
||||||
|
```
|
||||||
|
// Print the numbers from 0 to 9
|
||||||
|
|
||||||
|
// With endless loop
|
||||||
|
i <- 0;
|
||||||
|
loop {
|
||||||
|
if i >= 10 {
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
print i;
|
||||||
|
i = i + 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Without advancement
|
||||||
|
i <- 0;
|
||||||
|
loop i < 10 {
|
||||||
|
print i;
|
||||||
|
i = i + 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
// With advancement
|
||||||
|
k <- 0;
|
||||||
|
loop k < 10; k = k + 1 {
|
||||||
|
print k;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### If / Else
|
||||||
|
- The language supports `if` and an optional `else`
|
||||||
|
- After the `if` keyword must be the deciding condition, parentheses are not needed
|
||||||
|
- The blocks are wrapped in braces (`{ }`)
|
||||||
|
- *Optional:* If there is an `else` after the *if-block*, there must be a following *if-false*, aka. else block
|
||||||
|
- NOTE: Logical operators like AND / OR do not support short-circuit evaluation. So Both sides of
|
||||||
|
the logical operations will be evaluated, even if it might not be necessary
|
||||||
|
```
|
||||||
|
a <- 1;
|
||||||
|
b <- 2;
|
||||||
|
if a == b {
|
||||||
|
// a is equal to b
|
||||||
|
print 1;
|
||||||
|
} else {
|
||||||
|
// a is not equal to b
|
||||||
|
print 0;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Block Scopes
|
||||||
|
|
||||||
|
- It is possible to create a limited scope for local variables that will no longer exist once the
|
||||||
|
scope ends
|
||||||
|
- Shadowing variables by redefining a variable in an inner scope is supported
|
||||||
|
```
|
||||||
|
var_in_outer_scope <- 5;
|
||||||
|
{
|
||||||
|
var_in_inner_scope <- 3;
|
||||||
|
|
||||||
|
// Inner scope can access both vars
|
||||||
|
print var_in_outer_scope;
|
||||||
|
print var_in_inner_scope;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Outer scope is still valid
|
||||||
|
print var_in_outer_scope;
|
||||||
|
|
||||||
|
// !!! THIS DOES NOT WORK !!!
|
||||||
|
// The inner scope has ended
|
||||||
|
print var_in_inner_scope;
|
||||||
|
```
|
||||||
|
|
||||||
|
## Functions
|
||||||
|
|
||||||
|
### Function definition
|
||||||
|
- Functions can be defined by using the `fun` keyword, followed by the function name and the
|
||||||
|
parameters in parentheses. After the parentheses, the body is specified inside a braces block
|
||||||
|
- The function parameters are specified by only their names
|
||||||
|
- The function body has its own scope
|
||||||
|
- Parameters are only accessible inside the body
|
||||||
|
- Variables from the outer scope can be accessed and modified if the are defined before the function
|
||||||
|
- Variables from the outer scope are shadowed by parameters or local variables with the same name
|
||||||
|
- The `return` keyword can be used to return a value from the function and exit it immediately
|
||||||
|
- If no return is specified, a special `void` value is returned. That value can't be used in
|
||||||
|
calculations or comparisons, but can be stored in a variable (even tho it doesn't make sense)
|
||||||
|
- Functions can only be defined at the top-level. So defining a function inside of any other scoped
|
||||||
|
block (like inside another function, if, loop, ...) is invalid
|
||||||
|
- Functions can only be used after definition and there is no forward declaration right now
|
||||||
|
- However a function can be called recursively inside of itself
|
||||||
|
- Functions can't be redefined, so defining a function with an existing name is invalid
|
||||||
|
```
|
||||||
|
fun add_maybe(a, b) {
|
||||||
|
if a < 100 {
|
||||||
|
return a;
|
||||||
|
} else {
|
||||||
|
return a + b;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fun println(val) {
|
||||||
|
print val;
|
||||||
|
print "\n";
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Function calls
|
||||||
|
- Function calls are primary expressions, so they can be directly used in calculations (if they
|
||||||
|
return appropriate values)
|
||||||
|
- Function calls are performed by writing the function name, followed by the arguments in parentheses
|
||||||
|
- The arguments can be any expressions, separated by commas
|
||||||
|
```
|
||||||
|
b <- 100;
|
||||||
|
result <- add_maybe(250, b);
|
||||||
|
|
||||||
|
// Prints 350 + new-line
|
||||||
|
println(result);
|
||||||
|
```
|
||||||
|
|
||||||
|
## IO
|
||||||
|
|
||||||
|
### Print
|
||||||
|
Printing is implemented via the `print` keyword
|
||||||
|
- The `print` keyword is followed by an expression, the value of which will be printed to the terminal
|
||||||
|
- To add a line break a string print can be used `print "\n";`
|
||||||
|
```
|
||||||
|
a <- 1;
|
||||||
|
// Outputs `1` to the terminal
|
||||||
|
print a;
|
||||||
|
|
||||||
|
// Outputs a new-line to the terminal
|
||||||
|
print "\n";
|
||||||
|
```
|
||||||
|
|
||||||
|
## Comments
|
||||||
|
|
||||||
|
### Line comments
|
||||||
|
Line comments can be initiated by using `//`
|
||||||
|
- Everything after `//` up to the end of the current line is ignored and not parsed
|
||||||
|
```
|
||||||
|
// This is a comment
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
# Feature Tracker
|
||||||
|
|
||||||
## High level Components
|
## High level Components
|
||||||
|
|
||||||
- [ ] Lexer: Transforms text into Tokens
|
- [x] Lexer: Transforms text into Tokens
|
||||||
- [ ] Parser: Transforms Tokens into Abstract Syntax Tree
|
- [x] Parser: Transforms Tokens into Abstract Syntax Tree
|
||||||
- [ ] Interpreter (tree-walk-interpreter): Walks the tree and evaluates the expressions / statements
|
- [x] Interpreter (tree-walk-interpreter): Walks the tree and evaluates the expressions / statements
|
||||||
|
- [x] Simple optimizer: Apply trivial optimizations to the Ast
|
||||||
|
- [x] Precalculate binary ops / unary ops that have only literal operands
|
||||||
|
|
||||||
|
## Language features
|
||||||
|
|
||||||
|
- [x] General expressions
|
||||||
|
- [x] Arithmetic operations
|
||||||
|
- [x] Addition `a + b`
|
||||||
|
- [x] Subtraction `a - b`
|
||||||
|
- [x] Multiplication `a * b`
|
||||||
|
- [x] Division `a / b`
|
||||||
|
- [x] Modulo `a % b`
|
||||||
|
- [x] Negate `-a`
|
||||||
|
- [x] Parentheses `(a + b) * c`
|
||||||
|
- [x] Logical boolean operators
|
||||||
|
- [x] Equal `a == b`
|
||||||
|
- [x] Not equal `a != b`
|
||||||
|
- [x] Greater than `a > b`
|
||||||
|
- [x] Less than `a < b`
|
||||||
|
- [x] Greater than or equal `a >= b`
|
||||||
|
- [x] Less than or equal `a <= b`
|
||||||
|
- [x] Logical operators
|
||||||
|
- [x] And `a && b`
|
||||||
|
- [x] Or `a || b`
|
||||||
|
- [x] Not `!a`
|
||||||
|
- [x] Bitwise operators
|
||||||
|
- [x] Bitwise AND `a & b`
|
||||||
|
- [x] Bitwise OR `a | b`
|
||||||
|
- [x] Bitwise XOR `a ^ b`
|
||||||
|
- [x] Bitwise NOT `~a`
|
||||||
|
- [x] Bitwise left shift `a << b`
|
||||||
|
- [x] Bitwise right shift `a >> b`
|
||||||
|
- [x] Variables
|
||||||
|
- [x] Declaration
|
||||||
|
- [x] Assignment
|
||||||
|
- [x] Local variables (for example inside loop, if, else, functions)
|
||||||
|
- [x] Scoped block for specific local vars `{ ... }`
|
||||||
|
- [x] Statements with semicolon & Multiline programs
|
||||||
|
- [x] Control flow
|
||||||
|
- [x] Loops
|
||||||
|
- [x] While-style loop `loop X { ... }`
|
||||||
|
- [x] For-style loop without with `X` as condition and `Y` as advancement `loop X; Y { ... }`
|
||||||
|
- [x] Infinite loop `loop { ... }`
|
||||||
|
- [x] Break `break`
|
||||||
|
- [x] Continue `continue`
|
||||||
|
- [x] If else statement `if X { ... } else { ... }`
|
||||||
|
- [x] If Statement
|
||||||
|
- [x] Else statement
|
||||||
|
- [x] Line comments `//`
|
||||||
|
- [x] Strings
|
||||||
|
- [x] Arrays
|
||||||
|
- [x] Creating array with size `X` as a variable `arr <- [X]`
|
||||||
|
- [x] Accessing arrays by index `arr[X]`
|
||||||
|
- [x] IO Intrinsics
|
||||||
|
- [x] Print
|
||||||
|
- [x] Functions
|
||||||
|
- [x] Function declaration `fun f(X, Y, Z) { ... }`
|
||||||
|
- [x] Function calls `f(1, 2, 3)`
|
||||||
|
- [x] Function returns `return X`
|
||||||
|
- [x] Local variables
|
||||||
|
- [x] Pass arrays by-reference, i64 by-vale, string is a const ref
|
||||||
|
|
||||||
|
# Parsing Grammar
|
||||||
|
|
||||||
|
## Expressions
|
||||||
|
```
|
||||||
|
ARRAY_LITERAL = "[" expr "]"
|
||||||
|
ARRAY_ACCESS = IDENT "[" expr "]"
|
||||||
|
FUN_CALL = IDENT "(" (expr ",")* expr? ")"
|
||||||
|
LITERAL = I64_LITERAL | STR_LITERAL | ARRAY_LITERAL
|
||||||
|
expr_primary = LITERAL | IDENT | FUN_CALL | ARRAY_ACCESS | "(" expr ")" | "-" expr_primary
|
||||||
|
| "~" expr_primary
|
||||||
|
expr_mul = expr_primary (("*" | "/" | "%") expr_primary)*
|
||||||
|
expr_add = expr_mul (("+" | "-") expr_mul)*
|
||||||
|
expr_shift = expr_add ((">>" | "<<") expr_add)*
|
||||||
|
expr_rel = expr_shift ((">" | ">=" | "<" | "<=") expr_shift)*
|
||||||
|
expr_equ = expr_rel (("==" | "!=") expr_rel)*
|
||||||
|
expr_band = expr_equ ("&" expr_equ)*
|
||||||
|
expr_bxor = expr_band ("^" expr_band)*
|
||||||
|
expr_bor = expr_bxor ("|" expr_bxor)*
|
||||||
|
expr_land = expr_bor ("&&" expr_bor)*
|
||||||
|
expr_lor = expr_land ("||" expr_land)*
|
||||||
|
expr = expr_lor
|
||||||
|
```
|
||||||
|
|
||||||
|
## Statements
|
||||||
|
```
|
||||||
|
stmt_return = "return" expr ";"
|
||||||
|
stmt_break = "break" ";"
|
||||||
|
stmt_continue = "continue" ";"
|
||||||
|
stmt_var_decl = IDENT "<-" expr ";"
|
||||||
|
stmt_fun_decl = "fun" IDENT "(" (IDENT ",")* IDENT? ")" "{" stmt* "}"
|
||||||
|
stmt_expr = expr ";"
|
||||||
|
stmt_block = "{" stmt* "}"
|
||||||
|
stmt_loop = "loop" (expr (";" expr)?)? "{" stmt* "}"
|
||||||
|
stmt_if = "if" expr "{" stmt* "}" ("else" "{" stmt* "}")?
|
||||||
|
stmt_print = "print" expr ";"
|
||||||
|
stmt = stmt_return | stmt_break | stmt_continue | stmt_var_decl | stmt_fun_decl
|
||||||
|
| stmt_expr | stmt_block | stmt_loop | stmt_if | stmt_print
|
||||||
|
```
|
||||||
|
|
||||||
|
# Examples
|
||||||
|
There are a bunch of examples in the [examples](examples/) directory. Those include (non-optimal) solutions to the first five project euler problems, as well as a [simple Game of Life implementation](examples/game_of_life.nek).
|
||||||
|
|
||||||
|
To run an example via `cargo-run`, use:
|
||||||
|
```
|
||||||
|
cargo run --release -- examples/[NAME]
|
||||||
|
```
|
||||||
|
|
||||||
|
# Extras
|
||||||
|
## Visual Studio Code Language Support
|
||||||
|
A VSCode extension that provides simple syntax highlighing for nek is also available on
|
||||||
|
[gitlab](https://code.fbi.h-da.de/advanced-systems-programming-ws21/x4/nek-lang-vscode). Since this
|
||||||
|
is a very small scale project, the extension was not published and instuctions on how to install it
|
||||||
|
can be found in the mentioned repository.
|
||||||
15
examples/euler1.nek
Normal file
15
examples/euler1.nek
Normal file
@ -0,0 +1,15 @@
|
|||||||
|
// If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9.
|
||||||
|
// The sum of these multiples is 23.
|
||||||
|
// Find the sum of all the multiples of 3 or 5 below 1000.
|
||||||
|
//
|
||||||
|
// Correct Answer: 233168
|
||||||
|
|
||||||
|
sum <- 0;
|
||||||
|
i <- 0;
|
||||||
|
loop i < 1_000; i = i + 1 {
|
||||||
|
if i % 3 == 0 || i % 5 == 0 {
|
||||||
|
sum = sum + i;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
print sum;
|
||||||
24
examples/euler2.nek
Normal file
24
examples/euler2.nek
Normal file
@ -0,0 +1,24 @@
|
|||||||
|
// Each new term in the Fibonacci sequence is generated by adding the previous two terms.
|
||||||
|
// By starting with 1 and 2, the first 10 terms will be:
|
||||||
|
// 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...
|
||||||
|
// By considering the terms in the Fibonacci sequence whose values do not exceed four million,
|
||||||
|
// find the sum of the even-valued terms.
|
||||||
|
//
|
||||||
|
// Correct Answer: 4613732
|
||||||
|
|
||||||
|
sum <- 0;
|
||||||
|
|
||||||
|
a <- 0;
|
||||||
|
b <- 1;
|
||||||
|
loop a < 4_000_000 {
|
||||||
|
if a % 2 == 0 {
|
||||||
|
sum = sum + a;
|
||||||
|
}
|
||||||
|
|
||||||
|
tmp <- a;
|
||||||
|
a = b;
|
||||||
|
b = b + tmp;
|
||||||
|
}
|
||||||
|
|
||||||
|
print sum;
|
||||||
|
|
||||||
29
examples/euler3.nek
Normal file
29
examples/euler3.nek
Normal file
@ -0,0 +1,29 @@
|
|||||||
|
// The prime factors of 13195 are 5, 7, 13 and 29.
|
||||||
|
// What is the largest prime factor of the number 600851475143 ?
|
||||||
|
//
|
||||||
|
// Correct Answer: 6857
|
||||||
|
|
||||||
|
number <- 600_851_475_143;
|
||||||
|
result <- 0;
|
||||||
|
|
||||||
|
div <- 2;
|
||||||
|
|
||||||
|
loop number > 1 {
|
||||||
|
loop number % div == 0 {
|
||||||
|
if div > result {
|
||||||
|
result = div;
|
||||||
|
}
|
||||||
|
number = number / div;
|
||||||
|
}
|
||||||
|
|
||||||
|
div = div + 1;
|
||||||
|
if div * div > number {
|
||||||
|
if number > 1 && number > result {
|
||||||
|
result = number;
|
||||||
|
}
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
print result;
|
||||||
31
examples/euler4.nek
Normal file
31
examples/euler4.nek
Normal file
@ -0,0 +1,31 @@
|
|||||||
|
// A palindromic number reads the same both ways. The largest palindrome made from the product of
|
||||||
|
// two 2-digit numbers is 9009 = 91 × 99.
|
||||||
|
// Find the largest palindrome made from the product of two 3-digit numbers.
|
||||||
|
//
|
||||||
|
// Correct Answer: 906609
|
||||||
|
|
||||||
|
fun reverse(n) {
|
||||||
|
rev <- 0;
|
||||||
|
loop n {
|
||||||
|
rev = rev * 10 + n % 10;
|
||||||
|
n = n / 10;
|
||||||
|
}
|
||||||
|
return rev;
|
||||||
|
}
|
||||||
|
|
||||||
|
res <- 0;
|
||||||
|
|
||||||
|
i <- 100;
|
||||||
|
loop i < 1_000; i = i + 1 {
|
||||||
|
k <- i;
|
||||||
|
loop k < 1_000; k = k + 1 {
|
||||||
|
num <- i * k;
|
||||||
|
num_rev <- reverse(num);
|
||||||
|
|
||||||
|
if num == num_rev && num > res {
|
||||||
|
res = num;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
print res;
|
||||||
24
examples/euler4.py
Normal file
24
examples/euler4.py
Normal file
@ -0,0 +1,24 @@
|
|||||||
|
# A palindromic number reads the same both ways. The largest palindrome made from the product of
|
||||||
|
# two 2-digit numbers is 9009 = 91 × 99.
|
||||||
|
# Find the largest palindrome made from the product of two 3-digit numbers.
|
||||||
|
#
|
||||||
|
# Correct Answer: 906609
|
||||||
|
|
||||||
|
def reverse(n):
|
||||||
|
rev = 0
|
||||||
|
while n:
|
||||||
|
rev = rev * 10 + n % 10
|
||||||
|
n //= 10
|
||||||
|
return rev
|
||||||
|
|
||||||
|
res = 0
|
||||||
|
|
||||||
|
for i in range(100, 1_000):
|
||||||
|
for k in range(i, 1_000):
|
||||||
|
num = i * k
|
||||||
|
num_rev = reverse(num)
|
||||||
|
|
||||||
|
if num == num_rev and num > res:
|
||||||
|
res = num
|
||||||
|
|
||||||
|
print(res)
|
||||||
23
examples/euler5.nek
Normal file
23
examples/euler5.nek
Normal file
@ -0,0 +1,23 @@
|
|||||||
|
// 2520 is the smallest number that can be divided by each of the numbers from 1 to 10 without any remainder.
|
||||||
|
// What is the smallest positive number that is evenly divisible by all of the numbers from 1 to 20?
|
||||||
|
//
|
||||||
|
// Correct Answer: 232_792_560
|
||||||
|
|
||||||
|
fun gcd(x, y) {
|
||||||
|
loop y {
|
||||||
|
tmp <- x;
|
||||||
|
x = y;
|
||||||
|
y = tmp % y;
|
||||||
|
}
|
||||||
|
|
||||||
|
return x;
|
||||||
|
}
|
||||||
|
|
||||||
|
result <- 1;
|
||||||
|
|
||||||
|
i <- 1;
|
||||||
|
loop i <= 20; i = i + 1 {
|
||||||
|
result = result * (i / gcd(i, result));
|
||||||
|
}
|
||||||
|
|
||||||
|
print result;
|
||||||
15
examples/euler5.py
Normal file
15
examples/euler5.py
Normal file
@ -0,0 +1,15 @@
|
|||||||
|
# 2520 is the smallest number that can be divided by each of the numbers from 1 to 10 without any remainder.
|
||||||
|
# What is the smallest positive number that is evenly divisible by all of the numbers from 1 to 20?
|
||||||
|
#
|
||||||
|
# Correct Answer: 232_792_560
|
||||||
|
|
||||||
|
def gcd(x, y):
|
||||||
|
while y:
|
||||||
|
x, y = y, x % y
|
||||||
|
return x
|
||||||
|
|
||||||
|
result = 1
|
||||||
|
for i in range(1, 21):
|
||||||
|
result *= i // gcd(i, result)
|
||||||
|
|
||||||
|
print(result)
|
||||||
134
examples/game_of_life.nek
Normal file
134
examples/game_of_life.nek
Normal file
@ -0,0 +1,134 @@
|
|||||||
|
fun print_field(field, width, height) {
|
||||||
|
y <- 0;
|
||||||
|
loop y < height; y = y+1 {
|
||||||
|
x <- 0;
|
||||||
|
loop x < width; x = x+1 {
|
||||||
|
if field[y*height + x] {
|
||||||
|
print "# ";
|
||||||
|
} else {
|
||||||
|
print ". ";
|
||||||
|
}
|
||||||
|
}
|
||||||
|
print "\n";
|
||||||
|
}
|
||||||
|
print "\n";
|
||||||
|
}
|
||||||
|
|
||||||
|
fun count_neighbours(field, x, y, width, height) {
|
||||||
|
neighbours <- 0;
|
||||||
|
if y > 0 {
|
||||||
|
if x > 0 {
|
||||||
|
if field[(y-1)*width + (x-1)] {
|
||||||
|
// Top left
|
||||||
|
neighbours = neighbours + 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if field[(y-1)*width + x] {
|
||||||
|
// Top
|
||||||
|
neighbours = neighbours + 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
if x < width-1 {
|
||||||
|
if field[(y-1)*width + (x+1)] {
|
||||||
|
// Top right
|
||||||
|
neighbours = neighbours + 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if x > 0 {
|
||||||
|
if field[y*width + (x-1)] {
|
||||||
|
// Left
|
||||||
|
neighbours = neighbours + 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if x < width-1 {
|
||||||
|
if field[y*width + (x+1)] {
|
||||||
|
// Right
|
||||||
|
neighbours = neighbours + 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
if y < height-1 {
|
||||||
|
if x > 0 {
|
||||||
|
if field[(y+1)*width + (x-1)] {
|
||||||
|
// Bottom left
|
||||||
|
neighbours = neighbours + 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if field[(y+1)*width + x] {
|
||||||
|
// Bottom
|
||||||
|
neighbours = neighbours + 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
if x < width-1 {
|
||||||
|
if field[(y+1)*width + (x+1)] {
|
||||||
|
// Bottom right
|
||||||
|
neighbours = neighbours + 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return neighbours;
|
||||||
|
}
|
||||||
|
|
||||||
|
fun copy(from, to, len) {
|
||||||
|
i <- 0;
|
||||||
|
loop i < len; i = i + 1 {
|
||||||
|
to[i] = from[i];
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Set the width and height of the field
|
||||||
|
width <- 10;
|
||||||
|
height <- 10;
|
||||||
|
|
||||||
|
// Create the main and temporary field
|
||||||
|
field <- [width*height];
|
||||||
|
field2 <- [width*height];
|
||||||
|
|
||||||
|
// Preset the main field with a glider
|
||||||
|
field[1] = 1;
|
||||||
|
field[12] = 1;
|
||||||
|
field[20] = 1;
|
||||||
|
field[21] = 1;
|
||||||
|
field[22] = 1;
|
||||||
|
|
||||||
|
fun run_gol(num_rounds) {
|
||||||
|
runs <- 0;
|
||||||
|
loop runs < num_rounds; runs = runs + 1 {
|
||||||
|
// Print the field
|
||||||
|
print_field(field, width, height);
|
||||||
|
|
||||||
|
// Calculate next stage from field and store into field2
|
||||||
|
y <- 0;
|
||||||
|
loop y < height; y = y+1 {
|
||||||
|
x <- 0;
|
||||||
|
loop x < width; x = x+1 {
|
||||||
|
|
||||||
|
// Get the neighbours of the current cell
|
||||||
|
neighbours <- count_neighbours(field, x, y, width, height);
|
||||||
|
|
||||||
|
// Set the new cell according to the neighbour count
|
||||||
|
if neighbours < 2 || neighbours > 3 {
|
||||||
|
field2[y*width + x] = 0;
|
||||||
|
} else {
|
||||||
|
if neighbours == 3 {
|
||||||
|
field2[y*width + x] = 1;
|
||||||
|
} else {
|
||||||
|
field2[y*width + x] = field[y*width + x];
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Transfer from field2 to field
|
||||||
|
copy(field2, field, width*height);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
run_gol(32);
|
||||||
9
examples/recursive_fib.nek
Normal file
9
examples/recursive_fib.nek
Normal file
@ -0,0 +1,9 @@
|
|||||||
|
fun fib(n) {
|
||||||
|
if n <= 1 {
|
||||||
|
return n;
|
||||||
|
} else {
|
||||||
|
return fib(n-1) + fib(n-2);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
print fib(30);
|
||||||
6
examples/recursive_fib.py
Normal file
6
examples/recursive_fib.py
Normal file
@ -0,0 +1,6 @@
|
|||||||
|
def fib(n):
|
||||||
|
if n <= 1:
|
||||||
|
return n
|
||||||
|
return fib(n-1) + fib(n-2)
|
||||||
|
|
||||||
|
print(fib(30))
|
||||||
31
examples/test_functions.nek
Normal file
31
examples/test_functions.nek
Normal file
@ -0,0 +1,31 @@
|
|||||||
|
fun square(a) {
|
||||||
|
return a * a;
|
||||||
|
}
|
||||||
|
|
||||||
|
fun add(a, b) {
|
||||||
|
return a + b;
|
||||||
|
}
|
||||||
|
|
||||||
|
fun mul(a, b) {
|
||||||
|
return a * b;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Funtion with multiple args & nested calls to different functions
|
||||||
|
fun addmul(a, b, c) {
|
||||||
|
return mul(add(a, b), c);
|
||||||
|
}
|
||||||
|
|
||||||
|
a <- 10;
|
||||||
|
b <- 20;
|
||||||
|
c <- 3;
|
||||||
|
|
||||||
|
result <- addmul(a, b, c) + square(c);
|
||||||
|
|
||||||
|
// Access and modify outer variable. Argument `a` must not be used from outer var
|
||||||
|
fun sub_from_result(a) {
|
||||||
|
result = result - a;
|
||||||
|
}
|
||||||
|
|
||||||
|
sub_from_result(30);
|
||||||
|
|
||||||
|
print result;
|
||||||
211
src/ast.rs
Normal file
211
src/ast.rs
Normal file
@ -0,0 +1,211 @@
|
|||||||
|
use std::rc::Rc;
|
||||||
|
|
||||||
|
use crate::stringstore::{Sid, StringStore};
|
||||||
|
|
||||||
|
/// Types for binary operations
|
||||||
|
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||||
|
pub enum BinOpType {
|
||||||
|
/// Addition ("+")
|
||||||
|
Add,
|
||||||
|
|
||||||
|
/// Subtraction ("-")
|
||||||
|
Sub,
|
||||||
|
|
||||||
|
/// Multiplication ("*")
|
||||||
|
Mul,
|
||||||
|
|
||||||
|
/// Division ("/")
|
||||||
|
Div,
|
||||||
|
|
||||||
|
/// Modulo / Remainder ("%")
|
||||||
|
Mod,
|
||||||
|
|
||||||
|
/// Compare Equal ("==")
|
||||||
|
EquEqu,
|
||||||
|
|
||||||
|
/// Compare Not Equal ("!=")
|
||||||
|
NotEqu,
|
||||||
|
|
||||||
|
/// Compare Less than ("<")
|
||||||
|
Less,
|
||||||
|
|
||||||
|
/// Compare Less than or Equal ("<=")
|
||||||
|
LessEqu,
|
||||||
|
|
||||||
|
/// Compare Greater than (">")
|
||||||
|
Greater,
|
||||||
|
|
||||||
|
/// Compare Greater than or Equal (">=")
|
||||||
|
GreaterEqu,
|
||||||
|
|
||||||
|
/// Bitwise Or ("|")
|
||||||
|
BOr,
|
||||||
|
|
||||||
|
/// Bitwise And ("&")
|
||||||
|
BAnd,
|
||||||
|
|
||||||
|
/// Bitwise Xor / Exclusive Or ("^")
|
||||||
|
BXor,
|
||||||
|
|
||||||
|
/// Logical And ("&&")
|
||||||
|
LAnd,
|
||||||
|
|
||||||
|
/// Logical Or ("||")
|
||||||
|
LOr,
|
||||||
|
|
||||||
|
/// Bitwise Shift Left ("<<")
|
||||||
|
Shl,
|
||||||
|
|
||||||
|
/// Bitwise Shift Right (">>")
|
||||||
|
Shr,
|
||||||
|
|
||||||
|
/// Assign value to variable ("=")
|
||||||
|
Assign,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Types for unary operations
|
||||||
|
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||||
|
pub enum UnOpType {
|
||||||
|
/// Unary Negation ("-")
|
||||||
|
Negate,
|
||||||
|
|
||||||
|
/// Bitwise Not / Bitflip ("~")
|
||||||
|
BNot,
|
||||||
|
|
||||||
|
/// Logical Not ("!")
|
||||||
|
LNot,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Ast Node for possible Expression variants
|
||||||
|
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||||
|
pub enum Expression {
|
||||||
|
/// Integer literal (64-bit)
|
||||||
|
I64(i64),
|
||||||
|
/// String literal
|
||||||
|
String(Sid),
|
||||||
|
|
||||||
|
/// Array with size as an expression
|
||||||
|
ArrayLiteral(Box<Expression>),
|
||||||
|
/// Array access with name, stackpos and position as expression
|
||||||
|
ArrayAccess(Sid, usize, Box<Expression>),
|
||||||
|
|
||||||
|
/// Function call with name, stackpos and the arguments as a vec of expressions
|
||||||
|
FunCall(Sid, usize, Vec<Expression>),
|
||||||
|
|
||||||
|
/// Variable with name and the stackpos from behind. This means that stackpos 0 refers to the
|
||||||
|
/// last variable on the stack and not the first
|
||||||
|
Var(Sid, usize),
|
||||||
|
/// Binary operation. Consists of type, left hand side and right hand side
|
||||||
|
BinOp(BinOpType, Box<Expression>, Box<Expression>),
|
||||||
|
/// Unary operation. Consists of type and operand
|
||||||
|
UnOp(UnOpType, Box<Expression>),
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Ast Node for a loop
|
||||||
|
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||||
|
pub struct Loop {
|
||||||
|
/// The condition that determines if the loop should continue
|
||||||
|
pub condition: Option<Expression>,
|
||||||
|
/// This is executed after each loop to advance the condition variables
|
||||||
|
pub advancement: Option<Expression>,
|
||||||
|
/// The loop body that is executed each loop
|
||||||
|
pub body: BlockScope,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Ast Node for an if
|
||||||
|
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||||
|
pub struct If {
|
||||||
|
/// The condition
|
||||||
|
pub condition: Expression,
|
||||||
|
/// The body that is executed when condition is true
|
||||||
|
pub body_true: BlockScope,
|
||||||
|
/// The if body that is executed when the condition is false
|
||||||
|
pub body_false: BlockScope,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Ast Node for a function declaration
|
||||||
|
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||||
|
pub struct FunDecl {
|
||||||
|
/// The function name as StringID, stored in the stringstore
|
||||||
|
pub name: Sid,
|
||||||
|
/// The absolute position on the function stack where the function is stored
|
||||||
|
pub fun_stackpos: usize,
|
||||||
|
/// The argument names as StringIDs
|
||||||
|
pub argnames: Vec<Sid>,
|
||||||
|
/// The function body
|
||||||
|
pub body: Rc<BlockScope>,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Ast Node for a variable declaration
|
||||||
|
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||||
|
pub struct VarDecl {
|
||||||
|
/// The variable name as StringID, stored in the stringstore
|
||||||
|
pub name: Sid,
|
||||||
|
/// The absolute position on the variable stack where the variable is stored
|
||||||
|
pub var_stackpos: usize,
|
||||||
|
/// The right hand side that generates the initial value for the variable
|
||||||
|
pub rhs: Expression,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Ast Node for the possible Statement variants
|
||||||
|
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||||
|
pub enum Statement {
|
||||||
|
/// Return from a function with the given result value as an expression
|
||||||
|
Return(Expression),
|
||||||
|
/// Break out of the current loop
|
||||||
|
Break,
|
||||||
|
/// End the current loop iteration early and continue with the next loop iteration
|
||||||
|
Continue,
|
||||||
|
/// A variable declaration
|
||||||
|
Declaration(VarDecl),
|
||||||
|
/// A function declaration
|
||||||
|
FunDeclare(FunDecl),
|
||||||
|
/// A simple expression. This could be a function call or an assignment for example
|
||||||
|
Expr(Expression),
|
||||||
|
/// A freestanding block scope
|
||||||
|
Block(BlockScope),
|
||||||
|
/// A loop
|
||||||
|
Loop(Loop),
|
||||||
|
/// An if
|
||||||
|
If(If),
|
||||||
|
/// A print statement that will output the value of the given expression to the terminal
|
||||||
|
Print(Expression),
|
||||||
|
}
|
||||||
|
|
||||||
|
/// A number of statements that form a block of code together
|
||||||
|
pub type BlockScope = Vec<Statement>;
|
||||||
|
|
||||||
|
/// A full abstract syntax tree
|
||||||
|
#[derive(Clone, Default)]
|
||||||
|
pub struct Ast {
|
||||||
|
/// The stringstore contains the actual string values which are replaced with StringIDs in the
|
||||||
|
/// Ast. So this is needed to get the actual strings later
|
||||||
|
pub stringstore: StringStore,
|
||||||
|
/// The main (top-level) code given as a number of statements
|
||||||
|
pub main: BlockScope,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl BinOpType {
|
||||||
|
/// Get the precedence for a binary operator. Higher value means the OP is stronger binding.
|
||||||
|
/// For example Multiplication is stronger than addition, so Mul has higher precedence than Add.
|
||||||
|
///
|
||||||
|
/// The operator precedences are derived from the C language operator precedences. While not all
|
||||||
|
/// C operators are included or the exact same, the precedence oder is the same.
|
||||||
|
/// See: https://en.cppreference.com/w/c/language/operator_precedence
|
||||||
|
|
||||||
|
pub fn precedence(&self) -> u8 {
|
||||||
|
match self {
|
||||||
|
BinOpType::Assign => 1,
|
||||||
|
BinOpType::LOr => 2,
|
||||||
|
BinOpType::LAnd => 3,
|
||||||
|
BinOpType::BOr => 4,
|
||||||
|
BinOpType::BXor => 5,
|
||||||
|
BinOpType::BAnd => 6,
|
||||||
|
BinOpType::EquEqu | BinOpType::NotEqu => 7,
|
||||||
|
BinOpType::Less | BinOpType::LessEqu | BinOpType::Greater | BinOpType::GreaterEqu => 8,
|
||||||
|
BinOpType::Shl | BinOpType::Shr => 9,
|
||||||
|
BinOpType::Add | BinOpType::Sub => 10,
|
||||||
|
BinOpType::Mul | BinOpType::Div | BinOpType::Mod => 11,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
116
src/astoptimizer.rs
Normal file
116
src/astoptimizer.rs
Normal file
@ -0,0 +1,116 @@
|
|||||||
|
use crate::ast::{Ast, BlockScope, Expression, If, Loop, Statement, BinOpType, UnOpType, VarDecl};
|
||||||
|
|
||||||
|
/// A trait that allows to optimize an abstract syntax tree
|
||||||
|
pub trait AstOptimizer {
|
||||||
|
/// Consume an abstract syntax tree and return an ast that has the same functionality but with
|
||||||
|
/// optional optimizations.
|
||||||
|
fn optimize(ast: Ast) -> Ast;
|
||||||
|
}
|
||||||
|
|
||||||
|
/// A very simple optimizer that applies trivial optimizations like precalculation expressions that
|
||||||
|
/// have only literals as operands
|
||||||
|
pub struct SimpleAstOptimizer;
|
||||||
|
|
||||||
|
impl AstOptimizer for SimpleAstOptimizer {
|
||||||
|
fn optimize(mut ast: Ast) -> Ast {
|
||||||
|
Self::optimize_block(&mut ast.main);
|
||||||
|
ast
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl SimpleAstOptimizer {
|
||||||
|
fn optimize_block(block: &mut BlockScope) {
|
||||||
|
for stmt in block {
|
||||||
|
match stmt {
|
||||||
|
Statement::Expr(expr) => Self::optimize_expr(expr),
|
||||||
|
Statement::Block(block) => Self::optimize_block(block),
|
||||||
|
Statement::Loop(Loop {
|
||||||
|
condition,
|
||||||
|
advancement,
|
||||||
|
body,
|
||||||
|
}) => {
|
||||||
|
if let Some(condition) = condition {
|
||||||
|
Self::optimize_expr(condition);
|
||||||
|
}
|
||||||
|
if let Some(advancement) = advancement {
|
||||||
|
Self::optimize_expr(advancement)
|
||||||
|
}
|
||||||
|
Self::optimize_block(body);
|
||||||
|
}
|
||||||
|
Statement::If(If {
|
||||||
|
condition,
|
||||||
|
body_true,
|
||||||
|
body_false,
|
||||||
|
}) => {
|
||||||
|
Self::optimize_expr(condition);
|
||||||
|
Self::optimize_block(body_true);
|
||||||
|
Self::optimize_block(body_false);
|
||||||
|
}
|
||||||
|
Statement::Print(expr) => Self::optimize_expr(expr),
|
||||||
|
Statement::Declaration(VarDecl { name: _, var_stackpos: _, rhs}) => Self::optimize_expr(rhs),
|
||||||
|
Statement::FunDeclare(_) => (),
|
||||||
|
Statement::Return(expr) => Self::optimize_expr(expr),
|
||||||
|
Statement::Break | Statement::Continue => (),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn optimize_expr(expr: &mut Expression) {
|
||||||
|
match expr {
|
||||||
|
Expression::BinOp(bo, lhs, rhs) => {
|
||||||
|
Self::optimize_expr(lhs);
|
||||||
|
Self::optimize_expr(rhs);
|
||||||
|
|
||||||
|
// Precalculate binary operations that consist of 2 literals. No need to do this at
|
||||||
|
// runtime, as all parts of the calculation are known at *compiletime* / parsetime.
|
||||||
|
match (lhs.as_mut(), rhs.as_mut()) {
|
||||||
|
(Expression::I64(lhs), Expression::I64(rhs)) => {
|
||||||
|
let new_expr = match bo {
|
||||||
|
BinOpType::Add => Expression::I64(*lhs + *rhs),
|
||||||
|
BinOpType::Mul => Expression::I64(*lhs * *rhs),
|
||||||
|
BinOpType::Sub => Expression::I64(*lhs - *rhs),
|
||||||
|
BinOpType::Div => Expression::I64(*lhs / *rhs),
|
||||||
|
BinOpType::Mod => Expression::I64(*lhs % *rhs),
|
||||||
|
BinOpType::BOr => Expression::I64(*lhs | *rhs),
|
||||||
|
BinOpType::BAnd => Expression::I64(*lhs & *rhs),
|
||||||
|
BinOpType::BXor => Expression::I64(*lhs ^ *rhs),
|
||||||
|
BinOpType::LAnd => Expression::I64(if (*lhs != 0) && (*rhs != 0) { 1 } else { 0 }),
|
||||||
|
BinOpType::LOr => Expression::I64(if (*lhs != 0) || (*rhs != 0) { 1 } else { 0 }),
|
||||||
|
BinOpType::Shr => Expression::I64(*lhs >> *rhs),
|
||||||
|
BinOpType::Shl => Expression::I64(*lhs << *rhs),
|
||||||
|
BinOpType::EquEqu => Expression::I64(if lhs == rhs { 1 } else { 0 }),
|
||||||
|
BinOpType::NotEqu => Expression::I64(if lhs != rhs { 1 } else { 0 }),
|
||||||
|
BinOpType::Less => Expression::I64(if lhs < rhs { 1 } else { 0 }),
|
||||||
|
BinOpType::LessEqu => Expression::I64(if lhs <= rhs { 1 } else { 0 }),
|
||||||
|
BinOpType::Greater => Expression::I64(if lhs > rhs { 1 } else { 0 }),
|
||||||
|
BinOpType::GreaterEqu => Expression::I64(if lhs >= rhs { 1 } else { 0 }),
|
||||||
|
|
||||||
|
BinOpType::Assign => unreachable!(),
|
||||||
|
};
|
||||||
|
*expr = new_expr;
|
||||||
|
},
|
||||||
|
_ => ()
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
Expression::UnOp(uo, operand) => {
|
||||||
|
Self::optimize_expr(operand);
|
||||||
|
|
||||||
|
// Precalculate unary operations just like binary ones
|
||||||
|
match operand.as_mut() {
|
||||||
|
Expression::I64(val) => {
|
||||||
|
let new_expr = match uo {
|
||||||
|
UnOpType::Negate => Expression::I64(-*val),
|
||||||
|
UnOpType::BNot => Expression::I64(!*val),
|
||||||
|
UnOpType::LNot => Expression::I64(if *val == 0 { 1 } else { 0 }),
|
||||||
|
};
|
||||||
|
*expr = new_expr;
|
||||||
|
}
|
||||||
|
_ => (),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
_ => (),
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
}
|
||||||
@ -1,70 +1,588 @@
|
|||||||
use crate::parser::{Ast, BinOpType};
|
use std::{cell::RefCell, rc::Rc};
|
||||||
|
use thiserror::Error;
|
||||||
|
|
||||||
#[derive(Debug, PartialEq, Eq, Clone)]
|
use crate::{
|
||||||
pub enum Value {
|
ast::{Ast, BinOpType, BlockScope, Expression, FunDecl, If, Statement, UnOpType},
|
||||||
I64(i64),
|
astoptimizer::{AstOptimizer, SimpleAstOptimizer},
|
||||||
|
lexer::lex,
|
||||||
|
nice_panic,
|
||||||
|
parser::parse,
|
||||||
|
stringstore::{Sid, StringStore},
|
||||||
|
};
|
||||||
|
|
||||||
|
/// Runtime errors that can occur during execution
|
||||||
|
#[derive(Debug, Error)]
|
||||||
|
pub enum RuntimeError {
|
||||||
|
#[error("Invalid array Index: {0:?}")]
|
||||||
|
InvalidArrayIndex(Value),
|
||||||
|
|
||||||
|
#[error("Variable used but not declared: {0}")]
|
||||||
|
VarUsedNotDeclared(String),
|
||||||
|
|
||||||
|
#[error("Can't index into non-array variable: {0}")]
|
||||||
|
TryingToIndexNonArray(String),
|
||||||
|
|
||||||
|
#[error("Invalid value type for unary operation: {0:?}")]
|
||||||
|
UnOpInvalidType(Value),
|
||||||
|
|
||||||
|
#[error("Incompatible binary operations. Operands don't match: {0:?} and {1:?}")]
|
||||||
|
BinOpIncompatibleTypes(Value, Value),
|
||||||
|
|
||||||
|
#[error("Array access out of bounds: Accessed {0}, size is {1}")]
|
||||||
|
ArrayOutOfBounds(usize, usize),
|
||||||
|
|
||||||
|
#[error("Division by zero")]
|
||||||
|
DivideByZero,
|
||||||
|
|
||||||
|
#[error("Invalid number of arguments for function {0}. Expected {1}, got {2}")]
|
||||||
|
InvalidNumberOfArgs(String, usize, usize),
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Possible variants for the values
|
||||||
|
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||||
|
pub enum Value {
|
||||||
|
/// 64-bit integer value
|
||||||
|
I64(i64),
|
||||||
|
/// String value
|
||||||
|
String(Sid),
|
||||||
|
/// Array value
|
||||||
|
Array(Rc<RefCell<Vec<Value>>>),
|
||||||
|
/// Void value
|
||||||
|
Void,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// The exit type of a block. When a block ends, the exit type specified why the block ended.
|
||||||
|
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||||
|
pub enum BlockExit {
|
||||||
|
/// Normal exit when the block just ends normally (no returns / breaks / continues / etc.)
|
||||||
|
Normal,
|
||||||
|
/// The block ended through a break statement. This will be propagated up to the next loop
|
||||||
|
/// and cause it to fully terminate
|
||||||
|
Break,
|
||||||
|
/// The block ended through a continue statement. This will be propagated up to the next loop
|
||||||
|
/// and cause it to start the next iteration
|
||||||
|
Continue,
|
||||||
|
/// The block ended through a return statement. This will propagate up to the next function
|
||||||
|
/// body end
|
||||||
|
Return(Value),
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Default)]
|
||||||
pub struct Interpreter {
|
pub struct Interpreter {
|
||||||
// Runtime storage, for example variables ...
|
/// Run the SimpleAstOptimizer over the Ast before executing
|
||||||
|
pub optimize_ast: bool,
|
||||||
|
|
||||||
|
/// Print the tokens after lexing
|
||||||
|
pub print_tokens: bool,
|
||||||
|
/// Print the ast after parsing
|
||||||
|
pub print_ast: bool,
|
||||||
|
|
||||||
|
/// Capture the output values of print statements instead of printing them to the terminal
|
||||||
|
pub capture_output: bool,
|
||||||
|
/// The stored values that were captured
|
||||||
|
output: Vec<Value>,
|
||||||
|
|
||||||
|
/// Variable table stores the runtime values of variables as a stack
|
||||||
|
vartable: Vec<Value>,
|
||||||
|
|
||||||
|
/// Function table stores the functions during runtime as a stack
|
||||||
|
funtable: Vec<FunDecl>,
|
||||||
|
|
||||||
|
/// The stringstore contains all strings used throughout the program
|
||||||
|
stringstore: StringStore,
|
||||||
}
|
}
|
||||||
|
|
||||||
impl Interpreter {
|
impl Interpreter {
|
||||||
|
/// Create a new Interpreter
|
||||||
pub fn new() -> Self {
|
pub fn new() -> Self {
|
||||||
Self {}
|
Self {
|
||||||
}
|
optimize_ast: true,
|
||||||
|
..Self::default()
|
||||||
pub fn run(&mut self, prog: Ast) {
|
|
||||||
let result = self.resolve_expr(prog);
|
|
||||||
|
|
||||||
println!("Result = {:?}", result);
|
|
||||||
}
|
|
||||||
|
|
||||||
fn resolve_expr(&mut self, expr: Ast) -> Value {
|
|
||||||
match expr {
|
|
||||||
Ast::I64(val) => Value::I64(val),
|
|
||||||
Ast::BinOp(bo, lhs, rhs) => self.resolve_binop(bo, *lhs, *rhs),
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
fn resolve_binop(&mut self, bo: BinOpType, lhs: Ast, rhs: Ast) -> Value {
|
/// Get the captured output
|
||||||
let lhs = self.resolve_expr(lhs);
|
pub fn output(&self) -> &[Value] {
|
||||||
let rhs = self.resolve_expr(rhs);
|
&self.output
|
||||||
|
}
|
||||||
|
|
||||||
match (lhs, rhs) {
|
/// Try to retrieve a variable value from the varstack. The idx is the index from the back of
|
||||||
|
/// the stack. So 0 is the last value, not the first
|
||||||
|
fn get_var(&self, idx: usize) -> Option<Value> {
|
||||||
|
self.vartable.get(self.vartable.len() - idx - 1).cloned()
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Try to retrieve a mutable reference to a variable value from the varstack. The idx is the
|
||||||
|
/// index from the back of the stack. So 0 is the last value, not the first
|
||||||
|
fn get_var_mut(&mut self, idx: usize) -> Option<&mut Value> {
|
||||||
|
let idx = self.vartable.len() - idx - 1;
|
||||||
|
self.vartable.get_mut(idx)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Lex, parse and then run the given sourecode. This will terminate the program when an error
|
||||||
|
/// occurs and print an appropriate error message.
|
||||||
|
pub fn run_str(&mut self, code: &str) {
|
||||||
|
// Lex the tokens
|
||||||
|
let tokens = match lex(code) {
|
||||||
|
Ok(tokens) => tokens,
|
||||||
|
Err(e) => nice_panic!("Lexing error: {}", e),
|
||||||
|
};
|
||||||
|
|
||||||
|
if self.print_tokens {
|
||||||
|
println!("Tokens: {:?}", tokens);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Parse the ast
|
||||||
|
let ast = match parse(tokens) {
|
||||||
|
Ok(ast) => ast,
|
||||||
|
Err(e) => nice_panic!("Parsing error: {}", e),
|
||||||
|
};
|
||||||
|
|
||||||
|
// Run the ast
|
||||||
|
match self.run_ast(ast) {
|
||||||
|
Ok(_) => (),
|
||||||
|
Err(e) => nice_panic!("Runtime error: {}", e),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Execute the given Ast within the interpreter
|
||||||
|
pub fn run_ast(&mut self, mut ast: Ast) -> Result<(), RuntimeError> {
|
||||||
|
// Optimize the ast
|
||||||
|
if self.optimize_ast {
|
||||||
|
ast = SimpleAstOptimizer::optimize(ast);
|
||||||
|
}
|
||||||
|
|
||||||
|
if self.print_ast {
|
||||||
|
println!("{:#?}", ast.main);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Take over the stringstore of the given ast
|
||||||
|
self.stringstore = ast.stringstore;
|
||||||
|
|
||||||
|
// Run the top level block (the main)
|
||||||
|
self.run_block(&ast.main)?;
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Run all statements in the given block
|
||||||
|
pub fn run_block(&mut self, prog: &BlockScope) -> Result<BlockExit, RuntimeError> {
|
||||||
|
self.run_block_fp_offset(prog, 0)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Same as run_block, but with an additional framepointer offset. This allows to free more
|
||||||
|
/// values from the stack than normally and can be used when passing arguments inside a
|
||||||
|
/// function body scope from the outside
|
||||||
|
pub fn run_block_fp_offset(
|
||||||
|
&mut self,
|
||||||
|
prog: &BlockScope,
|
||||||
|
framepointer_offset: usize,
|
||||||
|
) -> Result<BlockExit, RuntimeError> {
|
||||||
|
let framepointer = self.vartable.len() - framepointer_offset;
|
||||||
|
|
||||||
|
let mut block_exit = BlockExit::Normal;
|
||||||
|
|
||||||
|
'blockloop: for stmt in prog {
|
||||||
|
match stmt {
|
||||||
|
Statement::Break => return Ok(BlockExit::Break),
|
||||||
|
Statement::Continue => return Ok(BlockExit::Continue),
|
||||||
|
|
||||||
|
Statement::Return(expr) => {
|
||||||
|
let val = self.resolve_expr(expr)?;
|
||||||
|
|
||||||
|
block_exit = BlockExit::Return(val);
|
||||||
|
break 'blockloop;
|
||||||
|
}
|
||||||
|
|
||||||
|
Statement::Expr(expr) => {
|
||||||
|
self.resolve_expr(expr)?;
|
||||||
|
}
|
||||||
|
|
||||||
|
Statement::Declaration(decl) => {
|
||||||
|
let rhs = self.resolve_expr(&decl.rhs)?;
|
||||||
|
self.vartable.push(rhs);
|
||||||
|
}
|
||||||
|
|
||||||
|
Statement::Block(block) => match self.run_block(block)? {
|
||||||
|
// Propagate return, continue and break
|
||||||
|
be @ (BlockExit::Return(_) | BlockExit::Continue | BlockExit::Break) => {
|
||||||
|
block_exit = be;
|
||||||
|
break 'blockloop;
|
||||||
|
}
|
||||||
|
_ => (),
|
||||||
|
},
|
||||||
|
|
||||||
|
Statement::Loop(looop) => {
|
||||||
|
// loop runs as long condition != 0
|
||||||
|
loop {
|
||||||
|
// Check the loop condition
|
||||||
|
if let Some(condition) = &looop.condition {
|
||||||
|
if matches!(self.resolve_expr(condition)?, Value::I64(0)) {
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Run the body
|
||||||
|
let be = self.run_block(&looop.body)?;
|
||||||
|
match be {
|
||||||
|
// Propagate return
|
||||||
|
be @ BlockExit::Return(_) => {
|
||||||
|
block_exit = be;
|
||||||
|
break 'blockloop;
|
||||||
|
}
|
||||||
|
BlockExit::Break => break,
|
||||||
|
BlockExit::Continue | BlockExit::Normal => (),
|
||||||
|
}
|
||||||
|
|
||||||
|
// Run the advancement
|
||||||
|
if let Some(adv) = &looop.advancement {
|
||||||
|
self.resolve_expr(&adv)?;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
Statement::Print(expr) => {
|
||||||
|
let result = self.resolve_expr(expr)?;
|
||||||
|
|
||||||
|
if self.capture_output {
|
||||||
|
self.output.push(result)
|
||||||
|
} else {
|
||||||
|
print!("{}", self.value_to_string(&result));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
Statement::If(If {
|
||||||
|
condition,
|
||||||
|
body_true,
|
||||||
|
body_false,
|
||||||
|
}) => {
|
||||||
|
// Run the right block depending on the conditions result being 0 or not
|
||||||
|
let exit = if matches!(self.resolve_expr(condition)?, Value::I64(0)) {
|
||||||
|
self.run_block(body_false)?
|
||||||
|
} else {
|
||||||
|
self.run_block(body_true)?
|
||||||
|
};
|
||||||
|
|
||||||
|
match exit {
|
||||||
|
// Propagate return, continue and break
|
||||||
|
be @ (BlockExit::Return(_) | BlockExit::Continue | BlockExit::Break) => {
|
||||||
|
block_exit = be;
|
||||||
|
break 'blockloop;
|
||||||
|
}
|
||||||
|
_ => (),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
Statement::FunDeclare(fundec) => {
|
||||||
|
self.funtable.push(fundec.clone());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
self.vartable.truncate(framepointer);
|
||||||
|
|
||||||
|
Ok(block_exit)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Execute the given expression to retrieve the resulting value
|
||||||
|
fn resolve_expr(&mut self, expr: &Expression) -> Result<Value, RuntimeError> {
|
||||||
|
let val = match expr {
|
||||||
|
Expression::I64(val) => Value::I64(*val),
|
||||||
|
Expression::ArrayLiteral(size) => {
|
||||||
|
let size = match self.resolve_expr(size)? {
|
||||||
|
Value::I64(size) if !size.is_negative() => size,
|
||||||
|
val => return Err(RuntimeError::InvalidArrayIndex(val)),
|
||||||
|
};
|
||||||
|
Value::Array(Rc::new(RefCell::new(vec![Value::I64(0); size as usize])))
|
||||||
|
}
|
||||||
|
Expression::String(text) => Value::String(text.clone()),
|
||||||
|
Expression::BinOp(bo, lhs, rhs) => self.resolve_binop(bo, lhs, rhs)?,
|
||||||
|
Expression::UnOp(uo, operand) => self.resolve_unop(uo, operand)?,
|
||||||
|
Expression::Var(name, idx) => self.resolve_var(*name, *idx)?,
|
||||||
|
Expression::ArrayAccess(name, idx, arr_idx) => {
|
||||||
|
self.resolve_array_access(*name, *idx, arr_idx)?
|
||||||
|
}
|
||||||
|
|
||||||
|
Expression::FunCall(fun_name, fun_stackpos, args) => {
|
||||||
|
let args_len = args.len();
|
||||||
|
|
||||||
|
// All of the arg expressions must be resolved before pushing the vars on the stack,
|
||||||
|
// otherwise the stack positions are incorrect while resolving
|
||||||
|
let args = args
|
||||||
|
.iter()
|
||||||
|
.map(|arg| self.resolve_expr(arg))
|
||||||
|
.collect::<Vec<_>>();
|
||||||
|
for arg in args {
|
||||||
|
self.vartable.push(arg?);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Function existance has been verified in the parser, so unwrap here shouldn't fail
|
||||||
|
let expected_num_args = self.funtable.get(*fun_stackpos).unwrap().argnames.len();
|
||||||
|
|
||||||
|
// Check if the number of provided arguments matches the number of expected arguments
|
||||||
|
if expected_num_args != args_len {
|
||||||
|
let fun_name = self
|
||||||
|
.stringstore
|
||||||
|
.lookup(*fun_name)
|
||||||
|
.cloned()
|
||||||
|
.unwrap_or("<unknown>".to_string());
|
||||||
|
return Err(RuntimeError::InvalidNumberOfArgs(
|
||||||
|
fun_name,
|
||||||
|
expected_num_args,
|
||||||
|
args_len,
|
||||||
|
));
|
||||||
|
}
|
||||||
|
|
||||||
|
// Run the function body and return the BlockExit type
|
||||||
|
match self.run_block_fp_offset(
|
||||||
|
&Rc::clone(&self.funtable.get(*fun_stackpos).unwrap().body),
|
||||||
|
expected_num_args,
|
||||||
|
)? {
|
||||||
|
BlockExit::Normal | BlockExit::Continue | BlockExit::Break => Value::Void,
|
||||||
|
BlockExit::Return(val) => val,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
Ok(val)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Retrive the value of a given array at the specified index from the varstack. The name is
|
||||||
|
/// given as a StringID and is used to reference the variable name in case of an error. The
|
||||||
|
/// idx is the stackpos where the array variable should be located and the arr_idx is the
|
||||||
|
/// actual array access index, given as an expression.
|
||||||
|
fn resolve_array_access(
|
||||||
|
&mut self,
|
||||||
|
name: Sid,
|
||||||
|
idx: usize,
|
||||||
|
arr_idx: &Expression,
|
||||||
|
) -> Result<Value, RuntimeError> {
|
||||||
|
// Resolve the array index into a value and check if it is a valid array index
|
||||||
|
let arr_idx = match self.resolve_expr(arr_idx)? {
|
||||||
|
Value::I64(size) if !size.is_negative() => size,
|
||||||
|
val => return Err(RuntimeError::InvalidArrayIndex(val)),
|
||||||
|
};
|
||||||
|
|
||||||
|
// Get the array value
|
||||||
|
let val = match self.get_var(idx) {
|
||||||
|
Some(val) => val,
|
||||||
|
None => {
|
||||||
|
return Err(RuntimeError::VarUsedNotDeclared(
|
||||||
|
self.stringstore
|
||||||
|
.lookup(name)
|
||||||
|
.cloned()
|
||||||
|
.unwrap_or_else(|| "<unknown>".to_string()),
|
||||||
|
))
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Make sure it is an array
|
||||||
|
let arr = match val {
|
||||||
|
Value::Array(arr) => arr,
|
||||||
|
_ => {
|
||||||
|
return Err(RuntimeError::TryingToIndexNonArray(
|
||||||
|
self.stringstore
|
||||||
|
.lookup(name)
|
||||||
|
.cloned()
|
||||||
|
.unwrap_or_else(|| "<unknown>".to_string()),
|
||||||
|
))
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Get the value of the requested cell inside the array
|
||||||
|
let arr = arr.borrow();
|
||||||
|
arr.get(arr_idx as usize)
|
||||||
|
.cloned()
|
||||||
|
.ok_or(RuntimeError::ArrayOutOfBounds(arr_idx as usize, arr.len()))
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Retrive the value of a given variable from the varstack. The name is given as a StringID
|
||||||
|
/// and is used to reference the variable name in case of an error. The idx is the stackpos
|
||||||
|
/// where the variable should be located
|
||||||
|
fn resolve_var(&mut self, name: Sid, idx: usize) -> Result<Value, RuntimeError> {
|
||||||
|
match self.get_var(idx) {
|
||||||
|
Some(val) => Ok(val),
|
||||||
|
None => {
|
||||||
|
return Err(RuntimeError::VarUsedNotDeclared(
|
||||||
|
self.stringstore
|
||||||
|
.lookup(name)
|
||||||
|
.cloned()
|
||||||
|
.unwrap_or_else(|| "<unknown>".to_string()),
|
||||||
|
))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Execute a unary operation and get the resulting value
|
||||||
|
fn resolve_unop(&mut self, uo: &UnOpType, operand: &Expression) -> Result<Value, RuntimeError> {
|
||||||
|
// Recursively resolve the operands expression into an actual value
|
||||||
|
let operand = self.resolve_expr(operand)?;
|
||||||
|
|
||||||
|
// Perform the correct operation, considering the operation and value type
|
||||||
|
Ok(match (operand, uo) {
|
||||||
|
(Value::I64(val), UnOpType::Negate) => Value::I64(-val),
|
||||||
|
(Value::I64(val), UnOpType::BNot) => Value::I64(!val),
|
||||||
|
(Value::I64(val), UnOpType::LNot) => Value::I64(if val == 0 { 1 } else { 0 }),
|
||||||
|
(val, _) => return Err(RuntimeError::UnOpInvalidType(val)),
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Execute a binary operation and get the resulting value
|
||||||
|
fn resolve_binop(
|
||||||
|
&mut self,
|
||||||
|
bo: &BinOpType,
|
||||||
|
lhs: &Expression,
|
||||||
|
rhs: &Expression,
|
||||||
|
) -> Result<Value, RuntimeError> {
|
||||||
|
let rhs = self.resolve_expr(rhs)?;
|
||||||
|
|
||||||
|
// Handle assignments separate from the other binary operations
|
||||||
|
match (&bo, &lhs) {
|
||||||
|
// Normal variable assignment
|
||||||
|
(BinOpType::Assign, Expression::Var(name, idx)) => {
|
||||||
|
// Get the variable mutably and assign the right hand side value
|
||||||
|
match self.get_var_mut(*idx) {
|
||||||
|
Some(val) => *val = rhs.clone(),
|
||||||
|
None => {
|
||||||
|
return Err(RuntimeError::VarUsedNotDeclared(
|
||||||
|
self.stringstore
|
||||||
|
.lookup(*name)
|
||||||
|
.cloned()
|
||||||
|
.unwrap_or_else(|| "<unknown>".to_string()),
|
||||||
|
))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return Ok(rhs);
|
||||||
|
}
|
||||||
|
// Array index assignment
|
||||||
|
(BinOpType::Assign, Expression::ArrayAccess(name, idx, arr_idx)) => {
|
||||||
|
// Calculate the array index
|
||||||
|
let arr_idx = match self.resolve_expr(arr_idx)? {
|
||||||
|
Value::I64(size) if !size.is_negative() => size,
|
||||||
|
val => return Err(RuntimeError::InvalidArrayIndex(val)),
|
||||||
|
};
|
||||||
|
|
||||||
|
// Get the mutable ref to the array variable
|
||||||
|
let val = match self.get_var_mut(*idx) {
|
||||||
|
Some(val) => val,
|
||||||
|
None => {
|
||||||
|
return Err(RuntimeError::VarUsedNotDeclared(
|
||||||
|
self.stringstore
|
||||||
|
.lookup(*name)
|
||||||
|
.cloned()
|
||||||
|
.unwrap_or_else(|| "<unknown>".to_string()),
|
||||||
|
))
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Verify that it actually is an array
|
||||||
|
match val {
|
||||||
|
// Assign the right hand side value to the array it the given index
|
||||||
|
Value::Array(arr) => arr.borrow_mut()[arr_idx as usize] = rhs.clone(),
|
||||||
|
_ => {
|
||||||
|
return Err(RuntimeError::TryingToIndexNonArray(
|
||||||
|
self.stringstore
|
||||||
|
.lookup(*name)
|
||||||
|
.cloned()
|
||||||
|
.unwrap_or_else(|| "<unknown>".to_string()),
|
||||||
|
))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return Ok(rhs);
|
||||||
|
}
|
||||||
|
_ => (),
|
||||||
|
}
|
||||||
|
|
||||||
|
// This code is only executed if the binop is not an assignment as the assignments return
|
||||||
|
// early
|
||||||
|
|
||||||
|
// Resolve the left hand side to the value
|
||||||
|
let lhs = self.resolve_expr(lhs)?;
|
||||||
|
|
||||||
|
// Perform the appropriate calculations considering the operation type and datatypes of the
|
||||||
|
// two values
|
||||||
|
let result = match (lhs, rhs) {
|
||||||
(Value::I64(lhs), Value::I64(rhs)) => match bo {
|
(Value::I64(lhs), Value::I64(rhs)) => match bo {
|
||||||
BinOpType::Add => Value::I64(lhs + rhs),
|
BinOpType::Add => Value::I64(lhs + rhs),
|
||||||
BinOpType::Mul => Value::I64(lhs * rhs),
|
BinOpType::Mul => Value::I64(lhs * rhs),
|
||||||
|
BinOpType::Sub => Value::I64(lhs - rhs),
|
||||||
|
BinOpType::Div => {
|
||||||
|
Value::I64(lhs.checked_div(rhs).ok_or(RuntimeError::DivideByZero)?)
|
||||||
|
}
|
||||||
|
BinOpType::Mod => {
|
||||||
|
Value::I64(lhs.checked_rem(rhs).ok_or(RuntimeError::DivideByZero)?)
|
||||||
|
}
|
||||||
|
BinOpType::BOr => Value::I64(lhs | rhs),
|
||||||
|
BinOpType::BAnd => Value::I64(lhs & rhs),
|
||||||
|
BinOpType::BXor => Value::I64(lhs ^ rhs),
|
||||||
|
BinOpType::LAnd => Value::I64(if (lhs != 0) && (rhs != 0) { 1 } else { 0 }),
|
||||||
|
BinOpType::LOr => Value::I64(if (lhs != 0) || (rhs != 0) { 1 } else { 0 }),
|
||||||
|
BinOpType::Shr => Value::I64(lhs >> rhs),
|
||||||
|
BinOpType::Shl => Value::I64(lhs << rhs),
|
||||||
|
BinOpType::EquEqu => Value::I64(if lhs == rhs { 1 } else { 0 }),
|
||||||
|
BinOpType::NotEqu => Value::I64(if lhs != rhs { 1 } else { 0 }),
|
||||||
|
BinOpType::Less => Value::I64(if lhs < rhs { 1 } else { 0 }),
|
||||||
|
BinOpType::LessEqu => Value::I64(if lhs <= rhs { 1 } else { 0 }),
|
||||||
|
BinOpType::Greater => Value::I64(if lhs > rhs { 1 } else { 0 }),
|
||||||
|
BinOpType::GreaterEqu => Value::I64(if lhs >= rhs { 1 } else { 0 }),
|
||||||
|
|
||||||
|
BinOpType::Assign => unreachable!(),
|
||||||
},
|
},
|
||||||
// _ => panic!("Value types are not compatible"),
|
(lhs, rhs) => return Err(RuntimeError::BinOpIncompatibleTypes(lhs, rhs)),
|
||||||
|
};
|
||||||
|
|
||||||
|
Ok(result)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Get a string representation of the given value. This uses the interpreters StringStore to
|
||||||
|
/// retrive the text values of Strings
|
||||||
|
fn value_to_string(&self, val: &Value) -> String {
|
||||||
|
match val {
|
||||||
|
Value::I64(val) => format!("{}", val),
|
||||||
|
Value::Array(val) => format!("{:?}", val.borrow()),
|
||||||
|
Value::String(text) => format!(
|
||||||
|
"{}",
|
||||||
|
self.stringstore
|
||||||
|
.lookup(*text)
|
||||||
|
.unwrap_or(&"<invalid string>".to_string())
|
||||||
|
),
|
||||||
|
Value::Void => format!("void"),
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
#[cfg(test)]
|
#[cfg(test)]
|
||||||
mod test {
|
mod test {
|
||||||
use crate::parser::{Ast, BinOpType};
|
|
||||||
use super::{Interpreter, Value};
|
use super::{Interpreter, Value};
|
||||||
|
use crate::ast::{BinOpType, Expression};
|
||||||
|
|
||||||
|
/// Simple test to check if a simple expression is executed properly.
|
||||||
|
/// Full system tests from lexing to execution can be found in `lib.rs`
|
||||||
#[test]
|
#[test]
|
||||||
fn test_interpreter_expr() {
|
fn test_interpreter_expr() {
|
||||||
// Expression: 1 + 2 * 3 + 4
|
// Expression: 1 + 2 * 3 + 4
|
||||||
// With precedence: (1 + (2 * 3)) + 4
|
// With precedence: (1 + (2 * 3)) + 4
|
||||||
let ast = Ast::BinOp(
|
let ast = Expression::BinOp(
|
||||||
BinOpType::Add,
|
BinOpType::Add,
|
||||||
Ast::BinOp(
|
Expression::BinOp(
|
||||||
BinOpType::Add,
|
BinOpType::Add,
|
||||||
Ast::I64(1).into(),
|
Expression::I64(1).into(),
|
||||||
Ast::BinOp(BinOpType::Mul, Ast::I64(2).into(), Ast::I64(3).into()).into(),
|
Expression::BinOp(
|
||||||
|
BinOpType::Mul,
|
||||||
|
Expression::I64(2).into(),
|
||||||
|
Expression::I64(3).into(),
|
||||||
)
|
)
|
||||||
.into(),
|
.into(),
|
||||||
Ast::I64(4).into(),
|
)
|
||||||
|
.into(),
|
||||||
|
Expression::I64(4).into(),
|
||||||
);
|
);
|
||||||
|
|
||||||
let expected = Value::I64(11);
|
let expected = Value::I64(11);
|
||||||
|
|
||||||
let mut interpreter = Interpreter::new();
|
let mut interpreter = Interpreter::new();
|
||||||
let actual = interpreter.resolve_expr(ast);
|
let actual = interpreter.resolve_expr(&ast).unwrap();
|
||||||
|
|
||||||
assert_eq!(expected, actual);
|
assert_eq!(expected, actual);
|
||||||
}
|
}
|
||||||
|
|||||||
377
src/lexer.rs
377
src/lexer.rs
@ -1,116 +1,313 @@
|
|||||||
use std::{iter::Peekable, str::Chars};
|
use std::{iter::Peekable, str::Chars};
|
||||||
|
use thiserror::Error;
|
||||||
|
|
||||||
use crate::parser::BinOpType;
|
use crate::{token::Token, T};
|
||||||
|
|
||||||
#[derive(Debug, PartialEq, Eq)]
|
/// Errors that can occur while lexing a given string
|
||||||
pub enum Token {
|
#[derive(Debug, Error)]
|
||||||
/// Integer literal (64-bit)
|
pub enum LexErr {
|
||||||
I64(i64),
|
#[error("Failed to parse '{0}' as i64")]
|
||||||
|
NumericParse(String),
|
||||||
|
|
||||||
/// Plus (+)
|
#[error("Invalid escape character '\\{0}'")]
|
||||||
Add,
|
InvalidStrEscape(char),
|
||||||
|
|
||||||
/// Asterisk (*)
|
#[error("Lexer encountered unexpected char: '{0}'")]
|
||||||
Mul,
|
UnexpectedChar(char),
|
||||||
|
|
||||||
/// End of file
|
#[error("Missing closing string quote '\"'")]
|
||||||
EoF,
|
MissingClosingString,
|
||||||
}
|
|
||||||
|
|
||||||
struct Lexer<'a> {
|
|
||||||
code: Peekable<Chars<'a>>,
|
|
||||||
}
|
|
||||||
|
|
||||||
impl<'a> Lexer<'a> {
|
|
||||||
fn new(code: &'a str) -> Self {
|
|
||||||
let code = code.chars().peekable();
|
|
||||||
Self { code }
|
|
||||||
}
|
|
||||||
|
|
||||||
fn lex(&mut self) -> Vec<Token> {
|
|
||||||
let mut tokens = Vec::new();
|
|
||||||
|
|
||||||
while let Some(ch) = self.next() {
|
|
||||||
match ch {
|
|
||||||
// Skip whitespace
|
|
||||||
' ' => (),
|
|
||||||
|
|
||||||
// Lex numbers
|
|
||||||
'0'..='9' => {
|
|
||||||
let mut sval = String::from(ch);
|
|
||||||
|
|
||||||
// Do as long as a next char exists and it is a numeric char
|
|
||||||
while let Some('0'..='9') = self.peek() {
|
|
||||||
// The next char is verified to be Some, so unwrap is safe
|
|
||||||
sval.push(self.next().unwrap());
|
|
||||||
}
|
|
||||||
|
|
||||||
// TODO: We only added numeric chars to the string, but the conversion could still fail
|
|
||||||
tokens.push(Token::I64(sval.parse().unwrap()));
|
|
||||||
}
|
|
||||||
|
|
||||||
'+' => tokens.push(Token::Add),
|
|
||||||
'*' => tokens.push(Token::Mul),
|
|
||||||
|
|
||||||
//TODO: Don't panic, keep calm
|
|
||||||
_ => panic!("Lexer encountered unexpected char: '{}'", ch),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
tokens
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Advance to next character and return the removed char
|
|
||||||
fn next(&mut self) -> Option<char> {
|
|
||||||
self.code.next()
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Get the next character without removing it
|
|
||||||
fn peek(&mut self) -> Option<char> {
|
|
||||||
self.code.peek().copied()
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Lex the provided code into a Token Buffer
|
/// Lex the provided code into a Token Buffer
|
||||||
///
|
pub fn lex(code: &str) -> Result<Vec<Token>, LexErr> {
|
||||||
/// TODO: Don't panic and implement error handling using Result
|
let lexer = Lexer::new(code);
|
||||||
pub fn lex(code: &str) -> Vec<Token> {
|
|
||||||
let mut lexer = Lexer::new(code);
|
|
||||||
lexer.lex()
|
lexer.lex()
|
||||||
}
|
}
|
||||||
|
|
||||||
impl Token {
|
/// The lexer is created from a reference to a sourcecode string and is consumed to create a token
|
||||||
pub fn try_to_binop(&self) -> Option<BinOpType> {
|
/// buffer from that sourcecode.
|
||||||
Some(match self {
|
struct Lexer<'a> {
|
||||||
Token::Add => BinOpType::Add,
|
/// The sourcecode text as a peekable iterator over the chars. Peekable allows for look-ahead
|
||||||
Token::Mul => BinOpType::Mul,
|
/// and the use of the Chars iterator allows to support unicode characters
|
||||||
_ => return None,
|
code: Peekable<Chars<'a>>,
|
||||||
})
|
/// The lexed tokens
|
||||||
|
tokens: Vec<Token>,
|
||||||
|
/// The sourcecode character that is currently being lexed
|
||||||
|
current_char: char,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl<'a> Lexer<'a> {
|
||||||
|
|
||||||
|
/// Create a new lexer from the given sourcecode
|
||||||
|
fn new(code: &'a str) -> Self {
|
||||||
|
let code = code.chars().peekable();
|
||||||
|
let tokens = Vec::new();
|
||||||
|
let current_char = '\0';
|
||||||
|
Self {
|
||||||
|
code,
|
||||||
|
tokens,
|
||||||
|
current_char,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Consume the lexer and try to lex the contained sourcecode into a token buffer
|
||||||
|
fn lex(mut self) -> Result<Vec<Token>, LexErr> {
|
||||||
|
|
||||||
|
loop {
|
||||||
|
self.current_char = self.next();
|
||||||
|
// Match on the current and next character. This gives a 1-char look-ahead and
|
||||||
|
// can be used to directly match 2-char tokens
|
||||||
|
match (self.current_char, self.peek()) {
|
||||||
|
// Stop lexing at EOF
|
||||||
|
('\0', _) => break,
|
||||||
|
|
||||||
|
// Skip / ignore whitespace
|
||||||
|
(' ' | '\t' | '\n' | '\r', _) => (),
|
||||||
|
|
||||||
|
// Line comment. Consume every char until linefeed (next line)
|
||||||
|
('/', '/') => while !matches!(self.next(), '\n' | '\0') {},
|
||||||
|
|
||||||
|
// Double character tokens
|
||||||
|
('>', '>') => self.push_tok_consume(T![>>]),
|
||||||
|
('<', '<') => self.push_tok_consume(T![<<]),
|
||||||
|
('=', '=') => self.push_tok_consume(T![==]),
|
||||||
|
('!', '=') => self.push_tok_consume(T![!=]),
|
||||||
|
('<', '=') => self.push_tok_consume(T![<=]),
|
||||||
|
('>', '=') => self.push_tok_consume(T![>=]),
|
||||||
|
('<', '-') => self.push_tok_consume(T![<-]),
|
||||||
|
('&', '&') => self.push_tok_consume(T![&&]),
|
||||||
|
('|', '|') => self.push_tok_consume(T![||]),
|
||||||
|
|
||||||
|
// Single character tokens
|
||||||
|
(',', _) => self.push_tok(T![,]),
|
||||||
|
(';', _) => self.push_tok(T![;]),
|
||||||
|
('+', _) => self.push_tok(T![+]),
|
||||||
|
('-', _) => self.push_tok(T![-]),
|
||||||
|
('*', _) => self.push_tok(T![*]),
|
||||||
|
('/', _) => self.push_tok(T![/]),
|
||||||
|
('%', _) => self.push_tok(T![%]),
|
||||||
|
('|', _) => self.push_tok(T![|]),
|
||||||
|
('&', _) => self.push_tok(T![&]),
|
||||||
|
('^', _) => self.push_tok(T![^]),
|
||||||
|
('(', _) => self.push_tok(T!['(']),
|
||||||
|
(')', _) => self.push_tok(T![')']),
|
||||||
|
('~', _) => self.push_tok(T![~]),
|
||||||
|
('<', _) => self.push_tok(T![<]),
|
||||||
|
('>', _) => self.push_tok(T![>]),
|
||||||
|
('=', _) => self.push_tok(T![=]),
|
||||||
|
('{', _) => self.push_tok(T!['{']),
|
||||||
|
('}', _) => self.push_tok(T!['}']),
|
||||||
|
('!', _) => self.push_tok(T![!]),
|
||||||
|
('[', _) => self.push_tok(T!['[']),
|
||||||
|
(']', _) => self.push_tok(T![']']),
|
||||||
|
|
||||||
|
// Special tokens with variable length
|
||||||
|
|
||||||
|
// Lex multiple characters together as numbers
|
||||||
|
('0'..='9', _) => self.lex_number()?,
|
||||||
|
|
||||||
|
// Lex multiple characters together as a string
|
||||||
|
('"', _) => self.lex_str()?,
|
||||||
|
|
||||||
|
// Lex multiple characters together as identifier or keyword
|
||||||
|
('a'..='z' | 'A'..='Z' | '_', _) => self.lex_identifier()?,
|
||||||
|
|
||||||
|
// Any character that was not handled otherwise is invalid
|
||||||
|
(ch, _) => Err(LexErr::UnexpectedChar(ch))?,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
Ok(self.tokens)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Lex multiple characters as a number until encountering a non numeric digit. The
|
||||||
|
/// successfully lexed i64 literal token is appended to the stored tokens.
|
||||||
|
fn lex_number(&mut self) -> Result<(), LexErr> {
|
||||||
|
// String representation of the integer value
|
||||||
|
let mut sval = String::from(self.current_char);
|
||||||
|
|
||||||
|
// Do as long as a next char exists and it is a numeric char
|
||||||
|
loop {
|
||||||
|
// The next char is verified to be Some, so unwrap is safe
|
||||||
|
match self.peek() {
|
||||||
|
// Underscore is a separator, so remove it but don't add to number
|
||||||
|
'_' => {
|
||||||
|
self.next();
|
||||||
|
}
|
||||||
|
'0'..='9' => {
|
||||||
|
sval.push(self.next());
|
||||||
|
}
|
||||||
|
// Next char is not a number, so stop and finish the number token
|
||||||
|
_ => break,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Try to convert the string representation of the value to i64. The error is mapped to
|
||||||
|
// the appropriate LexErr
|
||||||
|
let i64val = sval.parse().map_err(|_| LexErr::NumericParse(sval))?;
|
||||||
|
|
||||||
|
self.push_tok(T![i64(i64val)]);
|
||||||
|
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Lex characters as a string until encountering an unescaped closing doublequoute char '"'.
|
||||||
|
/// The successfully lexed string literal token is appended to the stored tokens.
|
||||||
|
fn lex_str(&mut self) -> Result<(), LexErr> {
|
||||||
|
// The opening " was consumed in match, so a fresh string can be used
|
||||||
|
let mut text = String::new();
|
||||||
|
|
||||||
|
// Read all chars until encountering the closing "
|
||||||
|
loop {
|
||||||
|
match self.peek() {
|
||||||
|
// An unescaped doubleqoute ends the current string
|
||||||
|
'"' => break,
|
||||||
|
|
||||||
|
// If the end of file is reached while still waiting for '"', error out
|
||||||
|
'\0' => Err(LexErr::MissingClosingString)?,
|
||||||
|
|
||||||
|
_ => match self.next() {
|
||||||
|
// Backslash indicates an escaped character, so consume one more char and
|
||||||
|
// treat it as the escaped char
|
||||||
|
'\\' => match self.next() {
|
||||||
|
'n' => text.push('\n'),
|
||||||
|
'r' => text.push('\r'),
|
||||||
|
't' => text.push('\t'),
|
||||||
|
'\\' => text.push('\\'),
|
||||||
|
'"' => text.push('"'),
|
||||||
|
// If the escaped char is not handled, it is unsupported and an error
|
||||||
|
ch => Err(LexErr::InvalidStrEscape(ch))?,
|
||||||
|
},
|
||||||
|
// All other characters are simply appended to the string
|
||||||
|
ch => text.push(ch),
|
||||||
|
},
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Consume closing "
|
||||||
|
self.next();
|
||||||
|
|
||||||
|
self.push_tok(T![str(text)]);
|
||||||
|
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Lex characters from the text as an identifier. The successfully lexed ident or keyword
|
||||||
|
/// token is appended to the stored tokens.
|
||||||
|
fn lex_identifier(&mut self) -> Result<(), LexErr> {
|
||||||
|
let mut ident = String::from(self.current_char);
|
||||||
|
|
||||||
|
// Do as long as a next char exists and it is a valid char for an identifier
|
||||||
|
loop {
|
||||||
|
match self.peek() {
|
||||||
|
// In the middle of an identifier numbers are also allowed
|
||||||
|
'a'..='z' | 'A'..='Z' | '0'..='9' | '_' => {
|
||||||
|
ident.push(self.next());
|
||||||
|
}
|
||||||
|
// Next char is not valid, so stop and finish the ident token
|
||||||
|
_ => break,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Check for pre-defined keywords
|
||||||
|
let token = match ident.as_str() {
|
||||||
|
"loop" => T![loop],
|
||||||
|
"print" => T![print],
|
||||||
|
"if" => T![if],
|
||||||
|
"else" => T![else],
|
||||||
|
"fun" => T![fun],
|
||||||
|
"return" => T![return],
|
||||||
|
"break" => T![break],
|
||||||
|
"continue" => T![continue],
|
||||||
|
|
||||||
|
// If it doesn't match a keyword, it is a normal identifier
|
||||||
|
_ => T![ident(ident)],
|
||||||
|
};
|
||||||
|
|
||||||
|
self.push_tok(token);
|
||||||
|
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Push the given token into the stored tokens
|
||||||
|
fn push_tok(&mut self, token: Token) {
|
||||||
|
self.tokens.push(token);
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Same as `push_tok` but also consumes the next token, removing it from the code iter. This
|
||||||
|
/// is useful when lexing double char tokens where the second token has only been peeked.
|
||||||
|
fn push_tok_consume(&mut self, token: Token) {
|
||||||
|
self.next();
|
||||||
|
self.tokens.push(token);
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Advance to next character and return the removed char. When the end of the code is reached,
|
||||||
|
/// `'\0'` is returned. This is used instead of an Option::None since it allows for much
|
||||||
|
/// shorter and cleaner code in the main loop. The `'\0'` character would not be valid anyways
|
||||||
|
fn next(&mut self) -> char {
|
||||||
|
self.code.next().unwrap_or('\0')
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Get the next character without removing it. When the end of the code is reached,
|
||||||
|
/// `'\0'` is returned. This is used instead of an Option::None since it allows for much
|
||||||
|
/// shorter and cleaner code in the main loop. The `'\0'` character would not be valid anyways
|
||||||
|
fn peek(&mut self) -> char {
|
||||||
|
self.code.peek().copied().unwrap_or('\0')
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
#[cfg(test)]
|
#[cfg(test)]
|
||||||
mod tests {
|
mod tests {
|
||||||
use super::{lex, Token};
|
use crate::{lexer::lex, T};
|
||||||
|
|
||||||
|
/// A general test to check if the lexer actually lexes tokens correctly
|
||||||
#[test]
|
#[test]
|
||||||
fn test_lexer() {
|
fn test_lexer() {
|
||||||
let code = "33 +5*2 + 4456467*2334+3";
|
let code = r#"53+1-567_000 * / % | ~ ! < > & ^ ({[]});= <- >= <=
|
||||||
|
== != && || << >> loop if else print my_123var "hello \t world\r\n\"\\""#;
|
||||||
let expected = vec![
|
let expected = vec![
|
||||||
Token::I64(33),
|
T![i64(53)],
|
||||||
Token::Add,
|
T![+],
|
||||||
Token::I64(5),
|
T![i64(1)],
|
||||||
Token::Mul,
|
T![-],
|
||||||
Token::I64(2),
|
T![i64(567_000)],
|
||||||
Token::Add,
|
T![*],
|
||||||
Token::I64(4456467),
|
T![/],
|
||||||
Token::Mul,
|
T![%],
|
||||||
Token::I64(2334),
|
T![|],
|
||||||
Token::Add,
|
T![~],
|
||||||
Token::I64(3),
|
T![!],
|
||||||
|
T![<],
|
||||||
|
T![>],
|
||||||
|
T![&],
|
||||||
|
T![^],
|
||||||
|
T!['('],
|
||||||
|
T!['{'],
|
||||||
|
T!['['],
|
||||||
|
T![']'],
|
||||||
|
T!['}'],
|
||||||
|
T![')'],
|
||||||
|
T![;],
|
||||||
|
T![=],
|
||||||
|
T![<-],
|
||||||
|
T![>=],
|
||||||
|
T![<=],
|
||||||
|
T![==],
|
||||||
|
T![!=],
|
||||||
|
T![&&],
|
||||||
|
T![||],
|
||||||
|
T![<<],
|
||||||
|
T![>>],
|
||||||
|
T![loop],
|
||||||
|
T![if],
|
||||||
|
T![else],
|
||||||
|
T![print],
|
||||||
|
T![ident("my_123var".to_string())],
|
||||||
|
T![str("hello \t world\r\n\"\\".to_string())],
|
||||||
];
|
];
|
||||||
|
|
||||||
let actual = lex(code);
|
let actual = lex(code).unwrap();
|
||||||
assert_eq!(expected, actual);
|
assert_eq!(expected, actual);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
67
src/lib.rs
67
src/lib.rs
@ -1,3 +1,68 @@
|
|||||||
|
pub mod ast;
|
||||||
|
pub mod interpreter;
|
||||||
pub mod lexer;
|
pub mod lexer;
|
||||||
pub mod parser;
|
pub mod parser;
|
||||||
pub mod interpreter;
|
pub mod token;
|
||||||
|
pub mod stringstore;
|
||||||
|
pub mod astoptimizer;
|
||||||
|
pub mod util;
|
||||||
|
|
||||||
|
/// A bunch of full program tests using the example code programs as test subjects.
|
||||||
|
#[cfg(test)]
|
||||||
|
mod tests {
|
||||||
|
use crate::interpreter::{Interpreter, Value};
|
||||||
|
use std::fs::read_to_string;
|
||||||
|
|
||||||
|
/// Run a nek program with the given filename from the examples directory and assert the
|
||||||
|
/// captured output with the expected result. This only works if the program just outputs one
|
||||||
|
/// value as the result
|
||||||
|
fn run_example_check_single_i64_output(filename: &str, correct_result: i64) {
|
||||||
|
let mut interpreter = Interpreter::new();
|
||||||
|
// Enable output capturing. This captures all calls to `print`
|
||||||
|
interpreter.capture_output = true;
|
||||||
|
|
||||||
|
// Load and run the given program
|
||||||
|
let code = read_to_string(format!("examples/{filename}")).unwrap();
|
||||||
|
interpreter.run_str(&code);
|
||||||
|
|
||||||
|
// Compare the captured output with the expected value
|
||||||
|
let expected_output = [Value::I64(correct_result)];
|
||||||
|
assert_eq!(interpreter.output(), &expected_output);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_euler1() {
|
||||||
|
run_example_check_single_i64_output("euler1.nek", 233168);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_euler2() {
|
||||||
|
run_example_check_single_i64_output("euler2.nek", 4613732);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_euler3() {
|
||||||
|
run_example_check_single_i64_output("euler3.nek", 6857);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_euler4() {
|
||||||
|
run_example_check_single_i64_output("euler4.nek", 906609);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_euler5() {
|
||||||
|
run_example_check_single_i64_output("euler5.nek", 232792560);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_recursive_fib() {
|
||||||
|
run_example_check_single_i64_output("recursive_fib.nek", 832040);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_functions() {
|
||||||
|
run_example_check_single_i64_output("test_functions.nek", 69);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
|||||||
64
src/main.rs
64
src/main.rs
@ -1,23 +1,59 @@
|
|||||||
use nek_lang::{lexer::lex, parser::parse, interpreter::Interpreter};
|
use std::{env::args, fs, process::exit};
|
||||||
|
|
||||||
|
use nek_lang::{interpreter::Interpreter, nice_panic};
|
||||||
|
|
||||||
|
/// Cli configuration flags and arguments. This could be done with `clap`, but since only so few
|
||||||
|
/// arguments are supported this seems kind of overkill.
|
||||||
|
#[derive(Debug, Default)]
|
||||||
|
struct CliConfig {
|
||||||
|
print_tokens: bool,
|
||||||
|
print_ast: bool,
|
||||||
|
no_optimizations: bool,
|
||||||
|
file: Option<String>,
|
||||||
|
}
|
||||||
|
|
||||||
fn main() {
|
fn main() {
|
||||||
|
let mut conf = CliConfig::default();
|
||||||
|
|
||||||
let mut code = String::new();
|
// Go through all commandline arguments except the first (filename)
|
||||||
|
for arg in args().skip(1) {
|
||||||
std::io::stdin().read_line(&mut code).unwrap();
|
match arg.as_str() {
|
||||||
let code = code.trim();
|
"--token" | "-t" => conf.print_tokens = true,
|
||||||
|
"--ast" | "-a" => conf.print_ast = true,
|
||||||
let tokens = lex(&code);
|
"--no-opt" | "-n" => conf.no_optimizations = true,
|
||||||
|
"--help" | "-h" => print_help(),
|
||||||
println!("Tokens: {:?}\n", tokens);
|
file if !arg.starts_with("-") && conf.file.is_none() => {
|
||||||
|
conf.file = Some(file.to_string())
|
||||||
let ast = parse(tokens);
|
}
|
||||||
|
_ => nice_panic!("Error: Invalid argument '{}'", arg),
|
||||||
println!("Ast: {:#?}\n", ast);
|
}
|
||||||
|
}
|
||||||
|
|
||||||
let mut interpreter = Interpreter::new();
|
let mut interpreter = Interpreter::new();
|
||||||
|
|
||||||
interpreter.run(ast);
|
interpreter.print_tokens = conf.print_tokens;
|
||||||
|
interpreter.print_ast = conf.print_ast;
|
||||||
|
interpreter.optimize_ast = !conf.no_optimizations;
|
||||||
|
|
||||||
|
if let Some(file) = &conf.file {
|
||||||
|
let code = match fs::read_to_string(file) {
|
||||||
|
Ok(code) => code,
|
||||||
|
Err(_) => nice_panic!("Error: Could not read file '{}'", file),
|
||||||
|
};
|
||||||
|
// Lex, parse and run the program
|
||||||
|
interpreter.run_str(&code);
|
||||||
|
} else {
|
||||||
|
println!("Error: No file given\n");
|
||||||
|
print_help();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn print_help() {
|
||||||
|
println!("Usage nek-lang [FLAGS] [FILE]");
|
||||||
|
println!("FLAGS: ");
|
||||||
|
println!("-t, --token Print the lexed tokens");
|
||||||
|
println!("-a, --ast Print the abstract syntax tree");
|
||||||
|
println!("-n, --no-opt Disable the AST optimizations");
|
||||||
|
println!("-h, --help Show this help screen");
|
||||||
|
exit(0);
|
||||||
}
|
}
|
||||||
|
|||||||
600
src/parser.rs
600
src/parser.rs
@ -1,48 +1,378 @@
|
|||||||
use std::iter::Peekable;
|
use thiserror::Error;
|
||||||
|
|
||||||
use crate::lexer::Token;
|
use crate::{
|
||||||
|
ast::{Ast, BlockScope, Expression, FunDecl, If, Loop, Statement, VarDecl},
|
||||||
|
stringstore::{Sid, StringStore},
|
||||||
|
token::Token,
|
||||||
|
util::{PutBackIter, PutBackableExt},
|
||||||
|
T,
|
||||||
|
};
|
||||||
|
|
||||||
/// Types for binary operators
|
/// Errors that can occur while parsing
|
||||||
#[derive(Debug, PartialEq, Eq, Clone)]
|
#[derive(Debug, Error)]
|
||||||
pub enum BinOpType {
|
pub enum ParseErr {
|
||||||
/// Addition
|
#[error("Unexpected Token \"{0:?}\", expected \"{1}\"")]
|
||||||
Add,
|
UnexpectedToken(Token, String),
|
||||||
|
|
||||||
/// Multiplication
|
#[error("Left hand side of declaration is not a variable")]
|
||||||
Mul,
|
DeclarationOfNonVar,
|
||||||
|
|
||||||
|
#[error("Use of undefined variable \"{0}\"")]
|
||||||
|
UseOfUndeclaredVar(String),
|
||||||
|
|
||||||
|
#[error("Use of undefined function \"{0}\"")]
|
||||||
|
UseOfUndeclaredFun(String),
|
||||||
|
|
||||||
|
#[error("Redeclation of function \"{0}\"")]
|
||||||
|
RedeclarationFun(String),
|
||||||
|
|
||||||
|
#[error("Function not declared at top level \"{0}\"")]
|
||||||
|
FunctionOnNonTopLevel(String),
|
||||||
}
|
}
|
||||||
|
|
||||||
#[derive(Debug, PartialEq, Eq, Clone)]
|
/// A result that can either be Ok, or a ParseErr
|
||||||
pub enum Ast {
|
type ResPE<T> = Result<T, ParseErr>;
|
||||||
/// Integer literal (64-bit)
|
|
||||||
I64(i64),
|
/// This macro can be used to quickly and easily assert if the next token is matching the expected
|
||||||
/// Binary operation. Consists of type, left hand side and right hand side
|
/// token and return an appropriate error if not. Since this is intended to be used inside the
|
||||||
BinOp(BinOpType, Box<Ast>, Box<Ast>),
|
/// parser, the first argument should always be `self`.
|
||||||
|
macro_rules! validate_next {
|
||||||
|
($self:ident, $expected_tok:pat, $expected_str:expr) => {
|
||||||
|
match $self.next() {
|
||||||
|
$expected_tok => (),
|
||||||
|
tok => return Err(ParseErr::UnexpectedToken(tok, format!("{}", $expected_str))),
|
||||||
|
}
|
||||||
|
};
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Parse the given tokens into an abstract syntax tree
|
||||||
|
pub fn parse<T: Iterator<Item = Token>, A: IntoIterator<IntoIter = T>>(tokens: A) -> ResPE<Ast> {
|
||||||
|
let parser = Parser::new(tokens);
|
||||||
|
parser.parse()
|
||||||
|
}
|
||||||
|
|
||||||
|
/// A parser that takes in a Token Stream and can create a full abstract syntax tree from it.
|
||||||
struct Parser<T: Iterator<Item = Token>> {
|
struct Parser<T: Iterator<Item = Token>> {
|
||||||
tokens: Peekable<T>,
|
tokens: PutBackIter<T>,
|
||||||
|
string_store: StringStore,
|
||||||
|
var_stack: Vec<Sid>,
|
||||||
|
fun_stack: Vec<Sid>,
|
||||||
|
nesting_level: usize,
|
||||||
}
|
}
|
||||||
|
|
||||||
impl<T: Iterator<Item = Token>> Parser<T> {
|
impl<T: Iterator<Item = Token>> Parser<T> {
|
||||||
/// Create a new parser to parse the given Token Stream
|
/// Create a new parser to parse the given Token Stream
|
||||||
fn new<A: IntoIterator<IntoIter = T>>(tokens: A) -> Self {
|
pub fn new<A: IntoIterator<IntoIter = T>>(tokens: A) -> Self {
|
||||||
let tokens = tokens.into_iter().peekable();
|
let tokens = tokens.into_iter().putbackable();
|
||||||
Self { tokens }
|
let string_store = StringStore::new();
|
||||||
|
let var_stack = Vec::new();
|
||||||
|
let fun_stack = Vec::new();
|
||||||
|
Self {
|
||||||
|
tokens,
|
||||||
|
string_store,
|
||||||
|
var_stack,
|
||||||
|
fun_stack,
|
||||||
|
nesting_level: 0,
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
fn parse(&mut self) -> Ast {
|
/// Consume the parser and try to create the abstract syntax tree from the token stream
|
||||||
self.parse_expr()
|
pub fn parse(mut self) -> ResPE<Ast> {
|
||||||
|
let main = self.parse_scoped_block()?;
|
||||||
|
Ok(Ast {
|
||||||
|
main,
|
||||||
|
stringstore: self.string_store,
|
||||||
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
fn parse_expr(&mut self) -> Ast {
|
/// Parse a series of statements together as a BlockScope. This will continuously parse
|
||||||
let lhs = self.parse_primary();
|
/// statements until encountering end-of-file or a block end '}' .
|
||||||
|
fn parse_scoped_block(&mut self) -> ResPE<BlockScope> {
|
||||||
|
self.parse_scoped_block_fp_offset(0)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Same as parse_scoped_block, but an offset to the framepointer can be specified to allow
|
||||||
|
/// for easily passing variables into scopes from the outside. This is used when parsing
|
||||||
|
/// function calls
|
||||||
|
fn parse_scoped_block_fp_offset(&mut self, framepointer_offset: usize) -> ResPE<BlockScope> {
|
||||||
|
self.nesting_level += 1;
|
||||||
|
let framepointer = self.var_stack.len() - framepointer_offset;
|
||||||
|
let mut prog = Vec::new();
|
||||||
|
|
||||||
|
loop {
|
||||||
|
match self.peek() {
|
||||||
|
// Just a semicolon is an empty statement. So just consume it
|
||||||
|
T![;] => {
|
||||||
|
self.next();
|
||||||
|
}
|
||||||
|
|
||||||
|
// '}' end the current block and EoF ends everything, as the end of the tokenstream
|
||||||
|
// is reached
|
||||||
|
T![EoF] | T!['}'] => break,
|
||||||
|
|
||||||
|
// Create a new scoped block
|
||||||
|
T!['{'] => {
|
||||||
|
self.next();
|
||||||
|
prog.push(Statement::Block(self.parse_scoped_block()?));
|
||||||
|
|
||||||
|
validate_next!(self, T!['}'], "}");
|
||||||
|
}
|
||||||
|
|
||||||
|
// By default try to lex statements
|
||||||
|
_ => prog.push(self.parse_stmt()?),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Reset the stack to where it was before entering the scope
|
||||||
|
self.var_stack.truncate(framepointer);
|
||||||
|
self.nesting_level -= 1;
|
||||||
|
|
||||||
|
Ok(prog)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Parse a single statement from the tokens
|
||||||
|
fn parse_stmt(&mut self) -> ResPE<Statement> {
|
||||||
|
let stmt = match self.peek() {
|
||||||
|
// Break statement
|
||||||
|
T![break] => {
|
||||||
|
self.next();
|
||||||
|
|
||||||
|
// After the statement, there must be a semicolon
|
||||||
|
validate_next!(self, T![;], ";");
|
||||||
|
|
||||||
|
Statement::Break
|
||||||
|
}
|
||||||
|
|
||||||
|
// Continue statement
|
||||||
|
T![continue] => {
|
||||||
|
self.next();
|
||||||
|
|
||||||
|
// After the statement, there must be a semicolon
|
||||||
|
validate_next!(self, T![;], ";");
|
||||||
|
|
||||||
|
Statement::Continue
|
||||||
|
}
|
||||||
|
|
||||||
|
// Loop statement
|
||||||
|
T![loop] => Statement::Loop(self.parse_loop()?),
|
||||||
|
|
||||||
|
// Print statement
|
||||||
|
T![print] => {
|
||||||
|
self.next();
|
||||||
|
|
||||||
|
let expr = self.parse_expr()?;
|
||||||
|
|
||||||
|
// After the statement, there must be a semicolon
|
||||||
|
validate_next!(self, T![;], ";");
|
||||||
|
|
||||||
|
Statement::Print(expr)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Return statement
|
||||||
|
T![return] => {
|
||||||
|
self.next();
|
||||||
|
let stmt = Statement::Return(self.parse_expr()?);
|
||||||
|
|
||||||
|
// After a statement, there must be a semicolon
|
||||||
|
validate_next!(self, T![;], ";");
|
||||||
|
|
||||||
|
stmt
|
||||||
|
}
|
||||||
|
|
||||||
|
// If statement
|
||||||
|
T![if] => Statement::If(self.parse_if()?),
|
||||||
|
|
||||||
|
// Function definition statement
|
||||||
|
T![fun] => {
|
||||||
|
|
||||||
|
self.next();
|
||||||
|
|
||||||
|
// Expect an identifier as the function name
|
||||||
|
let fun_name = match self.next() {
|
||||||
|
T![ident(fun_name)] => fun_name,
|
||||||
|
tok => return Err(ParseErr::UnexpectedToken(tok, "<ident>".to_string())),
|
||||||
|
};
|
||||||
|
|
||||||
|
// Only allow function definitions on the top level
|
||||||
|
if self.nesting_level > 1 {
|
||||||
|
return Err(ParseErr::FunctionOnNonTopLevel(fun_name));
|
||||||
|
}
|
||||||
|
|
||||||
|
// Intern the function name
|
||||||
|
let fun_name = self.string_store.intern_or_lookup(&fun_name);
|
||||||
|
|
||||||
|
// Check if the function name already exists
|
||||||
|
if self.fun_stack.contains(&fun_name) {
|
||||||
|
return Err(ParseErr::RedeclarationFun(
|
||||||
|
self.string_store
|
||||||
|
.lookup(fun_name)
|
||||||
|
.cloned()
|
||||||
|
.unwrap_or("<unknown>".to_string()),
|
||||||
|
));
|
||||||
|
}
|
||||||
|
|
||||||
|
// Put the function name on the fucntion stack for precalculating the stack
|
||||||
|
// positions
|
||||||
|
let fun_stackpos = self.fun_stack.len();
|
||||||
|
self.fun_stack.push(fun_name);
|
||||||
|
|
||||||
|
|
||||||
|
let mut arg_names = Vec::new();
|
||||||
|
|
||||||
|
validate_next!(self, T!['('], "(");
|
||||||
|
|
||||||
|
// Parse the optional arguments inside the parentheses
|
||||||
|
while matches!(self.peek(), T![ident(_)]) {
|
||||||
|
let var_name = match self.next() {
|
||||||
|
T![ident(var_name)] => var_name,
|
||||||
|
_ => unreachable!(),
|
||||||
|
};
|
||||||
|
|
||||||
|
// Intern argument names
|
||||||
|
let var_name = self.string_store.intern_or_lookup(&var_name);
|
||||||
|
arg_names.push(var_name);
|
||||||
|
|
||||||
|
// Push the variable onto the varstack
|
||||||
|
self.var_stack.push(var_name);
|
||||||
|
|
||||||
|
// If there are more args skip the comma so that the loop will read the argname
|
||||||
|
if self.peek() == &T![,] {
|
||||||
|
self.next();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
validate_next!(self, T![')'], ")");
|
||||||
|
|
||||||
|
validate_next!(self, T!['{'], "{");
|
||||||
|
|
||||||
|
// Create the scoped block with a stack offset. This will pop the args that are
|
||||||
|
// added to the stack while parsing args
|
||||||
|
let body = self.parse_scoped_block_fp_offset(arg_names.len())?;
|
||||||
|
|
||||||
|
validate_next!(self, T!['}'], "}");
|
||||||
|
|
||||||
|
Statement::FunDeclare(FunDecl {
|
||||||
|
name: fun_name,
|
||||||
|
fun_stackpos,
|
||||||
|
argnames: arg_names,
|
||||||
|
body: body.into(),
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
// Either a variable declaration statement or an expression statement
|
||||||
|
_ => {
|
||||||
|
// To decide if it is a declaration or an expression, a lookahead is needed
|
||||||
|
let first = self.next();
|
||||||
|
|
||||||
|
let stmt = match (first, self.peek()) {
|
||||||
|
// Identifier and "<-" is a declaration
|
||||||
|
(T![ident(name)], T![<-]) => {
|
||||||
|
self.next();
|
||||||
|
|
||||||
|
let rhs = self.parse_expr()?;
|
||||||
|
|
||||||
|
let sid = self.string_store.intern_or_lookup(&name);
|
||||||
|
let sp = self.var_stack.len();
|
||||||
|
self.var_stack.push(sid);
|
||||||
|
|
||||||
|
Statement::Declaration(VarDecl {
|
||||||
|
name: sid,
|
||||||
|
var_stackpos: sp,
|
||||||
|
rhs,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
// Anything else must be an expression
|
||||||
|
(first, _) => {
|
||||||
|
// Put the first token back in order for the parse_expr to see it
|
||||||
|
self.putback(first);
|
||||||
|
Statement::Expr(self.parse_expr()?)
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// After a statement, there must be a semicolon
|
||||||
|
validate_next!(self, T![;], ";");
|
||||||
|
|
||||||
|
stmt
|
||||||
|
}
|
||||||
|
};
|
||||||
|
Ok(stmt)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Parse an if statement from the tokens
|
||||||
|
fn parse_if(&mut self) -> ResPE<If> {
|
||||||
|
validate_next!(self, T![if], "if");
|
||||||
|
|
||||||
|
let condition = self.parse_expr()?;
|
||||||
|
|
||||||
|
validate_next!(self, T!['{'], "{");
|
||||||
|
|
||||||
|
let body_true = self.parse_scoped_block()?;
|
||||||
|
|
||||||
|
validate_next!(self, T!['}'], "}");
|
||||||
|
|
||||||
|
let mut body_false = BlockScope::default();
|
||||||
|
|
||||||
|
// Optionally parse the else part
|
||||||
|
if self.peek() == &T![else] {
|
||||||
|
self.next();
|
||||||
|
|
||||||
|
validate_next!(self, T!['{'], "{");
|
||||||
|
|
||||||
|
body_false = self.parse_scoped_block()?;
|
||||||
|
|
||||||
|
validate_next!(self, T!['}'], "}");
|
||||||
|
}
|
||||||
|
|
||||||
|
Ok(If {
|
||||||
|
condition,
|
||||||
|
body_true,
|
||||||
|
body_false,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Parse a loop statement from the tokens
|
||||||
|
fn parse_loop(&mut self) -> ResPE<Loop> {
|
||||||
|
validate_next!(self, T![loop], "loop");
|
||||||
|
|
||||||
|
let mut condition = None;
|
||||||
|
let mut advancement = None;
|
||||||
|
|
||||||
|
// Check if the optional condition is present
|
||||||
|
if !matches!(self.peek(), T!['{']) {
|
||||||
|
condition = Some(self.parse_expr()?);
|
||||||
|
|
||||||
|
// Check if the optional advancement is present
|
||||||
|
if matches!(self.peek(), T![;]) {
|
||||||
|
self.next();
|
||||||
|
advancement = Some(self.parse_expr()?);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
validate_next!(self, T!['{'], "{");
|
||||||
|
|
||||||
|
let body = self.parse_scoped_block()?;
|
||||||
|
|
||||||
|
validate_next!(self, T!['}'], "}");
|
||||||
|
|
||||||
|
Ok(Loop {
|
||||||
|
condition,
|
||||||
|
advancement,
|
||||||
|
body,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Parse a single expression from the tokens
|
||||||
|
fn parse_expr(&mut self) -> ResPE<Expression> {
|
||||||
|
let lhs = self.parse_primary()?;
|
||||||
self.parse_expr_precedence(lhs, 0)
|
self.parse_expr_precedence(lhs, 0)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Parse binary expressions with a precedence equal to or higher than min_prec
|
/// Parse binary expressions with a precedence equal to or higher than min_prec.
|
||||||
fn parse_expr_precedence(&mut self, mut lhs: Ast, min_prec: u8) -> Ast {
|
/// This uses the precedence climbing methode for dealing with the operator precedences:
|
||||||
|
/// https://en.wikipedia.org/wiki/Operator-precedence_parser#Precedence_climbing_method
|
||||||
|
fn parse_expr_precedence(&mut self, mut lhs: Expression, min_prec: u8) -> ResPE<Expression> {
|
||||||
while let Some(binop) = &self.peek().try_to_binop() {
|
while let Some(binop) = &self.peek().try_to_binop() {
|
||||||
|
// Stop if the next operator has a lower binding power
|
||||||
if !(binop.precedence() >= min_prec) {
|
if !(binop.precedence() >= min_prec) {
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
@ -51,89 +381,211 @@ impl<T: Iterator<Item = Token>> Parser<T> {
|
|||||||
// valid
|
// valid
|
||||||
let binop = self.next().try_to_binop().unwrap();
|
let binop = self.next().try_to_binop().unwrap();
|
||||||
|
|
||||||
let mut rhs = self.parse_primary();
|
let mut rhs = self.parse_primary()?;
|
||||||
|
|
||||||
while let Some(binop2) = &self.peek().try_to_binop() {
|
while let Some(binop2) = &self.peek().try_to_binop() {
|
||||||
if !(binop2.precedence() > binop.precedence()) {
|
if !(binop2.precedence() > binop.precedence()) {
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
|
||||||
rhs = self.parse_expr_precedence(rhs, binop.precedence() + 1);
|
rhs = self.parse_expr_precedence(rhs, binop.precedence() + 1)?;
|
||||||
}
|
}
|
||||||
|
|
||||||
lhs = Ast::BinOp(binop, lhs.into(), rhs.into());
|
lhs = Expression::BinOp(binop, lhs.into(), rhs.into());
|
||||||
}
|
}
|
||||||
|
|
||||||
lhs
|
Ok(lhs)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Parse a primary expression (for now only number)
|
/// Parse a primary expression. A primary can be a literal value, variable, function call,
|
||||||
fn parse_primary(&mut self) -> Ast {
|
/// array indexing, parentheses grouping or a unary operation
|
||||||
match self.next() {
|
fn parse_primary(&mut self) -> ResPE<Expression> {
|
||||||
Token::I64(val) => Ast::I64(val),
|
let primary = match self.next() {
|
||||||
|
// Literal i64
|
||||||
|
T![i64(val)] => Expression::I64(val),
|
||||||
|
|
||||||
tok => panic!("Error parsing primary expr: Unexpected Token '{:?}'", tok),
|
// Literal String
|
||||||
|
T![str(text)] => Expression::String(self.string_store.intern_or_lookup(&text)),
|
||||||
|
|
||||||
|
// Array literal. Square brackets containing the array size as expression
|
||||||
|
T!['['] => {
|
||||||
|
let size = self.parse_expr()?;
|
||||||
|
|
||||||
|
validate_next!(self, T![']'], "]");
|
||||||
|
|
||||||
|
Expression::ArrayLiteral(size.into())
|
||||||
|
}
|
||||||
|
|
||||||
|
// Array sccess, aka indexing. An ident followed by square brackets containing the
|
||||||
|
// index as an expression
|
||||||
|
T![ident(name)] if self.peek() == &T!['['] => {
|
||||||
|
// Get the stack position of the array variable
|
||||||
|
let sid = self.string_store.intern_or_lookup(&name);
|
||||||
|
let stackpos = self.get_stackpos(sid)?;
|
||||||
|
|
||||||
|
self.next();
|
||||||
|
|
||||||
|
let index = self.parse_expr()?;
|
||||||
|
|
||||||
|
validate_next!(self, T![']'], "]");
|
||||||
|
|
||||||
|
Expression::ArrayAccess(sid, stackpos, index.into())
|
||||||
|
}
|
||||||
|
|
||||||
|
// Identifier followed by parenthesis is a function call
|
||||||
|
T![ident(name)] if self.peek() == &T!['('] => {
|
||||||
|
// Skip the opening parenthesis
|
||||||
|
self.next();
|
||||||
|
|
||||||
|
let sid = self.string_store.intern_or_lookup(&name);
|
||||||
|
|
||||||
|
let mut args = Vec::new();
|
||||||
|
|
||||||
|
// Parse the arguments as expressions
|
||||||
|
while !matches!(self.peek(), T![')']) {
|
||||||
|
let arg = self.parse_expr()?;
|
||||||
|
args.push(arg);
|
||||||
|
|
||||||
|
// If there are more args skip the comma so that the loop will read the argname
|
||||||
|
if self.peek() == &T![,] {
|
||||||
|
self.next();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Get the next Token without removing it
|
validate_next!(self, T![')'], ")");
|
||||||
|
|
||||||
|
// Find the function stack position
|
||||||
|
let fun_stackpos = self.get_fun_stackpos(sid)?;
|
||||||
|
|
||||||
|
Expression::FunCall(sid, fun_stackpos, args)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Just an identifier is a variable
|
||||||
|
T![ident(name)] => {
|
||||||
|
// Find the variable stack position
|
||||||
|
let sid = self.string_store.intern_or_lookup(&name);
|
||||||
|
let stackpos = self.get_stackpos(sid)?;
|
||||||
|
|
||||||
|
Expression::Var(sid, stackpos)
|
||||||
|
}
|
||||||
|
|
||||||
|
// Parentheses grouping
|
||||||
|
T!['('] => {
|
||||||
|
// Contained inbetween the parentheses can be any other expression
|
||||||
|
let inner_expr = self.parse_expr()?;
|
||||||
|
|
||||||
|
// Verify that there is a closing parenthesis
|
||||||
|
validate_next!(self, T![')'], ")");
|
||||||
|
|
||||||
|
inner_expr
|
||||||
|
}
|
||||||
|
|
||||||
|
// Unary operations or invalid token
|
||||||
|
tok => match tok.try_to_unop() {
|
||||||
|
// If the token is a valid unary operation, parse it as such
|
||||||
|
Some(uot) => Expression::UnOp(uot, self.parse_primary()?.into()),
|
||||||
|
|
||||||
|
// Otherwise it's an unexpected token
|
||||||
|
None => return Err(ParseErr::UnexpectedToken(tok, "primary".to_string())),
|
||||||
|
},
|
||||||
|
};
|
||||||
|
|
||||||
|
Ok(primary)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Try to get the position of a variable on the variable stack. This is needed to precalculate
|
||||||
|
/// the stackpositions in order to save time when executing
|
||||||
|
fn get_stackpos(&self, varid: Sid) -> ResPE<usize> {
|
||||||
|
self.var_stack
|
||||||
|
.iter()
|
||||||
|
.rev()
|
||||||
|
.position(|it| *it == varid)
|
||||||
|
.map(|it| it)
|
||||||
|
.ok_or(ParseErr::UseOfUndeclaredVar(
|
||||||
|
self.string_store
|
||||||
|
.lookup(varid)
|
||||||
|
.map(String::from)
|
||||||
|
.unwrap_or("<unknown>".to_string()),
|
||||||
|
))
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Try to get the position of a function on the function stack. This is needed to precalculate
|
||||||
|
/// the stackpositions in order to save time when executing
|
||||||
|
fn get_fun_stackpos(&self, varid: Sid) -> ResPE<usize> {
|
||||||
|
self.fun_stack
|
||||||
|
.iter()
|
||||||
|
.rev()
|
||||||
|
.position(|it| *it == varid)
|
||||||
|
.map(|it| self.fun_stack.len() - it - 1)
|
||||||
|
.ok_or(ParseErr::UseOfUndeclaredFun(
|
||||||
|
self.string_store
|
||||||
|
.lookup(varid)
|
||||||
|
.map(String::from)
|
||||||
|
.unwrap_or("<unknown>".to_string()),
|
||||||
|
))
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Get the next Token without removing it. If there are no more tokens left, the EoF token is
|
||||||
|
/// returned. This follows the same reasoning as in the Lexer
|
||||||
fn peek(&mut self) -> &Token {
|
fn peek(&mut self) -> &Token {
|
||||||
self.tokens.peek().unwrap_or(&Token::EoF)
|
self.tokens.peek().unwrap_or(&T![EoF])
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Advance to next Token and return the removed Token
|
/// Put a single token back into the token stream
|
||||||
|
fn putback(&mut self, tok: Token) {
|
||||||
|
self.tokens.putback(tok);
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Advance to next Token and return the removed Token. If there are no more tokens left, the
|
||||||
|
/// EoF token is returned. This follows the same reasoning as in the Lexer
|
||||||
fn next(&mut self) -> Token {
|
fn next(&mut self) -> Token {
|
||||||
self.tokens.next().unwrap_or(Token::EoF)
|
self.tokens.next().unwrap_or(T![EoF])
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
pub fn parse<T: Iterator<Item = Token>, A: IntoIterator<IntoIter = T>>(tokens: A) -> Ast {
|
|
||||||
let mut parser = Parser::new(tokens);
|
|
||||||
parser.parse()
|
|
||||||
}
|
|
||||||
|
|
||||||
impl BinOpType {
|
|
||||||
/// Get the precedence for a binary operator. Higher value means the OP is stronger binding.
|
|
||||||
/// For example Multiplication is stronger than addition, so Mul has higher precedence than Add.
|
|
||||||
fn precedence(&self) -> u8 {
|
|
||||||
match self {
|
|
||||||
BinOpType::Add => 0,
|
|
||||||
BinOpType::Mul => 1,
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
#[cfg(test)]
|
#[cfg(test)]
|
||||||
mod tests {
|
mod tests {
|
||||||
use super::{parse, Ast, BinOpType};
|
use crate::{
|
||||||
use crate::lexer::Token;
|
ast::{BinOpType, Expression, Statement},
|
||||||
|
parser::parse,
|
||||||
|
T,
|
||||||
|
};
|
||||||
|
|
||||||
|
/// A very simple test to check if the parser correctly parses a simple expression
|
||||||
#[test]
|
#[test]
|
||||||
fn test_parser() {
|
fn test_parser() {
|
||||||
// Expression: 1 + 2 * 3 + 4
|
// Expression: 1 + 2 * 3 - 4
|
||||||
// With precedence: (1 + (2 * 3)) + 4
|
// With precedence: (1 + (2 * 3)) - 4
|
||||||
let tokens = [
|
let tokens = [
|
||||||
Token::I64(1),
|
T![i64(1)],
|
||||||
Token::Add,
|
T![+],
|
||||||
Token::I64(2),
|
T![i64(2)],
|
||||||
Token::Mul,
|
T![*],
|
||||||
Token::I64(3),
|
T![i64(3)],
|
||||||
Token::Add,
|
T![-],
|
||||||
Token::I64(4),
|
T![i64(4)],
|
||||||
|
T![;],
|
||||||
];
|
];
|
||||||
|
|
||||||
let expected = Ast::BinOp(
|
let expected = Statement::Expr(Expression::BinOp(
|
||||||
|
BinOpType::Sub,
|
||||||
|
Expression::BinOp(
|
||||||
BinOpType::Add,
|
BinOpType::Add,
|
||||||
Ast::BinOp(
|
Expression::I64(1).into(),
|
||||||
BinOpType::Add,
|
Expression::BinOp(
|
||||||
Ast::I64(1).into(),
|
BinOpType::Mul,
|
||||||
Ast::BinOp(BinOpType::Mul, Ast::I64(2).into(), Ast::I64(3).into()).into(),
|
Expression::I64(2).into(),
|
||||||
|
Expression::I64(3).into(),
|
||||||
)
|
)
|
||||||
.into(),
|
.into(),
|
||||||
Ast::I64(4).into(),
|
)
|
||||||
);
|
.into(),
|
||||||
|
Expression::I64(4).into(),
|
||||||
|
));
|
||||||
|
|
||||||
let actual = parse(tokens);
|
let expected = vec![expected];
|
||||||
assert_eq!(expected, actual);
|
|
||||||
|
let actual = parse(tokens).unwrap();
|
||||||
|
assert_eq!(expected, actual.main);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
104
src/stringstore.rs
Normal file
104
src/stringstore.rs
Normal file
@ -0,0 +1,104 @@
|
|||||||
|
use std::collections::HashMap;
|
||||||
|
|
||||||
|
/// A StringID that identifies a String inside the stringstore. This is only valid for the
|
||||||
|
/// StringStore that created the ID. These StringIDs can be trivialy and cheaply copied
|
||||||
|
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord)]
|
||||||
|
pub struct Sid(usize);
|
||||||
|
|
||||||
|
/// A Datastructure that stores strings, handing out StringIDs that can be used to retrieve the
|
||||||
|
/// real strings at a later point. This is called interning.
|
||||||
|
#[derive(Clone, Default)]
|
||||||
|
pub struct StringStore {
|
||||||
|
/// The actual strings that are stored in the StringStore. The StringIDs match the index of the
|
||||||
|
/// string inside of this strings vector
|
||||||
|
strings: Vec<String>,
|
||||||
|
/// A Hashmap that allows to match already interned Strings to their StringID. This allows for
|
||||||
|
/// deduplication since the same string won't be stored twice
|
||||||
|
sids: HashMap<String, Sid>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl StringStore {
|
||||||
|
|
||||||
|
/// Create a new empty StringStore
|
||||||
|
pub fn new() -> Self {
|
||||||
|
Self { strings: Vec::new(), sids: HashMap::new() }
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Put the given string into the StringStore and get a StringID in return. If the string is
|
||||||
|
/// not yet stored, it will be after this.
|
||||||
|
///
|
||||||
|
/// Note: The generated StringIDs are only valid for the StringStore that created them. Using
|
||||||
|
/// the IDs with another StringStore is undefined behavior. It might return wrong Strings or
|
||||||
|
/// None.
|
||||||
|
pub fn intern_or_lookup(&mut self, text: &str) -> Sid {
|
||||||
|
self.sids.get(text).copied().unwrap_or_else(|| {
|
||||||
|
let sid = Sid(self.strings.len());
|
||||||
|
self.strings.push(text.to_string());
|
||||||
|
self.sids.insert(text.to_string(), sid);
|
||||||
|
sid
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Lookup and retrieve a string by the StringID. If the String is not found, None is returned.
|
||||||
|
///
|
||||||
|
/// Note: The generated StringIDs are only valid for the StringStore that created them. Using
|
||||||
|
/// the IDs with another StringStore is undefined behavior. It might return wrong Strings or
|
||||||
|
/// None.
|
||||||
|
pub fn lookup(&self, sid: Sid) -> Option<&String> {
|
||||||
|
self.strings.get(sid.0)
|
||||||
|
}
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
|
#[cfg(test)]
|
||||||
|
mod tests {
|
||||||
|
use super::StringStore;
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_stringstore_intern_lookup() {
|
||||||
|
let mut ss = StringStore::new();
|
||||||
|
let s1 = "Hello";
|
||||||
|
let s2 = "World";
|
||||||
|
|
||||||
|
let id1 = ss.intern_or_lookup(s1);
|
||||||
|
assert_eq!(ss.lookup(id1).unwrap().as_str(), s1);
|
||||||
|
|
||||||
|
let id2 = ss.intern_or_lookup(s2);
|
||||||
|
assert_eq!(ss.lookup(id2).unwrap().as_str(), s2);
|
||||||
|
assert_eq!(ss.lookup(id1).unwrap().as_str(), s1);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn test_stringstore_no_duplicates() {
|
||||||
|
let mut ss = StringStore::new();
|
||||||
|
let s1 = "Hello";
|
||||||
|
let s2 = "World";
|
||||||
|
|
||||||
|
let id1_1 = ss.intern_or_lookup(s1);
|
||||||
|
assert_eq!(ss.lookup(id1_1).unwrap().as_str(), s1);
|
||||||
|
|
||||||
|
let id1_2 = ss.intern_or_lookup(s1);
|
||||||
|
assert_eq!(ss.lookup(id1_2).unwrap().as_str(), s1);
|
||||||
|
|
||||||
|
// Check that the string is the same
|
||||||
|
assert_eq!(id1_1, id1_2);
|
||||||
|
|
||||||
|
// Check that only one string is actually stored
|
||||||
|
assert_eq!(ss.strings.len(), 1);
|
||||||
|
assert_eq!(ss.sids.len(), 1);
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
let id2_1 = ss.intern_or_lookup(s2);
|
||||||
|
assert_eq!(ss.lookup(id2_1).unwrap().as_str(), s2);
|
||||||
|
|
||||||
|
let id2_2 = ss.intern_or_lookup(s2);
|
||||||
|
assert_eq!(ss.lookup(id2_2).unwrap().as_str(), s2);
|
||||||
|
|
||||||
|
// Check that the string is the same
|
||||||
|
assert_eq!(id2_1, id2_2);
|
||||||
|
|
||||||
|
assert_eq!(ss.strings.len(), 2);
|
||||||
|
assert_eq!(ss.sids.len(), 2);
|
||||||
|
}
|
||||||
|
}
|
||||||
379
src/token.rs
Normal file
379
src/token.rs
Normal file
@ -0,0 +1,379 @@
|
|||||||
|
use crate::{
|
||||||
|
ast::{BinOpType, UnOpType},
|
||||||
|
T,
|
||||||
|
};
|
||||||
|
|
||||||
|
/// Language keywords
|
||||||
|
#[derive(Debug, PartialEq, Eq)]
|
||||||
|
pub enum Keyword {
|
||||||
|
/// Loop keyword ("loop")
|
||||||
|
Loop,
|
||||||
|
/// Print keyword ("print")
|
||||||
|
Print,
|
||||||
|
/// If keyword ("if")
|
||||||
|
If,
|
||||||
|
/// Else keyword ("else")
|
||||||
|
Else,
|
||||||
|
/// Function declaration keyword ("fun")
|
||||||
|
Fun,
|
||||||
|
/// Return keyword ("return")
|
||||||
|
Return,
|
||||||
|
/// Break keyword ("break")
|
||||||
|
Break,
|
||||||
|
/// Continue keyword ("continue")
|
||||||
|
Continue,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Literal values
|
||||||
|
#[derive(Debug, PartialEq, Eq)]
|
||||||
|
pub enum Literal {
|
||||||
|
/// Integer literal (64-bit)
|
||||||
|
I64(i64),
|
||||||
|
/// String literal
|
||||||
|
String(String),
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Combined tokens that consist of a combination of characters
|
||||||
|
#[derive(Debug, PartialEq, Eq)]
|
||||||
|
pub enum Combo {
|
||||||
|
/// Equal Equal ("==")
|
||||||
|
Equal2,
|
||||||
|
|
||||||
|
/// Exclamation mark Equal ("!=")
|
||||||
|
ExclamationMarkEqual,
|
||||||
|
|
||||||
|
/// Ampersand Ampersand ("&&")
|
||||||
|
Ampersand2,
|
||||||
|
|
||||||
|
/// Pipe Pipe ("||")
|
||||||
|
Pipe2,
|
||||||
|
|
||||||
|
/// LessThan LessThan ("<<")
|
||||||
|
LessThan2,
|
||||||
|
|
||||||
|
/// GreaterThan GreaterThan (">>")
|
||||||
|
GreaterThan2,
|
||||||
|
|
||||||
|
/// LessThan Equal ("<=")
|
||||||
|
LessThanEqual,
|
||||||
|
|
||||||
|
/// GreaterThan Equal (">=")
|
||||||
|
GreaterThanEqual,
|
||||||
|
|
||||||
|
/// LessThan Minus ("<-")
|
||||||
|
LessThanMinus,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Tokens are a group of one or more sourcecode characters that have a meaning together
|
||||||
|
#[derive(Debug, PartialEq, Eq)]
|
||||||
|
pub enum Token {
|
||||||
|
/// Literal value token
|
||||||
|
Literal(Literal),
|
||||||
|
|
||||||
|
/// Keyword token
|
||||||
|
Keyword(Keyword),
|
||||||
|
|
||||||
|
/// Identifier token (names for variables, functions, ...)
|
||||||
|
Ident(String),
|
||||||
|
|
||||||
|
/// Combined tokens consisting of multiple characters
|
||||||
|
Combo(Combo),
|
||||||
|
|
||||||
|
/// Comma (",")
|
||||||
|
Comma,
|
||||||
|
|
||||||
|
/// Equal Sign ("=")
|
||||||
|
Equal,
|
||||||
|
|
||||||
|
/// Semicolon (";")
|
||||||
|
Semicolon,
|
||||||
|
|
||||||
|
/// End of file (This is not generated by the lexer, but the parser uses this to find the
|
||||||
|
/// end of the token stream)
|
||||||
|
EoF,
|
||||||
|
|
||||||
|
/// Left Bracket ("[")
|
||||||
|
LBracket,
|
||||||
|
|
||||||
|
/// Right Bracket ("]")
|
||||||
|
RBracket,
|
||||||
|
|
||||||
|
/// Left Parenthesis ("(")
|
||||||
|
LParen,
|
||||||
|
|
||||||
|
/// Right Parenthesis (")"")
|
||||||
|
RParen,
|
||||||
|
|
||||||
|
/// Left curly braces ("{")
|
||||||
|
LBraces,
|
||||||
|
|
||||||
|
/// Right curly braces ("}")
|
||||||
|
RBraces,
|
||||||
|
|
||||||
|
/// Plus ("+")
|
||||||
|
Plus,
|
||||||
|
|
||||||
|
/// Minus ("-")
|
||||||
|
Minus,
|
||||||
|
|
||||||
|
/// Asterisk ("*")
|
||||||
|
Asterisk,
|
||||||
|
|
||||||
|
/// Slash ("/")
|
||||||
|
Slash,
|
||||||
|
|
||||||
|
/// Percent ("%")
|
||||||
|
Percent,
|
||||||
|
|
||||||
|
/// Pipe ("|")
|
||||||
|
Pipe,
|
||||||
|
|
||||||
|
/// Tilde ("~")
|
||||||
|
Tilde,
|
||||||
|
|
||||||
|
/// Logical not ("!")
|
||||||
|
Exclamationmark,
|
||||||
|
|
||||||
|
/// Left angle bracket ("<")
|
||||||
|
LessThan,
|
||||||
|
|
||||||
|
/// Right angle bracket (">")
|
||||||
|
GreaterThan,
|
||||||
|
|
||||||
|
/// Ampersand ("&")
|
||||||
|
Ampersand,
|
||||||
|
|
||||||
|
/// Circumflex ("^")
|
||||||
|
Circumflex,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Token {
|
||||||
|
/// If the Token can be used as a binary operation type, get the matching BinOpType. Otherwise
|
||||||
|
/// return None.
|
||||||
|
pub fn try_to_binop(&self) -> Option<BinOpType> {
|
||||||
|
Some(match self {
|
||||||
|
T![+] => BinOpType::Add,
|
||||||
|
T![-] => BinOpType::Sub,
|
||||||
|
|
||||||
|
T![*] => BinOpType::Mul,
|
||||||
|
T![/] => BinOpType::Div,
|
||||||
|
T![%] => BinOpType::Mod,
|
||||||
|
|
||||||
|
T![&] => BinOpType::BAnd,
|
||||||
|
T![|] => BinOpType::BOr,
|
||||||
|
T![^] => BinOpType::BXor,
|
||||||
|
|
||||||
|
T![&&] => BinOpType::LAnd,
|
||||||
|
T![||] => BinOpType::LOr,
|
||||||
|
|
||||||
|
T![<<] => BinOpType::Shl,
|
||||||
|
T![>>] => BinOpType::Shr,
|
||||||
|
|
||||||
|
T![==] => BinOpType::EquEqu,
|
||||||
|
T![!=] => BinOpType::NotEqu,
|
||||||
|
|
||||||
|
T![<] => BinOpType::Less,
|
||||||
|
T![<=] => BinOpType::LessEqu,
|
||||||
|
|
||||||
|
T![>] => BinOpType::Greater,
|
||||||
|
T![>=] => BinOpType::GreaterEqu,
|
||||||
|
|
||||||
|
T![=] => BinOpType::Assign,
|
||||||
|
|
||||||
|
_ => return None,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
/// If the token can be used as a unary operation type, get the matching UnOpType. Otherwise
|
||||||
|
/// return None
|
||||||
|
pub fn try_to_unop(&self) -> Option<UnOpType> {
|
||||||
|
Some(match self {
|
||||||
|
T![-] => UnOpType::Negate,
|
||||||
|
T![!] => UnOpType::LNot,
|
||||||
|
T![~] => UnOpType::BNot,
|
||||||
|
|
||||||
|
_ => return None,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Macro to quickly create a token of the specified kind. As this is implemented as a macro, it
|
||||||
|
/// can be used anywhere including in patterns.
|
||||||
|
///
|
||||||
|
/// An implementation should exist for each token, so that there is no need to ever write out the
|
||||||
|
/// long token definitions.
|
||||||
|
#[macro_export]
|
||||||
|
macro_rules! T {
|
||||||
|
// Keywords
|
||||||
|
[loop] => {
|
||||||
|
crate::token::Token::Keyword(crate::token::Keyword::Loop)
|
||||||
|
};
|
||||||
|
|
||||||
|
[print] => {
|
||||||
|
crate::token::Token::Keyword(crate::token::Keyword::Print)
|
||||||
|
};
|
||||||
|
|
||||||
|
[if] => {
|
||||||
|
crate::token::Token::Keyword(crate::token::Keyword::If)
|
||||||
|
};
|
||||||
|
|
||||||
|
[else] => {
|
||||||
|
crate::token::Token::Keyword(crate::token::Keyword::Else)
|
||||||
|
};
|
||||||
|
|
||||||
|
[fun] => {
|
||||||
|
crate::token::Token::Keyword(crate::token::Keyword::Fun)
|
||||||
|
};
|
||||||
|
|
||||||
|
[return] => {
|
||||||
|
crate::token::Token::Keyword(crate::token::Keyword::Return)
|
||||||
|
};
|
||||||
|
|
||||||
|
[break] => {
|
||||||
|
crate::token::Token::Keyword(crate::token::Keyword::Break)
|
||||||
|
};
|
||||||
|
|
||||||
|
[continue] => {
|
||||||
|
crate::token::Token::Keyword(crate::token::Keyword::Continue)
|
||||||
|
};
|
||||||
|
|
||||||
|
// Literals
|
||||||
|
[i64($($val:tt)*)] => {
|
||||||
|
crate::token::Token::Literal(crate::token::Literal::I64($($val)*))
|
||||||
|
};
|
||||||
|
|
||||||
|
[str($($val:tt)*)] => {
|
||||||
|
crate::token::Token::Literal(crate::token::Literal::String($($val)*))
|
||||||
|
};
|
||||||
|
|
||||||
|
// Ident
|
||||||
|
[ident($($val:tt)*)] => {
|
||||||
|
crate::token::Token::Ident($($val)*)
|
||||||
|
};
|
||||||
|
|
||||||
|
// Combo crate::token::Tokens
|
||||||
|
[==] => {
|
||||||
|
crate::token::Token::Combo(crate::token::Combo::Equal2)
|
||||||
|
};
|
||||||
|
|
||||||
|
[!=] => {
|
||||||
|
crate::token::Token::Combo(crate::token::Combo::ExclamationMarkEqual)
|
||||||
|
};
|
||||||
|
|
||||||
|
[&&] => {
|
||||||
|
crate::token::Token::Combo(crate::token::Combo::Ampersand2)
|
||||||
|
};
|
||||||
|
|
||||||
|
[||] => {
|
||||||
|
crate::token::Token::Combo(crate::token::Combo::Pipe2)
|
||||||
|
};
|
||||||
|
|
||||||
|
[<<] => {
|
||||||
|
crate::token::Token::Combo(crate::token::Combo::LessThan2)
|
||||||
|
};
|
||||||
|
|
||||||
|
[>>] => {
|
||||||
|
crate::token::Token::Combo(crate::token::Combo::GreaterThan2)
|
||||||
|
};
|
||||||
|
|
||||||
|
[<=] => {
|
||||||
|
crate::token::Token::Combo(crate::token::Combo::LessThanEqual)
|
||||||
|
};
|
||||||
|
|
||||||
|
[>=] => {
|
||||||
|
crate::token::Token::Combo(crate::token::Combo::GreaterThanEqual)
|
||||||
|
};
|
||||||
|
|
||||||
|
[<-] => {
|
||||||
|
crate::token::Token::Combo(crate::token::Combo::LessThanMinus)
|
||||||
|
};
|
||||||
|
|
||||||
|
// Normal Tokens
|
||||||
|
[,] => {
|
||||||
|
crate::token::Token::Comma
|
||||||
|
};
|
||||||
|
|
||||||
|
[=] => {
|
||||||
|
crate::token::Token::Equal
|
||||||
|
};
|
||||||
|
|
||||||
|
[;] => {
|
||||||
|
crate::token::Token::Semicolon
|
||||||
|
};
|
||||||
|
|
||||||
|
[EoF] => {
|
||||||
|
crate::token::Token::EoF
|
||||||
|
};
|
||||||
|
|
||||||
|
['['] => {
|
||||||
|
crate::token::Token::LBracket
|
||||||
|
};
|
||||||
|
|
||||||
|
[']'] => {
|
||||||
|
crate::token::Token::RBracket
|
||||||
|
};
|
||||||
|
|
||||||
|
['('] => {
|
||||||
|
crate::token::Token::LParen
|
||||||
|
};
|
||||||
|
|
||||||
|
[')'] => {
|
||||||
|
crate::token::Token::RParen
|
||||||
|
};
|
||||||
|
|
||||||
|
['{'] => {
|
||||||
|
crate::token::Token::LBraces
|
||||||
|
};
|
||||||
|
|
||||||
|
['}'] => {
|
||||||
|
crate::token::Token::RBraces
|
||||||
|
};
|
||||||
|
|
||||||
|
[+] => {
|
||||||
|
crate::token::Token::Plus
|
||||||
|
};
|
||||||
|
|
||||||
|
[-] => {
|
||||||
|
crate::token::Token::Minus
|
||||||
|
};
|
||||||
|
|
||||||
|
[*] => {
|
||||||
|
crate::token::Token::Asterisk
|
||||||
|
};
|
||||||
|
|
||||||
|
[/] => {
|
||||||
|
crate::token::Token::Slash
|
||||||
|
};
|
||||||
|
|
||||||
|
[%] => {
|
||||||
|
crate::token::Token::Percent
|
||||||
|
};
|
||||||
|
|
||||||
|
[|] => {
|
||||||
|
crate::token::Token::Pipe
|
||||||
|
};
|
||||||
|
|
||||||
|
[~] => {
|
||||||
|
crate::token::Token::Tilde
|
||||||
|
};
|
||||||
|
|
||||||
|
[!] => {
|
||||||
|
crate::token::Token::Exclamationmark
|
||||||
|
};
|
||||||
|
|
||||||
|
[<] => {
|
||||||
|
crate::token::Token::LessThan
|
||||||
|
};
|
||||||
|
|
||||||
|
[>] => {
|
||||||
|
crate::token::Token::GreaterThan
|
||||||
|
};
|
||||||
|
|
||||||
|
[&] => {
|
||||||
|
crate::token::Token::Ampersand
|
||||||
|
};
|
||||||
|
|
||||||
|
[^] => {
|
||||||
|
crate::token::Token::Circumflex
|
||||||
|
};
|
||||||
|
}
|
||||||
167
src/util.rs
Normal file
167
src/util.rs
Normal file
@ -0,0 +1,167 @@
|
|||||||
|
/// Exit the program with error code 1 and format-print the given text on stderr. This pretty much
|
||||||
|
/// works like panic, but doesn't show the additional information that panic adds. Those can be
|
||||||
|
/// interesting for debugging, but don't look that great when building a release executable for an
|
||||||
|
/// end user.
|
||||||
|
/// When running tests or running in debug mode, panic is used to ensure the tests working
|
||||||
|
/// correctly.
|
||||||
|
#[macro_export]
|
||||||
|
macro_rules! nice_panic {
|
||||||
|
($fmt:expr) => {
|
||||||
|
{
|
||||||
|
if cfg!(test) || cfg!(debug_assertions) {
|
||||||
|
panic!($fmt);
|
||||||
|
} else {
|
||||||
|
eprintln!($fmt);
|
||||||
|
std::process::exit(1);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
};
|
||||||
|
($fmt:expr, $($arg:tt)*) => {
|
||||||
|
{
|
||||||
|
if cfg!(test) || cfg!(debug_assertions) {
|
||||||
|
panic!($fmt, $($arg)*);
|
||||||
|
} else {
|
||||||
|
eprintln!($fmt, $($arg)*);
|
||||||
|
std::process::exit(1);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
/// The PutBackIter allows for items to be put back back and to be peeked. Putting an item back
|
||||||
|
/// will cause it to be the next item returned by `next`. Peeking an item will get a reference to
|
||||||
|
/// the next item in the iterator without removing it.
|
||||||
|
///
|
||||||
|
/// The whole PutBackIter behaves analogous to `std::iter::Peekable` with the addition of the
|
||||||
|
/// `putback` function. This is slightly slower than `Peekable`, but allows for an unlimited number
|
||||||
|
/// of putbacks and therefore an unlimited look-ahead range.
|
||||||
|
pub struct PutBackIter<T: Iterator> {
|
||||||
|
iter: T,
|
||||||
|
putback_stack: Vec<T::Item>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl<T> PutBackIter<T>
|
||||||
|
where
|
||||||
|
T: Iterator,
|
||||||
|
{
|
||||||
|
/// Make the given iterator putbackable, wrapping it in the PutBackIter type. This effectively
|
||||||
|
/// adds the `peek` and `putback` functions.
|
||||||
|
pub fn new(iter: T) -> Self {
|
||||||
|
Self {
|
||||||
|
iter,
|
||||||
|
putback_stack: Vec::new(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Put the given item back into the iterator. This causes the putbacked items to be returned by
|
||||||
|
/// next in last-in-first-out order (aka. stack order). Only after all previously putback items
|
||||||
|
/// have been returned, the actual underlying iterator is used to get items.
|
||||||
|
/// The number of items that can be put back is unlimited.
|
||||||
|
pub fn putback(&mut self, it: T::Item) {
|
||||||
|
self.putback_stack.push(it);
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Peek the next item, getting a reference to it without removing it from the iterator. This
|
||||||
|
/// also includes items that were previsouly put back and not yet removed.
|
||||||
|
pub fn peek(&mut self) -> Option<&T::Item> {
|
||||||
|
if self.putback_stack.is_empty() {
|
||||||
|
let it = self.next()?;
|
||||||
|
self.putback(it);
|
||||||
|
}
|
||||||
|
|
||||||
|
self.putback_stack.last()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl<T> Iterator for PutBackIter<T>
|
||||||
|
where
|
||||||
|
T: Iterator,
|
||||||
|
{
|
||||||
|
type Item = T::Item;
|
||||||
|
|
||||||
|
fn next(&mut self) -> Option<Self::Item> {
|
||||||
|
match self.putback_stack.pop() {
|
||||||
|
Some(it) => Some(it),
|
||||||
|
None => self.iter.next(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
pub trait PutBackableExt {
|
||||||
|
/// Make the iterator putbackable, wrapping it in the PutBackIter type. This effectively
|
||||||
|
/// adds the `peek` and `putback` functions.
|
||||||
|
fn putbackable(self) -> PutBackIter<Self>
|
||||||
|
where
|
||||||
|
Self: Iterator + Sized,
|
||||||
|
{
|
||||||
|
PutBackIter::new(self)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl<T: Iterator> PutBackableExt for T {}
|
||||||
|
|
||||||
|
#[cfg(test)]
|
||||||
|
mod tests {
|
||||||
|
use super::PutBackableExt;
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn putback_iter_next() {
|
||||||
|
let mut iter = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10].into_iter();
|
||||||
|
let mut pb_iter = iter.clone().putbackable();
|
||||||
|
|
||||||
|
// Check if next works
|
||||||
|
for _ in 0..iter.len() {
|
||||||
|
assert_eq!(pb_iter.next(), iter.next());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn putback_iter_peek() {
|
||||||
|
let mut iter_orig = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10].into_iter();
|
||||||
|
let mut iter = iter_orig.clone();
|
||||||
|
let mut pb_iter = iter.clone().putbackable();
|
||||||
|
|
||||||
|
for _ in 0..iter.len() {
|
||||||
|
// Check if peek gives a preview of the actual next element
|
||||||
|
assert_eq!(pb_iter.peek(), iter.next().as_ref());
|
||||||
|
// Check if next still returns the next (just peeked) element and not the one after
|
||||||
|
assert_eq!(pb_iter.next(), iter_orig.next());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn putback_iter_putback() {
|
||||||
|
let mut iter_orig = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10].into_iter();
|
||||||
|
let mut iter = iter_orig.clone();
|
||||||
|
let mut pb_iter = iter.clone().putbackable();
|
||||||
|
|
||||||
|
// Get the first 5 items with next and check if they match
|
||||||
|
let it0 = pb_iter.next();
|
||||||
|
assert_eq!(it0, iter.next());
|
||||||
|
let it1 = pb_iter.next();
|
||||||
|
assert_eq!(it1, iter.next());
|
||||||
|
let it2 = pb_iter.next();
|
||||||
|
assert_eq!(it2, iter.next());
|
||||||
|
let it3 = pb_iter.next();
|
||||||
|
assert_eq!(it3, iter.next());
|
||||||
|
let it4 = pb_iter.next();
|
||||||
|
assert_eq!(it4, iter.next());
|
||||||
|
|
||||||
|
// Put one value back and check if `next` works as expected, returning the just put back
|
||||||
|
// item
|
||||||
|
pb_iter.putback(it0.unwrap());
|
||||||
|
assert_eq!(pb_iter.next(), it0);
|
||||||
|
|
||||||
|
// Put all values back
|
||||||
|
pb_iter.putback(it4.unwrap());
|
||||||
|
pb_iter.putback(it3.unwrap());
|
||||||
|
pb_iter.putback(it2.unwrap());
|
||||||
|
pb_iter.putback(it1.unwrap());
|
||||||
|
pb_iter.putback(it0.unwrap());
|
||||||
|
|
||||||
|
// After all values have been put back, the iter should match the original again
|
||||||
|
for _ in 0..iter.len() {
|
||||||
|
assert_eq!(pb_iter.next(), iter_orig.next());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
Loading…
x
Reference in New Issue
Block a user