Compare commits
39 Commits
master
...
feature-fu
| Author | SHA1 | Date | |
|---|---|---|---|
| e0c72003f9 | |||
| 638610d310 | |||
| 85339db25e | |||
| 88c5be6439 | |||
| 75e326e343 | |||
| 7490494bbf | |||
| 2a04a28f97 | |||
| b6d615b507 | |||
| 2946e67349 | |||
| 3c0e5f6b4d | |||
| dd6129bb00 | |||
| 308bc5b34e | |||
| 771a858da3 | |||
| eebe4a4c1c | |||
| 28d7f3ec03 | |||
| 64bd2341b8 | |||
| 9a7de0a1c6 | |||
| 99e462f4b5 | |||
| 4357a1eb55 | |||
| c4f5b89456 | |||
| c49a5ec0e2 | |||
| 49ada446f8 | |||
| 2ea2aa5203 | |||
| 14e8a0b507 | |||
| 07636d420c | |||
| 4ad16a71f4 | |||
| 8488e48364 | |||
| e80cae11c9 | |||
| 88ceacd500 | |||
| 1079eb1671 | |||
| 8a1debabe9 | |||
| fabe3ef2ad | |||
| 3535fec208 | |||
| 26c36ed0ae | |||
| 807482583a | |||
| 7b86fecc6f | |||
| 6b91264f84 | |||
| d9246c7ea1 | |||
| 1c4943828f |
@ -4,3 +4,5 @@ version = "0.1.0"
|
||||
edition = "2021"
|
||||
|
||||
[dependencies]
|
||||
anyhow = "1.0.53"
|
||||
thiserror = "1.0.30"
|
||||
|
||||
211
README.md
211
README.md
@ -1,7 +1,212 @@
|
||||
# NEK-Lang
|
||||
|
||||
## Variables
|
||||
Currently all variables are global and completely unscoped. That means no matter where a variable is declared, it remains over the whole remaining runtime of the progam.
|
||||
|
||||
All variables are currently of type `i64` (64-bit signed integer)
|
||||
|
||||
### Declaration
|
||||
- Declare and initialize a new variable
|
||||
- Declaring a previously declared variable again is currently equivalent to an assignment
|
||||
- Declaration is needed before assignment or other usage
|
||||
- The variable name is on the left side of the `<-` operator
|
||||
- The assigned value is on the right side and can be any expression
|
||||
```
|
||||
a <- 123;
|
||||
```
|
||||
Create a new variable named `a` and assign the value `123` to it.
|
||||
|
||||
### Assignment
|
||||
- Assigning a value to a previously declared variable
|
||||
- The variable name is on the left side of the `=` operator
|
||||
- The assigned value is on the right side and can be any expression
|
||||
```
|
||||
a = 123;
|
||||
```
|
||||
The value `123` is assigned to the variable named `a`. `a` needs to be declared before this.
|
||||
|
||||
## Expressions
|
||||
The operator precedence is the same order as in `C` for all implemented operators.
|
||||
Refer to the
|
||||
[C Operator Precedence Table](https://en.cppreference.com/w/c/language/operator_precedence)
|
||||
to see the different precedences.
|
||||
|
||||
### General
|
||||
- Parentheses `(` and `)` can be used to modify evaluation oder just like in any other
|
||||
programming language.
|
||||
- For example `(a + b) * c` will evaluate the addition before the multiplication, despite the multiplication having higher binding power
|
||||
|
||||
### Mathematical Operators
|
||||
Supported mathematical operations:
|
||||
- Addition `a + b`
|
||||
- Subtraction `a - b`
|
||||
- Multiplication `a * b`
|
||||
- Division `a / b`
|
||||
- Modulo `a % b`
|
||||
- Negation `-a`
|
||||
|
||||
### Bitwise Operators
|
||||
- And `a & b`
|
||||
- Or `a | b`
|
||||
- Xor `a ^ b`
|
||||
- Bitshift left (by `b` bits) `a << b`
|
||||
- Bitshift right (by `b` bits) `a >> b`
|
||||
- "Bit flip" (One's complement) `~a`
|
||||
|
||||
### Logical Operators
|
||||
The logical operators evaluate the operands as `false` if they are equal to `0` and `true` if they are not equal to `0`
|
||||
- And `a && b`
|
||||
- Or `a || b`
|
||||
- Not `!a` (if `a` is equal to `0`, the result is `1`, otherwise the result is `0`)
|
||||
|
||||
### Equality & Relational Operators
|
||||
The equality and relational operations result in `1` if the condition is evaluated as `true` and in `0` if the condition is evaluated as `false`.
|
||||
- Equality `a == b`
|
||||
- Inequality `a != b`
|
||||
- Greater than `a > b`
|
||||
- Greater or equal than `a >= b`
|
||||
- Less than `a < b`
|
||||
- Less or equal than `a <= b`
|
||||
|
||||
## Control-Flow
|
||||
For conditions like in if or loops, every non zero value is equal to `true`, and `0` is `false`.
|
||||
|
||||
### Loop
|
||||
- There is currently only the `loop` keyword that can act like a `while` with optional advancement (an expression that is executed after the loop body)
|
||||
- The `loop` keyword is followed by the condition (an expression) without needing parentheses
|
||||
- *Optional:* If there is a `;` after the condition, there must be another expression which is used as the advancement
|
||||
- The loops body is wrapped in braces (`{ }`) just like in C/C++
|
||||
|
||||
```
|
||||
// Print the numbers from 0 to 9
|
||||
|
||||
// Without advancement
|
||||
i <- 0;
|
||||
loop i < 10 {
|
||||
print i;
|
||||
i = i - 1;
|
||||
}
|
||||
|
||||
// With advancement
|
||||
k <- 0;
|
||||
loop k < 10; k = k - 1 {
|
||||
print k;
|
||||
}
|
||||
```
|
||||
|
||||
### If / Else
|
||||
|
||||
- The language supports `if` and an optional `else`
|
||||
- After the `if` keyword must be the deciding condition, parentheses are not needed
|
||||
- The block *if-true* block is wrapped in braces (`{ }`)
|
||||
- *Optional:* If there is an `else` after the *if-block*, there must be a following *if-false*, aka. else block
|
||||
```
|
||||
a <- 1;
|
||||
b <- 2;
|
||||
if a == b {
|
||||
// a is equal to b
|
||||
print 1;
|
||||
} else {
|
||||
// a is not equal to b
|
||||
print 0;
|
||||
}
|
||||
```
|
||||
|
||||
## IO
|
||||
|
||||
### Print
|
||||
Printing is implemented via the `print` keyword
|
||||
- The `print` keyword is followed by an expression, the value of which will be printed to the terminal.
|
||||
- Print currently automatically adds a linebreak
|
||||
```
|
||||
a <- 1;
|
||||
print a; // Outputs `"1\n"` to the terminal
|
||||
```
|
||||
|
||||
## Comments
|
||||
|
||||
### Line comments
|
||||
Line comments can be initiated by using `//`
|
||||
- Everything after `//` up to the end of the current line is ignored and not parsed
|
||||
```
|
||||
// This is a comment
|
||||
```
|
||||
|
||||
|
||||
# Feature Tracker
|
||||
|
||||
## High level Components
|
||||
|
||||
- [ ] Lexer: Transforms text into Tokens
|
||||
- [ ] Parser: Transforms Tokens into Abstract Syntax Tree
|
||||
- [ ] Interpreter (tree-walk-interpreter): Walks the tree and evaluates the expressions / statements
|
||||
- [x] Lexer: Transforms text into Tokens
|
||||
- [x] Parser: Transforms Tokens into Abstract Syntax Tree
|
||||
- [x] Interpreter (tree-walk-interpreter): Walks the tree and evaluates the expressions / statements
|
||||
|
||||
## Language features
|
||||
|
||||
- [x] General expressions
|
||||
- [x] Arithmetic operations
|
||||
- [x] Addition `a + b`
|
||||
- [x] Subtraction `a - b`
|
||||
- [x] Multiplication `a * b`
|
||||
- [x] Division `a / b`
|
||||
- [x] Modulo `a % b
|
||||
- [x] Negate `-a`
|
||||
- [x] Parentheses `(a + b) * c`
|
||||
- [x] Logical boolean operators
|
||||
- [x] Equal `a == b`
|
||||
- [x] Not equal `a != b`
|
||||
- [x] Greater than `a > b`
|
||||
- [x] Less than `a < b`
|
||||
- [x] Greater than or equal `a >= b`
|
||||
- [x] Less than or equal `a <= b`
|
||||
- [x] Logical operators
|
||||
- [x] And `a && b`
|
||||
- [x] Or `a || b`
|
||||
- [x] Not `!a`
|
||||
- [x] Bitwise operators
|
||||
- [x] Bitwise AND `a & b`
|
||||
- [x] Bitwise OR `a | b`
|
||||
- [x] Bitwise XOR `a ^ b`
|
||||
- [x] Bitwise NOT `~a`
|
||||
- [x] Bitwise left shift `a << b`
|
||||
- [x] Bitwise right shift `a >> b`
|
||||
- [x] Variables
|
||||
- [x] Declaration
|
||||
- [x] Assignment
|
||||
- [x] Statements with semicolon & Multiline programs
|
||||
- [x] Control flow
|
||||
- [x] While loop `while X { ... }`
|
||||
- [x] If else statement `if X { ... } else { ... }`
|
||||
- [x] If Statement
|
||||
- [x] Else statement
|
||||
- [x] Line comments `//`
|
||||
- [x] Strings
|
||||
- [x] IO Intrinsics
|
||||
- [x] Print
|
||||
|
||||
## Grammar
|
||||
|
||||
### Expressions
|
||||
```
|
||||
LITERAL = I64_LITERAL | STR_LITERAL
|
||||
expr_primary = LITERAL | IDENT | "(" expr ")" | "-" expr_primary | "~" expr_primary
|
||||
expr_mul = expr_primary (("*" | "/" | "%") expr_primary)*
|
||||
expr_add = expr_mul (("+" | "-") expr_mul)*
|
||||
expr_shift = expr_add ((">>" | "<<") expr_add)*
|
||||
expr_rel = expr_shift ((">" | ">=" | "<" | "<=") expr_shift)*
|
||||
expr_equ = expr_rel (("==" | "!=") expr_rel)*
|
||||
expr_band = expr_equ ("&" expr_equ)*
|
||||
expr_bxor = expr_band ("^" expr_band)*
|
||||
expr_bor = expr_bxor ("|" expr_bxor)*
|
||||
expr_land = expr_bor ("&&" expr_bor)*
|
||||
expr_lor = expr_land ("||" expr_land)*
|
||||
expr = expr_lor
|
||||
```
|
||||
|
||||
### Statements
|
||||
```
|
||||
stmt_if = "if" expr "{" stmt* "}" ("else" "{" stmt* "}")?
|
||||
stmt_loop = "loop" expr (";" expr)? "{" stmt* "}"
|
||||
stmt_expr = expr ";"
|
||||
stmt = stmt_expr | stmt_loop
|
||||
```
|
||||
15
examples/euler1.nek
Normal file
15
examples/euler1.nek
Normal file
@ -0,0 +1,15 @@
|
||||
// If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9.
|
||||
// The sum of these multiples is 23.
|
||||
// Find the sum of all the multiples of 3 or 5 below 1000.
|
||||
//
|
||||
// Correct Answer: 233168
|
||||
|
||||
sum <- 0;
|
||||
i <- 0;
|
||||
loop i < 1_000; i = i + 1 {
|
||||
if i % 3 == 0 | i % 5 == 0 {
|
||||
sum = sum + i;
|
||||
}
|
||||
}
|
||||
|
||||
print sum;
|
||||
26
examples/euler2.nek
Normal file
26
examples/euler2.nek
Normal file
@ -0,0 +1,26 @@
|
||||
// Each new term in the Fibonacci sequence is generated by adding the previous two terms.
|
||||
// By starting with 1 and 2, the first 10 terms will be:
|
||||
// 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...
|
||||
// By considering the terms in the Fibonacci sequence whose values do not exceed four million,
|
||||
// find the sum of the even-valued terms.
|
||||
//
|
||||
// Correct Answer: 4613732
|
||||
|
||||
sum <- 0;
|
||||
|
||||
a <- 0;
|
||||
b <- 1;
|
||||
tmp <- 0;
|
||||
|
||||
loop a < 4_000_000 {
|
||||
if a % 2 == 0 {
|
||||
sum = sum + a;
|
||||
}
|
||||
|
||||
tmp = a;
|
||||
a = b;
|
||||
b = b + tmp;
|
||||
}
|
||||
|
||||
print sum;
|
||||
|
||||
29
examples/euler3.nek
Normal file
29
examples/euler3.nek
Normal file
@ -0,0 +1,29 @@
|
||||
// The prime factors of 13195 are 5, 7, 13 and 29.
|
||||
// What is the largest prime factor of the number 600851475143 ?
|
||||
//
|
||||
// Correct Answer: 6857
|
||||
|
||||
number <- 600_851_475_143;
|
||||
result <- 0;
|
||||
|
||||
div <- 2;
|
||||
|
||||
loop number > 1 {
|
||||
loop number % div == 0 {
|
||||
if div > result {
|
||||
result = div;
|
||||
}
|
||||
number = number / div;
|
||||
}
|
||||
|
||||
div = div + 1;
|
||||
if div * div > number {
|
||||
if number > 1 & number > result {
|
||||
result = number;
|
||||
}
|
||||
number = 0;
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
print result;
|
||||
36
examples/euler4.nek
Normal file
36
examples/euler4.nek
Normal file
@ -0,0 +1,36 @@
|
||||
// A palindromic number reads the same both ways. The largest palindrome made from the product of
|
||||
// two 2-digit numbers is 9009 = 91 × 99.
|
||||
// Find the largest palindrome made from the product of two 3-digit numbers.
|
||||
//
|
||||
// Correct Answer: 906609
|
||||
|
||||
|
||||
res <- 0;
|
||||
|
||||
tmp <- 0;
|
||||
num <- 0;
|
||||
num_rev <- 0;
|
||||
|
||||
i <- 100;
|
||||
k <- 100;
|
||||
loop i < 1_000; i = i + 1 {
|
||||
k = 100;
|
||||
loop k < 1_000; k = k + 1 {
|
||||
num_rev = 0;
|
||||
|
||||
num = i * k;
|
||||
|
||||
tmp = num;
|
||||
|
||||
loop tmp {
|
||||
num_rev = num_rev*10 + tmp % 10;
|
||||
tmp = tmp / 10;
|
||||
}
|
||||
|
||||
if num == num_rev & num > res {
|
||||
res = num;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
print res;
|
||||
24
examples/euler4.py
Normal file
24
examples/euler4.py
Normal file
@ -0,0 +1,24 @@
|
||||
# A palindromic number reads the same both ways. The largest palindrome made from the product of
|
||||
# two 2-digit numbers is 9009 = 91 × 99.
|
||||
# Find the largest palindrome made from the product of two 3-digit numbers.
|
||||
#
|
||||
# Correct Answer: 906609
|
||||
|
||||
|
||||
res = 0
|
||||
|
||||
for i in range(100, 999):
|
||||
for k in range(100, 999):
|
||||
|
||||
num = i * k
|
||||
tmp = num
|
||||
|
||||
num_rev = 0
|
||||
while tmp != 0:
|
||||
num_rev = num_rev*10 + tmp % 10
|
||||
tmp = tmp // 10
|
||||
|
||||
if num == num_rev and num > res:
|
||||
res = num
|
||||
|
||||
print(res)
|
||||
128
src/ast.rs
Normal file
128
src/ast.rs
Normal file
@ -0,0 +1,128 @@
|
||||
use std::rc::Rc;
|
||||
|
||||
/// Types for binary operators
|
||||
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||
pub enum BinOpType {
|
||||
/// Addition
|
||||
Add,
|
||||
|
||||
/// Subtraction
|
||||
Sub,
|
||||
|
||||
/// Multiplication
|
||||
Mul,
|
||||
|
||||
/// Divide
|
||||
Div,
|
||||
|
||||
/// Modulo
|
||||
Mod,
|
||||
|
||||
/// Compare Equal
|
||||
EquEqu,
|
||||
|
||||
/// Compare Not Equal
|
||||
NotEqu,
|
||||
|
||||
/// Less than
|
||||
Less,
|
||||
|
||||
/// Less than or Equal
|
||||
LessEqu,
|
||||
|
||||
/// Greater than
|
||||
Greater,
|
||||
|
||||
/// Greater than or Equal
|
||||
GreaterEqu,
|
||||
|
||||
/// Bitwise OR (inclusive or)
|
||||
BOr,
|
||||
|
||||
/// Bitwise And
|
||||
BAnd,
|
||||
|
||||
/// Bitwise Xor (exclusive or)
|
||||
BXor,
|
||||
|
||||
/// Logical And
|
||||
LAnd,
|
||||
|
||||
/// Logical Or
|
||||
LOr,
|
||||
|
||||
/// Shift Left
|
||||
Shl,
|
||||
|
||||
/// Shift Right
|
||||
Shr,
|
||||
|
||||
/// Assign value to variable
|
||||
Assign,
|
||||
|
||||
/// Declare new variable with value
|
||||
Declare,
|
||||
}
|
||||
|
||||
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||
pub enum UnOpType {
|
||||
/// Unary Negate
|
||||
Negate,
|
||||
|
||||
/// Bitwise Not
|
||||
BNot,
|
||||
|
||||
/// Logical Not
|
||||
LNot,
|
||||
}
|
||||
|
||||
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||
pub enum Expression {
|
||||
/// Integer literal (64-bit)
|
||||
I64(i64),
|
||||
/// String literal
|
||||
String(Rc<String>),
|
||||
|
||||
FunCall(String, Vec<Expression>),
|
||||
/// Variable
|
||||
Var(String),
|
||||
/// Binary operation. Consists of type, left hand side and right hand side
|
||||
BinOp(BinOpType, Box<Expression>, Box<Expression>),
|
||||
/// Unary operation. Consists of type and operand
|
||||
UnOp(UnOpType, Box<Expression>),
|
||||
}
|
||||
|
||||
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||
pub struct Loop {
|
||||
/// The condition that determines if the loop should continue
|
||||
pub condition: Expression,
|
||||
/// This is executed after each loop to advance the condition variables
|
||||
pub advancement: Option<Expression>,
|
||||
/// The loop body that is executed each loop
|
||||
pub body: Ast,
|
||||
}
|
||||
|
||||
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||
pub struct If {
|
||||
/// The condition
|
||||
pub condition: Expression,
|
||||
/// The body that is executed when condition is true
|
||||
pub body_true: Ast,
|
||||
/// The if body that is executed when the condition is false
|
||||
pub body_false: Ast,
|
||||
}
|
||||
|
||||
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||
pub enum Statement {
|
||||
Expr(Expression),
|
||||
Loop(Loop),
|
||||
If(If),
|
||||
Print(Expression),
|
||||
FunDecl(String, Vec<String>, Ast),
|
||||
Return(Expression),
|
||||
}
|
||||
|
||||
#[derive(Debug, PartialEq, Eq, Clone, Default)]
|
||||
pub struct Ast {
|
||||
pub prog: Vec<Statement>,
|
||||
}
|
||||
@ -1,70 +1,227 @@
|
||||
use crate::parser::{Ast, BinOpType};
|
||||
use std::{collections::HashMap, fmt::Display, rc::Rc, cell::RefCell};
|
||||
|
||||
use crate::{ast::{Expression, BinOpType, UnOpType, Ast, Statement, If}, parser::parse, lexer::lex};
|
||||
|
||||
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||
pub enum Value {
|
||||
I64(i64),
|
||||
String(Rc<String>),
|
||||
}
|
||||
|
||||
pub enum RunEnd {
|
||||
Return(Value),
|
||||
End,
|
||||
}
|
||||
|
||||
pub struct Interpreter {
|
||||
// Runtime storage, for example variables ...
|
||||
// Variable table stores the runtime values of variables
|
||||
vartable: HashMap<String, Value>,
|
||||
funtable: HashMap<String, RefCell<(Vec<String>, Ast)>>,
|
||||
}
|
||||
|
||||
impl Interpreter {
|
||||
pub fn new() -> Self {
|
||||
Self {}
|
||||
Self {
|
||||
vartable: HashMap::new(),
|
||||
funtable: HashMap::new(),
|
||||
}
|
||||
}
|
||||
|
||||
pub fn run(&mut self, prog: Ast) {
|
||||
let result = self.resolve_expr(prog);
|
||||
|
||||
println!("Result = {:?}", result);
|
||||
pub fn run_str(&mut self, code: &str, print_tokens: bool, print_ast: bool) {
|
||||
let tokens = lex(code).unwrap();
|
||||
if print_tokens {
|
||||
println!("Tokens: {:?}", tokens);
|
||||
}
|
||||
|
||||
fn resolve_expr(&mut self, expr: Ast) -> Value {
|
||||
let ast = parse(tokens);
|
||||
if print_ast {
|
||||
println!("{:#?}", ast);
|
||||
}
|
||||
|
||||
self.run(&ast);
|
||||
}
|
||||
|
||||
pub fn run(&mut self, prog: &Ast) -> RunEnd {
|
||||
for stmt in &prog.prog {
|
||||
match stmt {
|
||||
Statement::Expr(expr) => {
|
||||
self.resolve_expr(expr);
|
||||
}
|
||||
|
||||
Statement::Return(expr) => {
|
||||
return RunEnd::Return(self.resolve_expr(expr));
|
||||
}
|
||||
|
||||
Statement::Loop(looop) => {
|
||||
// loop runs as long condition != 0
|
||||
loop {
|
||||
if matches!(self.resolve_expr(&looop.condition), Value::I64(0)) {
|
||||
break;
|
||||
}
|
||||
|
||||
match self.run(&looop.body) {
|
||||
RunEnd::Return(val) => return RunEnd::Return(val),
|
||||
RunEnd::End => (),
|
||||
}
|
||||
|
||||
if let Some(adv) = &looop.advancement {
|
||||
self.resolve_expr(&adv);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Statement::Print(expr) => {
|
||||
let result = self.resolve_expr(expr);
|
||||
print!("{}", result);
|
||||
}
|
||||
|
||||
Statement::If(If {condition, body_true, body_false}) => {
|
||||
let end = if matches!(self.resolve_expr(condition), Value::I64(0)) {
|
||||
self.run(body_false)
|
||||
} else {
|
||||
self.run(body_true)
|
||||
};
|
||||
match end {
|
||||
RunEnd::Return(val) => return RunEnd::Return(val),
|
||||
RunEnd::End => (),
|
||||
}
|
||||
}
|
||||
Statement::FunDecl(name, args, body) => {
|
||||
self.funtable.insert(name.clone(), (args.clone(), body.clone()).into());
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
RunEnd::End
|
||||
}
|
||||
|
||||
fn resolve_expr(&mut self, expr: &Expression) -> Value {
|
||||
match expr {
|
||||
Ast::I64(val) => Value::I64(val),
|
||||
Ast::BinOp(bo, lhs, rhs) => self.resolve_binop(bo, *lhs, *rhs),
|
||||
Expression::I64(val) => Value::I64(*val),
|
||||
Expression::String(text) => Value::String(text.clone()),
|
||||
Expression::BinOp(bo, lhs, rhs) => self.resolve_binop(bo, lhs, rhs),
|
||||
Expression::UnOp(uo, operand) => self.resolve_unop(uo, operand),
|
||||
Expression::Var(name) => self.resolve_var(name),
|
||||
Expression::FunCall(name, args) => {
|
||||
let fun = self.funtable.get(name).expect("Function not declared").clone();
|
||||
for i in 0 .. args.len() {
|
||||
let val = self.resolve_expr(&args[i]);
|
||||
self.vartable.insert(fun.borrow().0[i].clone(), val);
|
||||
}
|
||||
|
||||
if fun.borrow().0.len() != args.len() {
|
||||
panic!("Invalid number of arguments for function");
|
||||
}
|
||||
|
||||
let end = self.run(&fun.borrow().1);
|
||||
match end {
|
||||
RunEnd::Return(val) => val,
|
||||
RunEnd::End => Value::I64(0),
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
fn resolve_binop(&mut self, bo: BinOpType, lhs: Ast, rhs: Ast) -> Value {
|
||||
let lhs = self.resolve_expr(lhs);
|
||||
fn resolve_var(&mut self, name: &str) -> Value {
|
||||
match self.vartable.get(name) {
|
||||
Some(val) => val.clone(),
|
||||
None => panic!("Variable '{}' used but not declared", name),
|
||||
}
|
||||
}
|
||||
|
||||
fn resolve_unop(&mut self, uo: &UnOpType, operand: &Expression) -> Value {
|
||||
let operand = self.resolve_expr(operand);
|
||||
|
||||
match (operand, uo) {
|
||||
(Value::I64(val), UnOpType::Negate) => Value::I64(-val),
|
||||
(Value::I64(val), UnOpType::BNot) => Value::I64(!val),
|
||||
(Value::I64(val), UnOpType::LNot) => Value::I64(if val == 0 { 1 } else { 0 }),
|
||||
_ => panic!("Value type is not compatible with unary operation"),
|
||||
}
|
||||
}
|
||||
|
||||
fn resolve_binop(&mut self, bo: &BinOpType, lhs: &Expression, rhs: &Expression) -> Value {
|
||||
let rhs = self.resolve_expr(rhs);
|
||||
|
||||
match (&bo, &lhs) {
|
||||
(BinOpType::Declare, Expression::Var(name)) => {
|
||||
self.vartable.insert(name.clone(), rhs.clone());
|
||||
return rhs;
|
||||
}
|
||||
(BinOpType::Assign, Expression::Var(name)) => {
|
||||
match self.vartable.get_mut(name) {
|
||||
Some(val) => *val = rhs.clone(),
|
||||
None => panic!("Runtime Error: Trying to assign value to undeclared variable"),
|
||||
}
|
||||
return rhs;
|
||||
}
|
||||
_ => ()
|
||||
}
|
||||
|
||||
let lhs = self.resolve_expr(lhs);
|
||||
|
||||
match (lhs, rhs) {
|
||||
(Value::I64(lhs), Value::I64(rhs)) => match bo {
|
||||
BinOpType::Add => Value::I64(lhs + rhs),
|
||||
BinOpType::Mul => Value::I64(lhs * rhs),
|
||||
BinOpType::Sub => Value::I64(lhs - rhs),
|
||||
BinOpType::Div => Value::I64(lhs / rhs),
|
||||
BinOpType::Mod => Value::I64(lhs % rhs),
|
||||
BinOpType::BOr => Value::I64(lhs | rhs),
|
||||
BinOpType::BAnd => Value::I64(lhs & rhs),
|
||||
BinOpType::BXor => Value::I64(lhs ^ rhs),
|
||||
BinOpType::LAnd => Value::I64(if (lhs != 0) && (rhs != 0) { 1 } else { 0 }),
|
||||
BinOpType::LOr => Value::I64(if (lhs != 0) || (rhs != 0) { 1 } else { 0 }),
|
||||
BinOpType::Shr => Value::I64(lhs >> rhs),
|
||||
BinOpType::Shl => Value::I64(lhs << rhs),
|
||||
BinOpType::EquEqu => Value::I64(if lhs == rhs { 1 } else { 0 }),
|
||||
BinOpType::NotEqu => Value::I64(if lhs != rhs { 1 } else { 0 }),
|
||||
BinOpType::Less => Value::I64(if lhs < rhs { 1 } else { 0 }),
|
||||
BinOpType::LessEqu => Value::I64(if lhs <= rhs { 1 } else { 0 }),
|
||||
BinOpType::Greater => Value::I64(if lhs > rhs { 1 } else { 0 }),
|
||||
BinOpType::GreaterEqu => Value::I64(if lhs >= rhs { 1 } else { 0 }),
|
||||
|
||||
BinOpType::Declare | BinOpType::Assign => unreachable!(),
|
||||
},
|
||||
// _ => panic!("Value types are not compatible"),
|
||||
_ => panic!("Value types are not compatible"),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl Display for Value {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
match self {
|
||||
Value::I64(val) => write!(f, "{}", val),
|
||||
Value::String(text) => write!(f, "{}", text),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
#[cfg(test)]
|
||||
mod test {
|
||||
use crate::parser::{Ast, BinOpType};
|
||||
use super::{Interpreter, Value};
|
||||
use crate::ast::{Expression, BinOpType};
|
||||
|
||||
#[test]
|
||||
fn test_interpreter_expr() {
|
||||
// Expression: 1 + 2 * 3 + 4
|
||||
// With precedence: (1 + (2 * 3)) + 4
|
||||
let ast = Ast::BinOp(
|
||||
let ast = Expression::BinOp(
|
||||
BinOpType::Add,
|
||||
Ast::BinOp(
|
||||
Expression::BinOp(
|
||||
BinOpType::Add,
|
||||
Ast::I64(1).into(),
|
||||
Ast::BinOp(BinOpType::Mul, Ast::I64(2).into(), Ast::I64(3).into()).into(),
|
||||
Expression::I64(1).into(),
|
||||
Expression::BinOp(BinOpType::Mul, Expression::I64(2).into(), Expression::I64(3).into()).into(),
|
||||
)
|
||||
.into(),
|
||||
Ast::I64(4).into(),
|
||||
Expression::I64(4).into(),
|
||||
);
|
||||
|
||||
let expected = Value::I64(11);
|
||||
|
||||
let mut interpreter = Interpreter::new();
|
||||
let actual = interpreter.resolve_expr(ast);
|
||||
let actual = interpreter.resolve_expr(&ast);
|
||||
|
||||
assert_eq!(expected, actual);
|
||||
}
|
||||
|
||||
241
src/lexer.rs
241
src/lexer.rs
@ -1,23 +1,31 @@
|
||||
use crate::token::Token;
|
||||
use anyhow::Result;
|
||||
use std::{iter::Peekable, str::Chars};
|
||||
use thiserror::Error;
|
||||
|
||||
use crate::parser::BinOpType;
|
||||
#[derive(Debug, Error)]
|
||||
pub enum LexErr {
|
||||
#[error("Failed to parse '{0}' as i64")]
|
||||
NumericParse(String),
|
||||
|
||||
#[derive(Debug, PartialEq, Eq)]
|
||||
pub enum Token {
|
||||
/// Integer literal (64-bit)
|
||||
I64(i64),
|
||||
#[error("Invalid escape character '\\{0}'")]
|
||||
InvalidStrEscape(char),
|
||||
|
||||
/// Plus (+)
|
||||
Add,
|
||||
#[error("Lexer encountered unexpected char: '{0}'")]
|
||||
UnexpectedChar(char),
|
||||
|
||||
/// Asterisk (*)
|
||||
Mul,
|
||||
#[error("Missing closing string quote '\"'")]
|
||||
MissingClosingString,
|
||||
}
|
||||
|
||||
/// End of file
|
||||
EoF,
|
||||
/// Lex the provided code into a Token Buffer
|
||||
pub fn lex(code: &str) -> Result<Vec<Token>, LexErr> {
|
||||
let mut lexer = Lexer::new(code);
|
||||
lexer.lex()
|
||||
}
|
||||
|
||||
struct Lexer<'a> {
|
||||
/// The sourcecode text as an iterator over the chars
|
||||
code: Peekable<Chars<'a>>,
|
||||
}
|
||||
|
||||
@ -27,65 +35,186 @@ impl<'a> Lexer<'a> {
|
||||
Self { code }
|
||||
}
|
||||
|
||||
fn lex(&mut self) -> Vec<Token> {
|
||||
fn lex(&mut self) -> Result<Vec<Token>, LexErr> {
|
||||
let mut tokens = Vec::new();
|
||||
|
||||
while let Some(ch) = self.next() {
|
||||
match ch {
|
||||
loop {
|
||||
match self.next() {
|
||||
// Stop lexing at EOF
|
||||
'\0' => break,
|
||||
|
||||
// Skip whitespace
|
||||
' ' => (),
|
||||
' ' | '\t' | '\n' | '\r' => (),
|
||||
|
||||
// Line comment. Consume every char until linefeed (next line)
|
||||
'/' if matches!(self.peek(), '/') => while !matches!(self.next(), '\n' | '\0') {},
|
||||
|
||||
// Double character tokens
|
||||
'>' if matches!(self.peek(), '>') => {
|
||||
self.next();
|
||||
tokens.push(Token::Shr);
|
||||
}
|
||||
'<' if matches!(self.peek(), '<') => {
|
||||
self.next();
|
||||
tokens.push(Token::Shl);
|
||||
}
|
||||
'=' if matches!(self.peek(), '=') => {
|
||||
self.next();
|
||||
tokens.push(Token::EquEqu);
|
||||
}
|
||||
'!' if matches!(self.peek(), '=') => {
|
||||
self.next();
|
||||
tokens.push(Token::NotEqu);
|
||||
}
|
||||
'<' if matches!(self.peek(), '=') => {
|
||||
self.next();
|
||||
tokens.push(Token::LAngleEqu);
|
||||
}
|
||||
'>' if matches!(self.peek(), '=') => {
|
||||
self.next();
|
||||
tokens.push(Token::RAngleEqu);
|
||||
}
|
||||
'<' if matches!(self.peek(), '-') => {
|
||||
self.next();
|
||||
tokens.push(Token::LArrow);
|
||||
}
|
||||
'&' if matches!(self.peek(), '&') => {
|
||||
self.next();
|
||||
tokens.push(Token::LAnd);
|
||||
}
|
||||
'|' if matches!(self.peek(), '|') => {
|
||||
self.next();
|
||||
tokens.push(Token::LOr);
|
||||
}
|
||||
|
||||
// Single character tokens
|
||||
';' => tokens.push(Token::Semicolon),
|
||||
'+' => tokens.push(Token::Add),
|
||||
'-' => tokens.push(Token::Sub),
|
||||
'*' => tokens.push(Token::Mul),
|
||||
'/' => tokens.push(Token::Div),
|
||||
'%' => tokens.push(Token::Mod),
|
||||
'|' => tokens.push(Token::BOr),
|
||||
'&' => tokens.push(Token::BAnd),
|
||||
'^' => tokens.push(Token::BXor),
|
||||
'(' => tokens.push(Token::LParen),
|
||||
')' => tokens.push(Token::RParen),
|
||||
'~' => tokens.push(Token::Tilde),
|
||||
'<' => tokens.push(Token::LAngle),
|
||||
'>' => tokens.push(Token::RAngle),
|
||||
'=' => tokens.push(Token::Equ),
|
||||
'{' => tokens.push(Token::LBraces),
|
||||
'}' => tokens.push(Token::RBraces),
|
||||
'!' => tokens.push(Token::LNot),
|
||||
',' => tokens.push(Token::Comma),
|
||||
|
||||
// Lex numbers
|
||||
'0'..='9' => {
|
||||
ch @ '0'..='9' => {
|
||||
// String representation of the integer value
|
||||
let mut sval = String::from(ch);
|
||||
|
||||
// Do as long as a next char exists and it is a numeric char
|
||||
while let Some('0'..='9') = self.peek() {
|
||||
loop {
|
||||
// The next char is verified to be Some, so unwrap is safe
|
||||
sval.push(self.next().unwrap());
|
||||
match self.peek() {
|
||||
// Underscore is a separator, so remove it but don't add to number
|
||||
'_' => {
|
||||
self.next();
|
||||
}
|
||||
|
||||
// TODO: We only added numeric chars to the string, but the conversion could still fail
|
||||
tokens.push(Token::I64(sval.parse().unwrap()));
|
||||
'0'..='9' => {
|
||||
sval.push(self.next());
|
||||
}
|
||||
|
||||
'+' => tokens.push(Token::Add),
|
||||
'*' => tokens.push(Token::Mul),
|
||||
|
||||
//TODO: Don't panic, keep calm
|
||||
_ => panic!("Lexer encountered unexpected char: '{}'", ch),
|
||||
// Next char is not a number, so stop and finish the number token
|
||||
_ => break,
|
||||
}
|
||||
}
|
||||
|
||||
tokens
|
||||
// Try to convert the string representation of the value to i64
|
||||
let i64val = sval.parse().map_err(|_| LexErr::NumericParse(sval))?;
|
||||
tokens.push(Token::I64(i64val));
|
||||
}
|
||||
|
||||
// Lex a string
|
||||
'"' => {
|
||||
// Opening " was consumed in match
|
||||
|
||||
let mut text = String::new();
|
||||
|
||||
// Read all chars until encountering the closing "
|
||||
loop {
|
||||
match self.peek() {
|
||||
'"' => break,
|
||||
// If the end of file is reached while still waiting for '"', error out
|
||||
'\0' => Err(LexErr::MissingClosingString)?,
|
||||
_ => match self.next() {
|
||||
// Backshlash indicates an escaped character
|
||||
'\\' => match self.next() {
|
||||
'n' => text.push('\n'),
|
||||
'r' => text.push('\r'),
|
||||
't' => text.push('\t'),
|
||||
'\\' => text.push('\\'),
|
||||
'"' => text.push('"'),
|
||||
ch => Err(LexErr::InvalidStrEscape(ch))?,
|
||||
},
|
||||
// All other characters are simply appended to the string
|
||||
ch => text.push(ch),
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
// Consume closing "
|
||||
self.next();
|
||||
|
||||
tokens.push(Token::String(text))
|
||||
}
|
||||
|
||||
// Lex characters as identifier
|
||||
ch @ ('a'..='z' | 'A'..='Z' | '_') => {
|
||||
let mut ident = String::from(ch);
|
||||
|
||||
// Do as long as a next char exists and it is a valid char for an identifier
|
||||
loop {
|
||||
match self.peek() {
|
||||
// In the middle of an identifier numbers are also allowed
|
||||
'a'..='z' | 'A'..='Z' | '0'..='9' | '_' => {
|
||||
ident.push(self.next());
|
||||
}
|
||||
// Next char is not valid, so stop and finish the ident token
|
||||
_ => break,
|
||||
}
|
||||
}
|
||||
|
||||
// Check for pre-defined keywords
|
||||
let token = match ident.as_str() {
|
||||
"loop" => Token::Loop,
|
||||
"print" => Token::Print,
|
||||
"if" => Token::If,
|
||||
"else" => Token::Else,
|
||||
"fun" => Token::Fun,
|
||||
"return" => Token::Return,
|
||||
|
||||
// If it doesn't match a keyword, it is a normal identifier
|
||||
_ => Token::Ident(ident),
|
||||
};
|
||||
|
||||
tokens.push(token);
|
||||
}
|
||||
|
||||
ch => Err(LexErr::UnexpectedChar(ch))?,
|
||||
}
|
||||
}
|
||||
|
||||
Ok(tokens)
|
||||
}
|
||||
|
||||
/// Advance to next character and return the removed char
|
||||
fn next(&mut self) -> Option<char> {
|
||||
self.code.next()
|
||||
fn next(&mut self) -> char {
|
||||
self.code.next().unwrap_or('\0')
|
||||
}
|
||||
|
||||
/// Get the next character without removing it
|
||||
fn peek(&mut self) -> Option<char> {
|
||||
self.code.peek().copied()
|
||||
}
|
||||
}
|
||||
|
||||
/// Lex the provided code into a Token Buffer
|
||||
///
|
||||
/// TODO: Don't panic and implement error handling using Result
|
||||
pub fn lex(code: &str) -> Vec<Token> {
|
||||
let mut lexer = Lexer::new(code);
|
||||
lexer.lex()
|
||||
}
|
||||
|
||||
impl Token {
|
||||
pub fn try_to_binop(&self) -> Option<BinOpType> {
|
||||
Some(match self {
|
||||
Token::Add => BinOpType::Add,
|
||||
Token::Mul => BinOpType::Mul,
|
||||
_ => return None,
|
||||
})
|
||||
fn peek(&mut self) -> char {
|
||||
self.code.peek().copied().unwrap_or('\0')
|
||||
}
|
||||
}
|
||||
|
||||
@ -95,7 +224,7 @@ mod tests {
|
||||
|
||||
#[test]
|
||||
fn test_lexer() {
|
||||
let code = "33 +5*2 + 4456467*2334+3";
|
||||
let code = "33 +5*2 + 4456467*2334+3 % - / << ^ | & >>";
|
||||
let expected = vec![
|
||||
Token::I64(33),
|
||||
Token::Add,
|
||||
@ -108,9 +237,17 @@ mod tests {
|
||||
Token::I64(2334),
|
||||
Token::Add,
|
||||
Token::I64(3),
|
||||
Token::Mod,
|
||||
Token::Sub,
|
||||
Token::Div,
|
||||
Token::Shl,
|
||||
Token::BXor,
|
||||
Token::BOr,
|
||||
Token::BAnd,
|
||||
Token::Shr,
|
||||
];
|
||||
|
||||
let actual = lex(code);
|
||||
let actual = lex(code).unwrap();
|
||||
assert_eq!(expected, actual);
|
||||
}
|
||||
}
|
||||
|
||||
@ -1,3 +1,5 @@
|
||||
pub mod lexer;
|
||||
pub mod token;
|
||||
pub mod parser;
|
||||
pub mod ast;
|
||||
pub mod interpreter;
|
||||
|
||||
58
src/main.rs
58
src/main.rs
@ -1,23 +1,55 @@
|
||||
use nek_lang::{lexer::lex, parser::parse, interpreter::Interpreter};
|
||||
use std::{env::args, fs, io::{stdout, Write, stdin}};
|
||||
|
||||
use nek_lang::interpreter::Interpreter;
|
||||
|
||||
|
||||
#[derive(Debug, Default)]
|
||||
struct CliConfig {
|
||||
print_tokens: bool,
|
||||
print_ast: bool,
|
||||
interactive: bool,
|
||||
file: Option<String>,
|
||||
}
|
||||
|
||||
fn main() {
|
||||
|
||||
let mut code = String::new();
|
||||
let mut conf = CliConfig::default();
|
||||
|
||||
std::io::stdin().read_line(&mut code).unwrap();
|
||||
let code = code.trim();
|
||||
|
||||
let tokens = lex(&code);
|
||||
|
||||
println!("Tokens: {:?}\n", tokens);
|
||||
|
||||
let ast = parse(tokens);
|
||||
|
||||
println!("Ast: {:#?}\n", ast);
|
||||
// Go through all commandline arguments except the first (filename)
|
||||
for arg in args().skip(1) {
|
||||
match arg.as_str() {
|
||||
"--token" | "-t" => conf.print_tokens = true,
|
||||
"--ast" | "-a" => conf.print_ast = true,
|
||||
"--interactive" | "-i" => conf.interactive = true,
|
||||
file if conf.file.is_none() => conf.file = Some(file.to_string()),
|
||||
_ => panic!("Invalid argument: '{}'", arg),
|
||||
}
|
||||
}
|
||||
|
||||
let mut interpreter = Interpreter::new();
|
||||
|
||||
interpreter.run(ast);
|
||||
if let Some(file) = &conf.file {
|
||||
let code = fs::read_to_string(file).expect(&format!("File not found: '{}'", file));
|
||||
interpreter.run_str(&code, conf.print_tokens, conf.print_ast);
|
||||
}
|
||||
|
||||
if conf.interactive || conf.file.is_none() {
|
||||
let mut code = String::new();
|
||||
|
||||
loop {
|
||||
print!(">> ");
|
||||
stdout().flush().unwrap();
|
||||
|
||||
code.clear();
|
||||
stdin().read_line(&mut code).unwrap();
|
||||
|
||||
if code.trim() == "exit" {
|
||||
break;
|
||||
}
|
||||
|
||||
interpreter.run_str(&code, conf.print_tokens, conf.print_ast);
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
327
src/parser.rs
327
src/parser.rs
@ -1,24 +1,7 @@
|
||||
use std::iter::Peekable;
|
||||
|
||||
use crate::lexer::Token;
|
||||
|
||||
/// Types for binary operators
|
||||
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||
pub enum BinOpType {
|
||||
/// Addition
|
||||
Add,
|
||||
|
||||
/// Multiplication
|
||||
Mul,
|
||||
}
|
||||
|
||||
#[derive(Debug, PartialEq, Eq, Clone)]
|
||||
pub enum Ast {
|
||||
/// Integer literal (64-bit)
|
||||
I64(i64),
|
||||
/// Binary operation. Consists of type, left hand side and right hand side
|
||||
BinOp(BinOpType, Box<Ast>, Box<Ast>),
|
||||
}
|
||||
use crate::ast::*;
|
||||
use crate::token::Token;
|
||||
|
||||
struct Parser<T: Iterator<Item = Token>> {
|
||||
tokens: Peekable<T>,
|
||||
@ -32,17 +15,203 @@ impl<T: Iterator<Item = Token>> Parser<T> {
|
||||
}
|
||||
|
||||
fn parse(&mut self) -> Ast {
|
||||
self.parse_expr()
|
||||
let mut prog = Vec::new();
|
||||
|
||||
loop {
|
||||
match self.peek() {
|
||||
Token::Semicolon => {
|
||||
self.next();
|
||||
}
|
||||
Token::EoF => break,
|
||||
Token::RBraces => {
|
||||
break;
|
||||
}
|
||||
|
||||
fn parse_expr(&mut self) -> Ast {
|
||||
// By default try to lex a statement
|
||||
_ => prog.push(self.parse_stmt()),
|
||||
}
|
||||
}
|
||||
|
||||
Ast { prog }
|
||||
}
|
||||
|
||||
fn parse_stmt(&mut self) -> Statement {
|
||||
match self.peek() {
|
||||
Token::Loop => Statement::Loop(self.parse_loop()),
|
||||
|
||||
Token::Print => {
|
||||
self.next();
|
||||
|
||||
let expr = self.parse_expr();
|
||||
|
||||
// After a statement, there must be a semicolon
|
||||
if !matches!(self.next(), Token::Semicolon) {
|
||||
panic!("Expected semicolon after statement");
|
||||
}
|
||||
|
||||
Statement::Print(expr)
|
||||
}
|
||||
|
||||
Token::Return => {
|
||||
self.next();
|
||||
|
||||
let expr = self.parse_expr();
|
||||
|
||||
// After a statement, there must be a semicolon
|
||||
if !matches!(self.next(), Token::Semicolon) {
|
||||
panic!("Expected semicolon after statement");
|
||||
}
|
||||
|
||||
Statement::Return(expr)
|
||||
}
|
||||
|
||||
Token::If => Statement::If(self.parse_if()),
|
||||
|
||||
Token::Fun => {
|
||||
self.next();
|
||||
|
||||
let name = match self.next() {
|
||||
Token::Ident(name) => name,
|
||||
_ => panic!("Error lexing function: Expected ident token"),
|
||||
};
|
||||
|
||||
let mut args = Vec::new();
|
||||
|
||||
if !matches!(self.next(), Token::LParen) {
|
||||
panic!("Expected opening parenthesis");
|
||||
}
|
||||
|
||||
while self.peek() != &Token::RParen {
|
||||
let argname = match self.next() {
|
||||
Token::Ident(argname) => argname,
|
||||
_ => panic!("Error lexing function: Expected ident token for argname"),
|
||||
};
|
||||
|
||||
args.push(argname);
|
||||
|
||||
if self.peek() == &Token::Comma {
|
||||
self.next();
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
self.next();
|
||||
|
||||
if !matches!(self.next(), Token::LBraces) {
|
||||
panic!("Expected opening braces");
|
||||
}
|
||||
|
||||
let body = self.parse();
|
||||
|
||||
if !matches!(self.next(), Token::RBraces) {
|
||||
panic!("Expected closing braces");
|
||||
}
|
||||
|
||||
Statement::FunDecl(name, args, body)
|
||||
}
|
||||
|
||||
// If it is not a loop, try to lex as an expression
|
||||
_ => {
|
||||
let stmt = Statement::Expr(self.parse_expr());
|
||||
|
||||
// After a statement, there must be a semicolon
|
||||
if !matches!(self.next(), Token::Semicolon) {
|
||||
panic!("Expected semicolon after statement");
|
||||
}
|
||||
|
||||
stmt
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
fn parse_if(&mut self) -> If {
|
||||
if !matches!(self.next(), Token::If) {
|
||||
panic!("Error lexing if: Expected if token");
|
||||
}
|
||||
|
||||
let condition = self.parse_expr();
|
||||
|
||||
if !matches!(self.next(), Token::LBraces) {
|
||||
panic!("Error lexing if: Expected '{{'")
|
||||
}
|
||||
|
||||
let body_true = self.parse();
|
||||
|
||||
if !matches!(self.next(), Token::RBraces) {
|
||||
panic!("Error lexing if: Expected '}}'")
|
||||
}
|
||||
|
||||
let mut body_false = Ast::default();
|
||||
|
||||
if matches!(self.peek(), Token::Else) {
|
||||
self.next();
|
||||
|
||||
if !matches!(self.next(), Token::LBraces) {
|
||||
panic!("Error lexing if: Expected '{{'")
|
||||
}
|
||||
|
||||
body_false = self.parse();
|
||||
|
||||
if !matches!(self.next(), Token::RBraces) {
|
||||
panic!("Error lexing if: Expected '}}'")
|
||||
}
|
||||
}
|
||||
|
||||
If {
|
||||
condition,
|
||||
body_true,
|
||||
body_false,
|
||||
}
|
||||
}
|
||||
|
||||
fn parse_loop(&mut self) -> Loop {
|
||||
if !matches!(self.next(), Token::Loop) {
|
||||
panic!("Error lexing loop: Expected loop token");
|
||||
}
|
||||
|
||||
let condition = self.parse_expr();
|
||||
let mut advancement = None;
|
||||
|
||||
let body;
|
||||
|
||||
match self.next() {
|
||||
Token::LBraces => {
|
||||
body = self.parse();
|
||||
}
|
||||
|
||||
Token::Semicolon => {
|
||||
advancement = Some(self.parse_expr());
|
||||
|
||||
if !matches!(self.next(), Token::LBraces) {
|
||||
panic!("Error lexing loop: Expected '{{'")
|
||||
}
|
||||
|
||||
body = self.parse();
|
||||
}
|
||||
|
||||
_ => panic!("Error lexing loop: Expected ';' or '{{'"),
|
||||
}
|
||||
|
||||
if !matches!(self.next(), Token::RBraces) {
|
||||
panic!("Error lexing loop: Expected '}}'")
|
||||
}
|
||||
|
||||
Loop {
|
||||
condition,
|
||||
advancement,
|
||||
body,
|
||||
}
|
||||
}
|
||||
|
||||
fn parse_expr(&mut self) -> Expression {
|
||||
let lhs = self.parse_primary();
|
||||
self.parse_expr_precedence(lhs, 0)
|
||||
}
|
||||
|
||||
/// Parse binary expressions with a precedence equal to or higher than min_prec
|
||||
fn parse_expr_precedence(&mut self, mut lhs: Ast, min_prec: u8) -> Ast {
|
||||
fn parse_expr_precedence(&mut self, mut lhs: Expression, min_prec: u8) -> Expression {
|
||||
while let Some(binop) = &self.peek().try_to_binop() {
|
||||
// Stop if the next operator has a lower binding power
|
||||
if !(binop.precedence() >= min_prec) {
|
||||
break;
|
||||
}
|
||||
@ -61,21 +230,77 @@ impl<T: Iterator<Item = Token>> Parser<T> {
|
||||
rhs = self.parse_expr_precedence(rhs, binop.precedence() + 1);
|
||||
}
|
||||
|
||||
lhs = Ast::BinOp(binop, lhs.into(), rhs.into());
|
||||
lhs = Expression::BinOp(binop, lhs.into(), rhs.into());
|
||||
}
|
||||
|
||||
lhs
|
||||
}
|
||||
|
||||
/// Parse a primary expression (for now only number)
|
||||
fn parse_primary(&mut self) -> Ast {
|
||||
fn parse_primary(&mut self) -> Expression {
|
||||
match self.next() {
|
||||
Token::I64(val) => Ast::I64(val),
|
||||
// Literal i64
|
||||
Token::I64(val) => Expression::I64(val),
|
||||
|
||||
// Literal String
|
||||
Token::String(text) => Expression::String(text.into()),
|
||||
|
||||
Token::Ident(name) if matches!(self.peek(), Token::LParen) => self.parse_funcall(name),
|
||||
|
||||
Token::Ident(name) => Expression::Var(name),
|
||||
|
||||
// Parentheses grouping
|
||||
Token::LParen => {
|
||||
let inner_expr = self.parse_expr();
|
||||
|
||||
// Verify that there is a closing parenthesis
|
||||
if !matches!(self.next(), Token::RParen) {
|
||||
panic!("Error parsing primary expr: Exepected closing parenthesis ')'");
|
||||
}
|
||||
|
||||
inner_expr
|
||||
}
|
||||
|
||||
// Unary negation
|
||||
Token::Sub => {
|
||||
let operand = self.parse_primary();
|
||||
Expression::UnOp(UnOpType::Negate, operand.into())
|
||||
}
|
||||
|
||||
// Unary bitwise not (bitflip)
|
||||
Token::Tilde => {
|
||||
let operand = self.parse_primary();
|
||||
Expression::UnOp(UnOpType::BNot, operand.into())
|
||||
}
|
||||
|
||||
// Unary logical not
|
||||
Token::LNot => {
|
||||
let operand = self.parse_primary();
|
||||
Expression::UnOp(UnOpType::LNot, operand.into())
|
||||
}
|
||||
|
||||
tok => panic!("Error parsing primary expr: Unexpected Token '{:?}'", tok),
|
||||
}
|
||||
}
|
||||
|
||||
fn parse_funcall(&mut self, name: String) -> Expression {
|
||||
let mut args = Vec::new();
|
||||
|
||||
// Consume (
|
||||
self.next();
|
||||
|
||||
while self.peek() != &Token::RParen {
|
||||
args.push(self.parse_expr());
|
||||
|
||||
if self.peek() == &Token::Comma {
|
||||
self.next();
|
||||
}
|
||||
}
|
||||
self.next();
|
||||
|
||||
Expression::FunCall(name, args)
|
||||
}
|
||||
|
||||
/// Get the next Token without removing it
|
||||
fn peek(&mut self) -> &Token {
|
||||
self.tokens.peek().unwrap_or(&Token::EoF)
|
||||
@ -95,18 +320,36 @@ pub fn parse<T: Iterator<Item = Token>, A: IntoIterator<IntoIter = T>>(tokens: A
|
||||
impl BinOpType {
|
||||
/// Get the precedence for a binary operator. Higher value means the OP is stronger binding.
|
||||
/// For example Multiplication is stronger than addition, so Mul has higher precedence than Add.
|
||||
///
|
||||
/// The operator precedences are derived from the C language operator precedences. While not all
|
||||
/// C operators are included or the exact same, the precedence oder is the same.
|
||||
/// See: https://en.cppreference.com/w/c/language/operator_precedence
|
||||
|
||||
fn precedence(&self) -> u8 {
|
||||
match self {
|
||||
BinOpType::Add => 0,
|
||||
BinOpType::Mul => 1,
|
||||
BinOpType::Declare => 0,
|
||||
BinOpType::Assign => 1,
|
||||
BinOpType::LOr => 2,
|
||||
BinOpType::LAnd => 3,
|
||||
BinOpType::BOr => 4,
|
||||
BinOpType::BXor => 5,
|
||||
BinOpType::BAnd => 6,
|
||||
BinOpType::EquEqu | BinOpType::NotEqu => 7,
|
||||
BinOpType::Less | BinOpType::LessEqu | BinOpType::Greater | BinOpType::GreaterEqu => 8,
|
||||
BinOpType::Shl | BinOpType::Shr => 9,
|
||||
BinOpType::Add | BinOpType::Sub => 10,
|
||||
BinOpType::Mul | BinOpType::Div | BinOpType::Mod => 11,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::{parse, Ast, BinOpType};
|
||||
use crate::lexer::Token;
|
||||
use super::{parse, BinOpType, Expression};
|
||||
use crate::{
|
||||
parser::{Ast, Statement},
|
||||
token::Token,
|
||||
};
|
||||
|
||||
#[test]
|
||||
fn test_parser() {
|
||||
@ -118,20 +361,30 @@ mod tests {
|
||||
Token::I64(2),
|
||||
Token::Mul,
|
||||
Token::I64(3),
|
||||
Token::Add,
|
||||
Token::Sub,
|
||||
Token::I64(4),
|
||||
Token::Semicolon,
|
||||
];
|
||||
|
||||
let expected = Ast::BinOp(
|
||||
let expected = Statement::Expr(Expression::BinOp(
|
||||
BinOpType::Sub,
|
||||
Expression::BinOp(
|
||||
BinOpType::Add,
|
||||
Ast::BinOp(
|
||||
BinOpType::Add,
|
||||
Ast::I64(1).into(),
|
||||
Ast::BinOp(BinOpType::Mul, Ast::I64(2).into(), Ast::I64(3).into()).into(),
|
||||
Expression::I64(1).into(),
|
||||
Expression::BinOp(
|
||||
BinOpType::Mul,
|
||||
Expression::I64(2).into(),
|
||||
Expression::I64(3).into(),
|
||||
)
|
||||
.into(),
|
||||
Ast::I64(4).into(),
|
||||
);
|
||||
)
|
||||
.into(),
|
||||
Expression::I64(4).into(),
|
||||
));
|
||||
|
||||
let expected = Ast {
|
||||
prog: vec![expected],
|
||||
};
|
||||
|
||||
let actual = parse(tokens);
|
||||
assert_eq!(expected, actual);
|
||||
|
||||
152
src/token.rs
Normal file
152
src/token.rs
Normal file
@ -0,0 +1,152 @@
|
||||
use crate::ast::BinOpType;
|
||||
|
||||
#[derive(Debug, PartialEq, Eq)]
|
||||
pub enum Token {
|
||||
/// Integer literal (64-bit)
|
||||
I64(i64),
|
||||
|
||||
/// String literal
|
||||
String(String),
|
||||
|
||||
/// Identifier (name for variables, functions, ...)
|
||||
Ident(String),
|
||||
|
||||
/// Loop keyword (loop)
|
||||
Loop,
|
||||
|
||||
/// Print keyword (print)
|
||||
Print,
|
||||
|
||||
/// If keyword (if)
|
||||
If,
|
||||
|
||||
/// Else keyword (else)
|
||||
Else,
|
||||
|
||||
Fun,
|
||||
|
||||
Comma,
|
||||
|
||||
Return,
|
||||
|
||||
/// Left Parenthesis ('(')
|
||||
LParen,
|
||||
|
||||
/// Right Parenthesis (')')
|
||||
RParen,
|
||||
|
||||
/// Left curly braces ({)
|
||||
LBraces,
|
||||
|
||||
/// Right curly braces (})
|
||||
RBraces,
|
||||
|
||||
/// Plus (+)
|
||||
Add,
|
||||
|
||||
/// Minus (-)
|
||||
Sub,
|
||||
|
||||
/// Asterisk (*)
|
||||
Mul,
|
||||
|
||||
/// Slash (/)
|
||||
Div,
|
||||
|
||||
/// Percent (%)
|
||||
Mod,
|
||||
|
||||
/// Equal Equal (==)
|
||||
EquEqu,
|
||||
|
||||
/// Exclamationmark Equal (!=)
|
||||
NotEqu,
|
||||
|
||||
/// Pipe (|)
|
||||
BOr,
|
||||
|
||||
/// Ampersand (&)
|
||||
BAnd,
|
||||
|
||||
/// Circumflex (^)
|
||||
BXor,
|
||||
|
||||
/// Logical AND (&&)
|
||||
LAnd,
|
||||
|
||||
/// Logical OR (||)
|
||||
LOr,
|
||||
|
||||
/// Shift Left (<<)
|
||||
Shl,
|
||||
|
||||
/// Shift Right (>>)
|
||||
Shr,
|
||||
|
||||
/// Tilde (~)
|
||||
Tilde,
|
||||
|
||||
/// Logical not (!)
|
||||
LNot,
|
||||
|
||||
/// Left angle bracket (<)
|
||||
LAngle,
|
||||
|
||||
/// Right angle bracket (>)
|
||||
RAngle,
|
||||
|
||||
/// Left angle bracket Equal (<=)
|
||||
LAngleEqu,
|
||||
|
||||
/// Left angle bracket Equal (>=)
|
||||
RAngleEqu,
|
||||
|
||||
/// Left arrow (<-)
|
||||
LArrow,
|
||||
|
||||
/// Equal Sign (=)
|
||||
Equ,
|
||||
|
||||
/// Semicolon (;)
|
||||
Semicolon,
|
||||
|
||||
/// End of file
|
||||
EoF,
|
||||
}
|
||||
|
||||
impl Token {
|
||||
pub fn try_to_binop(&self) -> Option<BinOpType> {
|
||||
Some(match self {
|
||||
Token::Add => BinOpType::Add,
|
||||
Token::Sub => BinOpType::Sub,
|
||||
|
||||
Token::Mul => BinOpType::Mul,
|
||||
Token::Div => BinOpType::Div,
|
||||
Token::Mod => BinOpType::Mod,
|
||||
|
||||
Token::BAnd => BinOpType::BAnd,
|
||||
Token::BOr => BinOpType::BOr,
|
||||
Token::BXor => BinOpType::BXor,
|
||||
|
||||
Token::LAnd => BinOpType::LAnd,
|
||||
Token::LOr => BinOpType::LOr,
|
||||
|
||||
Token::Shl => BinOpType::Shl,
|
||||
Token::Shr => BinOpType::Shr,
|
||||
|
||||
Token::EquEqu => BinOpType::EquEqu,
|
||||
Token::NotEqu => BinOpType::NotEqu,
|
||||
|
||||
Token::LAngle => BinOpType::Less,
|
||||
Token::LAngleEqu => BinOpType::LessEqu,
|
||||
|
||||
Token::RAngle => BinOpType::Greater,
|
||||
Token::RAngleEqu => BinOpType::GreaterEqu,
|
||||
|
||||
Token::LArrow => BinOpType::Declare,
|
||||
Token::Equ => BinOpType::Assign,
|
||||
|
||||
_ => return None,
|
||||
})
|
||||
}
|
||||
}
|
||||
Loading…
x
Reference in New Issue
Block a user