2022-02-11 01:19:45 +01:00
2022-02-11 19:08:08 +01:00
2022-01-02 18:44:05 +01:00
2022-02-11 12:36:36 +01:00
2022-02-11 16:07:12 +01:00

NEK-Lang

Table of contents

Variables

The variables are all contained in scopes. Variables defined in an outer scope can be accessed in inner scoped. All variables defined in a scope that has ended do no longer exist and can't be accessed.

Declaration

  • Declare and initialize a new variable
  • Declaring a previously declared variable again will shadow the previous variable
  • Declaration is needed before assignment or other usage
  • The variable name is on the left side of the <- operator
  • The assigned value is on the right side and can be any expression
a <- 123;

Create a new variable named a and assign the value 123 to it.

Assignment

  • Assigning a value to a previously declared variable
  • The variable name is on the left side of the = operator
  • The assigned value is on the right side and can be any expression
a = 123;

The value 123 is assigned to the variable named a. a needs to be declared before this.

Datatypes

The available variable datatypes are i64 (64-bit signed integer), string ("this is a string") and array ([10])

I64

  • The normal default datatype is i64 which is a 64-bit signed integer
  • Can be created by just writing an integer literal like 546
  • Inside the number literal _ can be inserted for visual separation 100_000
  • The i64 values can be used as expected in calculations, conditions and so on
my_i64 <- 123_456;

String

  • Strings mainly exist for formatting the text output of a program
  • Strings can be created by using doublequotes like in other languages "Hello world"
  • There is no way to access or change the characters of the string
  • Unicode characters are supported "Hello 🌎"
  • Escape characters \n, \r, \t, \", \\ are supported
  • String can be assigned to variables, just like i64
world <- "🌎";

print "Hello ";
print world;
print "\n";

Array

  • Arrays can contain any other datatypes and don't need to have the same type in all cells
  • Arrays can be created by using brackets with the size in between [size]
  • Arrays must be assigned to a variable in order to be used
  • All cells will be initialized with i64 0 values
  • The size can be any expression that results in a positive i64 value
  • The array size can't be changed after creation
  • The arrays data is always allocated on the heap
  • The array cells can be accessed by using the variable name and specifying the index in brackets my_arr[index]
  • The index can be any expression that results in a positive i64 value in the range of the arrays indices
  • The indices start with 0
  • When an array is passed to a function, it is passed by reference
width <- 5;
heigt <- 5;

// Initialize array of size 25, initialized with 25x 0
my_array = [width * height];

// Modify first value
my_array[0] = 5;

// Print first value
// Outputs `5`
print my_array[0];

Expressions

The operator precedence is the same order as in C for all implemented operators. Refer to the C Operator Precedence Table to see the different precedences.

General

  • Parentheses ( and ) can be used to modify evaluation oder just like in any other programming language.
  • For example (a + b) * c will evaluate the addition before the multiplication, despite the multiplication having higher binding power

Mathematical Operators

Supported mathematical operations:

  • Addition a + b
  • Subtraction a - b
  • Multiplication a * b
  • Division a / b
  • Modulo a % b
  • Negation -a

Bitwise Operators

  • And a & b
  • Or a | b
  • Xor a ^ b
  • Bitshift left (by b bits) a << b
  • Bitshift right (by b bits) a >> b
  • "Bit flip" (One's complement) ~a

Logical Operators

The logical operators evaluate the operands as false if they are equal to 0 and true if they are not equal to 0. Note that logical operators like AND / OR do not support short-circuit evaluation. So Both sides of the logical operation will be evaluated, even if it might not be necessary.

  • And a && b
  • Or a || b
  • Not !a (if a is equal to 0, the result is 1, otherwise the result is 0)

Equality & Relational Operators

The equality and relational operations result in 1 if the condition is evaluated as true and in 0 if the condition is evaluated as false.

  • Equality a == b
  • Inequality a != b
  • Greater than a > b
  • Greater or equal than a >= b
  • Less than a < b
  • Less or equal than a <= b

Control-Flow

For conditions like in if or loops, every non-zero value is equal to true, and 0 is false.

Loop

  • The loop keyword can be used as an infinite loop, as a while loop or as a while loop with advancement (an expression that is executed after each loop)
  • If only loop is used, directly followed by the body, it is an infinite loop that needs to be terminated by using the break keyword
  • The loop keyword can be followed by the condition (an expression) without needing parentheses
  • Optional: If there is a ; after the condition, there must be another expression which is used as the advancement
  • The loops body is wrapped in braces ({ }) just like in C/C++
  • The continue keyword can be used to end the current loop iteration early
  • The break keyword can be used to fully break out of the current loop
// Print the numbers from 0 to 9

// With endless loop
i <- 0;
loop {
  if i >= 10 {
    break;
  }
  print i;
  i = i + 1;
}

// Without advancement
i <- 0;
loop i < 10 {
  print i;
  i = i + 1;
}

// With advancement
k <- 0;
loop k < 10; k = k + 1 {
  print k;
}

If / Else

  • The language supports if and an optional else
  • After the if keyword must be the deciding condition, parentheses are not needed
  • The blocks are wrapped in braces ({ })
  • Optional: If there is an else after the if-block, there must be a following if-false, aka. else block
  • NOTE: Logical operators like AND / OR do not support short-circuit evaluation. So Both sides of the logical operations will be evaluated, even if it might not be necessary
a <- 1;
b <- 2;
if a == b {
  // a is equal to b
  print 1;
} else {
  // a is not equal to b
  print 0;
}

Block Scopes

  • It is possible to create a limited scope for local variables that will no longer exist once the scope ends
  • Shadowing variables by redefining a variable in an inner scope is supported
var_in_outer_scope <- 5;
{
  var_in_inner_scope <- 3;
  
  // Inner scope can access both vars
  print var_in_outer_scope;
  print var_in_inner_scope;
}

// Outer scope is still valid
print var_in_outer_scope;

// !!! THIS DOES NOT WORK !!!
// The inner scope has ended
print var_in_inner_scope;

Functions

Function definition

  • Functions can be defined by using the fun keyword, followed by the function name and the parameters in parentheses. After the parentheses, the body is specified inside a braces block
  • The function parameters are specified by only their names
  • The function body has its own scope
  • Parameters are only accessible inside the body
  • Variables from the outer scope can be accessed and modified if the are defined before the function
  • Variables from the outer scope are shadowed by parameters or local variables with the same name
  • The return keyword can be used to return a value from the function and exit it immediately
  • If no return is specified, a special void value is returned. That value can't be used in calculations or comparisons, but can be stored in a variable (even tho it doesn't make sense)
  • Functions can only be defined at the top-level. So defining a function inside of any other scoped block (like inside another function, if, loop, ...) is invalid
  • Functions can only be used after definition and there is no forward declaration right now
  • However a function can be called recursively inside of itself
  • Functions can't be redefined, so defining a function with an existing name is invalid
fun add_maybe(a, b) {
  if a < 100 {
    return a;
  } else {
    return a + b;
  }
}

fun println(val) {
  print val;
  print "\n";
}

Function calls

  • Function calls are primary expressions, so they can be directly used in calculations (if they return appropriate values)
  • Function calls are performed by writing the function name, followed by the arguments in parentheses
  • The arguments can be any expressions, separated by commas
b <- 100;
result <- add_maybe(250, b);

// Prints 350 + new-line
println(result);

IO

Print

Printing is implemented via the print keyword

  • The print keyword is followed by an expression, the value of which will be printed to the terminal
  • To add a line break a string print can be used print "\n";
a <- 1;
// Outputs `1` to the terminal
print a;

// Outputs a new-line to the terminal
print "\n";

Comments

Line comments

Line comments can be initiated by using //

  • Everything after // up to the end of the current line is ignored and not parsed
// This is a comment

Feature Tracker

High level Components

  • Lexer: Transforms text into Tokens
  • Parser: Transforms Tokens into Abstract Syntax Tree
  • Interpreter (tree-walk-interpreter): Walks the tree and evaluates the expressions / statements
  • Simple optimizer: Apply trivial optimizations to the Ast
    • Precalculate binary ops / unary ops that have only literal operands

Language features

  • General expressions
    • Arithmetic operations
      • Addition a + b
      • Subtraction a - b
      • Multiplication a * b
      • Division a / b
      • Modulo a % b
      • Negate -a
    • Parentheses (a + b) * c
    • Logical boolean operators
      • Equal a == b
      • Not equal a != b
      • Greater than a > b
      • Less than a < b
      • Greater than or equal a >= b
      • Less than or equal a <= b
    • Logical operators
      • And a && b
      • Or a || b
      • Not !a
    • Bitwise operators
      • Bitwise AND a & b
      • Bitwise OR a | b
      • Bitwise XOR a ^ b
      • Bitwise NOT ~a
      • Bitwise left shift a << b
      • Bitwise right shift a >> b
  • Variables
    • Declaration
    • Assignment
    • Local variables (for example inside loop, if, else, functions)
    • Scoped block for specific local vars { ... }
  • Statements with semicolon & Multiline programs
  • Control flow
    • Loops
      • While-style loop loop X { ... }
      • For-style loop without with X as condition and Y as advancement loop X; Y { ... }
      • Infinite loop loop { ... }
      • Break break
      • Continue continue
    • If else statement if X { ... } else { ... }
      • If Statement
      • Else statement
  • Line comments //
  • Strings
  • Arrays
    • Creating array with size X as a variable arr <- [X]
    • Accessing arrays by index arr[X]
  • IO Intrinsics
    • Print
  • Functions
    • Function declaration fun f(X, Y, Z) { ... }
    • Function calls f(1, 2, 3)
    • Function returns return X
    • Local variables
    • Pass arrays by-reference, i64 by-vale, string is a const ref

Parsing Grammar

Expressions

ARRAY_LITERAL = "[" expr "]"
ARRAY_ACCESS = IDENT "[" expr "]"
FUN_CALL = IDENT "(" (expr ",")* expr? ")"
LITERAL = I64_LITERAL | STR_LITERAL | ARRAY_LITERAL
expr_primary = LITERAL | IDENT | FUN_CALL | ARRAY_ACCESS | "(" expr ")" | "-" expr_primary 
             | "~" expr_primary
expr_mul = expr_primary (("*" | "/" | "%") expr_primary)*
expr_add = expr_mul (("+" | "-") expr_mul)*
expr_shift = expr_add ((">>" | "<<") expr_add)*
expr_rel = expr_shift ((">" | ">=" | "<" | "<=") expr_shift)*
expr_equ = expr_rel (("==" | "!=") expr_rel)*
expr_band = expr_equ ("&" expr_equ)*
expr_bxor = expr_band ("^" expr_band)*
expr_bor = expr_bxor ("|" expr_bxor)*
expr_land = expr_bor ("&&" expr_bor)*
expr_lor = expr_land ("||" expr_land)*
expr = expr_lor

Statements

stmt_return = "return" expr ";"
stmt_break = "break" ";"
stmt_continue = "continue" ";"
stmt_var_decl = IDENT "<-" expr ";"
stmt_fun_decl = "fun" IDENT "(" (IDENT ",")* IDENT? ")" "{" stmt* "}"
stmt_expr = expr ";"
stmt_block = "{" stmt* "}"
stmt_loop = "loop" (expr (";" expr)?)? "{" stmt* "}"
stmt_if = "if" expr "{" stmt* "}" ("else" "{" stmt* "}")?
stmt_print = "print" expr ";"
stmt = stmt_return | stmt_break | stmt_continue | stmt_var_decl | stmt_fun_decl 
     | stmt_expr | stmt_block | stmt_loop | stmt_if | stmt_print

Examples

There are a bunch of examples in the examples directory. Those include (non-optimal) solutions to the first five project euler problems, as well as a simple Game of Life implementation.

To run an example via cargo-run, use:

cargo run --release -- examples/[NAME]

Extras

Visual Studio Code Language Support

A VSCode extension that provides simple syntax highlighing for nek is also available on gitlab. Since this is a very small scale project, the extension was not published and instuctions on how to install it can be found in the mentioned repository.

Description
No description provided
Readme 635 KiB
Languages
Rust 100%