-
Notifications
You must be signed in to change notification settings - Fork 0
Steps moving forward
Lay out the groundwork for what needs to be done:
An open issue for this topic is also found here
And a project board for the syntax can be found here
And the entire list of syntax can be found here
Steps moving forward
We have python and .py, C and .c, Javascript and .js.
What does this language have?
Maybe Verboscript
with the .vrbo
extension?
The entire concept for this language is plain english representation. Anyone, irrespective of programming experience, should be able to read and understand exactly what is happening. For an example of how this can be useful, consider the following:
This example in Assembly:
extern _printf
global _main
section .text
_main:
mov iter, 0
mov maxit, 5
loop1 nop
push iter
push format
call _printf
pop iter
inc iter
cmp iter, maxit
jl loop1
format: db '%d', 10, 0
Is roughly equivalent to this example in C:
#include <stdio.h>
for(int x = 0; x < 5; x++)
{
printf(x)
}
Which is equivalent to this example in Python:
for x in range(0, 5):
print(x)
Which could potentially be this, in Verboscript:
start a counter at zero, then repeat the following five times:
show the counter
Each layer of abstraction becomes easier to understand in plain english, at the cost of requiring a more complicated program to execute it (Assembly is executed directly by the CPU, while Python needs a C compiler, which itself needs an assembler, before it can be execute)
How does this language represent variables, loops, functions?
Does this language try to understand spelling mistakes, like in english?
Come up with a whole host of useful examples, and possibly their equivalent in python.
Is this language Static or dynamic?
-
Static: Variables in this language can contain only a single type of data (integer, string, list, etc) that is set when the variable is defined
-
Dynamic: Variables can contain any type of data, and can be chopped and changed throughout.
-
Pros/Cons: Dynamic is more intuitive and potentially simpler to program, but static reduces the errors with type checking at runtime
Interpreted or compiled?
-
Interpreted: The language is interpreted and executed line by line with a script written in another language
-
Compiled: The language is translated directly into machine code before execution, which can then be ran directly.
-
Transpiled: Another option, where a language is translated into another language, and the translated file is executed.
-
Pros/Cons: Compiled is faster, but interpreted is much simpler
-
Descision: The language will be interpreted
Bytecode or Tree-Walk?
-
Bytecode: The language is read, and converted into a linear series of small instructions that can be executed very efficiently
-
Tree-Walk: The language is divided into a search tree, where each branch dictates the subset of instructions related to the previous
-
Pros/Cons: Tree-Walk is easier to program, but Bytecode is much faster and requires much less memory
-
Descision: The language will utilise Bytecode
Object oriented, Procedural, or Functional? see this page for a whole sleuth of paradigms
-
Object oriented: The language primarily uses and modifies objects (like classes), ie: Python
-
Procedural: The languages primarily uses procedural calls, writen in exactly the order that the computer should execute them, ie: BASIC
-
Functional: The language relies on function calls to modify data, ie: Clojure
-
Pros/Cons: Honestly, they're all much for muchness. I personally prefer blended OOP and Func (Object oriented Programming and functional).
How extensive are the data types?
-
Do we store distinguish between strings and numbers, or are they all just 'values'?
-
What about between integers and decimals, or are they all just numbers?
-
Perhaps another distinction? (maybe all numbers could be complex, and we distinguish by real/imaginary who knows?)
and any other technical considerations I've missed
Build the python scripts that will make this language work
This depends on the technical details above, but should loosley follow this order:
- A Lexer (or tokeniser) that converts a plaintext file into tokens that represent language syntax.
- ie:
x = 5; print(x + 4)
becomes[IDENTIFIER:x, EQUAL, NUMBER:5, IDENTIFIER:print, LEFTBRACE, IDENTIFIER:x, PLUS, NUMBER:4, RIGHTBRACE]
- Tokenised script is much easier to work with, compared to a plaintext file.
- ie:
- A Parser that takes the tokens, and converts them into an intermediate representation (Abstract syntax, or bytecode) that is easier to execute.
- ie:
[IDENTIFIER:x, EQUAL, NUMBER:5, IDENTIFIER:print, LEFTBRACE, IDENTIFIER:x, PLUS, NUMBER:4, RIGHTBRACE]
becomes:-
[variable_declaration("x", 5), function_call("print", operation("add", variable_fetch("x"), 4))]
in a tree walk environment. -
[Declare, x, 5, Function, Print, Add, x, 4]
in a bytecode environment
-
- ie:
- A compiler that takes the intermediate representation, and checks all variables and the like, to ensure the code can run.
- An interpreter that executes the intermediate representation.
- Note: If the language is interpreted, steps 3 and 4 are done at the same time, by the same script.
And a testing suite that compares equivalent python scripts to this language, so we can ensure behaviour is correct.
ie: if this language uses show hello world
to print to the screen, then it should give the same result as python's print("hello world")