AEC specification

I've started a discussion about AEC on many Internet forums, including Reddit and Discord.

AEC Specification

  1. Introduction
  2. What platforms can be targeted now
  3. How to use the compilers
  5. Constants
  6. Variable declarations
  7. Arrays and pointers
  8. Assignments
  9. Operators
  10. Branching
  11. Loops
  12. Functions
  13. Structures
  14. Inline assembly
  15. Built-in functions
  16. String manipulation
  17. Advanced array manipulation
  18. Conclusion
Compilers these days, even C compilers, have lots of features and are often smarter than the programmer when it comes to things they are made to do. While this is usually very useful, sometimes it's counter-productive. Suppose that you are writing a program in Assembly and want to do something high-level (because correctness is way more important than speed). You can't tell your C compiler to simply output the code to assign sqrt(a*a+b*b) to c, if you try to, it will complain those variables aren't declared and that you aren't in a C function. C compilers have ideas how to declare functions and variables in assembly. While these ideas usually work, sometimes you are writing something where the assembler will complain if you give it the code that C compilers produced for those things, and there is no way to modify the code the C compiler outputs for declaring a function in C. So, sometimes, compilers, while they could come useful for some task, are trying to do too much and are thus counter-productive. Can we have a language with a compiler that does only what you told it, in a very predictable way? Well, that's what inspired me to create Arithmetic Expression Compiler (AEC) a few years ago.

What platforms can be targeted now
Right now, I've written two compilers for the AEC language. First, I wrote one targeting x86 processors (AMD and Intel). That one is written in JavaScript and the core of it can be run in browsers that have basic support for JavaScript (even in Internet Explorer 6). To use all the features, one needs to use NodeJS or Duktape with it, to enable it to access the file system. When I started studying at the university, many professors were impressed by that. My Algorithms and Data Structures professor Alfonso Baumgartner urged me to write a paper about it which got published in Osječki Matematički List. The compiler targeting x86 is around 2'000 lines of code (excluding the example programs). So, I decided to extend it so that my language can be used to target JavaScript Virtual Machine using WebAssembly (the JavaScript bytecode, which Mozilla has been pushing to get standardized, so that people can run programming languages better than JavaScript in a browser), and not only x86. As targeting WebAssembly is easier than targeting x86 (or probably any physical processor, as WebAssembly was designed to be an easy target for compilers, rather than to be easily implemented in hardware or easy to write assembly-language code for manually), I was able to add many new features. However, I think it's still not nearly as intrusive as C compilers are. Emscripten (the primary C and C++ compiler for WebAssembly, a modified version of the CLANG compiler) always assumes the standard C library is present on the JavaScript Virtual Machine when compiling any kind of program, so it's an overkill for most cases when it could come useful. The AEC-to-WebAssembly compiler has around 5'000 lines of code, and it's written in C++ (a language much more suitable for writing compilers than JavaScript).
WebAssembly is one of the reasons I am a libertarian, because it shows that, when a private company makes a mistake, no matter how hopeless the situation seems, there will come a solution... from capitalism itself. Making JavaScript, which is widely agreed to be a very poorly-designed programming language, a standard language of the Internet, which is what Netscape did back when it had a near-monopoly on the Internet browsers... for a long time, it seemed like a way to retard the development of the Internet forever. Fortunately, once the Internet got used more, somebody came up with this brilliant idea of WebAssembly, which seems to solve basically all the problems created by Netscape with that wrong decision. And, incidentally, that solution also significantly lowers the barrier towards making a new programming language, so that many more people can experiment with those things. When governments make a mistake, quite often, there is no solution. When the UN decided back in 1948 that the solution to the Holocaust is to make Palestinians pay for the Hitler's crimes with their land... it led to wars which continue to this day, and will likely continue all until a nuclear holocaust destroys most life on Earth (as has almost happened a few times by now). A private company most likely cannot make a mistake with such horrible consequences.
UPDATE on 11/10/2020: Of course, my compiler is not a very high-quality software. The LGTM static analyzer places it in the Language grade: C/C++ category, because it has found Total alerts potential bugs per 5'000 lines of code, most of them being unnecessarily doing deep copies of C++ objects (wasting time and memory), and quite a few of them being using potentially uninitialized variables in JavaScript. If you want to collaborate with me, perhaps one of the first things to do is fix those bugs found by static analysis.

How to use the compilers
Probably the simplest way to use the AEC-to-x86 compiler on a Linux machine is to type the following code into a terminal emulator:
mkdir ArithmeticExpressionCompiler
cd ArithmeticExpressionCompiler
gcc -o aec -lm aec.c duktape.c
./aec analogClock.aec
gcc -o analogClock analogClock.s -m32
If everything is fine, the Analog Clock program should now print the current time in the terminal. I think this would work on the vast majority of Linux machines, as well as on many non-Linux (FreeBSD, Solaris...) machines. A potential problem is that the 32-bit libraries are not installed on a 64-bit Linux machine (so that -m32 fails), but this is rarely the case, as Linux machines today usually have WINE (a Windows compatibility layer requiring 32-bit libraries), or at least some 32-bit programs. Needless to say, this will not work on Linux running on ARM processors, such as Android or Raspberry Pi. Also, the Analog Clock program probably cannot be run on Windows (I haven't managed to try it using CygWin, but I am quite sure it wouldn't run on Windows even if I managed to install Cygwin. (UPDATE on 08/05/2021: I have found a relatively simple way to modify the analogClock.aec so that it can be assembled by the version of GNU Assembler that comes with 32-bit MinGW-W64, and thus run on both 32-bit and 64-bit Windows, so I saved it in analogClockForWindows.aec. Unfortunately, it still shows a lot of errors if you attempt to assemble it using the version of GNU Assembler that comes with TDM-GCC or CygWin. Apparently, the versions of GNU Assembler that comes with various ports of GCC on Windows are different from each other to a greater extent than the version of GNU Assembler that runs on Linux and one that comes with MinGW-W64 are. I find that both surprising and somewhat unfortunate. What is especially unfortunate is the fact that some preprocessor directives in GNU Assembler have the same syntax, but different meanings depending on whether it is targetting Linux or Windows. With a little more modifications, which I have also done, analogClockForWindows.aec can also be assembled by the version of LLVM Assembler that comes with CLANG on Windows. However, the GNU Assembler that comes with MinGW-W64 and the LLVM Assembler that comes with CLANG on Windows do not output the same machine code. The executable produced by LLVM Assembler appears to run significantly faster on Windows 10, but it refuses to run at all on Windows XP. I have not studied it enough to explain what is going on there, and it is a bit creepy.)), but some other programs in can. However, those cannot be assembled by GNU Assembler (invoked by gcc), you need to use FlatAssembler instead. See the ReadMe.html inside for more details. The executables of Duktape for various x86 OS-es and example x86 AEC programs are available in a ZIP archive on my GitHub profile (UPDATE on 13/05/2021: Like I have said, the analogClockForWindows.exe file, assembled by CLANG on Windows, although it works on both 32-bit and 64-bit Windows 10, for some reason that escapes me, it refuses to run on Windows XP. You can assemble the assembly code produced by my compiler for the analogClockForWindows.aec, which will be called analogClockForWindows.s, using MinGW-w64, and then it will run on Windows XP, but it will be slower. Again, the explanation for that escapes me. An obvious explanation would be that MinGW-w64 includes some kind of a polyfill for functions missing on Windows XP, that make the executable able to run on it, but make it slower. But the problem with that explanation is that, actually, the executable produced by CLANG is bigger than the executable produced by MinGW-w64. If the executable by MinGW-w64 were polyfilled, we would expect it to be bigger, rather than smaller. (UPDATE on 16/05/2021: I have started a Reddit thread about it.)). You can also use SimpleCalculator, a version of the AEC-to-x86-compiler running in Rhino (a JavaScript engine written in Java, by Mozilla) and using Swing GUI. It an be used as a simple calculator, but it also supports converting AEC programs to x86 assembly. The AEC-to-x86 compiler, both when run in Rhino and when run in Duktape, can output assembly in two formats, one compatible with FlatAssembler, and one compatible with GNU Assembler. To switch between them, use syntax fasm and syntax gas. By default, it targets FlatAssembler. When targeting GNU Assembler, keep in mind that the directive syntax gas needs to be the very first directive in your program, even before any comments. Namely, the AEC-to-x86 compiler passes the comments down to the assembler, but FlatAssembler begins comments with ;, whereas GNU Assembler begins comments with #. To GNU Assembler, semi-colon ; means to have multiple assembly-language directives in a single line (useful for when invoked from a debugger, where you need to inline a few directives).
Using the AEC-to-WebAssembly compiler is, on most Linux machines, a little trickier. The following code might work:
git clone
cd AECforWebAssembly
g++ -std=c++11 -o aec AECforWebAssembly.cpp
cd analogClock
../aec analogClock.aec
npx -p wabt wat2wasm analogClock.wat
node analogClock
Again, if everything is fine, the Analog Clock program should print the current in the terminal. However, in order for this to work, you need to have NodeJS installed, which is often not the case. You also need to have a version of NodeJS newer than one which is usually shipped with Linux, as Debian-like Linux distributions today usually ship with NodeJS 10, and CentOS-like distributions with NodeJS 6, and the code my compiler generates relies on WebAssembly.Global being present, which is only true on NodeJS 11 and newer. For the exactly same reason, the code my compiler produces will not run in Firefox 52, which is the last version of Firefox to run on Windows XP, and the first one to support WebAssembly. I think I made the right decision not to waste time supporting the earliest implementations of WebAssembly, as almost nobody will notice my effort, and I'd need to put a lot of it. Where to draw the line there? ❔🙄 Make my compiler output asm.js as well? 👎 Of course, in order for npx -p wabt wat2wasm to work, you either need to have WABT already installed, or you need to be connected to the Internet so that npx can download it (which you probably do as you recently cloned from GitHub, but the firewall also needs to enable access to, which fewer firewalls do). Also, fewer machines have git installed than they have wget installed, but, if you are a programmer, you probably have git installed as well (if you do not have wget, you may try using curl instead). If you want to use it on an OS that's not compatible with Linux, well, good luck. I have provided some executable files of my compiler for different OS-es, including FreeDOS (compiled using DJGPP C++ compiler), in case it helps, as the assets to the releases of AECforWebAssembly. Especially good luck using WABT and NodeJS there (I believe WABT can be made to run on FreeDOS, but with a lot of difficulty, and that NodeJS cannot be made to run on FreeDOS even with a lot of effort.) Sorry, but dealing with binary files is complicated, and me building an assembler for WebAssembly into my compiler (instead of simply outputting assembly) would give me a lot of hassle and very little benefit. I hope you understand.
A friend I met on Discord called zero9178 helped me write the CMAKE script for building and testing the AEC-to-WebAssembly compiler, so that you can easily use any IDE that works well with CMAKE (Visual Studio, QtCreator, NetBeans and CLion can import CMAKE projects automatically, and CMAKE can be made to output configuration files necessary for Eclipse). For now, however, Visual Studio 2019 falsely claims tests which invoke WABT fail (that WABT executables are not proper Windows executables, although they can be run from command-line as well as from CMAKE run from command line). I do not know why. The automated tests are integrated with GitHub Actions and GitLab CI, and they seem to work properly there. The structureDeclarationTestCompiles seems to run around an order of magnitude faster (that is, 10 times faster) on GitHub Actions than on the laptop I am working on, and around five times faster than on GitLab CI (that is, GitLab CI seems to be around 2 times faster than my laptop). I am not sure why, as I expected my compiler to run very poorly in computer clouds, because computer clouds are, as far as I understand it, made of countless low-powered computers which can be well-used only by programs that support parallel execution, which my compiler does not. I asked a question on Quora about that. Nevertheless, I think the hypothesis that it does not actually run there (and that is the reason why it seems to run so quickly) can be eliminated, as subsequent tests would fail (WABT invoked in structureDeclarationTestAssembles would exit with an error message, and so would NodeJS invoked by structureDeclarationTestRuns) if it were the case. The tokenizer of my compiler runs very slowly, and I do not know how to make it faster. I have made a forum thread about it.
UPDATE on 28/05/2021: As a friend I met on Discord called elucent (the author of the Basil programming language) suggested, the tokenizer can be made a lot faster by using std::remove_if to erase all whitespace at once rather than by calling std::vector<typename T>::erase for each all-whitespace token (as zero9178 had suggested me to do). I implemented that, and now the test structureDeclarationTestCompiles takes only 2 seconds to run on GitLab CI, whereas it previously took 6 seconds (so it is around 3 times faster). He also suggested some other ways to make both the tokenizer and the parser faster, but those are harder to implement.
UPDATE on 06/06/2021: The AEC-to-WebAssembly compiler can now target WebAssembly System Interface (WASI), as the example Hello World from WASI shows. Basically, you need to put #target WASI before any declarations. Unlike with AEC-to-x86 syntax gas directive, comments can go before that. I believe this a significant step forward on a way for the WebAssembly dialect of AEC to run on rarely used operating systems such as FreeDOS, using portable WASI environments such as Wasm3. Of course, that is assuming we also manage to compile wat2wasm from WebAssembly Binary Toolkit to run there (which will not be easy because CMAKE does not run there). Also, some blockchains support WebAssembly Binary Interface, so perhaps my compiler can now be useful there.

In the version of AEC targeting x86, the comments start with ; and end with a newline character, as in FlatAssembler dialect of Assembly (which ArithmeticExpressionCompiler primarily targets), and there are no multi-line comments. In the version of AEC for WebAssembly, the comments are as in C, C++ and JavaScript, single-line comments start with //, and multi-line comments start with /* and end with */. Multi-line comments do not nest (as they do in, for example, Swift). Many people say multi-line comments are a bad thing because bad programmers use them for versioning code (which is a very bad practice). I don't think the job of the compiler is to enforce some particular programming style and refuse to compile code-smelling programs (though warnings are often useful).

In AEC for x86, a token that consists of numbers and at most one point is a number, and all numbers are treated as 32-bit decimal numbers. In both dialects of AEC, a string is a token which starts and ends with ", and strings are passed unchanged to the assembler. String tokens next to each other are concatenated by the tokenizer into one string (as in C and C++, in contrast with JavaScript). In both dialects of AEC, a token consisting of three characters of which both the first one and the last one are ' is a number, and the tokenizer replaces it with a number equal to the ASCII code of the second character in that token (like in most dialects of x86 assembly). In AEC for WebAssembly, a token which matches the regular expression "(^\\d+$)|(^0x(\\d|[a-f]|[A-F])+$)" is of type Integer64 and is passed unchanged to the assembler (notice that this includes hexadecimal numbers starting with 0x, as in C, C++ and JavaScript). A token which matches the regular expression "^\\d+\\.\\d*$" is of the type Decimal64 and is also passed unchanged to the assembler. Notice that in AEC for WebAssembly, 3/2=1 (as in C and C++), while, in AEC for x86, 3/2=1.5 (as in JavaScript). It's hard to tell which approach is better, both can produce hard-to-find bugs.

Variable declarations
In the version of AEC targeting x86, there are no variable declarations in the language itself, the compiler simply assumes any token that matches the regular expression "^(_|[A-Z]|[a-z])\\w*\\[?$" and is not a keyword is a name of a variable of type 32-bit decimal number or 32-bit decimal number array (if it ends with [) that's been previously declared in assembly. In the version of AEC targeting WebAssembly, variables are declared with:
DataType name_of_the_variable;
Where DataType is Character, CharacterPointer, Integer16, Integer16Pointer, Integer32, Integer32Pointer, Decimal32, Decimal32Pointer, Decimal64 or Decimal64Pointer. The compiler assumes the pointers are 32-bit and characters are 8-bit, which is true in the vast majority of cases. There is also a way to initialize a variable:
DataType name_of_the_variable := initial_value;
Without that, global variables are zero by default, and local variables contain whatever happens to be on the top of the system stack at the time of their declarations (as in C or C++). In Ada and old versions of C, variables must be declared on the top of a scope, before any other statement. There is no such restriction in AEC. However, the initial values to global variables must be compile-time constants. That means, you can't refer to other global variables or to AEC functions. However, when writing an initial value for a decimal variable, you can use C library (available to the compiler) functions and constants, such as sin(x), asin(x), atan2(y,x), pi or e. The same doesn't work when assigning initial values to local variables. It's possible to declare multiple variables of the same type in the same statement by separating them with a comma , (as in C, C++ and JavaScript). For example (excerpt from the Dragon Curve):
Integer32 directionX[4]    := { 0, 1, 0, -1},
          directionY[4]    := {-1, 0, 1,  0},
          currentX         := 10,
          currentY         := 250 + 490 - 410, //When set on 250, the turtle
                                               //reaches 410 and then turns
                                               //back (I know this by
          currentDirection := 0,
          lineLength       := 5,
          lineWidth        := 2,
          currentStep      := 0;
I like the C-like approach to declaring multiple variables in the same statement way more than the Ada-and-VHDL-like approach, yet alone the Rust-like approach.
TODO: Decide what to do about aligning the variables in memory (making sure that, for example, a Integer32 is on the memory location divisible by 4). Aligning variables wastes memory, sometimes around half of the allocated memory ends up unused because of the aligning. On the other hand, for the interoperability with other languages, it is probably desirable for variables and arrays to be aligned. JavaScript throws an exception on attempted unaligned access, while in C and C++, it is supported by some compilers and optimization levels but not in others (it's undefined behavior). Right now, the AEC compiler doesn't make sure the variables are aligned, which I am not sure is the best approach. Also, while JavaScript Virtual Machine does allow unaligned memory access, it's not guaranteed to be nearly as fast as aligned access (on x86, it usually is, on ARM, it's many times slower).

Arrays and Pointers
Arrays are declared as follows:
DataType name_of_the_array[size];
They can be initialized as follows:
DataType name_of_the_array[size] := {first_element, second_element...};
Note that, unlike in C, if you put only one element in the initializer list (between { and }), only the first element is initialized with that value, while others are left uninitialized (or are, in the case of global variables, set to zero). Makes a lot more sense to me than those complicated rules C and C++ have for initializing arrays. In C and C++, you can use array-style syntax with pointers or pointer-style syntax with arrays most of the time, except in some confusing scenarios. In AEC, you can never do those things. To the AEC compiler, the array is named name_of_the_array[ rather than name_of_the_array, and attempting to use name_of_the_array to refer to the pointer to the first element of the array (as you can usually do in C or C++) leads to "undeclared variable" error. This can be confusing to those who come from C or C++, but it makes a lot more sense to me. Or, you can make it behave in the C-like manner (if it makes your code shorter) like this (excerpt from the Analog Clock in AEC):
    Character signature[100] := {0};
    CharacterPointer signature := AddressOf(signature[0]);
    //AEC, unlike C, always makes a clear distinction between
    //arrays and pointers.
    logString("Empty signature has length of: ");
    strcat(signature, " Analog Clock for WebAssembly\n");
To get or assign the value to the thing a pointer points to, you use the ValueAt( operator, like this (excerpt from HybridSort in AEC ):
      ValueAt(originalni_niz + donja_granica) :=
                ValueAt(originalni_niz + donja_granica + 1);
Makes a lot more sense to me than to use the same operator as for the multiplication, like C or C++ do (they use * for both of those things). To get a pointer to something, you use the AddressOf( operator. Makes a lot more sense to me than the way C and C++ do that, using the same operator as for the bitwise and operation (they use &). AEC for x86 doesn't support pointers at all. Note that, like in C and C++ (but unlike in JavaScript or Assembly), name_of_some_integer32pointer := name_of_some_integer32pointer + 1 increases the value stored in it by 4 (the size of Integer32), rather than by 1.
TODO: Implement the multi-dimensional arrays. But let's not use the JavaScript-like approach to them, JavaScript really sucks in that regard.

For assignments, you use := operator, in both dialects of AEC. In the AEC for WebAssembly, you can nest assignment expressions, like this (excerpt from HybridSort):
    broj_obrnuto_poredanih_podniza  :=
        broj_vec_poredanih_podniza  :=
        broj_pokretanja_QuickSorta  :=
        broj_pokretanja_MergeSorta  :=
        broj_pokretanja_SelectSorta := 0;
After that, all 5 variables will be 0. You can't do that in AEC for x86. In AEC for WebAssembly, assignment statements end with a semicolon ; and can run across multiple lines. In AEC for x86, they end with a newline character. In AEC for x86, there is also the string-assignment operator <= (similar to the difference between signal and variable assignments in VHDL). In AEC for WebAssembly, you use := for string assignments.
TODO: Implement the assignment operators +=, -= and related ones as they are in C, C++ and JavaScript. They can make code significantly shorter. (UPDATE on 14/09/2020: They have been implemented.)

AEC has the following operators:
Priority Associativity Operators
1 left . ->
2 left * /
3 left - +
4 left < > =
5 left and (in the x86 dialect: &)
6 left or (in the x86 dialect: |)
7 right ?: (ternary conditional operator)
8 right := (assignment operator)
(UPDATE on 07/10/2020: The operators . and -> have the same meaning they do in C, dealing with structures and structure pointers.) The first argument of the ternary conditional operator is converted to Integer32 (that's how WebAssembly represents Booleans), and the last two are converted to strongest type of those two. The strongest types are pointers, next comes Decimal64, then Decimal32, then Integer64, then Integer32, then comes Integer16, while the weakest type is Character. Most binary operands convert both of their operands to the stronger type, except for the assignment operator, the and and or operator. The assignment operator converts the right-hand side operand to the type of the left-hand-side operand. In case one of them is a pointer and the other one isn't, the compiler issues a warning. The bitwise and and or convert both operands to Integer32. There are no logical and and or operators in AEC (like the C and C++ and and or or && and ||). This can lead to some confusing behavior. For instance, 1 and 2 would be 1 in C and C++, but it's 0 in AEC. To convert a number to Boolean, you can use the built-in function not(x), like this (from HybridSort):
        velicina_niza / (64 * 1024 / 4) +
        not(not(mod(velicina_niza, 64 * 1024 / 4))) >
        prijasnja_velicina_niza / (64 * 1024 / 4) +
        not(not(mod(prijasnja_velicina_niza, 64 * 1024 / 4))) or
        prijasnja_velicina_niza = 0
Namely, not(not(x)) returns 1 if x is not 0, and 0 if x=0. You can also use not(x=0). There is no built-in exclusive or function or operator in AEC, but you can easily build one like this (excerpt from Arithmetic Operators Test):
Function xor(Integer32 first,
             Integer32 second)
     Which Returns Integer32 Does
        //I hope people will like the way I named the bit-operators.
        Return (first and invertBits(second)) or (invertBits(first) and second);
As you can see, there is a built-in invertBits(Integer32 x) function which inverts the bits in an integer. Internally, it xor-s x with -1.
TODO: Implement a less-than-or-equal-to operator <= and a greater-than-or-equal-to operator >=. For that, we will need to modify both the tokenizer, the parser and the compiler.

Branching is supported only via If, If-Else, and If-ElseIf-Else. There is no equivalent of C, C++ and JavaScript switch-case. After the If token, the compiler expects a condition. In AEC for WebAssembly, the condition ends with a Then, while, in AEC for x86, the condition ends with a newline character. The branching ends with EndIf. That's to make the program easier to parse, and to prevent dangling-else. Between Then and EndIf, there can be an Else token. Before the Else token (if there is one, otherwise it's before the EndIf) there can be an ElseIf token. An ElseIf token, much like the If token, is followed by an expression representing a condition, which is ended by the Then token. After the Then token and after the Else token, you can put an { (curly brace) which the compiler will ignore. Similarly, you can put an } before EndIf, ElseIf and Else. That is to make it easier to use text editors made primarily for C like languages for writing AEC (to jump to the end of the code block, for example). An example from the Analog Clock in AEC:
        If signature[j] = '\n' Then
            i := (i / windowWidth + 1) * windowWidth;
        ElseIf not(signature[j] = 0) Then
            output[i] := signature[j];
            colors[i] :=   modraColor;
            i         :=        i + 1;
            output[i] := ' ';
If there is an ElseIf token inside the If-statement, there, of course, doesn't need to be an Else token:
            If j < 2 and (output[i - windowWidth] = 'x' and
                (output[i + 1] = 'x' or output[i - 1] = 'x')) Then
                output[i] :=            'x';
                colors[i] := darkGreenColor;
            ElseIf j=2 and (output[i + 1]=' ' and
                output[i - windowWidth] = 'x') Then
                output[i] := ' ';
Note that, unlike in Ada or VHDL, there is no semicolon after the EndIf token.
Simple branching can also be done using the ?: operator. As it is right-associative, it can be used to concisely write ElseIf statements. An example of that from Arithmetic Operators Test (one of the first programs I wrote in AEC for WebAssembly):
Function signum(Integer32 number) Which Returns Integer32 Does
     * The ternary conditional operator "?:" is right-associative,
     * as it is in C, C++ and JavaScript (unlike in PHP), which
     * makes it easy to abbreviate else-if statements using it.
     * And, as of time of writing this, I haven't yet implemented
     * the "If" statement into the AEC-to-WebAssembly compiler.
    Return (number<0)? //If the number is less than 0...
                    -1 //signum of that number is -1...
                     : //else...
                     (number=0)? //if the number is 0...
                               0 //signum of that number is 0.
                               : //else...
                               1; //The signum of that number is 1.
In AEC for x86, the condition after ElseIf ends with a newline, rather than with a Then token.
TODO: Implement something like switch-case, to make it easier to write long ElseIf branchings. Most of the time, they are a sign of bad code design, but sometimes they are not, and it's not acceptable for a language to discourage them.

For now, there is only one type of loop in AEC, the While-loop. A While token is followed by an expression representing a condition. That expression ends with a newline character in AEC for x86, and with a Loop token in AEC for WebAssembly. In both of them, the While-statement ends with an EndWhile token. For example, here is what the Euclid Algorithm looks like in AEC for WebAssembly (an excerpt from Euclid Test):
    While not(b=0) Loop
        If a>b Then
            If a=0 Then
                Return b;
It uses the built-in function mod(Integer64 a, Integer64 b) to get the remainder of the division. AEC for x86 supports it both for integers and decimal numbers, whereas AEC for WebAssembly supports it only for integers (for the simple reason that WebAssembly doesn't support it for decimal numbers either). Nevertheless, there is a simple way to write it yourself in AEC for WebAssembly (excerpt from the Analog Clock):
Function fmod(Decimal32 a, Decimal32 b) Which Returns Decimal32 Does
    Return a - b * Integer32(a / b);
It might actually be a good idea to use Integer64( instead of Integer32(, because there is no guarantee that a / b will be in the range of Integer32. So, that's how you do casting in AEC, with the built-in functions named DataType(.
TODO: Think of some nice syntax for the for-loop, every remotely modern language has a for-loop. Also, implement something like C, C++ and JavaScript break and continue. Most of the time, they are a sign of bad code design, but sometimes they aren't, and it's not acceptable for a language to discourage them.

AEC for x86 supports no user-added functions. Furthermore, the parser implemented there doesn't support functions with more than 2 arguments. AEC for WebAssembly supports functions, because WebAssembly makes it easy to implement them (x86 assembly doesn't have a standardized way to call functions, the way it's done varies by operating system). Importing functions from JavaScript is done like this (excerpt from Analog Clock):
//Let's import some functions useful for debugging from JavaScript...
Function logString(CharacterPointer str) Which Returns Nothing Is External;
Function logInteger(Integer32 int) Which Returns Nothing Is External;
So, function declaration starts with a Function token. After it, comes a function name (ending with an open parenthesis (). After that, comes the list of arguments (it may be empty, and it often is). An element of that list consists of an argument type and an argument name. Unlike in C, argument name is obligatory and the parser is going to complain if you don't insert it. And there doesn't appear to be a simple way to change that. Arguments are separated by a colon token ,. After that comes a Which token, after which comes a Returns token. Then comes a return type (which can be a data type or Nothing). If the function is being imported from JavaScript, then comes the Is-token, External-token and a semicolon. If the function is implemented in AEC, there comes the Does-token, then comes the function body which ends with the EndFunction token. A function exits when the control flow reaches the EndFunction-token or when it reaches a Return-statement. If the function returns Nothing, a Return-statement consists only of a Return-token and a semicolon. If a function returns something, there needs to be an expression between a Return-token and a semicolon, the result of which the function will return to its caller. If the control flow of a function that returns something reaches the EndFunction-token, the function returns 0 (in sharp contrast with both C-like languages, where such a function returns an undefined value, and Rust, where such a function returns a value of last expression in it). There have been a few examples of functions in this specification. Arguments to functions may have default values, specified as follows (excerpt from Empty Function Test):
Function empty_function(Character charArgument:='A',
                        Integer16 shortArgument:=4096,
                        Integer32 intArgument:=32768,
                        Integer64 longArgument:=8*exp(9*log(10)),
                        Decimal32 floatArgument:=22/7,
                        Decimal64 doubleArgument:=pi)
         Which Returns Nothing Does
    //It does nothing, but the compiler should still generate valid code.
Now, if you call such a function with fewer than 6 arguments, the compiler will not complain, but will supply the rest with default values. Note that this will not work when calling an AEC function from JavaScript. AEC doesn't support function hoisting nor circularly dependent functions, for the simple reason that there is no obvious way to implement them in WebAssembly. (UPDATE: C++-style forward function declarations have been implemented, as there is a relatively simple way to do it using WebAssembly Binary Toolkit, the assembler takes care of that for you. The syntax is the same as for declaring external functions, except that you replace External with Declared.)
TODO: Find a way to implement circularly dependent functions, and, if possible, function hoisting. Also, implement function pointers. Named function arguments (using, for example, the := operator) might also come useful (in languages with them, there is basically no need for the abstract builders and directors).

The parser can parse structure declarations of the form:
Structure Point Consists Of
    Decimal32 x,y,z;
    Integer16 number_of_dimensions;
However, the compiler crashes if you insert them in a real program (rather than just in a parser test). Structures are supposed to be instantiated using a directive called InstantiateStructure, that is, an InstantiateStructure token followed by the structure name followed by the variable name (and the variable will be of type represented by the structure name).
TODO: Implement structures in the compiler, not just in the parser. (UPDATE on 30/09/2020: Some progress on that has been made, namely, local and global structures are now supported as long as they are not nested. You can see an example program using them here. Structures as arguments or return types aren't supported for now, and are unlikely to be so in the near future. I've started a Reddit thread asking for help with that.
UPDATE on 16/02/2021: My implementation of the N-Queens Puzzle in AEC uses structures.
UPDATE on 28/04/2021: I have started a forum thread to help me with implementing structures as arguments to the functions.)

UPDATE on 03/06/2021: It is important to note that the assignment operator when used between structures works very differently in AEC and C++. I made it so because I found the C++ behaviour very confusing while making this compiler, as it appeared to corrupt the Abstract Syntax Tree while compiling. Excerpt from Structure Declaration Test that illustrates the AEC-specific behaviour (the spaghetti function returns 2):
Structure ListNode Consists Of {
  ListNodePointer next;
  Integer32 value;

Function spaghetti() Which Returns Integer32 Does {
  // See the link above, about how C++ behaviour appeared to corrupt the AST.
  // By common sense, this should return 2. By C++ semantics, this should
  // return 3.
  InstantiateStructure ListNode list[3];
  list[0].value : = 1;
  list[1].value : = 2;
  list[2].value : = 3;
  list[0].next : = AddressOf(list[1]);
  list[1].next : = AddressOf(list[2]);
  list[0] : = ValueAt(list[0].next);
  CharacterPointer pointer : = AddressOf(list[0]);
  Return ValueAt(ListNodePointer(pointer)).value;
The assembly code that the AEC compiler produces for structure assignments is slower than one produced by C++ compilers, but, in my opinion, the assembly code by AEC behaves way more intuitively in cases like this.

Inline assembly
AEC for x86 passes everything between the AsmStart and AsmEnd token unchanged to the assembler. AEC for WebAssembly has keywords asm(, asm_i32(, asm_i64(, asm_f32( and asm_f64( which expect a compile-time constant string as an argument which the compiler will process (for example, replace \n with a literal new-line character and \\ with \...) and pass it to the assembler. asm( assumes that nothing will be left on the system stack by the inline assembly code, while other ones assume that a certain WebAssembly type (corresponding to some AEC type) will be left on the system stack by it, which they will then fetch. Since WebAssembly type i32 corresponds to AEC Integer32 and pointers, we can write something like this (excerpt from HybridSort):
//Napravimo sada omotnicu oko WebAssemblerske naredbe "memory.grow"...
Function zauzmi_memorijske_stranice(Integer32 broj_stranica) Which Returns
    CharacterPointer Does
    Integer32 nova_adresa_u_stranicama :=
        asm_i32 //"asm_i32" kaže compileru da umetne asemblerski kod, i da
                //pretpostavi da će se nakon njega na sistemskom stogu
                //nalaziti vrijednost tipa "i32". To očito nije točno ako
                //netko prebaci JavaScript virtualnu mašinu u 64-bitni
                //način rada, ali nadam se da to nitko neće napraviti.
                //Vjerojatnost da će JavaScript virtualnoj mašini trebati
                //više nego 4GB RAM-a je zanemariva, a vjerojatnost da će
                //se neki korisni programi srušiti zbog prebacivanja u 
                //64-bitni način rada nije baš zanemariva.
                "\t(local.get 0)\n" //Prvi (nulti) argument funkcije,
    If nova_adresa_u_stranicama = -1 Then //Ako nema više
                                          //slobodne memorije...
        Return -1;
    Return nova_adresa_u_stranicama * 64 * 1024; //Na JavaScript Virtualnoj
                                                 //Mašini, jedna stranica
                                                 //(page) iznosi 64 KB.
Remember that string tokens put next to each other are concatenated by the tokenizer into one string token.
TODO: Implement a GCC-and-CLANG-style inline assembly, where you can access variables from assembly by their names (rather than having to manually calculate their memory addresses).

Built-in functions
AEC for x86 has many built-in functions, wrappers around x86 assembly instructions. These are: sin(, cos(, tan(, atan2(, arctan(, ctg(, arcctg(, sqrt(, arcsin(, arccos(, ln( (natural logarithm), log( (base-10 logarithm), exp( (returns the Euler number to the power of the argument), pow(, abs( and mod( (floating-point modulo). Unlike in most programming languages, trigonometric functions expect arguments in degrees (not radians, as in C, C++ or JavaScript). Similarly, the cyclometric functions such as atan2( return the result in degrees, rather than in radians. If you want to use them in AEC for WebAssembly, you can write them yourself, like this (excerpt from the Analog Clock):
Function sin(Decimal32 degrees) Which Returns Decimal32 Does
    If degrees<0 Then
        Return -sin(-degrees);
    If degrees>90 Then
        Decimal32 sinOfDegreesMinus90 := sin(degrees - 90);
        If fmod(degrees, 360) < 180 Then
            Return sqrt(1 - sinOfDegreesMinus90 * sinOfDegreesMinus90);
            Return -sqrt(1 - sinOfDegreesMinus90 * sinOfDegreesMinus90);
     * Sine and cosine are defined in Mathematics 2 using the system of
     * equations (Cauchy system):
     * sin(0)=0
     * cos(0)=1
     * sin'(x)=cos(x)
     * cos'(x)=-sin(x)
     * ---------------
     * Let's translate that as literally as possible to the programming
     * language.
    Decimal32 radians := degrees / oneRadianInDegrees,
              tmpsin  := 0,
              tmpcos  := 1,
              epsilon := radians / PRECISION,
              i       := 0;
    While (epsilon>0 and i<radians) or (epsilon<0 and i>radians) Loop
        tmpsin := tmpsin + epsilon * tmpcos;
        tmpcos := tmpcos - epsilon * tmpsin;
        i      :=               i + epsilon;
    Return tmpsin;
Or, you may use the Taylor Series. But I think this obeys the KISS (keep it simple, stupid) principle better. Note that you can't call the JavaScript Math.sin and similar functions, because they are methods of the Math singleton, and there is no standardized way to call methods of JavaScript objects from WebAssembly (for a good reason).
TODO: Build some mathematical functions into AEC for WebAssembly.

String manipulation
AEC for x86 doesn't support string manipulation at all, it doesn't have a character type. AEC for WebAssembly is about as good at C manipulation as C is without a C library (like when doing operating system development). There are no built-in string-manipulation functions, but one can easily write them oneself, like this (excerpt from Dragon Curve):
//Again, we need to implement string manipulation functions. Like I've said,
//even though this program will be running on JavaScript Virtual Machine, it
//can't call the methods of the JavaScript "String" class.
Function strlen(CharacterPointer str) Which Returns Integer32 Does
    //We can't implement this recursively, like we did in earlier AEC
    //programs, because we will be dealing with large strings which will
    //cause stack overflow.
    Integer32 length := 0;
    While ValueAt(str + length) Loop
        length := length + 1;
    Return length;

Function strcpy(CharacterPointer dest,
                CharacterPointer src) Which Returns Nothing Does
    While ValueAt(src) Loop
        ValueAt(dest) := ValueAt(src);
        dest          :=     dest + 1;
        src           :=      src + 1;
    ValueAt(dest) := 0;

Function reverseString(CharacterPointer string) Which Returns Nothing Does
    CharacterPointer pointerToLastCharacter := string + strlen(string) - 1;
    While pointerToLastCharacter - string > 0 Loop
        Character tmp                   := ValueAt(string);
        ValueAt(string)                 := ValueAt(pointerToLastCharacter);
        ValueAt(pointerToLastCharacter) := tmp;
        string                          := string + 1;
        pointerToLastCharacter          := pointerToLastCharacter - 1;

Function strcat(CharacterPointer dest,
                CharacterPointer src) Which Returns Nothing Does
    strcpy(dest + strlen(dest), src);

Function convertIntegerToString(CharacterPointer string,
                                Integer32 number)
    Which Returns Integer32 Does //Returns the length of the string.
    Integer32 isNumberNegative := 0;
    If number < 0 Then
        number           := -number;
        isNumberNegative :=       1;
    Integer32 i := 0;
    While number > 9 Loop
        ValueAt(string + i) := '0' + mod(number, 10);
        number              :=           number / 10;
        i                   :=                 i + 1;
    ValueAt(string + i) := '0' + number;
    i                   :=        i + 1;
    If isNumberNegative Then
        ValueAt(string + i) :=   '-';
        i                   := i + 1;
    ValueAt(string + i) := 0;
    Return i;
To delete a character from a string, you can use the following function from the Havlik's Law:
Function izbrisi_znak_iz_stringa(CharacterPointer mjesto_u_stringu)
         Which Returns Nothing Does
               While ValueAt(mjesto_u_stringu) Loop
                     ValueAt(mjesto_u_stringu) :=
                                                mjesto_u_stringu + 1
                     mjesto_u_stringu += 1;
TODO: Build some useful string manipulation into AEC, at least ones for converting numbers to strings and vice versa.

Advanced array manipulation
When it comes to sorting arrays, I've tried to efficiently implement the HybridSort (a sorting algorithm I came up with, a mixture of MergeSort, QuickSort and SelectionSort) algorithm in AEC. HybridSort sorting algorithm is based on the fact that the number of comparisons done by MergeSort depends only on the size of the array, and is always equal to
, where n is the length of the array, while QuickSort is faster more shuffled the array is and is slowest for already-sorted and inverse-sorted arrays. So, sometimes QuickSort is faster, and sometimes MergeSort is faster. Using a simple genetic algorithm, I came up with a formula for approximating how many comparisons QuickSort will do on a given array. That formula is:
e(ln(n) + ln(ln(n))) · 1.05 + (ln(n) - ln(ln(n)) - ln(2)) · 0.9163 · |2.38854 · s7 - 0.284258 · s6 - 1.87104 · s5 + 0.372637 · s4 + 0.167242 · s3 - 0.0884977 · s2 + 0.315119 · s|
, where s is the sortedness of the array (-1 when the array is inverse-sorted, 1 when it is sorted, around 0 when it is randomly shuffled), and ln being the natural logarithm, base e = 2.718.... I am not sure how to test how correct that approximation is. My algorithm is recursive, and for every iteration of the recursion, it estimates whether it should behave like QuickSort or like MergeSort based on those formulas. In case the array is very small, so that SelectionSort is faster than both QuickSort and MergeSort, or in case it runs of stack memory, it runs SelectionSort. However, for some reason, my algorithm is significantly slower than JavaScript Array.sort method:

Results of the measurements I am not sure what causes those stairs in the measurement results. Professor Alfonzo Baumgartner thinks it has to do with cache misses, here is what he wrote when I asked him about that in an e-mail:
objašnjenje koje ja nudim za te 'stepenice' je u činjenici da svi procesori koriste cache memoriju u koje spremaju određeni broj memorijskih stranica.
Ako uzmemo da je jedna memorijska stranica npr. 64K, onda, čim se naš niz poveća samo za 1 element više od 64K, potrebne su dvije memorijske stranice. Čim se naš niz prostire na dvije memorijske stranice, odmah je mogućnost tzv. 'cache miss'-ova kada procesor neće pronaći u svome L1 cache-u tu stranicu pa mora doći do zamjene stranica unutar cache, što usporava izvođenje programa.
Kako Vi povećavate niz, tako se koristi sve više memorijskih stranica u kojima je on zapisan, a sam cache-ing sistem onda ima više posla oko njihovih zamjena.
Zato se za neke duljine nizova dobiva skoro isti rezultat, a onda čim povećamo samo za jedan podatak (koji ide u novu mem. stranicu) dobijemo 'drastično' uvećan rezultat mjerenja.
Ne znam jeste li me shvatili i nisam siguran 100% da je to kod Vas uzrok tih 'stepenica', ali na prvi pogled mi se čini takvo nešto..
In any case, I think this is an interesting result worth of further exploration. Which sorting algorithms have those stairs when measured? Why do some do, and some (like JavaScript Array.sort) do not, on the same array?

I think AEC is a promising project, but a lot of work is still needed to make it successful. I don't think I can do everything that's needed for it to be successful by myself. (UPDATE on 06/06/2021: It would, for example, be useful to make a web-based IDE for the AEC-to-WebAssembly compiler, so that somebody can try my programming language directly in the browser. I have opened a Quora question asking for advice about how to do that. We would need to get the AEC-to-WebAssembly compiler, which can already run in NodeJS if compiled with EMSCRIPTEN, to run in a browser, and, to be honest, I do not know enough WebAssembly to do that by myself. Actually, I think almost no web-developer these days has the knowledge needed to make that. We would also need to embed the wat2wasm from WebAssembly Binary Toolkit to run in that web-app, as my compiler relies on it to convert the WebAssembly assembly language it outputs to the bytecode that browsers understand. Somebody has already made wat2wasm run in modern browsers, but they, apparently, left no instructions how they managed to do that.).

UPDATE on 16/10/2020: I've published a YouTube video about programming in your programming languages for the client-side web. If you have trouble playing it, you can download the minified MP4 and try opening it in VLC or a similar program. If nothing else works, try opening the ZIP file with a PDF, an ODP and a PPT file.