diff options
Diffstat (limited to 'doc')
-rw-r--r-- | doc/Documentation.md | 94 | ||||
-rw-r--r-- | doc/IMPLEMENTATION.md | 21 |
2 files changed, 63 insertions, 52 deletions
diff --git a/doc/Documentation.md b/doc/Documentation.md index adb0bef..ca579a2 100644 --- a/doc/Documentation.md +++ b/doc/Documentation.md @@ -16,11 +16,14 @@ Instructions can be in 7 different formats: 6.2 Opcode(u8) + register(i8) + intermediate(u16)\ 6.3 Opcode(u8) + register(i8) + data(u16)\ 6.4 Opcode(u8) + flags(u8) + num_local_var_reg(u16) -7. 5 bytes: Opcode(u8) + index(u16) + num_args(u8) + register(i8) +7. 5 bytes: Opcode(u8) + index(u8) + index(u16) + num_args(u8) ## Registers Registers have a range of 128. Local variables start from register 0 and increment while parameters start from -1 and decrement. Registers have the scope of functions and reset after instructions reach a new function (AMAL_OP_FUNC_START). +If import index for call and calle is 0, then that means the function resides in the same file the function call +is being called from. Which means that import index 1 is actually import index 0 into the import list. + # Compiler flow (Tokenize&parse -> Resolve AST -> Generate SSA -> Generate bytecode) -> Generate program\ Each step except the last is done using multiple threads in parallel and the output of each step is used @@ -29,7 +32,9 @@ and writing it to a file, which is an IO bottlenecked operation and it won't ben and may even lose performance because of it. # Bytecode -The layout of the full bytecode is: Header (Intermediates Strings Functions External_Functions Exported_Functions Instructions)* +The layout of the full bytecode is: Header (X Intermediates X Strings X Functions X External_Functions X Exported_Functions X Imports X Instructions)*\ +Where the X is a magic number to make it easier to find errors while decoding the bytecode.\ +The value of the magic number is @AMAL_BYTECODE_SECTION_MAGIC_NUMBER # Bytecode header ## Header layout @@ -44,10 +49,10 @@ The versions in the header only changes for every release, not every change. # Bytecode intermediates ## Intermediates layout -|Type |Field |Description | -|------------|------------------|-------------------------------------------------------------------------------| -|u32 |Intermediates size|The size of the intermediates section, in bytes. | -|Intermediate|Intermediate data |Multiple intermediates, where the total size is defined by @Intermediates size.| +|Type |Field |Description | +|--------------|------------------|-------------------------------------------------------------------------------| +|u32 |Intermediates size|The size of all intermediates, in bytes. | +|Intermediate[]|Intermediate data |Multiple intermediates, where the total size is defined by @Intermediates size.| ## Intermediate |Type|Field|Description | @@ -57,11 +62,11 @@ The versions in the header only changes for every release, not every change. # Bytecode strings ## Strings layout -|Type |Field |Description | -|------|-----------------|------------------------------------------------------------------| -|u16 |Number of strings|The number of strings. | -|u32 |Strings size |The size of the strings section, in bytes. | -|String|Strings data |Multiple strings, where the total size is defined by @Strings size| +|Type |Field |Description | +|--------|-----------------|------------------------------------------------------------------| +|u16 |Number of strings|The number of strings. | +|u32 |Strings size |The size of all strings, in bytes. | +|String[]|Strings data |Multiple strings, where the total size is defined by @Strings size| ## String |Type|Field|Description | @@ -70,33 +75,46 @@ The versions in the header only changes for every release, not every change. |u8* |Data|The data of the string, where the size is defined by @Size. Strings are null-terminated.| # Bytecode functions -## Internal functions layout -|Type|Field |Description | -|----|-------------------|---------------------------------| -|u16 |Number of functions|The number of internal functions.| +## Functions layout +|Type |Field |Description | +|----------|----------|--------------------------------------------------------------------------------------| +|u16 |num_funcs |The number of non-extern functions. | +|u32 |funcs_size|The size of all functions, in bytes. | +|Function[]|Functions |Multiple non-extern functions, where the number of functions is defined by @num_funcs.| + +## Function +|Type|Field |Description | +|----|-------------------------|------------------------------------------------------------------------------------------------------------------------| +|u32 |func_offset |The offset in the program code (machine code) where the function starts. Is always 0 until the program has been started.| +|u8 |num_params |The number of parameters. | +|u32 |params_num_pointers |The number of pointers in the parameters. | +|u32 |params_fixed_size |The size of all non-pointer type parameters, in bytes. | +|u8 |num_return_types |The number of return values. | +|u32 |return_types_num_pointers|The number of pointers in the return types. | +|u32 |return_types_fixed_size |The size of all non-pointer type return types, in bytes. | # Bytecode external functions ## External functions layout -|Type |Field |Description | -|-----------------|------------------|-----------------------------------------------------------------------------------------| -|u16 |num_extern_func |The number of external functions. | -|u32 |extern_funcs_size |The size of the external functions section, in bytes. | -|External function|External functions|Multiple external functions, where the number of functions is defined by @num_extern_func| +|Type |Field |Description | +|-------------------|------------------|-----------------------------------------------------------------------------------------| +|u16 |num_extern_func |The number of external functions. | +|u32 |extern_funcs_size |The size of all external functions, in bytes. | +|External function[]|External functions|Multiple external functions, where the number of functions is defined by @num_extern_func| ## External function -|Type|Field |Description | -|----|--------|-----------------------------------------------------------------------------------------------------| -|u8 |num_args|The number of arguments the functions has. | -|u8 |name_len|The length of the external function name, in bytes. Excluding the null-terminate character. | -|u8* |name |The name of the external function, where the size is defined by @name_len. Names are null-terminated.| +|Type|Field |Description | +|----|----------|-----------------------------------------------------------------------------------------------------| +|u8 |num_params|The number of parameters the functions has. | +|u8 |name_len |The length of the external function name, in bytes. Excluding the null-terminate character. | +|u8[]|name |The name of the external function, where the size is defined by @name_len. Names are null-terminated.| # Bytecode exported functions ## Exported functions layout -|Type |Field |Description | -|-----------------|------------------|-----------------------------------------------------------------------------------------| -|u16 |num_export_func |The number of exported functions. | -|u32 |export_funcs_size |The size of the exported functions section, in bytes. | -|Exported function|Exported functions|Multiple exported functions, where the number of functions is defined by @num_export_func| +|Type |Field |Description | +|-------------------|------------------|-----------------------------------------------------------------------------------------| +|u16 |num_export_func |The number of exported functions. | +|u32 |export_funcs_size |The size of all exported functions, in bytes. | +|Exported function[]|Exported functions|Multiple exported functions, where the number of functions is defined by @num_export_func| ## Exported function |Type|Field |Description | @@ -104,7 +122,21 @@ The versions in the header only changes for every release, not every change. |u32 |instruction_offset|The offset in the instruction data where the exported function is defined. Is always 0 until the program has been started.| |u8 |num_args |The number of arguments the functions has. | |u8 |name_len |The length of the exported function name, in bytes. Excluding the null-terminate character. | -|u8* |name |The name of the exported function, where the size is defined by @name_len. Names are null-terminated. | +|u8[]|name |The name of the exported function, where the size is defined by @name_len. Names are null-terminated. | + +# Bytecode imports +## Imports layout +|Type |Field |Description | +|--------|------------|-------------------------------------------------------------------------| +|u8 |num_imports |The number of imports. | +|u32 |imports_size|The size of all imports, in bytes. | +|Import[]|Import |Multiple imports, where the number of imports is defined by @num_imports.| + +## Import +|Type|Field |Description | +|----|---------------------|----------------------------------------------------------------------------------------| +|u32 |function_index |The index in the bytecode where function header begins for the imported file. | +|u32 |extern_function_index|The index in the bytecode where the extern function header begins for the imported file.| # Bytecode instructions ## Instructions layout diff --git a/doc/IMPLEMENTATION.md b/doc/IMPLEMENTATION.md deleted file mode 100644 index 387c6eb..0000000 --- a/doc/IMPLEMENTATION.md +++ /dev/null @@ -1,21 +0,0 @@ -# Goal -1. In the first stage the parser parses multiple files at the same time using multiple threads. -The tokenization should be done without storing the tokens in a list (streaming) but AST needs to be stored in a list -because the compiler needs to support out of order declarations. -2. In the second stage the ast is handled using multiple threads. In this stage, variables, parameters -and types are defined and resolved and if a type is defined after there is a reference to it, -then the compiler first resolves that type. There are flags set to make sure there aren't recursive dependencies. -3. In the third stage the resolved ast is used to create SSA form (static single assignment form). If optimization is -enabled then then some inlining for ast is done by copying ast from functions to the places they are called from -before the SSA is created. -4. In the fourth stage the SSA form is used to create the bytecode. If optimization is enabled then the SSA form -is optimized before creating the bytecode. -5. If optimization is enabled then the bytecode is optimized. - -# Progress -1. Parsing using multiple threads is done, but the parser is not finished. -2. Resolving ast using multiple threads is done, but the ast resolver is not finished. -3. Generating ssa using multiple threads is done, but the ssa generator is not finished. -4. Generating bytecode using multiple threads is done, but the bytecode generator is not finished. -Currently it generates C code. -5. Not started.
\ No newline at end of file |