# Instructions Variable length instructions. Instruction size ranges from 1 to 4 bytes. ## Instruction formats Instructions can be in 7 different formats: 1. 1 byte: Opcode(u8) 2. 2 bytes: Opcode(u8) + register(AmalReg) 3. 3 bytes: Opcode(u8) + register(AmalReg) + register(AmalReg) 4. 3 bytes:\ 4.1 Opcode(u8) + intermediate(u16)\ 4.2 Opcode(u8) + data(u16)\ 4.3 Opcode(u8) + label(i16)\ 4.4 Opcode(u8) + register(AmalReg) + num_args(u8) 5. 4 bytes: Opcode(u8) + register(AmalReg) + register(AmalReg) + register(AmalReg) 6. 4 bytes:\ 6.1 Opcode(u8) + register(AmalReg) + label(i16)\ 6.2 Opcode(u8) + register(AmalReg) + intermediate(u16)\ 6.3 Opcode(u8) + register(AmalReg) + data(u16)\ 6.4 Opcode(u8) + register(AmalReg) + index(u16)\ 6.5 Opcode(u8) + flags(u8) + num_local_var_reg(u16)\ 6.6 Opcode(u8) + index(u8) + index(u16) ## Registers Registers have a range of 128. Parameters have the most significant bit set while local variables dont. Registers have the scope of functions and reset after instructions reach a new function (AMAL_OP_FUNC_START). If import index for call and calle is 0, then that means the function resides in the same file the function call is being called from. Which means that import index 1 is actually import index 0 into the import list. @AmalReg is an alias for u8. # Compiler flow (Tokenize&parse -> Resolve AST -> Generate IR -> Generate bytecode) -> Generate program\ Each step except the last is done using multiple threads in parallel and the output of each step is used in the next step. The last step is not done in parallel because the last step is combining all bytecode and writing it to a file, which is an IO bottlenecked operation and it won't benefit from multithreading and may even lose performance because of it. # Bytecode The layout of the full bytecode is: Header (X Intermediates X Strings X Functions X External_Functions X Exported_Functions X Imports X Instructions)*\ Where the X is a magic number to make it easier to find errors while decoding the bytecode.\ The value of the magic number is @AMAL_BYTECODE_SECTION_MAGIC_NUMBER # Bytecode header ## Header layout |Type|Field |Description | |----|-------------|----------------------------------------------------------------------------| |u32 |Magic number |The magic number used to identify an amalgam bytecode file. | |u8 |Major version|The major version of the bytecode. Updates in this is a breaking change. | |u8 |Minor version|The minor version of the bytecode. Updates in this are backwards compatible.| |u8 |Patch version|The patch version of the bytecode. Updates in this are only minor bug fixes.| |u8 |Endian |Endian of the program. 0 = little endian, 1 = big endian. | The versions in the header only changes for every release, not every change. # Bytecode intermediates ## Intermediates layout |Type |Field |Description | |--------------|------------------|-------------------------------------------------------------------------------| |u32 |Intermediates size|The size of all intermediates, in bytes. | |Intermediate[]|Intermediate data |Multiple intermediates, where the total size is defined by @Intermediates size.| ## Intermediate |Type|Field|Description | |----|-----|----------------------------------------------------| |u8 |Type |The type of the number. 0=integer, 1=float. | |u64 |Value|The type of the value depends on the value of @Type.| # Bytecode strings ## Strings layout |Type |Field |Description | |--------|-----------------|------------------------------------------------------------------| |u16 |Number of strings|The number of strings. | |u32 |Strings size |The size of all strings, in bytes. | |String[]|Strings data |Multiple strings, where the total size is defined by @Strings size| ## String |Type|Field|Description | |----|----|----------------------------------------------------------------------------------------| |u16 |Size|The size of the string, in bytes. Excluding the null-terminate character. | |u8* |Data|The data of the string, where the size is defined by @Size. Strings are null-terminated.| # Bytecode functions ## Functions layout |Type |Field |Description | |----------|----------|--------------------------------------------------------------------------------------| |u16 |num_funcs |The number of non-extern functions. | |u32 |funcs_size|The size of all functions, in bytes. | |Function[]|Functions |Multiple non-extern functions, where the number of functions is defined by @num_funcs.| ## Function |Type|Field |Description | |----|-------------------------|-------------------------------------------------------------------------------------------------------------------------| |u32 |func_offset |The offset in the program code (machine code) where the function starts. Is always ~0 until the program has been started.| |u8 |num_params |The number of parameters. | |u32 |params_num_pointers |The number of pointers in the parameters. | |u32 |params_fixed_size |The size of all non-pointer type parameters, in bytes. | |u8 |num_return_types |The number of return values. | |u32 |return_types_num_pointers|The number of pointers in the return types. | |u32 |return_types_fixed_size |The size of all non-pointer type return types, in bytes. | # Bytecode external functions ## External functions layout |Type |Field |Description | |-------------------|------------------|-----------------------------------------------------------------------------------------| |u16 |num_extern_func |The number of external functions. | |u32 |extern_funcs_size |The size of all external functions, in bytes. | |External function[]|External functions|Multiple external functions, where the number of functions is defined by @num_extern_func| ## External function |Type|Field |Description | |----|-------------------------|-----------------------------------------------------------------------------------------------------| |u8 |num_params |The number of parameters. | |u8 |num_return_types |The number of return values. | |u8 |name_len |The length of the external function name, in bytes. Excluding the null-terminate character. | |u8 |flags |The flags for the external function. The values are defined in @amal_func_flag. | |u32 |params_num_pointers |The number of pointers in the parameters. | |u32 |params_fixed_size |The size of all non-pointer type parameters, in bytes. | |u32 |return_types_num_pointers|The number of pointers in the return types. | |u32 |return_types_fixed_size |The size of all non-pointer type return types, in bytes. | |u8[]|name |The name of the external function, where the size is defined by @name_len. Names are null-terminated.| # Bytecode exported functions ## Exported functions layout |Type |Field |Description | |-------------------|------------------|-----------------------------------------------------------------------------------------| |u16 |num_export_func |The number of exported functions. | |u32 |export_funcs_size |The size of all exported functions, in bytes. | |Exported function[]|Exported functions|Multiple exported functions, where the number of functions is defined by @num_export_func| ## Exported function |Type|Field |Description | |----|------------------|---------------------------------------------------------------------------------------------------------------------------| |u32 |instruction_offset|The offset in the instruction data where the exported function is defined. Is always ~0 until the program has been started.| |u8 |num_args |The number of arguments the functions has. | |u8 |name_len |The length of the exported function name, in bytes. Excluding the null-terminate character. | |u8[]|name |The name of the exported function, where the size is defined by @name_len. Names are null-terminated. | # Bytecode imports ## Imports layout |Type |Field |Description | |--------|------------|-------------------------------------------------------------------------| |u8 |num_imports |The number of imports. | |u32 |imports_size|The size of all imports, in bytes. | |Import[]|Import |Multiple imports, where the number of imports is defined by @num_imports.| ## Import |Type|Field |Description | |----|---------------------|----------------------------------------------------------------------------------------| |u32 |function_index |The index in the bytecode where function header begins for the imported file. | |u32 |extern_function_index|The index in the bytecode where the extern function header begins for the imported file.| # Bytecode instructions ## Instructions layout |Type |Field |Description | |-------------|-----------------|---------------------------------------------------------------------------| |u32 |Instructions size|The size of the instructions section, in bytes. | |Instruction[]|Instructions data|The instructions data. Each instructions begins with an opcode, see #Opcode| # Execution backend Amalgam supports multiple execution backend and they can be implemented with minimal effort. The only requirement is implementation of all the functions in executor/executor.h and adding the source file with the implementation to the build script. See executor/interpreter/executor.c as an example.\ These functions are then called by amalgam as amalgam parses the amalgam bytecode when `amal_program_run` is called.