aboutsummaryrefslogtreecommitdiff
path: root/doc/Documentation.md
blob: 79e81aa65ee0a213ba0124f5058431a4ef6aea29 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
# Opcode
Variable length opcodes. Sizes range from 1 to 5 bytes.
## Instruction formats
Instructions can be in 7 different formats:
1. 1 byte: Opcode(u8)
2. 2 bytes: Opcode(u8) + register(i8)
3. 3 bytes: Opcode(u8) + register(i8) + register(i8)
4. 3 bytes:\
4.1 Opcode(u8) + intermediate(u16)\
4.2 Opcode(u8) + data(u16)\
4.3 Opcode(u8) + offset(i16)\
4.4 Opcode(u8) + register(i8) + num_args(u8)
5. 4 bytes: Opcode(u8) + register(i8) + register(i8) + register(i8)
6. 4 bytes:\
6.1 Opcode(u8) + register(i8) + offset(i16)\
6.2 Opcode(u8) + register(i8) + intermediate(u16)\
6.3 Opcode(u8) + register(i8) + data(u16)\
6.4 Opcode(u8) + num_param_reg(u8) + num_local_var_reg(u16)
7. 5 bytes: Opcode(u8) + index(u16) + num_args(u8) + register(i8)
## Registers
Registers have a range of 128. Local variables start from register 0 and increment while parameters start from -1
and decrement.

# Compiler flow
(Tokenize&parse -> Resolve AST -> Generate SSA -> Generate bytecode) -> Generate program\
Each step except the last is done using multiple threads in parallel and the output of each step is used
in the next step. The last step is not done in parallel because the last step is combining all bytecode
and writing it to a file, which is an IO bottlenecked operation and it won't benefit from multithreading
and may even lose performance because of it.

# Bytecode
The layout of the full bytecode is: Header (Intermediates Strings Functions External_Functions Instructions)*

# Bytecode header
## Header layout
|Type|Field        |Description                                                                 |
|----|-------------|----------------------------------------------------------------------------|
|u32 |Magic number |The magic number used to identify an amalgam bytecode file.                 |
|u8  |Major version|The major version of the bytecode. Updates in this is a breaking change.    |
|u8  |Minor version|The minor version of the bytecode. Updates in this are backwards compatible.|
|u8  |Patch version|The patch version of the bytecode. Updates in this are only minor bug fixes.|

The versions in the header only changes for every release, not every change.

# Bytecode intermediates
## Intermediates layout
|Type        |Field             |Description                                                                    |
|------------|------------------|-------------------------------------------------------------------------------|
|u32         |Intermediates size|The size of the intermediates section, in bytes.                               |
|Intermediate|Intermediate data |Multiple intermediates, where the total size is defined by @Intermediates size.|

## Intermediate
|Type|Field|Description                                         |
|----|-----|----------------------------------------------------|
|u8  |Type |The type of the number. 0=integer, 1=float.         |
|u64 |Value|The type of the value depends on the value of @Type.|

# Bytecode strings
## Strings layout
|Type  |Field            |Description                                                       |
|------|-----------------|------------------------------------------------------------------|
|u16   |Number of strings|The number of strings.                                            |
|u32   |Strings size     |The size of the strings section, in bytes.                        |
|String|Strings data     |Multiple strings, where the total size is defined by @Strings size|

## String
|Type|Field|Description                                                                            |
|----|----|----------------------------------------------------------------------------------------|
|u16 |Size|The size of the string, in bytes. Excluding the null-terminate character.               |
|u8* |Data|The data of the string, where the size is defined by @Size. Strings are null-terminated.|

# Bytecode functions
## Internal functions layout
|Type|Field              |Description                      |
|----|-------------------|---------------------------------|
|u16 |Number of functions|The number of internal functions.|

# Bytecode external functions
## External functions layout
|Type             |Field             |Description                                                                              |
|-----------------|------------------|-----------------------------------------------------------------------------------------|
|u16              |num_extern_func   |The number of external functions.                                                        |
|u32              |extern_funcs_size |The size of the external functions section, in bytes.                                    |
|External function|External functions|Multiple external functions, where the number of functions is defined by @num_extern_func|

## External function
|Type|Field   |Description                                                                                          |
|----|--------|-----------------------------------------------------------------------------------------------------|
|u8  |num_args|The number of arguments the functions has.                                                           |
|u8  |name_len|The length of the external function name, in bytes. Excluding the null-terminate character.          |
|u8* |name    |The name of the external function, where the size is defined by @name_len. Names are null-terminated.|

# Bytecode instructions
## Instructions layout
|Type       |Field            |Description                                                                |
|-----------|-----------------|---------------------------------------------------------------------------|
|u32        |Instructions size|The size of the instructions section, in bytes.                            |
|Instruction|Instructions data|The instructions data. Each instructions begins with an opcode, see #Opcode|

# Execution backend
Amalgam supports multiple execution backend and they can be implemented with minimal
effort. The only requirement is implementation of all the functions in executor/executor.h
and adding the source file with the implementation to the build script. See executor/interpreter/executor.c
as an example.\
These functions are then called by amalgam as amalgam parses the amalgam bytecode when `amal_program_run` is called.