Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[0050] VM Syscalls 3 #418

Merged
188 changes: 188 additions & 0 deletions rfcs/0147-vm-syscalls-3/0147-vm-syscalls-3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,188 @@
---
Number: "0147"
Category: Standards Track
Status: Draft
Author: Xu Jiandong <lynndon@gmail.com>, Dingwei Zhang <zhangsoledad@gmail.com>
Created: 2023-04-17
---

# VM Syscalls 3

## Abstract

This document describes the addition of the syscalls during the CKB2023. This update significantly enhances the flexibility of CKB Script.

The following three syscalls are added:
zhangsoledad marked this conversation as resolved.
Show resolved Hide resolved

- [Spawn]
- [Get Memory Limit]
- [Set Content]
- [Load Extension]

### Spawn
[Spawn]: #spawn

The syscall Spawn is the core part of this update. The *Spawn* and the latter two syscalls: *Get Memory Limit* and *Set Content* together, implement a way to call another CKB Script in a CKB Script. Unlike the *Exec*[1](../0034-vm-syscalls-2/0034-vm-syscalls-2.md) syscall, *Spawn* saves the execution context of the current script, like [posix_spawn](https://man7.org/linux/man-pages/man3/posix_spawn.3.html), the parent script blocks until the child script ends.

```c
int ckb_spawn(uint64_t memory_limit, size_t index, size_t source,
zhangsoledad marked this conversation as resolved.
Show resolved Hide resolved
size_t bounds, int argc, char* argv[], int8_t* exit_code,
uint8_t* content, uint64_t* content_length);
```

The arguments used here are:

- `memory_limit`: an integer value denoting the memory size to use(Not including descendant children scripts), possible values include:
- 1 (0.5 M)
- 2 (1 M)
- 3 (1.5 M)
- 4 (2 M)
- 5 (2.5 M)
- 6 (3 M)
- 7 (3.5 M)
- 8 (4 M)
- `index`: an index value denoting the index of entries to read.
- `source`: a flag denoting the source of cells to locate, possible values include:
- 1: input cells.
- `0x0100000000000001`: input cells with the same running script as current script
- 2: output cells.
- `0x0100000000000002`: output cells with the same running script as current script
- 3: dep cells.
- `bounds`: high 32 bits means `offset`, low 32 bits means `length`. if `length` equals to zero, it read to end instead of reading 0 bytes.
- `argc`: argc contains the number of arguments passed to the program
- `argv`: argv is a one-dimensional array of strings
- `exit_code`: an int8 pointer denoting where we save the exit code of child script.
- `content`: a pointer to a buffer in VM memory space denoting where we would load the sub-script data. The child script will write data in this buffer via `set_content`.
- `content_length`: a pointer to a 64-bit unsigned integer in VM memory space. When calling the syscall, this memory location should store the length of the buffer specified by `content` . When returning from the syscall, CKB VM would fill in `content_length` with the actual length of the buffer. `content_length` up to 256K.

The arguments used here `index`, `source` ,`bounds` ,`argc` and `argv` follow the usage described in [EXEC].

This syscall might return the following results:

- 0: Success.
- 1-3: Reserved. These values are already assigned to other syscalls.
- 4: Elf format error
- 5: Exceeded max content length.
- 6: Wrong memory limit
- 7: Exceeded max peak memory

Note that now we have a new limit called *Peak Memory Usage*. The maximum memory usage of the parent script and its descendant children cannot exceed this value. Currently this limit is set at 32M.

Unlike cycles which always increase, the current memory can decrease or increase. When a child script is returned, the occupied memory is freed. This makes current memory usage lower.


### Get Memory Limit
[Get Memory Limit]: #get-memory-limit

Get the maximum available memory for the current script.

```c
int ckb_get_memory_limit();
```

For the prime script, it will always return 8(4M). For the child script, it depends on the parameters set by *Spawn*.

### Set Content
[Set Content]: #set-content

The child script can return bytes data to the parent script through `Set Content`.

```c
int ckb_set_content(uint8_t* content, uint64_t* length);
```

- Length up to 256K.
- If the written length is greater than the limit given by *Spawn*, the final written length is the minimum of the two.
- This function is optional. Not every child script needs to call this.

### Spawn example

Suppose we write a dependency library, the function of this library is very simple: receive parameters, then concatenates the parameters together and return to the caller.

**lib_strcat.c**

```c
#include <stdint.h>
#include <string.h>

#include "ckb_syscalls.h"

int main(int argc, char *argv[]) {
char content[80];
for (int i = 0; i < argc; i++) {
strcat(content, argv[i]);
}
uint64_t content_size = (uint64_t)strlen(content);
ckb_set_content(&content[0], &content_size);
if (content_size != (uint64_t)strlen(content)) {
return 1;
}
return 0;
}
```

We can call this dependent library in the prime script. The prime script passes in two parameters "hello", "world" and checks if the return value is equal to "helloworld”:

**prime.c**

```c
#include <stdint.h>
#include <string.h>

#include "ckb_syscalls.h"

int main() {
const char *argv[] = {"hello", "world"};
int8_t exit_code = 255;
uint8_t content[80] = {};
uint64_t content_length = 80;

ckb_spawn(8, 1, 3, 0, 2, argv, &exit_code, &content[0], &content_length);
if (strlen(content) != 10) {
return 1;
}
if (strcmp(content, "helloworld") != 0) {
return 1;
}
if (exit_code != 0) {
return 1;
}
return 0;
}
```

### Load Extension
[Load Extension]: #load-extension

*Load Extension* syscall has a signature like following:

```c
int ckb_load_extension(void* addr, uint64_t* len, size_t offset, size_t index, size_t source)
{
return syscall(2104, addr, len, offset, index, source, 0);
}
```

The arguments used here are:

* `addr`, `len` and `offset` follow the usage described in [Partial Loading] section.
* `index`: an index value denoting the index of entries to read.
* `source`: a flag denoting the source of cells to locate, possible values include:
+ 1: input cells.
+ `0x0100000000000001`: input cells with the same running script as current script
+ 3: dep cells.
+ 4: header deps.

This syscall would locate the `extension` field associated either with an input cell, a dep cell, or a header dep based on `source` and `index` value, then use the same step as documented in [Partial Loading] section to feed the serialized value into VM.

Note when you are loading the `extension` associated with an input cell or a dep cell, the header hash of the corresponding block should still be included in `header deps` section of current transaction.

This syscall might return the following errors:
* An invalid source value would immediately trigger an VM error and halt execution.
* The syscall would return with `1` as return value if the index value is out of bound.
* This syscall would return with `2` as return value if requesting a header for an input cell, but the `header deps` section is missing the header hash for the input cell.

In case of errors, `addr` and `index` will not contain meaningful data to use.

[EXEC]: ../0034-vm-syscalls-2/0034-vm-syscalls-2.md#exec
[Partial Loading]: ../0009-vm-syscalls/0009-vm-syscalls.md#partial-loading