-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New atom table implementation #892
Conversation
d343d16
to
322e3a8
Compare
5ba4b80
to
25a65ac
Compare
fa0c866
to
8a70c75
Compare
cecc731
to
b78ae45
Compare
If I understand this right, this is a hash table with separate chaining and grouping of entries. I am a little bit confused about grouping of entries that I understand is some kind of optimization. Indeed, I believe malloc is tlsf_malloc on esp32 and probably has an overhead of 4 bytes plus 32 bits alignment. I would rather not optimizing too early here if the plan is to have VM-based atoms that would be gperf and module/dynamic atoms that we would strdup. Indeed, with strdup, we would already lose the grouping benefit and should rather allocate both the string and the pointer to the next entry (if we stick to hash table with separate chaining). We could also consider other strategies such as realloc. Is it indeed an optimization trying to minimize the overhead of malloc? If so, could it be supported by a benchmark? Also, as previously mentioned, we could imagine other optimizations based on the fact that Erlang/OTP by default has a maximum of 2^20 atoms and on 32 bits platforms we have an index on 2^26 and on the fact that we already process modules with Packbeam and could embed them with some mmap'd structure that could help reduce RAM footprint. |
yes, it is quite efficient, but still for 100 atoms they are 400 bytes that we can save.
I don't want to strdup atoms coming from modules, they are quite a lot, and that takes memory.
Yes, we can benchmark it. The main purpose is having cleaner implementation without the double table issue etc...
We can discuss this, but this approach introduces some additional complexity and additional assumptions, so a fallback implementation would be required anyway. |
After I fixed the two bugs I found related to rehash (see my comments above), I ran a quick test against a much simpler version that simply mallocs HNode (including the string on copy). Running test-structs, I get the following values: alloc_total = 3640 - alloc_count = 224 If a malloc has 4 bytes of overhead, we only saved 4 bytes in exchange of a much more complex implementation... The code used for the benchmark can be found here: The benchmark counter was wrongly updated on realloc (group implementation) and wrongly not decremented on free (simpler implementation), but the difference is the same. With the fix, the results are: We have 16 malloc'd blocks with the group implementation and 97 additional ones with the simpler implementation. If it's 4 bytes per allocation, it means an extra 388 bytes. |
b78ae45
to
3c44640
Compare
New atom table addresses a number of issues, such as memory overhead, race conditions, etc... This new atom table implements both AtomString to index lookup and viceversa, so locking is simplified. Old atomshashtable is still required for modules table. Signed-off-by: Davide Bettio <davide@uninstall.it>
Signed-off-by: Davide Bettio <davide@uninstall.it>
Further reduce memory consumption and improve performances by adding a function that inserts multiple atoms at once, that can be used during module loading. Signed-off-by: Davide Bettio <davide@uninstall.it>
Add also option flags for `AtomTableAlreadyExisting` and `AtomTableCopyAtom`. Signed-off-by: Davide Bettio <davide@uninstall.it>
Use improved `atom_table_ensure_atom` which supports `AtomTableCopyAtom` and `AtomTableAlreadyExisting` options that fixes that possible race condition. Signed-off-by: Davide Bettio <davide@uninstall.it>
Instead of abort propagate error. Signed-off-by: Davide Bettio <davide@uninstall.it>
avail count can be kept as part of the table main structure, rather having it for each node group. Signed-off-by: Davide Bettio <davide@uninstall.it>
Rehash table when count is above a certain threshold, which is computed using capacity. Signed-off-by: Davide Bettio <davide@uninstall.it>
3c44640
to
a4b28da
Compare
Remove forward declaration by reorganizing declaration order. Signed-off-by: Davide Bettio <davide@uninstall.it>
Do not access anything other than local variables after unlocking. Signed-off-by: Davide Bettio <davide@uninstall.it>
Instead of returning an AtomString outside the atom table, implement a compare primitive which avoids the need for that. As a further step it will be possible to move atom strings to other locations without worrying about dangling pointers or complex locking logic. Signed-off-by: Davide Bettio <davide@uninstall.it>
Account for `\0`, so use `<=` instead of just `<`. Signed-off-by: Davide Bettio <davide@uninstall.it>
Instead of returning a pointer to the atom string actual data, copy the string to a caller owned buffer. This will allow later implementing atom strings replacement when loading a newer module version, without worrying about reference counting or additional locks. Additional changes are required so `atom_table_get_atom_string` will be around for some more time. Signed-off-by: Davide Bettio <davide@uninstall.it>
globalcontext_insert_atom just calls globalcontext_insert_atom_maybe_copy that calls `atom_table_ensure_atom`. Signed-off-by: Davide Bettio <davide@uninstall.it>
- Handle out of memory in *_to_atom - Avoid unnecessary malloc - Fix memory leak Signed-off-by: Davide Bettio <davide@uninstall.it>
All functions should be lock and unlock, in order to keep everything maintainable. Signed-off-by: Davide Bettio <davide@uninstall.it>
Signed-off-by: Davide Bettio <davide@uninstall.it>
a4b28da
to
9e76b00
Compare
Just an additional comment here: in order to have a fair comparison we should take account of index to string complexity. I'll provide more information soon about this improvement. |
Implement new atom table that overcomes previous implementation shortcomings.
Some further improvements are not part of this PR and they will be addressed with further PRs, such as:
atom_table_get_atom_string
strdup
)Memory benchmark:
With atom table:
without:
These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).
SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later