-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added interop_iolist_fold #687
Conversation
083676e
to
94ed2f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to play a little bit with inline
and some other attributes to make sure performance are not reduced by function calls.
#define would work as well |
I did a first investigation: static inline InteropFunctionResult size_fold_fun(term t, void *accum) and inline InteropFunctionResult interop_iolist_fold(term t, interop_iolist_fold_fun fold_fun, void *accum) on x86_64 allow gcc to perform a fair level of optimization, on xtensa doesn't look yet nearly the same, anyway I would love to do some additional investigation, but definitely using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would slightly defer merging this PR after some investigation, minor changes and tricks to allow compiler optimizing it back to something comparable to previous version.
Right now without some changes it would have a significant penalty due to function calls / branching.
Can you quantify the difference in performance with a test? And for the majority of cases, is it even measurable? Or are these a priori analyses based on generated code? It might be hard to make assessments about performance without actual measurements, e.g., due to processor pipelining and other tricks. |
The problem with in-lining is code bloat. We are already pushing the boundaries on some platforms, especially smaller microcontrollers. |
If we do inline it doesn't get worse than now since those functions are already duplicated, however by using |
For what is worth, emscripten builds require |
True, but bear in mind that in the crypto branch, on ESP32 (due to the mbedtls API) we need to call this function for each crypto algorithm (for hashing, but maybe later for encryption as well). There are currently 6 hashing algorithms supported in that branch (md5, sha1, sha224, sha256, sha384, and sha512) |
94ed2f2
to
1444c3a
Compare
I used this version for testing purposes. |
40b4dd8
to
6df6561
Compare
Signed-off-by: Fred Dushin <fred@dushin.net>
6df6561
to
a15c922
Compare
This PR adds a function that supports traversal of values (integers and binaries) in an IO list.
These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).
SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later