Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ARM assembly optimized memcpy for RP2350 #2552

Merged
merged 3 commits into from
Oct 23, 2024
Merged

Add ARM assembly optimized memcpy for RP2350 #2552

merged 3 commits into from
Oct 23, 2024

Conversation

earlephilhower
Copy link
Owner

33% faster for 4K memcpy using DMAMemcyp example

With this assembly:
CPU: 4835 clock cycles for 4K
DMA: 2169 clock cycles for 4K

Using stock Newlib memcpy:
CPU: 7314 clock cycles for 4K
DMA: 2175 clock cycles for 4K

(What's interesting is that if we place this in RAM it's actually slower in this test because the CPU instruction fetch will fight with the data read and write, causing stalls...neat!)

33% faster for 4K memcpy using DMAMemcyp example

With this assembly:
CPU: 4835 clock cycles for 4K
DMA: 2169 clock cycles for 4K

Using stock Newlib memcpy:
CPU: 7314 clock cycles for 4K
DMA: 2175 clock cycles for 4K
@earlephilhower earlephilhower merged commit e7419fb into master Oct 23, 2024
26 checks passed
@earlephilhower earlephilhower deleted the memasm branch October 23, 2024 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant