Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dsp: p_firsym: API issue and implementation #158

Open
lfochamon opened this issue Jun 17, 2015 · 6 comments
Open

dsp: p_firsym: API issue and implementation #158

lfochamon opened this issue Jun 17, 2015 · 6 comments

Comments

@lfochamon
Copy link
Contributor

Although I think having a separate implementation specific for filters with symmetric coefficients is a great idea, there is the issue of dealing with the different types of symmetric FIR filters. For reference:

  • Type I: symmetric coefficients, odd length;
  • Type II: symmetric coefficients, even length (cannot be high pass, zero at -1);
  • Type III: anti-symmetric coefficients, odd length (cannot be high pass, low pass, or band stop, zeros at +1 and -1);
  • Type IV: anti-symmetric coefficients, odd length (cannot be low pass, zero at +1).

The way the function description is written today, I guess it would imply a type II filter. The implementation of all these FIR are very similar, but it should be defined whether it would make more sense to have (a) a single function that support all types of filters or (b) different functions.

For (a), the function interface needs to be updated to either pass symmetric/anti-symmetric and odd/even length parameters or the filter type. Option (b) would mean less branching and more straightforward code, although it could increase code footprint if someone used all 4 kinds of filters in a project (something really unlikely in practice).

I've written the code, but I guess I'll wait for a consensus before submitting a pull request.

@aolofsson
Copy link
Member

Take a look at the "firs" function from TI and let me know what you think. Some of the usage restrictions in that function seem reasonable.

http://www.ti.com/lit/ug/spru422j/spru422j.pdf

@lfochamon
Copy link
Contributor Author

OK, so firs constraints the filter to be type II. It seems a little restrictive, given the forced zero at -1. Personally, if I were to choose only one symmetric FIR to implement, it would be the type I. It does not constrain the transfer function and the implementation cost is virtually the same (there is 1 additional MAC). But I guess this is a designer judgement call.

Another possible compromise solution would be to make nh the full length of the filter to let the function know which type (I or II) the filter actually is. It alleviates the API at the cost of a little more testing inside the function. However, I would lean towards eliminating all checks inside these functions since filters usually sit inside the main loop.

TL;DR: make 4 functions (fir1, fir2, fir3, and fir4) or say p_firsym is for type I filters and nh = (L+1)/2+1, for a filter of length L odd. [my 2 cents]

BTW, now I get the idea behind passing *dbuf (or for the current p_fir, *dl). Although I guess all the arrays should be in local memory, otherwise it's gonna lead to more RAM accesses.

@aolofsson
Copy link
Member

Is there even a point? I know symmetrical filters are good for ASICs and FPGAs, but in most microprocessors, if you have one op per clock cycle and (MUL, ADD, or MAC), then there might be a big gain in doing the symmetrical fir versus the standard fir?

Is the function essential?

@lfochamon
Copy link
Contributor Author

Hmmm... That's true. I get stuck on an FPGA thinking, didn't even thought of that. But I've written function the p_firsym way for DSPs and I've seen them on libraries so that got me wondering.

Since it's the only way to really, I just coded p_firsym (with type II like TI's library, that doesn't matter for now) and tested it againt p_fir for differen nx and nh on my PC and the BeagleBone Black. I only ran tests with nh <= nx. I timed 1e7 runs on the PC and 1e6 runs on the BBB. The results are presented in the plots below.

On the PC, both function are roughly the same for small nh (no idea what happened with that oulier in the middle). For large nh, p_firsym does get run around 200 ns faster than p_fir.

On the ARM core, p_firsym is much faster than p_fir. Note that the functions run around 10x slower on the BBB, so the relative gains are more or less the same. I stopped the test in the middle because it was taking forever and the trend was already clear.

I cannot explain why, but aparrently p_firsym is worth implementing (as soon as the type issue is decided upon).

i7

bench_fir_firsym_i7
bench_fir_firsym_hist_i7

AM3558

bench_fir_firsym_bbb

bench_fir_firsym_hist_bbb

@aolofsson
Copy link
Member

Great data. Not sure I completely understand the diff between type 1 and type 2 and why the TI on eis type2?

Are you ok with the restrictions on the number coefficients being 3?

@lfochamon
Copy link
Contributor Author

I should've given an example from the start!

  • Type 1 filter: 1 2 3 4 5 4 3 2 1 (9 taps, odd length).
  • Type 2 filter: 1 2 3 4 5 5 4 3 2 1 (10 taps, even length).

I think TI implemented type 2 because it is easier code. I lean towards type 1 because it has no restriction. For the example above, the function parameters would be: h = {1,2,3,4,5} and nh = 5. Then p_firsym would do its magic.

At this moment, the the code is written so that there are no restrictions on nh. For small nh, however, you might be better off using p_fir directly. Depends on the platform, though...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants