Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Incomplete tile warning #245

Closed
wants to merge 6 commits into from
Closed

[ENH] Incomplete tile warning #245

wants to merge 6 commits into from

Conversation

ryanhammonds
Copy link
Member

@ryanhammonds ryanhammonds commented Jan 29, 2021

This adds a tile warning addressing #223, so noobs like myself aren't confused when a slope in the spectrum results from sim_oscillation (i.e. fs/freq is not an int). The slope below use to be smooth but now it produces a decreasing slope of harmonics.

from neurodsp.sim import sim_oscillation
from neurodsp.spectral import compute_spectrum
from neurodsp.plts import plot_power_spectra


n_seconds = 100
fs = 1000
freq = 9

sig = sim_oscillation(n_seconds, fs, freq)

freqs, powers = compute_spectrum(sig, fs)

plot_power_spectra(freqs, powers)

neurodsp/sim/periodic.py Outdated Show resolved Hide resolved
neurodsp/sim/periodic.py Outdated Show resolved Hide resolved
@TomDonoghue
Copy link
Member

Very interesting idea!

I make some comments on the current warning to potential update the description.

But then I was playing with it for a moment, and I realized it's actually more than fs & freq - it depends on the total simulated time as well.
So far example, this would pass the test, but still lead to the 'weird slope':

n_seconds, fs, freq = 0.95, 1000, 10
sig = sim_oscillation(n_seconds, fs, freq)
freqs, powers = compute_spectrum(sig, fs)
plot_power_spectra(freqs, powers)

Also: I'm a little wary that this warning might trigger a lot, and be annoying.
Something like this is common:

for freq in [8, 9, 10, 11, 12]:
      sig = sim_oscillation(10, 1000, freq)

and it would be annoying to trigger the warning a ton.
In this situation, the recommendation would typically not be use different sampling frequencies. Having non-integer number of cycles is something that happens sometimes - and sorta fine (it's not something to always avoid, and is not necessarily a big issue).

So: maybe we need to update this warning to make it consider n_seconds. But even so, I'm a bit hesitant to have it trigger too often. Maybe this is something that should rather be discussed more in the tutorials? And/or, I think we should perhaps add the text proposed for the warning into the docstring, in a Notes section?

@ryanhammonds
Copy link
Member Author

Thanks! I thought this was worth consideration since it seemed to spark a long discussion in #223.

I didn't realize this was dependent on n_seconds too. It looks your example is caused by n_seconds being too short to contain a single full cycle but I'll have to dig in more.

I can also see this warning being annoying - but I think this is important for spectral analyses? Ideally, it would be better if this was only triggered if compute_spectrum was ran on the simulated timeseries - but I don't think there's a way to do this without making things complicated.

I can drop the warning here and move it to notes.

@ryanhammonds
Copy link
Member Author

For your first example, scipy raises warning about defaulting nperseg. Your example can be fixed by passing in different values (nperseg=900). This only seems to be an issue when when n_seconds < 1. I played with your second example and it only raised the warning once per notebook.

The latest commit moves the warning to a note in the sim_oscillation docstring.

@TomDonoghue
Copy link
Member

TomDonoghue commented Feb 9, 2021

Hey @ryanhammonds :

Yeh, I also don't know that there is any way to trigger the warning from compute_spectrum. I don't think it's that important - the amount of off-CF power is minimal, and note that this is something relevant to all PSDs (real data, for example, has no guarantee of an integer number of cycles, and a non-integer number leads to the same thing in the frequency domain). The scipy warning is if the default nperseg of 1 second is longer than the signal length, I think - which isn't that important.

The n_seconds dependency is that if the duration of the signal doesn't cleanly include an integer number of cycles, then the signal gets truncated to end with a non-integer number of cycles, causing the same issue. (This technically happens when the sim'd signal is truncated at the end of sim_oscillation).

Because of the n_seconds dependency, I've pushed a commit that suggests a updated phrasing of the note in the function.

Also, it feels like even if this is not a continuously checked warning, then we should have functionality available to check this. To add this, I've added a sim util function to get an oscillation definition (in terms of whether the definition will include an integer number of cycles), which uses your original check for fs, and also check the signal duration. What do you think about this extra utility? If you like the idea, can you test it a little, to make sure if actually seems to track definitions that do / don't end up with integer number of cycles. Perhaps we then show / mention this function in the doc-site tutorial, so that anyone who wants to be really careful with this can see how.

Copy link
Member Author

@ryanhammonds ryanhammonds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of a separate util. I left one comment about truncation. This doesn't seem like it is always an issue and I'll have to run more sims to figure out when it is - so far though it seems like n_seconds is only an issue when < 1.

srate_check = (fs/freq).is_integer()

# Time check: check if signal length matches an integer number of cycles
time_check = (n_seconds * freq).is_integer()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is truncation always an issue? This example seems to produce an appropriate spectrum:

n_seconds, fs, freq = 1.99, 1000, 10

print((n_seconds * freq).is_integer())

sig = sim_oscillation(n_seconds, fs, freq)
freqs, powers = compute_spectrum(sig, fs)
plot_power_spectra(freqs, powers)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, great catch!

So yeh, this time check isn't super specific. Whether the number of cycles leads to funky PSD is a bit more complex than this check - it depends on the actual spectral estimation that is done (for example, with Welchs, I think it depends on the segment size (nperseg), and how this relates to the signal, more than the details of the total signal. Because of this, I think we should probably just entirely drop this time check, since I otherwise don't really know how to make it more specific (we can't predict the spectral estimation people will do - and this emphasizes that the details of the resulting PSD are somewhat contextual - not fully determined by the sim'd time series).

@TomDonoghue
Copy link
Member

So, I feel like I should have realized this earlier - but an important thing to keep in mind here is that the power on/off frequency is not defined solely by the simulation, but also the manner of computing the power spectrum - in particular when we are talking about an integer number of cycles.

Consider this sim:

from neurodsp.sim import sim_oscillation, sim_cycle
from neurodsp.spectral import compute_spectrum
from neurodsp.plts import plot_power_spectra
from neurodsp.utils import create_times

n_seconds, fs, freq = 10.0, 1000, 7
times = create_times(n_seconds, fs)
sig = sim_oscillation(n_seconds, fs, freq)

freqs, powers = compute_spectrum(sig, fs)
plot_power_spectra(freqs, powers)

Gives the seemingly wonky power spectrum:
Screen Shot 2021-02-23 at 11 33 31 AM

However, this is all down to the relationship between the simulated frequency, and the nperseg value in the Welch's spectral estimation (and nothing inherent in the simulation).

We can check this by updating nperseg:

cyclen = len(sim_cycle(1/freq, 1000, 'sine'))
freqs, powers = compute_spectrum(sig, fs, nperseg=cyclen*5)
plot_power_spectra(freqs, powers)

Which gives the a more exact spectrum:
Screen Shot 2021-02-23 at 11 33 37 AM

By default, fs and nperseg are often multiples of 10, and so frequencies like 10 Hz divide evenly, and frequencies like 7 do not - but this isn't inherent to the simulation, and not predictable from the simulation definition (n_seconds, fs, freq), without knowing something about the spectral estimation.

Which ultimately means - I don't think this 'helper' function or the note are actually well posed - since we can't say anything strictly true about the expected spectrum of the simulated signal based on the signal - what matters is the spectral estimation. So - in the end, I think we should drop this PR.

@ryanhammonds - thoughts? Does this make sense? I don't currently see a clear way to integrate this into the codebase without it being heavily caveated. As an alternative, this is all some information that I think would be useful to work into a tutorial / example somewhere - and if you feel you have a handle on it, a quick example that discusses and shows the relationship between spectral estimation and expected power spectra would be quite useful I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants