-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Label mismatch in WaveformView and FeatureView vs. Correlogram View and n_spikes #1261
Comments
best to run phy template-gui params.py --debug and list out the contents if a view is failing. Could you explain your other problem a bit more. What do you mean by being unable to tell which is the "real"? And by spike you mean unit/cluster? |
Hi @zm711 , this is a known issue of Kilosort 3. I found many claiming this issue here and there. For instance, #1257 is referring to the same problem. |
I think it would be easier if I could open this myself and poke around. Does anyone have a small dataset that reproduces this problem (maybe <2GB?)? We would need to know if this is a phy/phylib or a ks issue. And having data that produces this would be easiest! |
Hi @zm711 , thank you for your prompt reply and interest to this issue! Please look into and try the dataset. Let me know if you have any trouble browsing/downloading. I prepared a real, but modified data file so that the total size would be a little less than 2GB. I could upload a small dataset, but the caveat is that the above-mentioned issue is somehow data size (probably spike number)-dependent. I ran "GM-tryKS3-F27\rec1\run_KS3.m" and moved all the outputs to GM-tryKS3-F27\temp_kilo3. As you run Phy and circle any cluster (doesn't matter mua or good) that has a large number of spikes in a way that most of spikes but some are included, and then split, you should be able to reproduce what I and others are talking about. |
Thanks @gminami. I'm booked up for the next couple days, so my hope is to check this out in more depth on Friday! |
@zm711 looking forward to your update!! FYI, the following is what happens with the dataset I sent. If you circle all the points but one on FeatureView and split (1st screenshot), you'll get numbers of blue waveforms that are similar to template and a somewhat deviated red waveform, indicating blue was the ones originally inside the circle and red was outside. On the other hand, ClusterView reports majority of spikes in the red cluster, not blue. |
I have a hypothesis! So some of the views only load a random sampling of the spikes in order to not be too intensive. For example if you look at your waveforms in the This is a limitation of these types of GUIs. It would be way to slow to display all the data. This is true in SpikeInterface and the SpikeInterface GUI as well. For that we let you choose the spikes you want (although not as freely as we want yet). But Phy (at the gui-level) chooses for you. This is why splitting is so difficult vs merging at the gui level. The Does all this make sense? |
Yes, I think this feature of Phy is reasonable. Such sparse representation per se hasn't been a problem as a 'cloud' of points lets you estimate the distribution of a cluster. I've been able to split clusters just fine in KS1 up to KS2.5. The issue is not sparse representation of data, but FALSE representation; splitting of a large cluster in Phy2 after KS3 displays wrong data -- data that does not belong to the current cluster. In the example I sent, either the waveforms or points on the feature view are not real (because there is a mismatch), I suspect. Since KS3 does not create npy files for Phy that's needed for FeatureView and others, I needed to run phy extract-waveforms params.py in my python environment to create those (according to what some KS3 users claim to do). This may be the cause of malfunction. Is there a way to fix this? For instance, if you come up with a way to create feature-related npy files for Phy on MATLAB (as in all other versions of kilosort), that would be great. |
Let me answer in reverse because the last point is easiest.
that needs to be done by the kilosort team. They had plans to do it but then dropped it when they started working on KS4. So you’d have to do that yourself. Sorry.
yes people do this to get features. But with the extraction process you lock in your random spikes. So the views can’t update in the future. I think you could make an argument that the distribution chosen for some clusters is poor but once you’ve extracted that distribution you’re locked in.
I don’t understand this part. I don’t see anything that struck me as false. It struck me more that different views have different amounts of data available and so they update what they can. That doesn’t make anything false again it’s a limitation of these gui representations. But if you explain this more maybe I can be convinced. But I want to emphasize overall you’re right that a big different between <3 and 3 is that KS didn’t finish the npy writing which means that the FeatureView can only have the info you extract rather than updated views based on the prewritten full files. So again not false but just limited compared to what was possible before. You could try to request from the KS team but they said before they wouldn’t work on it so I don’t think they will. |
Currently running Windows 11 and data sorted from Kilosort 3. Whenever I perform a split, the Feature View and Waveform View show features that are internally consistent (in my screenshot below, the red unit in the Waveform View is the noise pulled out by splitting in FeatureView). However, the n_spikes and correlogram are reversed, making it impossible to tell which is the "real" spike. Incidentally the amplitude view is not showing data. This is not associated with any errors in the command prompt as far as I can tell. Any help would be appreciated.
The text was updated successfully, but these errors were encountered: