Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset download #4

Open
Yuan-ManX opened this issue Aug 21, 2023 · 7 comments
Open

Dataset download #4

Yuan-ManX opened this issue Aug 21, 2023 · 7 comments

Comments

@Yuan-ManX
Copy link

Is the dataset open source? How to download?

@liebharc
Copy link

I would also be happy to see the data set. In addition, it would be nice to have the training code available. In my test runs the results seem to be sensitive depending on how the staffs are cropped from a larger image. I would think that this could be improved by adding more distortions to the training set.

@Daniel63656
Copy link

will the dataset be made public?

@liebharc
Copy link

From what I can see, the data set was never published. While I still hope that this might change in the future, I started an attempt to train this model on a mix of the PrIMuS data set and the Grandstaff data set. The results aren't as robust yet as what I get with the weights provided in this repo, but in some cases it works well. I put the training code so far on my fork of this repo: https://github.com/liebharc/Polyphonic-TrOMR

@noobpeng99
Copy link

From what I can see, the data set was never published. While I still hope that this might change in the future, I started an attempt to train this model on a mix of the PrIMuS data set and the Grandstaff data set. The results aren't as robust yet as what I get with the weights provided in this repo, but in some cases it works well. I put the training code so far on my fork of this repo: https://github.com/liebharc/Polyphonic-TrOMR

You are right,I have also attempted to train TrOMR on the PRIMuS dataset, simply by scaling the images to a fixed size. My results show that TrOMR's performance does not exhibit a significant advantage, with a symbol error rate exceeding 3% on the CameraPrIMuS dataset. Can you share your test results?

@liebharc
Copy link

You are right,I have also attempted to train TrOMR on the PRIMuS dataset, simply by scaling the images to a fixed size. My results show that TrOMR's performance does not exhibit a significant advantage, with a symbol error rate exceeding 3% on the CameraPrIMuS dataset. Can you share your test results?

I haven't calculated a symbol error rate yet. Right now, I run the inference on a small set of example images, such as https://github.com/BreezeWhite/oemer/blob/main/figures/tabi.jpg (after splitting it into single staff images) to get a feeling on how well it performs.

Is the code you are using to calculate the SER available somewhere? To get meaningful results, I'd also need another data set to calculate the SER on. Since PrIMuS is used for the training, I can't of course also use it to rate the performance of the results. At least for monophonic examples, it shouldn't be too hard for me to find another data set.

@noobpeng99
Copy link

You are right,I have also attempted to train TrOMR on the PRIMuS dataset, simply by scaling the images to a fixed size. My results show that TrOMR's performance does not exhibit a significant advantage, with a symbol error rate exceeding 3% on the CameraPrIMuS dataset. Can you share your test results?

I haven't calculated a symbol error rate yet. Right now, I run the inference on a small set of example images, such as https://github.com/BreezeWhite/oemer/blob/main/figures/tabi.jpg (after splitting it into single staff images) to get a feeling on how well it performs.

Is the code you are using to calculate the SER available somewhere? To get meaningful results, I'd also need another data set to calculate the SER on. Since PrIMuS is used for the training, I can't of course also use it to rate the performance of the results. At least for monophonic examples, it shouldn't be too hard for me to find another data set.

I will open-source my code once everything is ready, but currently, it's still under development. You can calculate the symbol error rate by measuring the edit distance between the predicted sequence generated by the computational model and the ground truth. You can use the command 'pip install editdistance' to install the tool for calculating the edit distance.

Regarding the dataset, I trained the model using approximately 60,000 images from the PrIMuS dataset and then tested it on around 10,000 images. I also experimented with training on a smaller scale of images and found that TrOMR may not fully demonstrate its true capabilities when the dataset size is small.

@liebharc
Copy link

FYI, homr now combines TrOMR trained on the grandstaff dataset with a staff detection module which is based on the segmentation models of oemer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants