Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breakdown of which benchmarks were solved in paper #9

Open
SamuelSchmidgall opened this issue Oct 14, 2024 · 7 comments
Open

Breakdown of which benchmarks were solved in paper #9

SamuelSchmidgall opened this issue Oct 14, 2024 · 7 comments

Comments

@SamuelSchmidgall
Copy link

Hello,

I noticed that in the paper there is no discussion of exactly which of the benchmarks your solutions were able to solve. I am also curious of the percent breakdown for Low, Medium, and High complexity (e.g. above medium / earning Bronze, Silver, Gold). I would greatly appreciate if this data could be provided.

Thank you,
Sam

@Fardeen786-eng
Copy link

Hi Team, if along with that analysis, we get a clear classification of the competitions based on the types mentioned. And the medals achievement against the various types. I think there were multiple runs and seeds, but just for the best runs would do.

Regards,
Fardeen

@thesofakillers
Copy link
Collaborator

thesofakillers commented Oct 17, 2024

we get a clear classification of the competitions based on the types mentioned

Hi, what do you mean exactly by this? Are you looking for the raw data that went into making Figure 6 in the report?

If you're simply looking for which comps are low/medium/high, you can check the splits in experiments/splits/

@SamuelSchmidgall
Copy link
Author

Hi, not sure what the other user meant,

I was hoping to get a medal breakdown based on the difficulty tiers of the challenges. In table 2 you report the following measures: Made Submission (%), Valid Submission (%), Above Median (%), Bronze (%), Silver (%), Gold (%), Any Medal (%).

However, since you do not report which exact benchmarks the above metrics were earned for, there is no way to know the e.g. Above Median (%) for low complexity problems. I was hoping to be able to create a table with the following columns

Made Submission (%), Valid Submission (%), Above Median (%), Bronze (%), Silver (%), Gold (%), Any Medal (%)

However, reporting the breakdown based on complexity (e.g. low, medium, high).

@Fardeen786-eng
Copy link

we get a clear classification of the competitions based on the types mentioned

Hi, what do you mean exactly by this? Are you looking for the raw data that went into making Figure 6 in the report?

If you're simply looking for which comps are low/medium/high, you can check the splits in experiments/splits/

Yes, I am looking to find the raw data for Figure 6. And some analysis of the medals earned by the runs based on the various categories mentioned in Figure 6. ( Like the % medal for Tabular, Tex Classification...)

@SamuelSchmidgall
Copy link
Author

Is there no results.txt file anywhere, where the benchmarks that were solved are available? Or logs from the experiments? I was just hoping for an accuracy/medal breakdown on MLE-Bench based on complexity, would be happy to calculate myself if the logs are available.

@james-aung
Copy link
Collaborator

We'll likely share the grading reports from our runs later this week or next :)

@SamuelSchmidgall
Copy link
Author

Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants