Breakdown of which benchmarks were solved in paper #9

SamuelSchmidgall · 2024-10-14T17:31:50Z

Hello,

I noticed that in the paper there is no discussion of exactly which of the benchmarks your solutions were able to solve. I am also curious of the percent breakdown for Low, Medium, and High complexity (e.g. above medium / earning Bronze, Silver, Gold). I would greatly appreciate if this data could be provided.

Thank you,
Sam

Fardeen786-eng · 2024-10-16T00:44:31Z

Hi Team, if along with that analysis, we get a clear classification of the competitions based on the types mentioned. And the medals achievement against the various types. I think there were multiple runs and seeds, but just for the best runs would do.

Regards,
Fardeen

thesofakillers · 2024-10-17T15:37:26Z

we get a clear classification of the competitions based on the types mentioned

Hi, what do you mean exactly by this? Are you looking for the raw data that went into making Figure 6 in the report?

If you're simply looking for which comps are low/medium/high, you can check the splits in experiments/splits/

SamuelSchmidgall · 2024-10-17T21:38:48Z

Hi, not sure what the other user meant,

I was hoping to get a medal breakdown based on the difficulty tiers of the challenges. In table 2 you report the following measures: Made Submission (%), Valid Submission (%), Above Median (%), Bronze (%), Silver (%), Gold (%), Any Medal (%).

However, since you do not report which exact benchmarks the above metrics were earned for, there is no way to know the e.g. Above Median (%) for low complexity problems. I was hoping to be able to create a table with the following columns

Made Submission (%), Valid Submission (%), Above Median (%), Bronze (%), Silver (%), Gold (%), Any Medal (%)

However, reporting the breakdown based on complexity (e.g. low, medium, high).

Fardeen786-eng · 2024-10-17T22:18:22Z

we get a clear classification of the competitions based on the types mentioned

Hi, what do you mean exactly by this? Are you looking for the raw data that went into making Figure 6 in the report?

If you're simply looking for which comps are low/medium/high, you can check the splits in experiments/splits/

Yes, I am looking to find the raw data for Figure 6. And some analysis of the medals earned by the runs based on the various categories mentioned in Figure 6. ( Like the % medal for Tabular, Tex Classification...)

SamuelSchmidgall · 2024-10-23T13:18:17Z

Is there no results.txt file anywhere, where the benchmarks that were solved are available? Or logs from the experiments? I was just hoping for an accuracy/medal breakdown on MLE-Bench based on complexity, would be happy to calculate myself if the logs are available.

james-aung · 2024-10-23T13:41:07Z

We'll likely share the grading reports from our runs later this week or next :)

SamuelSchmidgall · 2024-10-23T13:43:07Z

Thank you very much!

thesofakillers mentioned this issue Oct 24, 2024

Information request the result of each dataset in the benchmark #16

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Breakdown of which benchmarks were solved in paper #9

Breakdown of which benchmarks were solved in paper #9

SamuelSchmidgall commented Oct 14, 2024

Fardeen786-eng commented Oct 16, 2024

thesofakillers commented Oct 17, 2024 •

edited

Loading

SamuelSchmidgall commented Oct 17, 2024

Fardeen786-eng commented Oct 17, 2024

SamuelSchmidgall commented Oct 23, 2024

james-aung commented Oct 23, 2024

SamuelSchmidgall commented Oct 23, 2024

Breakdown of which benchmarks were solved in paper #9

Breakdown of which benchmarks were solved in paper #9

Comments

SamuelSchmidgall commented Oct 14, 2024

Fardeen786-eng commented Oct 16, 2024

thesofakillers commented Oct 17, 2024 • edited Loading

SamuelSchmidgall commented Oct 17, 2024

Fardeen786-eng commented Oct 17, 2024

SamuelSchmidgall commented Oct 23, 2024

james-aung commented Oct 23, 2024

SamuelSchmidgall commented Oct 23, 2024

thesofakillers commented Oct 17, 2024 •

edited

Loading