Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add workflow example for time series forecasting #255

Merged
merged 12 commits into from
Oct 10, 2023

Conversation

hcho3
Copy link
Contributor

@hcho3 hcho3 commented Jul 27, 2023

  • Use data from the M5 competition, a well-known competition for time series forecasting
  • Use Dask Kubernetes to run hyperparameter optimization
  • Use GPU-accelerated XGBoost and cuDF to accelerate time series analysis end-to-end
  • All preprocessing logic uses cuDF.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@hcho3
Copy link
Contributor Author

hcho3 commented Aug 2, 2023

@jacobtomlinson For the time being, I put in the extra notebooks under the toctree of the start_here.ipynb. I did so because I didn't want to clutter the workflow catalog.

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hcho3 sorry it has taken me so long to get a review on this one.

Overall I think the structure is great, I've added a few comments.

One high-level question I had is why are there so many separate pre-processing notebooks? Some of them seem to load the data, do some small transformation and save them again. I think it would be more pleasant to read if there were fewer (or just one) pre-processing notebook. What do you think?

Also can you add some tags to the start-here notebook?

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really great. Thanks for taking the time to merge this into one notebook and
address the review feedback.

The only small nit I have is that the DISABLE_JUPYTER env var is no longer needed, so you can remove that from the code cell.

@jacobtomlinson jacobtomlinson merged commit 6d49336 into rapidsai:main Oct 10, 2023
3 checks passed
@hcho3 hcho3 deleted the add_time_series branch October 10, 2023 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants