Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to archive to S3 bucket #2980

Open
3 tasks
KevinCounts opened this issue Oct 3, 2024 · 0 comments
Open
3 tasks

Add ability to archive to S3 bucket #2980

KevinCounts opened this issue Oct 3, 2024 · 0 comments
Labels
feature New feature or request

Comments

@KevinCounts
Copy link

What new functionality do you need?

When running reanalyses, PSL would like to have the ability to push and archive data outputs to a public AWS S3 bucket from NOAA HPC resources. Ideally, we'd like to push this data to both HPSS (tarball) and AWS(raw). I'm working with Phil Pegion, Jeff Whitaker, and Ding Liu on setting up this process.

I believe this is a similar request to #2872 , but that issue is more focused on the entire workflow running on AWS or other CSPs. We'd like to be able to do this when running the workflow from NOAA resources.

What are the requirements for the new functionality?

Ability to configure and push data from AWS S3 public buckets. Users should be able to choose if they want to push data to AWS, HPSS, or both. The workflow should be able to handle this new functionality from NOAA HPC resources.

Acceptance Criteria

  • Workflow can be set up to upload data to remote AWS public bucket
  • User can choose to archive data at HPSS(tarball) and/or AWS bucket (raw)
  • Workflow runs on NOAA resources

Suggest a solution (optional)

/ush/python/pygfs/task/archive.py already handles data archiving to HPSS. We would like the ability to choose AWS as an archive here. I think this would align with Walter Kolczynski's suggestion in #2873 .

@KevinCounts KevinCounts added feature New feature or request triage Issues that are triage labels Oct 3, 2024
@WalterKolczynski-NOAA WalterKolczynski-NOAA removed the triage Issues that are triage label Oct 8, 2024
@WalterKolczynski-NOAA WalterKolczynski-NOAA changed the title Add alternate data archive Add ability to archive to S3 bucket Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants