Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new lancaster storage. Remove broken status from the old lancaste… #81

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sk1806
Copy link

@sk1806 sk1806 commented May 30, 2024

Added new lancaster storage.

Removed broken status from old lancaster storage since it works and we need to access it to replicate data from old to new storage.

I tested
t2kdm-put
t2kdm-replicate

and both of these worked with the new SE.

@ast0815
Copy link
Member

ast0815 commented Jun 3, 2024

Don't forget to also update the CHANGELOG file.

@ast0815
Copy link
Member

ast0815 commented Jun 3, 2024

Also, are you sure the broken status prevented you from replicating from that SE? It should de-prioritize broken SE when automatically choosing where to get a file from, but it should still use them as a last resort or if you explicitly tell it to use a specific one.

@thecuriousneutrino
Copy link
Collaborator

thecuriousneutrino commented Jun 10, 2024

Also, are you sure the broken status prevented you from replicating from that SE? It should de-prioritize broken SE when automatically choosing where to get a file from, but it should still use them as a last resort or if you explicitly tell it to use a specific one.

Actually I just tested if one can get files from a broken SE, and can confirm that keeping the old Lancaster SE with the broken status makes it impossible to get any files from it:

t2kdm-get -v /beta-production/sn/sntools/v0.7.2/nakazato/IO_100kpc_2001/HyperK_20perCent/0001/wcsim/v1.9.4_A/wcsr/sn_0001_000020050_wcsr.root
Getting /beta-production/sn/sntools/v0.7.2/nakazato/IO_100kpc_2001/HyperK_20perCent/0001/wcsim/v1.9.4_A/wcsr/sn_0001_000020050_wcsr.root
2024-06-10 05:36:17 UTC None/API INFO: Replica Lookup Time: 0.21 seconds
2024-06-10 05:36:17 UTC None/API INFO: Replica Lookup Time: 0.16 seconds
Copying root://x509up_u22620@fal-pygrid-30.lancs.ac.uk:1094//dpm/lancs.ac.uk/home/hyperk.org/hyperk.org/beta-production/sn/sntools/v0.7.2/nakazato/IO_100kpc_2001/HyperK_20perCent/0001/wcsim/v1.9.4_A/wcsr/sn_0001_000020050_wcsr.root to ./sn_0001_000020050_wcsr.root
Getting /beta-production/sn/sntools/v0.7.2/nakazato/IO_100kpc_2001/HyperK_20perCent/0001/wcsim/v1.9.4_A/wcsr/sn_0001_000020050_wcsr.root failed.
b'gfal-copy error: 52 (Invalid exchange) - Could not stat the source: Failed to stat file (Invalid exchange)\n'`

When one removes the broken label t2kdm-get works fine.

@ast0815
Copy link
Member

ast0815 commented Jun 10, 2024

This seems like a bug. Looking at the code, it should just de-prioritize blacklisted or broken SEs:

t2kdm/t2kdm/storage.py

Lines 136 to 146 in 778b68a

def sorter(SE):
if SE is None:
return 1000
distance = self.get_distance(SE)
if SE.type == "tape":
# Prefer disks over tape, even if the tape is closer by
distance += 10
if SE.is_blacklisted():
# Try blacklisted SEs only as a last resort
distance += 100
return distance

Is root://[x509up_u22620@fal-pygrid-30.lancs.ac.uk](mailto:x509up_u22620@fal-pygrid-30.lancs.ac.uk):1094 the new SE or the old?

If it is the old, then I do not understand why it does not work depending on the broken flag, because it should do exactly the same steps.

If it is the new, I do not understand why it does not try the broken SE after this one fails. As noted above, it should really just put them at the end of the list but still download from there if all else fails.

@ast0815
Copy link
Member

ast0815 commented Jun 10, 2024

is_blacklisted also returns True for broken SEs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants