Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ontobio is not HTTPS-safe #688

Open
kltm opened this issue Aug 20, 2024 · 1 comment
Open

ontobio is not HTTPS-safe #688

kltm opened this issue Aug 20, 2024 · 1 comment
Labels

Comments

@kltm
Copy link
Member

kltm commented Aug 20, 2024

We recently discovered, while working on Cloudflare for GO public access points (https://github.com/geneontology/operations/issues/70) that we got errors in ontobio.

[2024-08-20T06:50:13.986Z] Traceback (most recent call last):
[2024-08-20T06:50:13.986Z]   File "/usr/local/bin/validate.py", line 999, in <module>
[2024-08-20T06:50:13.986Z]     cli(obj={})
[2024-08-20T06:50:13.986Z]   File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in __call__
[2024-08-20T06:50:13.986Z]     return self.main(*args, **kwargs)
[2024-08-20T06:50:13.986Z]   File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
[2024-08-20T06:50:13.986Z]     rv = self.invoke(ctx)
[2024-08-20T06:50:13.986Z]   File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1688, in invoke
[2024-08-20T06:50:13.986Z]     return _process_result(sub_ctx.command.invoke(sub_ctx))
[2024-08-20T06:50:13.986Z]   File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
[2024-08-20T06:50:13.986Z]     return ctx.invoke(self.callback, **ctx.params)
[2024-08-20T06:50:13.986Z]   File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
[2024-08-20T06:50:13.986Z]     return __callback(*args, **kwargs)
[2024-08-20T06:50:13.986Z]   File "/usr/local/lib/python3.10/dist-packages/click/decorators.py", line 33, in new_func
[2024-08-20T06:50:13.986Z]     return f(get_current_context(), *args, **kwargs)
[2024-08-20T06:50:13.986Z]   File "/usr/local/bin/validate.py", line 722, in produce
[2024-08-20T06:50:13.986Z]     matching_gpi_path = download_a_dataset_source(group, ds, absolute_target, ds["source"],
[2024-08-20T06:50:13.986Z]   File "/usr/local/bin/validate.py", line 106, in download_a_dataset_source
[2024-08-20T06:50:13.986Z]     response = requests.get(reconstructed_url, stream=True)
[2024-08-20T06:50:13.986Z]   File "/usr/local/lib/python3.10/dist-packages/requests/api.py", line 73, in get
[2024-08-20T06:50:13.986Z]     return request("get", url, params=params, **kwargs)
[2024-08-20T06:50:13.986Z]   File "/usr/local/lib/python3.10/dist-packages/requests/api.py", line 59, in request
[2024-08-20T06:50:13.986Z]     return session.request(method=method, url=url, **kwargs)
[2024-08-20T06:50:13.986Z]   File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line 589, in request
[2024-08-20T06:50:13.986Z]     resp = self.send(prep, **send_kwargs)
[2024-08-20T06:50:13.986Z]   File "/usr/local/lib/python3.10/dist-packages/requests/sessions.py", line 703, in send
[2024-08-20T06:50:13.986Z]     r = adapter.send(request, **kwargs)
[2024-08-20T06:50:13.986Z]   File "/usr/local/lib/python3.10/dist-packages/requests/adapters.py", line 501, in send
[2024-08-20T06:50:13.986Z]     raise ConnectionError(err, request=request)
[2024-08-20T06:50:13.986Z] requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
[2024-08-20T06:50:15.848Z] make: *** [Makefile:86: target/groups/dictybase/dictybase.group] Error 1

Poking around it may be from some hard-coded section like

    def create_from_remote_file(self, group, snapshot=True, **args):
        """
        Creates from remote GAF
        """
        import requests
        url = "http://snapshot.geneontology.org/annotations/{}.gaf.gz".format(group)
        r = requests.get(url, stream=True, headers={'User-Agent': get_user_agent(modules=[requests], caller_name=__name__)})
        p = GafParser()
        results = p.skim(r.raw)
        return self.create_from_tuples(results, **args)

in ./ontobio/assoc_factory.py (or somewhere else), called from validate.py.

We can mitigate this by allowing both HTTP and HTTPS connections to snapshot.geneontology.org, as we were last week.

A fix would be to check all of the requests lib usage and make sure that our external calls are okay to be 301 upgraded from HTTP to HTTPS.

@kltm kltm added the bug label Aug 20, 2024
@kltm
Copy link
Member Author

kltm commented Aug 20, 2024

This does not currently have a priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant