Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

watch exit unexpected #325

Closed
fighterhit opened this issue Jul 31, 2024 · 4 comments
Closed

watch exit unexpected #325

fighterhit opened this issue Jul 31, 2024 · 4 comments

Comments

@fighterhit
Copy link

fighterhit commented Jul 31, 2024

In order to do service discovery in my self-hosted k8s cluster, I use the following code to watch the changes of the endpoint under the specified namespace, but it will exit with the following exception after a period of time. What is the reason? I found that it takes about 5 minutes from startup to abnormal exit. Is this related to the timeout_seconds setting? But according to the cause of the exception, it doesn't seem so, and I think the watch operation should keep observing the changes of the endpoint by default.

  • k8s version: 1.23.13
  • kubernetes-asyncio: 23.6.0
async def watch_endpoints():
    async with client.ApiClient() as api:
        v1 = client.CoreV1Api(api)
        async with watch.Watch().stream(v1.list_namespaced_endpoints, "MY_NS") as stream:
            async for event in stream:
                evt, obj = event["type"], event["object"]
                ips = []
                if obj.subsets:
                    for ep in obj.subsets:
                        for addr in ep.addresses:
                            ips.append(addr.ip)
                    print(
                        "{} {}/{} endpoints {}".format(
                            evt, obj.metadata.namespace, obj.metadata.name, ips
                        )
                    )
Task exception was never retrieved
future: <Task finished name='Task-1' coro=<watch_endpoints() done, defined at /root/k8s.py:33> exception=ApiException()>
Traceback (most recent call last):
  File "/root/k8s.py", line 37, in watch_endpoints
    async for event in stream:
  File "/usr/local/lib/python3.11/site-packages/kubernetes_asyncio/watch/watch.py", line 131, in __anext__
    return await self.next()
           ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/kubernetes_asyncio/watch/watch.py", line 174, in next
    return self.unmarshal_event(line, self.return_type)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/kubernetes_asyncio/watch/watch.py", line 103, in unmarshal_event
    raise client.exceptions.ApiException(status=obj['code'], reason=reason)
kubernetes_asyncio.client.exceptions.ApiException: (410)
Reason: Expired: too old resource version: 3250692444 (3250783264)

When I set timeout_seconds=600, the program still exits 5 minutes after startup, but another exception is raised.

Task exception was never retrieved
future: <Task finished name='Task-1' coro=<watch_endpoints() done, defined at /root/k8s.py:33> exception=TimeoutError()>
Traceback (most recent call last):
  File "/root/k8s.py", line 37, in watch_endpoints
    async for event in stream:
  File "/usr/local/lib/python3.11/site-packages/kubernetes_asyncio/watch/watch.py", line 131, in __anext__
    return await self.next()
           ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/kubernetes_asyncio/watch/watch.py", line 152, in next
    line = await self.resp.content.readline()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/streams.py", line 311, in readline
    return await self.readuntil()
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/streams.py", line 343, in readuntil
    await self._wait("readuntil")
  File "/usr/local/lib/python3.11/site-packages/aiohttp/streams.py", line 303, in _wait
    with self._timer:
  File "/usr/local/lib/python3.11/site-packages/aiohttp/helpers.py", line 720, in __exit__
    raise asyncio.TimeoutError from None
TimeoutError
@fighterhit
Copy link
Author

fighterhit commented Jul 31, 2024

I found that _request_timeout can be passed as a parameter to watch.Watch().stream according to #259, which finally used as the timeout parameter of aiohttp (default 5min), which can avoid TimeoutError, but another exception(Reason: Expired: too old resource version...) will still be thrown after the _request_timeout is reached, but at least we can increase the watch time.

@tomplus
Copy link
Owner

tomplus commented Jul 31, 2024

Duplicated of #136

If you want to watch forever it should work without _request_timeout, timeout. These 410s are real problem here.

@tomplus tomplus closed this as completed Jul 31, 2024
@fighterhit
Copy link
Author

Duplicated of #136

If you want to watch forever it should work without _request_timeout, timeout. These 410s are real problem here.

@tomplus Thanks, is there a solution now?

@tomplus
Copy link
Owner

tomplus commented Aug 2, 2024

Not now, but I'll take a look on it next week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants