Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server timestamps in logs received from server #434

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

korowa
Copy link

@korowa korowa commented May 13, 2024

Checklist:

  • Add tests that demonstrate the correct behavior of the change. Tests should fail without the change.
  • Add or update relevant docs, in the docs folder and in code.
  • Ensure PR doesn't contain untouched code reformatting: spaces, etc.
  • Run flake8 and fix issues.
  • Run pytest no tests failed. See https://clickhouse-driver.readthedocs.io/en/latest/development.html.

Context:

For certain types of queries (e.g. backup/restore operations) ClickHouse server might trigger sending logs to the client with significant delay. In such cases, if client has send_logs_level setting enabled, it might be helpful (or less confusing) to also have server event timestamp on client side.

This PR adds server timestamp to log messages by default, and also adds an ability to disable log attributes concatenation to a single string message (which may be helpful in case of structured logging, to avoid parsing the string back to original attributes).

tests/test_blocks.py Outdated Show resolved Hide resolved
clickhouse_driver/log.py Outdated Show resolved Hide resolved
@korowa korowa marked this pull request as draft May 13, 2024 11:44
@coveralls
Copy link

coveralls commented May 14, 2024

Coverage Status

coverage: 96.26% (+0.009%) from 96.251%
when pulling 2a17ea5 on korowa:server-log-timestamp
into 59ea586 on mymarilyn:master.

@korowa korowa force-pushed the server-log-timestamp branch 3 times, most recently from 71b0109 to af09da7 Compare May 16, 2024 07:20
@korowa korowa changed the title preserve server log timestamp in clickhouse logs server timestamps in logs received from server May 16, 2024
@korowa korowa marked this pull request as ready for review May 16, 2024 07:20
@korowa korowa requested a review from xzkostyan May 20, 2024 06:37
@korowa
Copy link
Author

korowa commented May 24, 2024

@xzkostyan could you please take a look when you have time?

@xzkostyan
Copy link
Member

Replacing client datetime with server datetime breaks user experience.

Example: I'm digging into some application issue and grep-ing logs by date/datetime. If server has significant time drift from client there will be no logs in grep output in application logs.

I'd suggest to add another timestamp into log for server timestamp. Result format will be:

        logger.info(
            '[ %s ] [ %s ] [ %s ] {%s} <%s> %s: %s',
            row['server_timestamp'],
            row['host_name'],
            thread_id,
            row['query_id'],
            priority,
            row['source'],
            row['text']
            row['text']
        )

Result log will be

2018-12-14 10:24:53,875 INFO     clickhouse_driver.log: [2018-12-14 10:22:53,875] [ klebedev-ThinkPad-T460 ] [ 25 ] {b328ad33-60e8-4012-b4cc-97f44a7b28f2} <Debug> MemoryTracker: Peak memory usage (for query): 40.23 KiB.

What do you think?

@korowa korowa marked this pull request as draft June 24, 2024 15:28
@korowa korowa force-pushed the server-log-timestamp branch 2 times, most recently from a332f0f to e83c2da Compare June 27, 2024 08:36
Copy link
Contributor

@azat azat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@korowa I guess it is not a draft anymore?

clickhouse_driver/log.py Outdated Show resolved Hide resolved
clickhouse_driver/log.py Show resolved Hide resolved
row['thread_id'] = row.get('thread_id') or row['thread_number']

# put log block row into LogRecord extra
extra = {"server"+k: v for k, v in row.items()}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this will be better?

Suggested change
extra = {"server"+k: v for k, v in row.items()}
extra = {"clickhouse"+k: v for k, v in row.items()}

Also I don't see how does _ added here, can you clarify?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't -- all test failres has been caused by this.

Regarding attribute prefix -- "server" simply is shorter and doesn't conflict with any of LogRecord attributes, so it seemed to be sufficient, but I don't have any string preferences here.

docs/features.rst Outdated Show resolved Hide resolved
@korowa korowa force-pushed the server-log-timestamp branch 2 times, most recently from be19911 to bb16df5 Compare July 1, 2024 07:43
@korowa korowa marked this pull request as ready for review July 1, 2024 07:53
@korowa
Copy link
Author

korowa commented Jul 1, 2024

@xzkostyan I've added server timestamp to the log message as a default behaviour.

In addition, now there is an ability to disable this concatenation, and set log format, using any of ClickHouse log row attributes, which are passed as extra to the log call.

return True


def configure_logger(raw_log_record=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW defaults can be error prone, and in this particular case it could be avoided, the function is not very common. So, I would rather remove the default here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants