Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release/3.0.0 #372

Draft
wants to merge 27 commits into
base: master
Choose a base branch
from
Draft

Release/3.0.0 #372

wants to merge 27 commits into from

Conversation

pondzix
Copy link
Contributor

@pondzix pondzix commented Sep 30, 2024

WIP

adatzer and others added 26 commits September 11, 2024 10:53
The default was changed to a naively long value when we implemented the feature - 60s is more reasonable
Refactors the HTTP target to handle batches of data in one go, adds the ability to provide a template to get those batches of data into a request body, and adds a helper function to cover the case where there is no template provided.

This involves breaking changes - the previous behaviour can be achieved by configuring a template if the data is already JSON, or a transformation + a template if not - but we don't believe there are any use cases requiring us to do so.

- Refactors the http target to handle batches
- Adds configuration of batch sizes for http target
- Adds a templater function
- Adds a helper function which creates a JSON array from the data where no template is provided
- Adds a function to group data by dynamic headers where that is configured - to ensure that this feature still works
- Adds tests for these features

These features also introduce a new possibility for failure in the HTTP target. The http target now expects valid JSON (previously it could be anything) - where we hit an error in either templating this data - or in the case of no template in transforming the data into a JSON array request body - these messages are treated as invalid, and will be sent to the failure target.

---------

Co-authored-by: Piotr Poniedziałek <pondzix@gmail.com>
So that we can reference environment variables in HTTP templates.
By using function `models.NewTargetWriteResult` instead of directly creating structure `models.TargetWriteResult`. Function makes sure fields `SentCount` and `FailedCount` are set which are later used by observer to update metrics.
`/tmp/config.hcl` is the default value for `SNOWBRIDGE_CONFIG_FILE`
environment variable when running docker image. It means it's the default location where
Snowbridge tries to find and load configuration.

Without mounting `/tmp/config.hcl` explicitly or setting custom `SNOWBRIDGE_CONFIG_FILE` variable, Snowbridge would crash. Creating empty file and saving it as `/tmp/config.hcl` in Dockerfile ensures Snowbridge doesn't fail even when none of the above is set by user. It simply runs using defaults.
We don't want explicit null to be present on JQ output, so we have to
get rid of them somehow. As there is no option neither in `gojq` nor
`json.Marshal` allowing us to to that, it seems the best way is to simply
iterate over a map (we have a map between `gojq` output and `json.Marshal` input in transformation) and filter out fields being nulls.
Currently retrying [configuration](https://github.com/snowplow/snowbridge/blob/c8c94f8a13367260f9a1a280cdd5d86098f66521/cmd/cli/cli.go#L213) is hardcoded to 5 attempts with exponential delay (inital 1 second) for each type of an error.

This commit brings a couple of improvements to retrying:

* Make retrying settings configurable from HCL. Max attempts and delay can now be customized.
* Split retrying strategies into 2 categories: transient and setup. Target can now signal what kind of an error has happened and what kind of retrying strategy should be applied.
* HTTP target can now return setup errors by the new response rules configuration. These rules also allows the target to match invalid data. Previously it was either failure (retried) or oversized data.
* Eventually setup errors will also trigger another actions like toggling application health status and sending monitoring alerts. For now behaviour for setup errors is basically the same as for transient errors, but extending behaviour for setup errors should be relatively easy with this structure.
* Rename `epoch` to `epochMillis`

* Add second-granularity epoch function

Including a test of chaining jq commands
* Add boolean config to enable TLS

* Fix config test
AWS sdk not updated as it broke e2e tests
* Changed default delay for transient error from 1 to 2. When we start transient retrying we know we've already have at least one write attempt (first layer of retrying is for setup-like errors). To preserve previous 'aggressiveness' of retrying we need to shift initial delay from 1 to 2 seconds.
* Changed log messages for retries
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants