Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation updates for event and metric triggers #12366

Merged
merged 5 commits into from
Mar 21, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 105 additions & 21 deletions docs/concepts/automations.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ search:

Automations in Prefect Cloud enable you to configure [actions](#actions) that Prefect executes automatically based on [trigger](#triggers) conditions.

Potential triggers include the occurrence of events from changes in a flow run's state - or the absence of such events. You can event define your own custom trigger to fire based on an [event](/cloud/events/) created from a webhook or a custom event defined in Python code.
Potential triggers include the occurrence of events from changes in a flow run's state - or the absence of such events. You can even define your own custom trigger to fire based on an [event](/cloud/events/) created from a webhook or a custom event defined in Python code.

Potential actions include kicking off flow runs, pausing schedules, and sending custom notifications.

Expand Down Expand Up @@ -47,10 +47,10 @@ On the **Automations** page, select the **+** icon to create a new automation. Y

### Triggers

Triggers specify the conditions under which your action should be performed. Triggers can be of several types, including triggers based on:
Triggers specify the conditions under which your action should be performed. The Prefect UI includes templates for many common conditions, such as:

- Flow run state change
- Note - Flow Run Tags currently are only evaluated with `OR` criteria
- Note - Flow Run Tags currently are only evaluated with `OR` criteria
- Work pool status
- Work queue status
- Deployment status
Expand Down Expand Up @@ -104,28 +104,32 @@ Specify a name and, optionally, a description for the automation.

## Custom triggers

Custom triggers allow advanced configuration of the conditions on which an automation executes its actions. Several custom trigger fields accept values that end with trailing wildcards, like `"prefect.flow-run.*"`.
When you need a trigger that doesn't quite fit the templates in UI trigger builder, you can define a custom trigger in JSON. With custom triggers, you have access to the full capabilities of Prefect's automation system - allowing you to react to many kinds of events and metrics in your workspace.

![Viewing a custom trigger for automations for a workspace in Prefect Cloud.](/img/ui/automations-custom.png)
Each automation has a single trigger that, when fired, will cause all of its associated actions to run. That single trigger may be a reactive or proactive event trigger, a trigger monitoring the value of a metric, or a composite trigger that combines several underlying triggers.

### Event triggers

The schema that defines a trigger is as follows:
Event triggers are the most common types of trigger, and they are intended to react to the presence or absence of an event happening in your workspace. Event triggers are indicated with `{"type": "event"}`.

| Name | Type | Supports trailing wildcards | Description |
| ------------------ | ------------------ | --------------------------- | ----------- |
| **match** | object | :material-check: | Labels for resources which this Automation will match. |
| **match_related** | object | :material-check: | Labels for related resources which this Automation will match. |
| **after** | array of strings | :material-check: | Event(s), one of which must have first been seen to start this automation. |
| **expect** | array of strings | :material-check: | The event(s) this automation is expecting to see. If empty, this automation will evaluate any matched event. |
| **for_each** | array of strings | :material-close: | Evaluate the Automation separately for each distinct value of these labels on the resource. By default, labels refer to the primary resource of the triggering event. You may also refer to labels from related resources by specifying `related:<role>:<label>`. This will use the value of that label for the first related resource in that role. |
| **posture** | string enum | N/A | The posture of this Automation, either Reactive or Proactive. Reactive automations respond to the presence of the expected events, while Proactive automations respond to the absence of those expected events. |
| **threshold** | integer | N/A | The number of events required for this Automation to trigger (for Reactive automations), or the number of events expected (for Proactive automations) |
| **within** | number | N/A | The time period over which the events must occur. For Reactive triggers, this may be as low as 0 seconds, but must be at least 10 seconds for Proactive triggers |
![Viewing a custom trigger for automations for a workspace in Prefect Cloud.](/img/ui/automations-custom.png)

The schema that defines an event trigger is as follows:

| Name | Type | Supports trailing wildcards | Description |
| ----------------- | ---------------- | --------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **match** | object | :material-check: | Labels for resources which this Automation will match. |
| **match_related** | object | :material-check: | Labels for related resources which this Automation will match. |
| **posture** | string enum | N/A | The posture of this Automation, either Reactive or Proactive. Reactive automations respond to the presence of the expected events, while Proactive automations respond to the absence of those expected events. |
| **after** | array of strings | :material-check: | Event(s), one of which must have first been seen to start this automation. |
| **expect** | array of strings | :material-check: | The event(s) this automation is expecting to see. If empty, this automation will evaluate any matched event. |
| **for_each** | array of strings | :material-close: | Evaluate the Automation separately for each distinct value of these labels on the resource. By default, labels refer to the primary resource of the triggering event. You may also refer to labels from related resources by specifying `related:<role>:<label>`. This will use the value of that label for the first related resource in that role. |
| **threshold** | integer | N/A | The number of events required for this Automation to trigger (for Reactive automations), or the number of events expected (for Proactive automations) |
| **within** | number | N/A | The time period over which the events must occur. For Reactive triggers, this may be as low as 0 seconds, but must be at least 10 seconds for Proactive triggers |

### Resource matching

`match` and `match_related` control which events a trigger considers for evaluation by filtering on the contents of their `resource` and `related` fields, respectively. Each label added to a `match` filter is `AND`ed with the other labels, and can accept a single value or a list of multiple values that are `OR`ed together.
Both the `event` and `metric` triggers support matching events for specific resources in your workspace, including most Prefect objects (like flows, deployment, blocks, work pools, tags, etc) as well as resources you have defined in any events you emit yourself. The `match` and `match_related` fields control which events a trigger considers for evaluation by filtering on the contents of their `resource` and `related` fields, respectively. Each label added to a `match` filter is `AND`ed with the other labels, and can accept a single value or a list of multiple values that are `OR`ed together.

Consider the `resource` and `related` fields on the following `prefect.flow-run.Completed` event, truncated for the sake of example. Its primary resource is a flow run, and since that flow run was started via a deployment, it is related to both its flow and its deployment:

Expand Down Expand Up @@ -265,7 +269,7 @@ The following configuration uses `after` to prevent this automation from firing

### Evaluation strategy

All of the previous examples were designed around a reactive `posture` - that is, count up events toward the `threshold` until it is met, then execute actions. To respond to the absence of events, use a proactive `posture`. A proactive trigger will fire when its `threshold` has _not_ been met by the end of the window of time defined by `within`. Proactive triggers must have a `within` of at least 10 seconds.
All of the previous examples were designed around a reactive `posture` - that is, count up events toward the `threshold` until it is met, then execute actions. To respond to the absence of events, use a proactive `posture`. A proactive trigger will fire when its `threshold` has _not_ been met by the end of the window of time defined by `within`. Proactive triggers must have a `within` value of at least 10 seconds.

The following trigger will fire if a `prefect.flow-run.Completed` event is not seen within 60 seconds after a `prefect.flow-run.Running` event is seen.

Expand Down Expand Up @@ -310,6 +314,86 @@ However, without `for_each`, a `prefect.flow-run.Completed` event from a _differ
}
```


### Metric triggers

Metric triggers (`{"type": "metric"}`) fire when the value of a metric in your workspace crosses a threshold you've defined. For example, you can trigger an automation when the success rate of flows in your workspace drops below 95% over the course of an hour.

Prefect's metrics are all derived by examining your workspace's events, and if applicable, use the `occurred` times of those events as the basis for their calculations.

Prefect defines three metrics today:

* **Successes** (`{"name": "successes"}`), defined as the number of flow runs that went `Pending` and then the latest state we saw was not a failure (`Failed` or `Crashed`). This metric accounts for retries if the ultimate state was successful.
* **Duration** (`{"name": "duration"}`), defined as the _length of time_ that a flow remains in a `Running` state before transitioning to a terminal state such as `Completed`, `Failed`, or `Crashed`. Because this time is derived in terms of flow run state change events, it may be greater than the runtime of your function.
* **Lateness** (`{"name": "lateness"}`), defined as the _length of time_ that a `Scheduled` flow remains in a `Late` state before transitioning to a `Running` and/or `Crashed` state. Only flow runs that the system marks `Late` are included.

The schema of a metric trigger is as follows:

| Name | Type | Supports trailing wildcards | Description |
| ----------------- | -------------------- | --------------------------- | -------------------------------------------------------------- |
| **match** | object | :material-check: | Labels for resources which this Automation will match. |
| **match_related** | object | :material-check: | Labels for related resources which this Automation will match. |
| **metric** | `MetricTriggerQuery` | N/A | The definition of the metric query to run |

And the `MetricTriggerQuery` query is defined as:

| Name | Type | Description |
| -------------- | ------------------------------------- | ---------------------------------------------------------------------- |
| **name** | string | The name of the Prefect metric to evaluate (see above). |
| **threshold** | number | The threshold to which the current metric value is compared |
| **operator** | string (`"<"`, `"<="`, `">"`, `">="`) | The comparison operator to use to decide if the threshold value is met |
| **range** | duration in seconds | How far back to evaluate the metric |
| **firing_for** | duration in seconds | How long the value must exceed the threshold before this trigger fires |

For example, to fire when flow runs tagged `production` in your workspace are failing at a rate of 10% or worse for the last hour (in other words, your success rate is below 90%), create this trigger:

```json
{
"type": "metric",
"match": {
"prefect.resource.id": "prefect.flow-run.*"
},
"match_related": {
"prefect.resource.id": "prefect.tag.production",
"prefect.resource.role": "tag"
},
"metric": {
"name": "successes",
"threshold": 0.9,
"operator": "<",
"range": 3600,
"firing_for": 0
}
}
```

To detect when the average lateness of your Kubernetes workloads (running on a work pool named `kubernetes`) in the last day exceeds 5 minutes late, and that number hasn't gotten better for the last 10 minutes, use a trigger like this:

```json
{
"type": "metric",
"match": {
"prefect.resource.id": "prefect.flow-run.*"
},
"match_related": {
"prefect.resource.id": "prefect.work-pool.kubernetes",
"prefect.resource.role": "work-pool"
},
"metric": {
"name": "lateness",
"threshold": 300,
"operator": ">",
"range": 86400,
"firing_for": 600
}
}
```

### Composite triggers

[more information coming soon]
chrisguidry marked this conversation as resolved.
Show resolved Hide resolved


## Create an automation via deployment triggers

To enable the simple configuration of event-driven deployments, Prefect provides deployment triggers - a shorthand for creating automations that are linked to specific deployments to run them based on the presence or absence of events.
Expand Down Expand Up @@ -340,10 +424,10 @@ You can pass one or more `--trigger` arguments to `prefect deploy`, which can be
# Pass a trigger as a JSON string
prefect deploy -n test-deployment \
--trigger '{
"enabled": true,
"enabled": true,
"match": {
"prefect.resource.id": "prefect.flow-run.*"
},
},
"expect": ["prefect.flow-run.Completed"]
}'

Expand Down Expand Up @@ -413,7 +497,7 @@ The `flow_run|ui_url` token returns the URL for viewing the flow run in Prefect
Here’s an example for something that would be relevant to a flow run state-based notification:

```
Flow run {{ flow_run.name }} entered state {{ flow_run.state.name }}.
Flow run {{ flow_run.name }} entered state {{ flow_run.state.name }}.

Timestamp: {{ flow_run.state.timestamp }}
Flow ID: {{ flow_run.flow_id }}
Expand Down