Skip to content

Latest commit

 

History

History
96 lines (76 loc) · 5.13 KB

Design.md

File metadata and controls

96 lines (76 loc) · 5.13 KB

Design Overview

The data processing pipeline is split into two main elements:

  • Data gathering and aggregation
  • Data consumption / reporting

Data gathering tasks:

The purpose of the data gathering tasks is to collect data from the Develocity API and aggregate it into summaries that can be used for reporting. The data for this stage must be defined such that it is purely additive and can be accumulated over time through a reduction operation, adding the results of one build to the results of the next, etc.. The data collected in this stage can be thought of as an intermediate format, not intended for direct consumption.

Multiple summaries can be created as a result of this stage of processing. Summarizers are registered with the plugin and have the responsibility of processing the raw build data and producing the intermediate data format.

Hourly tasks

  • Query the Develocity server for list of builds within the hour
  • For each configured summarizer:
    • Process each build and produce a summary file
    • reduces the summaries together, persisting them to disk by ID

Daily tasks

  • Depends upon the outputs of hourly tasks for the configured day
  • For each configured summarizer:
    • Loads the hourly summary files
    • reduces the summaries together, persisting them to disk by ID

Time window tasks (e.g., last 7 days)

  • Depends upon the outputs of hourly and/or daily tasks to satisfy the requested time window
  • For each configured summarizer:
    • Loads the hourly summary files
    • reduces the summaries together, persisting them to disk by ID

All data gathering tasks are performed only in the root project.

Consumable Configurations

In order to expose the root project's data to the consuming projects the root project defines consumable Configurations which export the data produced by the data gathering tasks to the consuming projects. These Configurations use the same naming conventions as the data gathering tasks.

Pre-created Configurations

In order for the consuming projects to be able to consume the data produced by the root project's data gathering mechanisms from sub-projects, the root project needs to be instructed as to what Configurations to pre-create. This is due to the fact that the desired Configuration must exist at configuration time.

There are two mechanisms built to satisfy this requirement:

  • When the plugin is applied as a settings plugin, it will look at the requested task list for any task name which is suffixed with -<timeSpec>, where <timeSpec> may either be a datetime specification (e.g. 2024-10-21) or a duration specification (e.g. P7D). The plugin will then automatically pre-create the Configuration for the task with no additional effort required.
  • If the task name detection approach does not satisfy the underlying requirement of the consuming project (e.g., if the task name does not contain a time specification) then the metricsForDevelocityConfigurations gradle property may be supplied with a comma-delimited list of time specifications to automatically pre-create the configurations for. This is a bit hacky, but works in lie of a Gradle API being created to allow for dynamic configuration creation (xref: gradle/gradle#30831).

Basic task wiring

To simplify the chore of wiring up the consuming projects, the TaskProviderExtensions file provides a set of extension functions that can be used to wire up an individual summarizer's output of the data gathering task to a consuming project's TaskProvider<out MetricSummarizerTask> instance.

Advanced task wiring

For more advanced use cases where the provided extensions are inadequate, the consuming project can create a resolvable Gradle Configuration to refer to the root project's Configuration by name and attribute configuration. When this approach is used, the resolvable Configuration should specify the following attributes:

  • SUMMARIZER_ATTRIBUTE with a value of [SUMMARIZER_ALL], which will result in a directory of all summarizer outputs.
  • TIME_SPEC_ATTRIBUTE with a value of the datetime or duration specification (e.g., 2024-10-21 or P7D)

At this point, the summarizer output file would ideally be selected via an artifact transform provided by this plugin. Unfortunately, the Gradle API for this is not quite workable at this time. For now, simply use the output directory provided and resolve the file named with the summarizer's ID, directly. See the TaskProviderExtensions helper method implementations for an example of how this is done.

Data consumer / reporting tasks

Data consumer / reporting tasks are configured to consume the aggregated data from the first stage of the pipeline and produce data or reports in their final form. These tasks are to be implemented as needed to satisfy the requirements of the project.