Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: add fault resolution ADR #2051

Open
wants to merge 27 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 101 additions & 0 deletions docs/docs/adrs/adr-018-fault-resolutions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
---
sidebar_position: 19
sainoe marked this conversation as resolved.
Show resolved Hide resolved
title: Fault Resolutions
sainoe marked this conversation as resolved.
Show resolved Hide resolved
sainoe marked this conversation as resolved.
Show resolved Hide resolved
---
# ADR 018: Fault Resolutions
sainoe marked this conversation as resolved.
Show resolved Hide resolved

## Changelog
sainoe marked this conversation as resolved.
Show resolved Hide resolved
* 17th July 2024: Initial draft
sainoe marked this conversation as resolved.
Show resolved Hide resolved

## Status

Proposed

## Context

Partial Set Security ([PSS](./adr-015-partial-set-security.md)) allows a subset of a provider chain's validator set to secure a consumer chain.
While this shared security scheme has many advantages, it comes with a risk known as the
[subset problem](https://informal.systems/blog/replicated-vs-mesh-security#risks-of-opt-in-security-also-known-as-ics-v-2).
This problem arises when a malicious majority of validators from the provider chain collude and misbehave on a consumer chain.
This threat is particularly relevant for Opt-in chains since they might be secured by a relatively small subset of the provider's validator set.
sainoe marked this conversation as resolved.
Show resolved Hide resolved

In cases of collusion, various types of misbehaviour can be performed by the validators, such as:

* Incorrect executions to break protocol rules in order to steal funds.
* Liveness attacks to halt the chain or censor transactions.
* Oracle attacks to falsify information used by the chain logic.
sainoe marked this conversation as resolved.
Show resolved Hide resolved

Currently, these types of attacks aren't handled in PSS, leaving the malicious validators unpunished.
sainoe marked this conversation as resolved.
Show resolved Hide resolved

A potential solution is to use fraud proofs. This technology allows proving incorrect state transitions of a chain without a full node.
However, this is a complex technology, and there is no framework that works for Cosmos chains to this day.


To address this risk in PSS, a governance-gated slashing solution can be used until fraud proof technology matures.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the fraud proofs will only work for incorrect execution.



This ADR proposes a fault resolution mechanism, which is a type of governance proposal that victims of faults can use to vote on the
slashing of validators that misbehave on Opt-in consumer chains.
sainoe marked this conversation as resolved.
Show resolved Hide resolved

In what follows, we describe the implementation of a fault resolution mechanism that handles incorrect executions on consumer chains,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why only incorrect executions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we only provide a guideline for incorrect execution fault definition in the intersubjective faults CHIPs. The handling for other types of faults will be implemented iteratively.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but that will not need any new implementation effort, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend making this a bit more general. From a code perspective, this handles arbitrary fault resolutions

as a first iteration.


## Decision

The proposed solution introduces a new `consumer-fault-resolution` governance proposal type to the `provider` module, which allows
validators to be penalised for committing faults on an Opt-in consumer chain.

If such a proposal passes, the proposal handler tombstones all the validators listed in the proposal and slashes them by a predefined
amount or the default value used for double-sign infractions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if the predefined ammount is a per consumer chain param.


The proposal has the following fields:

- **Description**: This field should be filled with a fault definition describing the type of misbehavior that the validators executed
sainoe marked this conversation as resolved.
Show resolved Hide resolved
on an Opt-in consumer chain. A fault definition should precisely describe how an attack was performed and why it is eligible as a slashable fault.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- **Consumer Chain**: The chain that the fault was related to.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the consumer ID once we have permissionless ICS.

- **Validators**: The list of all the validators to be slashed.
mpoke marked this conversation as resolved.
Show resolved Hide resolved

sainoe marked this conversation as resolved.
Show resolved Hide resolved
In addition, in order to prevent spamming, users are required to pay a fee of `100ATOM` to submit a fault resolution to the provider.
mpoke marked this conversation as resolved.
Show resolved Hide resolved

### Validations

The submission of a fault resolution fails if any of the following conditions are not met:
sainoe marked this conversation as resolved.
Show resolved Hide resolved

- the consumer chain is an Opt-in chain
- all listed validators were opted-in to the consumer chain in the past unbonding-period
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- all listed validators were opted-in to the consumer chain in the past unbonding-period
- the provided validator set matches one of the consumer validator sets from the previous unbonding period
- all the listed validators are part of the provided validator set

Copy link
Contributor Author

@sainoe sainoe Jul 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The provided validator set shouldn't necessarily match one of the consumer validator sets completely, right? i.e. it can be a subset.

Copy link
Contributor

@jtremback jtremback Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sainoe yes this is right. Perhaps only a few validators committed the fault.

- the `100ATOM` fee is provided
sainoe marked this conversation as resolved.
Show resolved Hide resolved

### Additional considerations

Fault resolution proposals should be `expedited` to minimize the time given to the listed validators to
sainoe marked this conversation as resolved.
Show resolved Hide resolved
to unbond to avoid punishment (see [Expedited Proposals](https://docs.cosmos.network/v0.50/build/modules/gov#expedited-proposals)) .
sainoe marked this conversation as resolved.
Show resolved Hide resolved


## Consequences

### Positive

- Provide the ability to slash and tombstone validators for committing incorrect executions on Opt-in consumer chains.

### Negative

- Assuming that malicious validators unbond immediately after misbehaving, a fault resolution has to be submitted within a maximum
of two weeks in order to slash the validators.

### Neutral

- Fault definitions need to have a clear framework in order to avoid debates about whether an attack has actually taken place.
sainoe marked this conversation as resolved.
Show resolved Hide resolved
sainoe marked this conversation as resolved.
Show resolved Hide resolved

## References
sainoe marked this conversation as resolved.
Show resolved Hide resolved

<!-- TODO: add Fault Resolution CHIPs discussion here when it's published -->

* [Enabling Opt-in and Mesh Security with Fraud Votes](https://forum.cosmos.network/t/enabling-opt-in-and-mesh-security-with-fraud-votes/10901)

* [CHIPs discussion phase: Partial Set Security](https://forum.cosmos.network/t/chips-discussion-phase-partial-set-security-updated/11775)

* [Replicated vs. Mesh Security](https://informal.systems/blog/replicated-vs-mesh-security#risks-of-opt-in-security-also-known-as-ics-v-2)



1 change: 1 addition & 0 deletions docs/docs/adrs/intro.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ To suggest an ADR, please make use of the [ADR template](./templates/adr-templat
- [ADR 011: Improving testing and increasing confidence](./adr-011-improving-test-confidence.md)
- [ADR 016: Security aggregation](./adr-016-securityaggregation.md)
- [ADR 017: ICS with Inactive Provider Validators](./adr-017-allowing-inactive-validators.md)
- [ADR 018: Fault Resolutions](./adr-018-fault-resolutions.md)

### Rejected

Expand Down
Loading