Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add packet trimming API #2077

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

marian-pritsak
Copy link
Contributor

When a packet is lost, it can be recovered through fast retransmission (e.g., Go-Back-N in RoCE) or by using timeouts. Retransmission triggered by timeouts typically incurs significant latency. Packet trimming aims to facilitate rapid packet loss notification and, consequently, eliminate slow timeout-based retransmissions.

@marian-pritsak marian-pritsak changed the title Add packet trimming API (#7) Add packet trimming API Sep 19, 2024
inc/saibuffer.h Outdated Show resolved Hide resolved
* @flags CREATE_AND_SET
* @default 0
*/
SAI_SWITCH_ATTR_PACKET_TRIMMING_DSCP_VALUE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should DSCP and QUEUE_INDEX be on a more granular level like on every buffer profile or port?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be, but the initial proposal is to have a global setting, which is enough for now.

inc/saiacl.h Outdated Show resolved Hide resolved
* @flags CREATE_AND_SET
* @default 0
*/
SAI_SWITCH_ATTR_PACKET_TRIMMING_QUEUE_INDEX,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is expected behavior if TRIMMING_QUEUE experiences congestion ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop the packet

@rlhui rlhui added the reviewed PR is discussed in SAI Meeting label Sep 19, 2024
* @flags CREATE_AND_SET
* @default 0
*/
SAI_SWITCH_ATTR_PACKET_TRIMMING_QUEUE_INDEX,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If SAI_SWITCH_ATTR_QOS_DSCP_TO_TC_MAP, SAI_SWITCH_ATTR_QOS_TC_AND_COLOR_TO_DSCP_MAP, SAI_SWITCH_ATTR_PACKET_TRIMMING_DSCP_VALUE and SAI_SWITCH_ATTR_PACKET_TRIMMING_QUEUE_INDEX are all configured and if there are any conflicts between these configuration, what is the expected behavior ?
Different ASICs might want to resolve this in different ways

@helloanandhi
Copy link

Can you also add the spec you are referring for this trimming feature ?

@marian-pritsak
Copy link
Contributor Author

Can you also add the spec you are referring for this trimming feature ?

I'm not aware of any RFC or IEEE spec, but there are papers describing this feature, like this one.

inc/saibuffer.h Show resolved Hide resolved
* @flags CREATE_AND_SET
* @default 0
*/
SAI_SWITCH_ATTR_PACKET_TRIMMING_QUEUE_INDEX,
Copy link
Contributor

@JaiOCP JaiOCP Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically the domain will be configured with TRIM_DSCP value and its much better to derive to the queue from the DSCP. Current proposal dis-associates the TRIM_DSCP and queue index i.e. one can configure DSCP_TRIM that may be handled by a certain queue in hw and SAI may configure wrong queue index.
Would it better to use qos map profile configured for the switch. We can derive DSCP to Queue mapping based on it.
SAI_SWITCH_ATTR_QOS_DSCP_TO_TC_MAP
and
SAI_SWITCH_ATTR_QOS_DSCP_TO_TC_MAP

Instead of hardcoding the queue index.

│ │ │ │ │ │ │ │ │ │ │ │ │ ││
│ │ │ │ │ \ / │ │ │ │ │ │ │ │ ││
│ │ │ │ │ \ / │ │ │ │ │ │ │ │ ││
│ Packet │ │ Pipeilne ┼────┼───────\────────► │ │ │ │ │ │ │ ││ Queue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Pipeilne/Pipeline/

Copy link
Contributor

@JaiOCP JaiOCP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving conditionally

  • Update the workflow using the TC derived queue and dscp.
  • Add TRIM packet/byte stats to queue and to port.

* @flags CREATE_AND_SET
* @default 0
*/
SAI_SWITCH_ATTR_PACKET_TRIMMING_QUEUE_INDEX,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marian-pritsak How does the NOS knows what queue index to program in this case? Capability query will not return a support queue index value. We should let HW decide what queue index it will support for TRIM and make this attribute as READ_ONLY.

Else please provide a workflow that how NOS is supposed to know the queue index

inc/saibuffer.h Outdated
* Default action. Packet has nowhere to go
* and will be dropped.
*/
SAI_BUFFER_PROFILE_PACKET_ADMISSION_FAIL_ACTION_DROP = 0x00000000,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is just enum you don't need to explicitly assign any values, you can remove them to make code more clean

* @flags CREATE_AND_SET
* @default 128
*/
SAI_SWITCH_ATTR_PACKET_TRIMMING_SIZE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments are confusing. One can interpret them as that trimming can be done lesser then configured byte upto maximum size.

We should change it keep it consistent with sample truncation attribute comments
"@brief Truncate size. Truncate sampled packets to this size to reduce traffic bandwidth"

Something like "Trim packet to this size"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should we add an attribute to specify a min size threshold for packets to qualify for trimming? As an example - a packet can only be trimmed if its size exceeds a min threshold Len1 and if trimmed, it will be trimmed to size Len2, where Len1 >= Len2.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should we add an attribute to specify a min size threshold for packets to qualify for trimming? As an example - a packet can only be trimmed if its size exceeds a min threshold Len1 and if trimmed, it will be trimmed to size Len2, where Len1 >= Len2.

Ashutosh,

I am not sure about adding a min size. Truncation size is the min and max both. We use similar semantics for Sampling as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed description

* @flags CREATE_AND_SET
* @default 128
*/
SAI_SWITCH_ATTR_PACKET_TRIMMING_SIZE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should we add an attribute to specify a min size threshold for packets to qualify for trimming? As an example - a packet can only be trimmed if its size exceeds a min threshold Len1 and if trimmed, it will be trimmed to size Len2, where Len1 >= Len2.

Ashutosh,

I am not sure about adding a min size. Truncation size is the min and max both. We use similar semantics for Sampling as well.

* @flags CREATE_AND_SET
* @default 128
*/
SAI_SWITCH_ATTR_PACKET_TRIMMING_SIZE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marian-pritsak
We need to also clarify that trimming a packet MUST update the IP length and IP checksum.
L4 is not touched i.e. for eg UDP length and checksum is not updated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this comment to the action description in saibuffer.h

@tjchadaga
Copy link
Collaborator

@marian-pritsak - could you please resolve branch conflicts and meta checker issues?

@JaiOCP
Copy link
Contributor

JaiOCP commented Oct 7, 2024 via email

When a packet is lost, it can be recovered through fast retransmission (e.g., Go-Back-N in RoCE) or by using timeouts. Retransmission triggered by timeouts typically incurs significant latency. Packet trimming aims to facilitate rapid packet loss notification and, consequently, eliminate slow timeout-based retransmissions.

To help the host recover data more quickly and accurately, we introduce a packet trimming feature, that upon a failed packet admission to a shared buffer,
will trim a packet to a configured size, and try sending it on a different queue to deliver a packet drop notification to an end host.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will the trimmed packet's headers look like? Will it be the same as the original packet with the exception of the length and DSCP information?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will the trimmed packet's headers look like? Will it be the same as the original packet with the exception of the length and DSCP information?

Yes, TRIM packets are original packets, trimmed to the specified length and IP length updated

@JaiOCP
Copy link
Contributor

JaiOCP commented Oct 10, 2024

Can you also add the spec you are referring for this trimming feature ?

I'm not aware of any RFC or IEEE spec, but there are papers describing this feature, like this one.

This is part of a UEC Spec and is not yet released.

@JaiOCP
Copy link
Contributor

JaiOCP commented Oct 11, 2024

@marian-pritsak Can we change general nomenclature of the word TRIMMING.

TRIMMING is a function but once the function is achieved, packet is TRIMMED.

we can either change the attributes to
where PACKET_TRIMMING as a feature appended by the function on the packet.

SAI_SWITCH_ATTR_PACKET_TRIMMING_SIZE -> SAI_SWITCH_ATTR_PACKET_TRIMMING_TRUNCATION_SIZE

SAI_SWITCH_ATTR_PACKET_TRIMMING_DSCP_VALUE -> SAI_SWITCH_ATTR_PACKET_TRIMMING_TRIMED_PACKET_DSCP_VALUE

SAI_SWITCH_ATTR_PACKET_TRIMMING_QUEUE_INDEX ->
SAI_SWITCH_ATTR_PACKET_TRIMMING_TRIMED_PACKET_QUEUE_INDEX

What do you think?

@tushar-ty
Copy link
Collaborator

tushar-ty commented Oct 11, 2024

@marian-pritsak Can we change general nomenclature of the word TRIMMING.

TRIMMING is a function but once the function is achieved, packet is TRIMMED.

we can either change the attributes to where PACKET_TRIMMING as a feature appended by the function on the packet.

SAI_SWITCH_ATTR_PACKET_TRIMMING_SIZE -> SAI_SWITCH_ATTR_PACKET_TRIMMING_TRUNCATION_SIZE

SAI_SWITCH_ATTR_PACKET_TRIMMING_DSCP_VALUE -> SAI_SWITCH_ATTR_PACKET_TRIMMING_TRIMED_PACKET_DSCP_VALUE

SAI_SWITCH_ATTR_PACKET_TRIMMING_QUEUE_INDEX -> SAI_SWITCH_ATTR_PACKET_TRIMMING_TRIMED_PACKET_QUEUE_INDEX

What do you think?

A more simpler option would be to remove the grammar from the names completely and simply use "TRIM".
The purpose and intent should be conveyed through documentation and comments.

@marian-pritsak
Copy link
Contributor Author

@marian-pritsak Can we change general nomenclature of the word TRIMMING.
TRIMMING is a function but once the function is achieved, packet is TRIMMED.
we can either change the attributes to where PACKET_TRIMMING as a feature appended by the function on the packet.
SAI_SWITCH_ATTR_PACKET_TRIMMING_SIZE -> SAI_SWITCH_ATTR_PACKET_TRIMMING_TRUNCATION_SIZE
SAI_SWITCH_ATTR_PACKET_TRIMMING_DSCP_VALUE -> SAI_SWITCH_ATTR_PACKET_TRIMMING_TRIMED_PACKET_DSCP_VALUE
SAI_SWITCH_ATTR_PACKET_TRIMMING_QUEUE_INDEX -> SAI_SWITCH_ATTR_PACKET_TRIMMING_TRIMED_PACKET_QUEUE_INDEX
What do you think?

A more simpler option would be to remove the grammar from the names completely and simply use "TRIM". The purpose and intent should be conveyed through documentation and comments.

I like Tushar's idea to remove grammar entirely. What about you, @JaiOCP ?

inc/saiswitch.h Outdated
/**
* @brief Is the new queue index for a trimmed packet mapped from DSCP
*
* @type sai_bool_t

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marian-pritsak is this a valid type?

Looks like compilation fails with:

ERROR: invalid enum specified 'sai_bool_t' on SAI_SWITCH_ATTR_PACKET_TRIMMING_QUEUE_INDEX_MAPPED_FROM_DSCP
ERROR: sai_bool_t == SAI_SWITCH_ATTR_PACKET_TRIMMING_QUEUE_INDEX_MAPPED_FROM_DSCP not ending on enum name BOOL

Are we missing a declaration?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type should be bool

inc/saiswitch.h Outdated
@@ -3048,7 +3048,6 @@ typedef enum _sai_switch_attr_t
*/
SAI_SWITCH_ATTR_MAX_ICMP_ECHO_SESSION,

/**

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marian-pritsak is this a typo?

When a packet is lost, it can be recovered through fast retransmission
(e.g., Go-Back-N in RoCE) or by using timeouts.
Retransmission triggered by timeouts typically incurs significant
latency. Packet trimming aims to facilitate rapid packet loss
notification and, consequently, eliminate slow timeout-based
retransmissions.

Signed-off-by: Marian Pritsak <marianp@mellanox.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reviewed PR is discussed in SAI Meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.