Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASM Standalone Billing #3965

Merged
merged 36 commits into from
Oct 21, 2024
Merged

ASM Standalone Billing #3965

merged 36 commits into from
Oct 21, 2024

Conversation

vpellan
Copy link
Contributor

@vpellan vpellan commented Oct 1, 2024

What does this PR do?

This PR add the DD_EXPERIMENTAL_APPSEC_STANDALONE_ENABLED feature flag to enable ASM standalone billing (that means disable APM by limiting the number of traces as much as possible, including distributed tracing)

ASM uses traces to send AppSec events/data to the agent. Standalone ASM billing means that we don't want to charge clients for APM traces, but we still want AppSec data (which again, relies on tracing), so we want to send the minimum amount of traces possible (ideally only traces that contains security events and their downstream services if distributed tracing is enabled), but for features such as API Security, we need to send at least one trace per minute to keep the service alive on the backend side.

ASM Standalone billing can be resumed that way:

  • We limit the number of traces sent to the agent to 1 per minute (so the service is still considered alive on the backend side) and all the other traces are dropped
  • If an appsec event is detected in a service, or in an upstream service through distributed tracing, we force keep the trace
  • We tell the agent that we've already computed metrics, so the agent don't and it effectively disables metrics billing

How to test the change?

There are system-tests specific to APPSEC_STANDALONE for each weblog (Rack, Rails 4.2-7.1, Sinatra 1.4-4.0)

Integration specs for rack, rails and sinatra and unit specs have been also been added. They are executed by the CI.

APPSEC-54014

@pr-commenter
Copy link

pr-commenter bot commented Oct 1, 2024

Benchmarks

Benchmark execution time: 2024-10-21 14:45:02

Comparing candidate commit 8e54f7d in PR branch vpellan/standalone-asm with baseline commit 60febf8 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 24 metrics, 2 unstable metrics.

@codecov-commenter
Copy link

codecov-commenter commented Oct 1, 2024

Codecov Report

Attention: Patch coverage is 98.75776% with 6 lines in your changes missing coverage. Please review.

Project coverage is 97.87%. Comparing base (60febf8) to head (8e54f7d).
Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
lib/datadog/tracing/contrib/ethon/easy_patch.rb 50.00% 1 Missing ⚠️
lib/datadog/tracing/contrib/excon/middleware.rb 50.00% 1 Missing ⚠️
lib/datadog/tracing/contrib/faraday/middleware.rb 50.00% 1 Missing ⚠️
...adog/tracing/contrib/httpclient/instrumentation.rb 50.00% 1 Missing ⚠️
.../datadog/tracing/contrib/httprb/instrumentation.rb 50.00% 1 Missing ⚠️
...tadog/tracing/contrib/rest_client/request_patch.rb 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3965      +/-   ##
==========================================
- Coverage   97.87%   97.87%   -0.01%     
==========================================
  Files        1314     1317       +3     
  Lines       78635    79072     +437     
  Branches     3895     3923      +28     
==========================================
+ Hits        76963    77389     +426     
- Misses       1672     1683      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Base automatically changed from vpellan/add-weblogs-system-tests to master October 2, 2024 08:55
@rate_limiter = if rate_limiter
@rate_limiter = if Datadog.configuration.appsec.standalone.enabled
# 0.0167 ~ 1 trace per minute
Core::TokenBucket.new(0.0167, 1.0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for readability I would propose to use 1/60 instead of 0.0167 here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

Copy link
Contributor

@Strech Strech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, I sprinkled a bit of suggestions here and there in attempt to improve this special-case code

span.set_tag(Datadog::AppSec::Ext::TAG_APM_ENABLED, 0) if Datadog.configuration.appsec.standalone.enabled
span.set_tag(Datadog::AppSec::Ext::TAG_APPSEC_ENABLED, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to keep the order from required to optional, like this

Suggested change
span.set_tag(Datadog::AppSec::Ext::TAG_APM_ENABLED, 0) if Datadog.configuration.appsec.standalone.enabled
span.set_tag(Datadog::AppSec::Ext::TAG_APPSEC_ENABLED, 1)
span.set_tag(Datadog::AppSec::Ext::TAG_APPSEC_ENABLED, 1)
span.set_tag(Datadog::AppSec::Ext::TAG_APM_ENABLED, 0) if Datadog.configuration.appsec.standalone.enabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

Comment on lines 148 to 155
if scope.trace
scope.trace.keep!
scope.trace.set_tag(
Datadog::Tracing::Metadata::Ext::Distributed::TAG_DECISION_MAKER,
Datadog::Tracing::Sampling::Ext::Decision::ASM
)
scope.trace.set_tag(Datadog::AppSec::Ext::TAG_APPSEC_EVENT, '1')
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can reduce nesting with a guard-clause here like this:

Suggested change
if scope.trace
scope.trace.keep!
scope.trace.set_tag(
Datadog::Tracing::Metadata::Ext::Distributed::TAG_DECISION_MAKER,
Datadog::Tracing::Sampling::Ext::Decision::ASM
)
scope.trace.set_tag(Datadog::AppSec::Ext::TAG_APPSEC_EVENT, '1')
end
return unless scope.trace
scope.trace.keep!
scope.trace.set_tag(
Datadog::Tracing::Metadata::Ext::Distributed::TAG_DECISION_MAKER,
Datadog::Tracing::Sampling::Ext::Decision::ASM
)
scope.trace.set_tag(Datadog::AppSec::Ext::TAG_APPSEC_EVENT, '1')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!


# return true if absence of upstream ASM event (_dd.p.appsec:1)
# and no local security event (which sets _dd.p.appsec:1 locally)
return true if active_trace.nil? || active_trace.get_tag(Datadog::AppSec::Ext::TAG_APPSEC_EVENT) != '1'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be possible to cut 1 namespace? (not sure)

Suggested change
return true if active_trace.nil? || active_trace.get_tag(Datadog::AppSec::Ext::TAG_APPSEC_EVENT) != '1'
return true if active_trace.nil? || active_trace.get_tag(AppSec::Ext::TAG_APPSEC_EVENT) != '1'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does work but it seems that there's a convention in the codebase to still add the parent namespace (I suppose for readability). You can see another example at l.27-28 of the same file.

Comment on lines +789 to +797
[true, false].each do |value|
context "when given #{value}" do
let(:appsec_standalone_enabled) { value }

before { set_appsec_standalone_enabled }

it { expect(settings.appsec.standalone.enabled).to eq(value) }
end
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be personal opinion, but I think DRY in tests makes them harder to read. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree for more complex cases, but I think that boolean are simple enough that it does not lower the readability too much. [true, false].each is used by APM team (e.g. in spec/datadog/core/configuration/settings_spec.rb) so I think it's better to use it for cohesion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, this spec is such a massive copy-paste with these boolean checks that I kinda wish we had a behaves_like_a_binary_setting or something like that helper rather than repeating it every time.

Not to say we need to fix it in this PR, but yeah, it's been bugging me for a while and it seemed like a good time to share ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, shared behavior sounds good. Agree that maybe not right now, but it's good to share the pain 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are we actually testing here? If we want to test that the value is being assigned, it is enough to test once with one value. If we want to test that this setting could be only set to a boolean value, we also need a test case when a non-boolean value is being passed to this setting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are testing that the value is being assigned. I agree that these tests should be reworked (on another PR)

Comment on lines 173 to 292
let(:appsec_enabled) { true }
it do
expect(service_span.send(:metrics)['_dd.apm.enabled']).to eq(0)
expect(service_span.send(:metrics)['_dd.appsec.enabled']).to eq(1.0)
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let(:appsec_enabled) { true }
it do
expect(service_span.send(:metrics)['_dd.apm.enabled']).to eq(0)
expect(service_span.send(:metrics)['_dd.appsec.enabled']).to eq(1.0)
end
let(:appsec_enabled) { true }
it do
expect(service_span.send(:metrics)['_dd.apm.enabled']).to eq(0)
expect(service_span.send(:metrics)['_dd.appsec.enabled']).to eq(1.0)
end

Comment on lines 181 to 453
let(:appsec_enabled) { false }
it do
expect(service_span.send(:metrics)['_dd.apm.enabled']).to be_nil
expect(service_span.send(:metrics)['_dd.appsec.enabled']).to be_nil
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let(:appsec_enabled) { false }
it do
expect(service_span.send(:metrics)['_dd.apm.enabled']).to be_nil
expect(service_span.send(:metrics)['_dd.appsec.enabled']).to be_nil
end
let(:appsec_enabled) { false }
it do
expect(service_span.send(:metrics)['_dd.apm.enabled']).to be_nil
expect(service_span.send(:metrics)['_dd.appsec.enabled']).to be_nil
end


# return true if absence of upstream ASM event (_dd.p.appsec:1)
# and no local security event (which sets _dd.p.appsec:1 locally)
return true if active_trace.nil? || active_trace.get_tag(Datadog::AppSec::Ext::TAG_APPSEC_EVENT) != '1'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a very-special case and we put some comments, we also might simplify reading with a good naming (not saying this one is good, but would like to show an example)

Suggested change
return true if active_trace.nil? || active_trace.get_tag(Datadog::AppSec::Ext::TAG_APPSEC_EVENT) != '1'
absent_trace_or_no_local_appsec_event = active_trace.nil? || active_trace.get_tag(Datadog::AppSec::Ext::TAG_APPSEC_EVENT) != '1'
return true if absent_trace_or_no_local_appsec_event

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

@@ -35,6 +35,11 @@ def request(req, body = nil, &block)
span.type = Tracing::Metadata::Ext::HTTP::TYPE_OUTBOUND
span.resource = req.method

if Datadog.configuration.appsec.standalone.enabled &&
(trace.nil? || trace.get_tag(Datadog::AppSec::Ext::TAG_APPSEC_EVENT) != '1')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about the same approach as ⬆️ ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

map '/requestdownstream' do
run(
proc do |_env|
uri = URI('http://localhost:3000/returnheaders')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Code Vulnerability

Do not use cleartext protocol http:// (...read more)

This rule is designed to prevent the use of the insecure HTTP protocol in your Ruby code. The HTTP protocol does not encrypt the data that is sent between the client and the server, which can lead to sensitive information being intercepted by malicious parties. This is particularly risky when dealing with sensitive data such as API keys, user credentials, or personal user information.

The importance of this rule lies in the security and integrity of your application. By using an unsecured protocol like HTTP, you expose your application and its users to potential data breaches. A breach can lead to loss of trust, legal liability, and significant remediation costs.

To avoid violating this rule, always use the HTTPS protocol when making network requests. HTTPS encrypts the data sent between the client and server, protecting it from interception. By using libraries like Faraday, HTTPX, HTTParty, RestClient, or Ruby's built-in Net::HTTP, you can specify HTTPS by simply replacing 'http://' with 'https://'. For example, instead of HTTP.get("http://example.org"), use HTTP.get("https://example.org"). Always ensure that any third-party services your application interacts with support HTTPS.

View in Datadog  Leave us feedback  Documentation

Comment on lines -15 to -16
FORCE_TESTS: -F tests/appsec/waf/test_addresses.py::Test_GraphQL -F tests/appsec/test_blocking_addresses.py::Test_BlockingGraphqlResolvers
FORCE_TESTS_SCENARIO: GRAPHQL_APPSEC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did we forgot to remove this when merging GraphQL changes? If so, would it be better to create a separate PR with these changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line must be cleaned after the release of the specified version in the manifest. In the case of GraphQL Appsec, it should have been cleaned after 2.3.0 release, but it does not change anything after the release, as we are forcing the execution of a test that is already executed. Charles specified in the doc that "from time to time" you should removes all the -F from the CI (https://github.com/DataDog/system-tests/blob/main/docs/edit/egg-chicken-changes.md?plain=1#L34), so this is why I did not considered it a priority.
I guess that a clean way to do it would be to add a step in fast_castle.
We can do a separate PR but the way FORCE_TESTS_SCENARIO is implemented does not enable to force tests execution on multiple scenarios, so that would mean that I cannot force execute APPSEC_STANDALONE on this PR

@@ -150,7 +150,8 @@ def add_appsec_tags(processor, scope)

return unless trace && span

span.set_tag('_dd.appsec.enabled', 1)
span.set_tag(Datadog::AppSec::Ext::TAG_APPSEC_ENABLED, 1)
span.set_tag(Datadog::AppSec::Ext::TAG_APM_ENABLED, 0) if Datadog.configuration.appsec.standalone.enabled
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure, but does this need a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure either what could be added to give more info that what the code already gives

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add somethink like "we want to make sure the customer will not get billed for APM when standalone ASM is enabled". But again, it's probably fine to leave it as it is

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read PR description but did not understand that this is what the PR is trying to achieve. So probably would be helpful to include this bit somewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment, thanks!

@@ -137,6 +137,24 @@ def build_service_entry_tags(event_group)
end
# rubocop:enable Metrics/MethodLength

def add_event_tags(scope, waf_result)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the class is already named Event, should we rename this function to add_tags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

Comment on lines 146 to 147
# Propagate to downstream services the information that the current distributed trace is
# containing at least one ASM security event
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put this comment before scope.trace.keep!, it's more relevant there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

Comment on lines 150 to 155
scope.trace.keep!
scope.trace.set_tag(
Datadog::Tracing::Metadata::Ext::Distributed::TAG_DECISION_MAKER,
Datadog::Tracing::Sampling::Ext::Decision::ASM
)
scope.trace.set_tag(Datadog::AppSec::Ext::TAG_APPSEC_EVENT, '1')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to execute this code only if the standalone appsec is enabled?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and I would actually move this code to another function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to execute this code only if the standalone appsec is enabled?

We want to execute this code each time there is an Appsec event, it is not specific to Appsec standalone

I wrote this method to factorise watch_xxx methods in each integration's Gateway Watcher so that it would comply to our Rubocop spec.

and I would actually move this code to another function

Would you move it to another method because it is acting on scope.trace and not scope.service_entry_span anymore ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I think it does a bit more then setting tags - if I understood correctly it's more about making sure that the trace is preserved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, scope.trace.keep! also sets sampled instance variable to true. I'll put it back in the gateway watchers and make a separate method for propagated tags.

@@ -120,6 +120,10 @@ def default_headers
# Add container ID, if present.
container_id = Datadog::Core::Environment::Container.container_id
headers[Datadog::Core::Transport::Ext::HTTP::HEADER_CONTAINER_ID] = container_id unless container_id.nil?
# Pretend that stats computation are already done by the client
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have much background here, but I think it would be helpful to extend this comment and add an explanation on what this header is used for, and why do we set it to yes manually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I added more context in the comment

Comment on lines 33 to 39
active_trace = Tracing.active_trace

# return true if absence of upstream ASM event (_dd.p.appsec:1)
# and no local security event (which sets _dd.p.appsec:1 locally)
no_upstream_or_local_event = active_trace.nil? ||
active_trace.get_tag(Datadog::AppSec::Ext::TAG_APPSEC_EVENT) != '1'
return true if no_upstream_or_local_event
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about this?

return true unless Tracing.active_trace

Tracing.active_trace.get_tag(Datadog::AppSec::Ext::TAG_APPSEC_EVENT) != '1'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did something similar (that would also satisfy Rubocop) thanks !

Comment on lines 35 to 36
# return true if absence of upstream ASM event (_dd.p.appsec:1)
# and no local security event (which sets _dd.p.appsec:1 locally)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, I think it makes sense to explain here in which case _dd.p.appsec is being set to 1 and why we want to skip distributed tracing in such case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also added more context to this one

@@ -35,6 +35,13 @@ def request(req, body = nil, &block)
span.type = Tracing::Metadata::Ext::HTTP::TYPE_OUTBOUND
span.resource = req.method

asm_standalone_no_upstream_or_local_event = Datadog.configuration.appsec.standalone.enabled &&
(trace.nil? || trace.get_tag(Datadog::AppSec::Ext::TAG_APPSEC_EVENT) != '1')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a method on Datadog::Trace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it as a class method to Tracing::TraceOperation, thanks!


@default_sampler = if default_sampler
@default_sampler = if Datadog.configuration.appsec.standalone.enabled
@rules = [SimpleRule.new(sample_rate: 1.0)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's quite confusing that we are setting @rules when assigning @default_sampler value. Should we move this out from here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I separated rules and default_sampler initializers, thanks!

@@ -29,21 +29,29 @@ def initialize(
default_sample_rate: Datadog.configuration.tracing.sampling.default_rate,
default_sampler: nil
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this section of the code will benefit from a comment explaining appsec & tracing interaction. Especially for someone who is new to both appsec and tracing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I added a comment to explain it a bit more, let me know if you think that it is not comprehensive enough !

@vpellan vpellan marked this pull request as ready for review October 10, 2024 15:53
@vpellan vpellan requested review from a team as code owners October 10, 2024 15:53
Comment on lines 37 to 56
@rules = if Datadog.configuration.appsec.standalone.enabled
[SimpleRule.new(sample_rate: 1.0)]
elsif default_sample_rate && !default_sampler
# Add to the end of the rule list a rule always matches any trace
rules << SimpleRule.new(sample_rate: default_sample_rate)
else
rules
end
@rate_limiter = if Datadog.configuration.appsec.standalone.enabled
# 1 trace per minute
Core::TokenBucket.new(1.0 / 60, 1.0)
elsif rate_limiter
rate_limiter
elsif rate_limit
Core::TokenBucket.new(rate_limit)
else
Core::UnlimitedLimiter.new
end

@default_sampler = if default_sampler
default_sampler
elsif default_sample_rate
# Add to the end of the rule list a rule always matches any trace
@rules << SimpleRule.new(sample_rate: default_sample_rate)
@default_sampler = if Datadog.configuration.appsec.standalone.enabled ||
(default_sample_rate && !default_sampler)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest moving this logic to the initialization of the RuleSampler, instead of inside of it:

def build_sampler(settings)
# A custom sampler is provided
if (sampler = settings.tracing.sampler)
return sampler
end
# Sampling rules are provided
if (rules = settings.tracing.sampling.rules)
post_sampler = Tracing::Sampling::RuleSampler.parse(
rules,
settings.tracing.sampling.rate_limit,
settings.tracing.sampling.default_rate
)
end
# The default sampler.
# Used if no custom sampler is provided, or if sampling rule parsing fails.
post_sampler ||= Tracing::Sampling::RuleSampler.new(
rate_limit: settings.tracing.sampling.rate_limit,
default_sample_rate: settings.tracing.sampling.default_rate
)
Tracing::Sampling::PrioritySampler.new(
base_sampler: Tracing::Sampling::AllSampler.new,
post_sampler: post_sampler
)
end

It looks like this logic can manipulate the parameters passed to RuleSampler to accomplish the same goal, I think it would be: RuleSampler.new([], rate_limiter: Core::TokenBucket.new(1.0 / 60, 1.0), default_sample_rate: 1.0).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is also initialised in here:

def initialize(
trace_flush: Flush::Finished.new,
context_provider: DefaultContextProvider.new,
default_service: Core::Environment::Ext::FALLBACK_SERVICE_NAME,
enabled: true,
sampler: Sampling::PrioritySampler.new(
base_sampler: Sampling::AllSampler.new,
post_sampler: Sampling::RuleSampler.new
),
span_sampler: Sampling::Span::Sampler.new,
tags: {},
writer: Writer.new
)

Do you think I should also move the logic there ? Imo it would mean duplicated code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can ignore this one in dd-trace-rb/lib/datadog/tracing/tracer.rb. It's not used anymore, just a relic from when the Tracer class itself was public. You can always assume well go through the Component initialization.

Copy link
Contributor

@Strech Strech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have much to add to existing suggestions by others, great job 👏🏼

private

def self.compressed_and_base64_encoded: (untyped value) -> untyped

def self.json_parse: (untyped value) -> untyped

def self.gzip: (untyped value) -> untyped

def self.add_distributed_tags: (untyped trace) -> untyped
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I ask to type it? (Not sure about trace type, but I think it could be defined too.

Suggested change
def self.add_distributed_tags: (untyped trace) -> untyped
def self.add_distributed_tags: (untyped trace) -> void

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 22f18a8

@@ -13,14 +13,18 @@ module Datadog
def self.record_via_span: (Datadog::Tracing::SpanOperation, *untyped events) -> untyped

def self.build_service_entry_tags: (Array[Hash[::Symbol, untyped]] event_group) -> Hash[::String, untyped]


def self.add_tags: (Datadog::AppSec::Scope scope, Datadog::AppSec::WAF::Result waf_result) -> untyped
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def self.add_tags: (Datadog::AppSec::Scope scope, Datadog::AppSec::WAF::Result waf_result) -> untyped
def self.add_tags: (Datadog::AppSec::Scope scope, Datadog::AppSec::WAF::Result waf_result) -> void

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 22f18a8

@@ -29,6 +29,15 @@ def internal_request?(request)
end

def should_skip_distributed_tracing?(client_config)
if Datadog.configuration.appsec.standalone.enabled
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that we add a bunch of Datadog.configuration.appsec.standalone.enabled all over the tracing component which imposes a hard dependency between Tracing and AppSec internal details. What's especially worrying, is that in this case we reverse the dependency and now Tracing knows about AppSec internals (even though AppSec is supposed to be a client of Tracing).

So my question is for the Tracing team: do you think it would make sense to introduce some tracing settings to allow client products to tweak tracing behaviour even more? Like we do between CI and Tracing modules: Tracing exposes settings, CI configures them.

@marcotc what do you think?

Copy link
Contributor Author

@vpellan vpellan Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add more context: appsec standalone is still experimental and the env var/configuration might change name in the future, and the 'appsec' part of this setting name may disappear. What it do on the tracing part is disabling distributed tracing, adding a header to the payload sent to the agent, creating a 1 trace per minute rate limiter, and setting the sampling priority to 'reject' if there is an appsec event detected. Although the first 3 points aren't directly tied to appsec, the last one is, but it is still acting only on the traces.

…t-Computed-Stats header and _sampling_priority_v1 tag

- Reproduced system-tests ASM Standalone tests
Set sampling priority to AUTO_REJECT when ASM Standalone is active and no appsec event everywhere there is HTTP.inject
Copy link
Member

@marcotc marcotc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Member

@y9v y9v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@vpellan vpellan disabled auto-merge October 21, 2024 14:10
@vpellan vpellan merged commit df97f48 into master Oct 21, 2024
266 checks passed
@vpellan vpellan deleted the vpellan/standalone-asm branch October 21, 2024 14:47
@github-actions github-actions bot added this to the 2.5.0 milestone Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants