Dynamic Sampling Context (Experimental)

Traces sampling done through the tracesSampleRate or tracesSampler options in the SDKs has quite a few drawbacks for users of Sentry SDKs:

  • Changing the sampling rate involved either redeploying applications (which is problematic in case of applications that are not updated automatically, i.e., mobile apps or physically distributed software) or building complex systems to dynamically fetch a sample rate.
  • Sampling only happened based on a factor of randomness.
  • Employing sampling rules, for example, based on event parameters, is very complex.
  • While writing rules for singular transactions is possible, enforcing them on entire traces is infeasible.

The solution for these problems is Dynamic Sampling. Dynamic Sampling allows users to configure sampling rules directly in the Sentry interface. Important: Sampling rules may be applied to entire traces or to a single transaction.

High-Level Problem Statement

Ingest

Implementing Dynamic Sampling comes with challenges, especially on the ingestion side of things. For Dynamic Sampling, we want to make sampling decisions for entire traces. However, to keep ingestion speedy, Relay only looks at singular transactions in isolation (as opposed to looking at whole traces). This means that we need the exact same decision basis for all transactions belonging to a trace. In other words, all transactions of a trace need to hold all of the information to make a sampling decision, and that information needs to be the same across all transactions of the trace. We call the information we base sampling decisions on "Dynamic Sampling Context" or "DSC". As a mental model: The head transaction in a trace determines the Dynamic Sampling Context for all following transactions in that trace. No information can be changed, added or deleted after the first propagation.

SDKs

SDKs are responsible for propagating Dynamic Sampling Context across all applications that are part of a trace. This involves:

  1. Collecting the information that makes up the DSC xor extracting the DSC from incoming requests.
  2. Propagating DSC to downstream SDKs.
  3. Sending the DSC to Sentry via the trace envelope header.

Because there are quite a few things to keep in mind for DSC propagation and to avoid every SDK running into the same problems, we defined a unified propagation mechanism (step-by-step instructions) that all SDK implementations should be able to follow.

Baggage

We chose baggage as the propagation mechanism for DSC. (w3c baggage spec) Baggage is a standard HTTP header with URI encoded key-value pairs.

For the propagation of DSC, SDKs first read the DSC from the baggage header of incoming requests/messages. To propagate DSC to downstream SDKs/services, we create a baggage header (or modify an existing one) through HTTP request instrumentation.

The following is an example of what a baggage header containing Dynamic Sampling Context may look like:

Copied
baggage: other-vendor-value-1=foo;bar;baz, sentry-trace_id=771a43a4192642f0b136d5159a501700, sentry-public_key=49d0f7386ad645858ae85020e393bef3, sentry-sample_rate=0.01337, sentry-user_id=Am%C3%A9lie, other-vendor-value-2=foo;bar;

See the Payloads section for a complete list of key-value pairs that SDKs should propagate.

Payloads

Dynamic Sampling Context is sent to Sentry via the trace envelope header and is propagated to downstream SDKs via a baggage header.

All of the values in the payloads below are required (non-optional) in a sense, that when they are known to an SDK at the time a transaction envelope is sent to Sentry, or at the time a baggage header is propagated, they must also be included in said envelope or baggage. In any case, trace_id, public_key, and sample_rate should always be known to an SDK, so these values are strictly required.

Envelope Header

Dynamic Sampling Context is transferred to Sentry through the trace envelope header. The value of this envelope header is a JSON object with the following fields:

  • trace_id (string) - The original trace ID as generated by the SDK, UUID V4 encoded as a hexadecimal sequence with no dashes (e.g. 771a43a4192642f0b136d5159a501700) that is a sequence of 32 hexadecimal digits. This must match the trace id of the submitted transaction item.
  • public_key (string) - Public key from the DSN used by the SDK. It allows Sentry to sample traces spanning multiple projects, by resolving the same set of rules based on the starting project.
  • sample_rate (string) - The sample rate as defined by the user on the SDK. This string should always be a number between (and including) 0 and 1 in basic float notation (0.04242) - no funky business like exponents or anything similar.
  • release (string) - The release name as specified in client options`.
  • environment (string) - The environment name as specified in client options.
  • user_id (string) - User ID as set by the user with scope.set_user.
  • user_segment (string) - User segment as set by the user with scope.set_user.
  • transaction (string) - The transaction name set on the scope.

Baggage-Header

SDKs may use the following keys to set entries on baggage HTTP headers:

  • sentry-trace_id
  • sentry-public_key
  • sentry-sample_rate
  • sentry-release
  • sentry-environment
  • sentry-user_id
  • sentry-user_segment
  • sentry-transaction

SDKs must set all of the keys in the form of "sentry-[name]". The prefix "sentry-" acts to identify key-value pairs set by Sentry SDKs.

All of the keys are defined in a way so their value directly corresponds to one of the fields on the trace envelope header. This allows SDKs to put all of the sentry key-value pairs from the baggage directly onto the envelope header, after stripping away the sentry- prefix.

Being able to simply copy key-value pairs from the baggage header onto the trace envelope header gives us the flexibility to provide dedicated API methods to propagate additional values using Dynamic Sampling Context. This, in return, allows users to define their own values in the Dynamic Sampling Context so they can sample by those in the Sentry interface.

Unified Propagation Mechanism

SDKs should follow these steps for any incoming and outgoing requests (in python pseudo-code for illustrative purposes):

Copied
def collect_dynamic_sampling_context():
  # Placeholder function that collects as many values for Dynamic Sampling Context
  # as possible and returns a dict

def has_sentry_value_in_baggage_header(request):
  # Placeholder function that returns True when there is at least one key-value pair in the baggage
  # header of `request`, for which the key starts with "sentry-". Otherwise, it returns False.

def on_incoming_request(request):
  if has_header(request, "sentry-trace") and (not has_header(request, "baggage") or not has_sentry_value_in_baggage_header(request)):
    # Request comes from an old SDK which doesn't support Dynamic Sampling Context yet
    # --> we don't propagate baggage for this trace
    transaction.baggage_locked = True
    transaction.baggage = {}
  elif has_header(request, "baggage") and has_sentry_value_in_baggage_header(request):
    transaction.baggage_locked = True
    transaction.baggage = baggage_header_to_dict(request.headers.baggage)

def on_outgoing_request(request):
  if not transaction.baggage_locked:
    transaction.baggage_locked = True
    if not transaction.baggage:
      transaction.baggage = {}
    transaction.baggage = merge_dicts(collect_dynamic_sampling_context(), transaction.baggage)

  if has_header(request, "baggage"):
    outgoing_baggage_dict = baggage_header_to_dict(request.headers.baggage)
    merged_baggage_dict = merge_dicts(outgoing_baggage_dict, transaction.baggage)
    merged_baggage_header = dict_to_baggage_header(merged_baggage_dict)
    set_header(request, "baggage", merged_baggage_header)
  else:
    baggage_header = dict_to_baggage_header(transaction.baggage)
    set_header(request, "baggage", baggage_header)

While there is no strict necessity for the transaction.baggage_locked flag yet, there is a future use case where we need it: We might want users to be able to set Dynamic Sampling Context values themselves. The flag becomes relevant after the first propagation, where Dynamic Sampling Context becomes immutable. When users attempt to set DSC afterwards, our SDKs should make this operation a noop.

Considerations

TODO - Add some sort of Q&A section on the following questions, after evaluating if they still need to be answered:

  • Why must baggage be immutable before the second transaction has been started?
  • What are the consequences and impacts of the immutability of baggeg on Dynamic Sampling UX?
  • Why can't we just make the decision for the whole trace in Relay after the trace is complete?
  • What is sample rate smoothing and how does it use sample_rate from the Dynamic Sampling Context?
  • What are the differences between Dynamic Sampling on traces vs. transactions?
  • Why did we choose baggage as propagation mechanism and not trace context https://www.w3.org/TR/trace-context/?
You can edit this page on GitHub.