Instrumentation

For basic observability of web applications, SAF provides the logic for:

Logging
Metrics
Error reporting

All SAF-provided libraries support these systems, and this library provides all the tools for new systems to do the same.

Tracing is the one major area of instrumentation that SAF does not provide. That's mostly due to the other three being simpler to set up and mostly sufficient. Still, SAF is designed in a way where tracing can be added, in particular by enforcing a consistent context definition (through this library) and using wrapping functions in libraries such as @saflib/express and @saflib/drizzle-sqlite where spans can be systematically added at package boundaries.

Stores

Instrumentation makes heavy use of Node's AsyncLocalStorage to store context and reporters.

Context

Context is information about what is currently running, in what environment. Any subsystem is expected to provide this context, and it can be used to:

Affect behavior of the operation, mainly through the auth field
Add context to instrumentation, which is basically every other field

See SafContext for more details (storage: safContextStorage).

Reporters

Reporters are functions for reporting telemetry to various services. They depend on the context, and are not serializable, so these are kept in a separate AsyncLocalStorage instance.

The main functions to use are:

log - a Winston logger which applications can add transports to.
logError - a convenience function for logging Error objects. It logs to both log and any ErrorReporter callbacks (so errors appear both in logging systems like Loki and error reporting services like Sentry).

See SafReporters for more details (storage: safReportersStorage).

Use in Applications

The main functions to use inside HTTP handlers, gRPC handlers, cron jobs, and alike are:

Use these for all logging and auth purposes. They will error if the application has not provided them, and what they return is typed to be what you should expect. These are mainly to avoid existence-check boilerplate, otherwise you could also just use safContextStorage and safReportersStorage's getStore methods directly.

Logging and Testing

By design, these helper functions will error if context and reporters have not been provided by the application, but the application may not be in the mix if you're testing smaller pieces in isolation. So these functions will return stubs if and only if the NODE_ENV environment variable is test. It will also log errors to console so they show up in test output.

If you want to check that certain logs are being made in tests, you can use getSafReporters to get the universal loggers and spy on them. See @saflib/cron's unit tests for example.

Integrate Logging

When you set up a new service, you will need to integrate logging with your chosen collectors or external services and do some other setup.

setServiceName - sets the service name, which is used to identify the service in logs and metrics. The service name should match the service package name (minus any organization prefix) and the docker image/service name, for consistency.
addErrorCollector - adds a callback for when errors are reported by the application. Callbacks receive a ErrorCollectorParam object, which is based off Sentry's captureContext parameter for captureContext
addTransport - adds a Winston transport to the log logger.
collectSystemMetrics - opts into prom-client's default metrics, which are a superset of Prometheus's recommended metrics.

Call all these before initializing any servers or other long-running processes, and start with setServiceName since any log calls will fail without one.

If you have some service-specific context (which is likely, especially for shared clients to databases and other services), you should put those in a sibling {service-name}-common package and provide them to each of your subsystems. Some, such as @saflib/grpc, provide helpers for this.

Provide Context and Reporters

It's the job of subsystem libraries such as @saflib/express and @saflib/grpc to provide context and reporters for each operation. They can do this preferably with the safContextStorage and safReportersStorage's run method.

They should use the following functions and variables:

createLogger to create a Winston logger. Provide subsystemName and operationName. To do this, a logger will have to be created for each "request" or "run".
generateRequestId. Only needed if not provided by the caller, which it should be if the operation does not originate from the subsystem itself such as for cron jobs.
safContextStorage and safReportersStorage to provide a context. Use run method ideally, or enterWith if necessary.
defaultErrorReporter for a standard error reporter.
makeSubsystemReporters when you want to log outside of an operation, such as when initializing a subsystem.

See examples throughout @saflib.

Recording Metrics

Metrics should be recorded through subsystem libraries as well, using prom-client. @saflib/express uses express-prom-bundle to record metrics for HTTP requests, and the other SAF libraries use prom-client directly to provide a similar histogram metric. See examples.

Beyond these basic RED metrics, SAF application code does not currently provide any other guidance or built-in metrics for back-end, but there's certainly room for more, potentially through traces and finite state machines.

Instrumentation ​

Stores ​

Context ​

Reporters ​

Use in Applications ​

Logging and Testing ​

Integrate Logging ​

Provide Context and Reporters ​

Recording Metrics ​