Skip to content

Latest commit

 

History

History
240 lines (161 loc) · 11.2 KB

loggers.mdx

File metadata and controls

240 lines (161 loc) · 11.2 KB
title description
Loggers | Dagster
Dagster includes a rich and extensible logging system.

Loggers

As part of its rich, extensible logging system, Dagster includes loggers. Loggers can be applied to all jobs within a code location or, in advanced cases, overriden at the job level.

Logging handlers are automatically invoked whenever ops in a job log messages, meaning out-of-the-box loggers track all execution events. Loggers can also be customized to meet your specific needs.


Relevant APIs

Name Description
The decorator used to define loggers. The decorator returns a .
Class for loggers. You almost never want to use initialize this class directly. Instead, you should use the decorator.
The context object available to an op compute function.
The context object passed to a custom logger's initialization function.
A function to construct a outside of execution, used primarily for testing purposes.

Defining loggers

By default, Dagster comes with a built-in logger that tracks all execution events. Built-in loggers are defined internally using the class. The decorator exposes a simpler API for the common logging use case, which is typically what should be used to define your own loggers.

The decorated function should take a single argument, the init_context available during logger initialization, and return a logging.Logger. Refer to the Customizing loggers guide for an example.


Using loggers

Logging from an op

Any op can emit log messages at any point in its computation. For example:

# demo_logger.py

from dagster import job, op, OpExecutionContext


@op
def hello_logs(context: OpExecutionContext):
    context.log.info("Hello, world!")


@job
def demo_job():
    hello_logs()

Using built-in loggers

When the above demo_job is run in the terminal:

dagster job execute -f demo_logger.py -j demo_job

The messages will display in the terminal, having been logged through a built-in logger:

2023-09-21 12:07:26 -0400 - dagster - DEBUG - demo_job - 219c7b51-b62f-4e5b-8de8-0e7a616b961c - 25434 - RUN_START - Started execution of run for "demo_job".
2023-09-21 12:07:26 -0400 - dagster - DEBUG - demo_job - 219c7b51-b62f-4e5b-8de8-0e7a616b961c - 25434 - ENGINE_EVENT - Executing steps using multiprocess executor: parent process (pid: 25434)
2023-09-21 12:07:26 -0400 - dagster - DEBUG - demo_job - 219c7b51-b62f-4e5b-8de8-0e7a616b961c - 25434 - hello_logs - STEP_WORKER_STARTING - Launching subprocess for "hello_logs".
2023-09-21 12:07:30 -0400 - dagster - DEBUG - demo_job - 219c7b51-b62f-4e5b-8de8-0e7a616b961c - 25438 - STEP_WORKER_STARTED - Executing step "hello_logs" in subprocess.
2023-09-21 12:07:30 -0400 - dagster - DEBUG - demo_job - 219c7b51-b62f-4e5b-8de8-0e7a616b961c - 25438 - hello_logs - RESOURCE_INIT_STARTED - Starting initialization of resources [io_manager].
2023-09-21 12:07:30 -0400 - dagster - DEBUG - demo_job - 219c7b51-b62f-4e5b-8de8-0e7a616b961c - 25438 - hello_logs - RESOURCE_INIT_SUCCESS - Finished initialization of resources [io_manager].
2023-09-21 12:07:30 -0400 - dagster - DEBUG - demo_job - 219c7b51-b62f-4e5b-8de8-0e7a616b961c - 25438 - LOGS_CAPTURED - Started capturing logs in process (pid: 25438).
2023-09-21 12:07:30 -0400 - dagster - DEBUG - demo_job - 219c7b51-b62f-4e5b-8de8-0e7a616b961c - 25438 - hello_logs - STEP_START - Started execution of step "hello_logs".
2023-09-21 12:07:30 -0400 - dagster - INFO - demo_job - 219c7b51-b62f-4e5b-8de8-0e7a616b961c - hello_logs - Hello, world!
2023-09-21 12:07:30 -0400 - dagster - DEBUG - demo_job - 219c7b51-b62f-4e5b-8de8-0e7a616b961c - 25438 - hello_logs - STEP_OUTPUT - Yielded output "result" of type "Any". (Type check passed).
2023-09-21 12:07:30 -0400 - dagster - DEBUG - demo_job - 219c7b51-b62f-4e5b-8de8-0e7a616b961c - hello_logs - Writing file at: /Users/erincochran/Desktop/dagster-examples/project-dagster-university/tmpzis_rf84/storage/219c7b51-b62f-4e5b-8de8-0e7a616b961c/hello_logs/result using PickledObjectFilesystemIOManager...
2023-09-21 12:07:30 -0400 - dagster - DEBUG - demo_job - 219c7b51-b62f-4e5b-8de8-0e7a616b961c - 25438 - hello_logs - HANDLED_OUTPUT - Handled output "result" using IO manager "io_manager"
2023-09-21 12:07:30 -0400 - dagster - DEBUG - demo_job - 219c7b51-b62f-4e5b-8de8-0e7a616b961c - 25438 - hello_logs - STEP_SUCCESS - Finished execution of step "hello_logs" in 49ms.
2023-09-21 12:07:31 -0400 - dagster - DEBUG - demo_job - 219c7b51-b62f-4e5b-8de8-0e7a616b961c - 25434 - ENGINE_EVENT - Multiprocess executor: parent process exiting after 4.52s (pid: 25434)
2023-09-21 12:07:31 -0400 - dagster - DEBUG - demo_job - 219c7b51-b62f-4e5b-8de8-0e7a616b961c - 25434 - RUN_SUCCESS - Finished execution of run for "demo_job".

The context object passed to every op execution includes the built-in log manager, context.log. It exposes the usual debug, info, warning, error, and critical methods you would expect anywhere else in Python.


Viewing logs in the Dagster UI

When jobs are run, the logs stream back to the UI's Run details page in real time. The UI contains two types of logs - structured event and raw compute - which you can learn about below.

Structured event logs

Raw compute logs


Debugging using logs

Errors in user code are caught by Dagster machinery to ensure jobs gracefully halt or continue to execute, but messages including the original stack trace get logged both to the console and back to the UI.

For example, if an error is introduced into an op's logic:

# demo_logger_error.py

from dagster import job, op, OpExecutionContext


@op
def hello_logs_error(context: OpExecutionContext):
    raise Exception("Somebody set up us the bomb")


@job
def demo_job_error():
    hello_logs_error()

Messages at level ERROR or above are highlighted both in the UI and in the console logs, so they can be easily identified even without filtering:

ERROR level in logs in the Dagster UI

In many cases, especially for local development, this log viewer, coupled with op reexecution, is sufficient to enable a fast debug cycle for job implementation.


Configuring loggers

Suppose that we've gotten the kinks out of our jobs developing locally, and now we want to run in production—without all of the log spew from DEBUG messages that was helpful during development.

Just like ops, loggers can be configured when you run a job. For example, to filter all messages below ERROR out of the colored console logger, add the following snippet to your config YAML:

loggers:
  console:
    config:
      log_level: ERROR

When a job with the above config is executed, you'll only see the ERROR level logs.

Environment-specific logging using jobs

Logging is environment-specific: for example, you don't want messages generated by data scientists' local development loops to be aggregated with production messages. On the other hand, you may find that console logging is irrelevant or even counterproductive in production.

Dagster recognizes this by attaching loggers to jobs so that you can seamlessly switch from environment to environment without changing any code. For example, let's say you want to switch from Cloudwatch logging in production to console logging in development and test:

@op
def log_op(context: OpExecutionContext):
    context.log.info("Hello, world!")


@graph
def hello_logs():
    log_op()


local_logs = hello_logs.to_job(
    name="local_logs", logger_defs={"console": colored_console_logger}
)
prod_logs = hello_logs.to_job(
    name="prod_logs", logger_defs={"cloudwatch": cloudwatch_logger}
)

In the UI, you can view and execute the prod_logs job and edit config to use the new Cloudwatch logger. For example:

loggers:
  cloudwatch:
    config:
      log_level: ERROR
      log_group_name: /my/cool/cloudwatch/log/group
      log_stream_name: very_good_log_stream

Specifying default code location loggers

Defalt loggers can be specified on a object by supplying an object containing a logger to the loggers argument.

When specified, these loggers will be added to every job specified in the code location and asset materializations provided to the code location:

from dagster import Definitions, define_asset_job, asset


@asset
def some_asset(): ...


the_job = define_asset_job("the_job", selection="*")


defs = Definitions(
    jobs=[the_job], assets=[some_asset], loggers={"json_logger": json_console_logger}
)

Note: If loggers are explicitly specified at the job level, they will override those provided to the object.


Related