Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix TaggedLogging functionality when broadcasting to another logger: #44695

Merged
merged 1 commit into from Mar 16, 2022

Conversation

Edouard-chin
Copy link
Member

@Edouard-chin Edouard-chin commented Mar 15, 2022

Fix TaggedLogging functionality when broadcasting:

  • This PR fixes two issues with the Tagged Logging feature in
    conjunction with broadcasting logs.

    For the sake of clarity I'll define the "main logger" and
    the "broadcasted logger" in this snippet:

      main_logger = ActiveSupport::TaggedLogging.new(ActiveSupport::Logger.new(io))
      broadcaster_logger = ActiveSupport::TaggedLogging.new(ActiveSupport::Logger.new(io))
    
      main_logger.extend(Activesupport::Logger.broadcast(broadcasted_logger))
    1. The first issue that this PR fixes, is that the tags on the "main logger"
      don't propagate to the "broadcasted logger" when you pass in a block.

      main_logger.tagged("FOO") { |logger| logger.info("Hello") }
      
      # Outputs:
      # [Foo] Hello <- By the main logger
      # Hello       <- By the broadcasted logger

      A fix was made in 70af536
      but that only works for the non block version

    2. It's quite common for the "broadcasted logger" to have a diffent
      log formatter that the "main logger". In example you'd want to
      output JSON logs in one and raw text in the other.

      That wasn't possible before. All loggers had to have the same
      instance of the formatter. The formatter was set on all loggers
      thanks to this and it's associated test
      This requirement was needed to make the Tagged Logging feature
      work; the tags being set in a thread variable whose name
      uses the object_id

      thread_key = @thread_key ||= "activesupport_tagged_logging_tags:#{object_id}"

      (different formatter instance -> different object_id -> different variables)

      In this PR, I have removed the code that sets the same formatter
      instance on all logger. The "broadcaster logger" just need to
      have the current_tags point to the current_tags of the
      "main logger", I'm doing that by redefing the current_tags
      method each time the "main logger" uses a different formatter.

      The advantages by doing so is that custom made formatter
      can now call this current_tags method, which will return
      all the tags and process them the way they want.

      class JSONLogFormatter
        def call(_, _, _, msg)
          tags = current_tags # Can now retrieve the tags
          { message: msg, tags: tags }.to_json
        end
      end
      
       broadcasted_logger = Logger.new(io)
       broadcaster_logger.formatter = JSONLogFormatter.new
       main_logger = Logger.new(io)
       main_logger.extend(ActiveSupport::Logger.broadcast(broadcasted_logger))

      The behavior remains the same as before if a logger uses the
      Rails vanilla formatter or the Tagged Logging formatter.

@Edouard-chin Edouard-chin force-pushed the ec-tagger-logger-broadcast branch 2 times, most recently from b7708be to 90da0ce Compare March 15, 2022 09:13
@jdelStrother
Copy link
Contributor

@Edouard-chin hm, I think this further breaks the way I use Logger.broadcast (#40759 (comment)).

I use Logger.broadcast so that I can have one normal plaintext log file, in addition to a json-formatted log file that gets ingested by NewRelic's log service. Both files get written to when I call Rails.logger.info "Hello", but they'd have different content - eg

2022-03-15T11:44:04.278993 INFO Hello

vs

{"hostname":"app1-1","entity.guid":"NTIwMnxBUE18QVBQTElDQVRJT058MTIxOTA","message":"Hello","timestamp":1647344453753}

This broke when I upgraded to Rails 7 - for now, I'm patching around it by removing the formatter= method from the Logger.broadcast module.

It sounds like you're expecting broadcasted loggers to always contain identical content, is that right?

@Edouard-chin
Copy link
Member Author

Ah, indeed that's problematic.

In Rails 6 things worked the way you are expecting but it was by chance, I'm pretty sure the implementation always assumed that the formatter is the same amongst logger (this is why the formatter= method get overridden to propagate it to all loggers

define_method(:formatter=) do |formatter|
logger.formatter = formatter
super(formatter)
end

This is needed because TaggedLogging makes use of the object_id to retrieve the tags
thread_key = @thread_key ||= "activesupport_tagged_logging_tags:#{object_id}"

Anyway, your use case definitely make sense though and we should officially support it. I'll dig a bit more to see how we can refactor that

@Edouard-chin Edouard-chin force-pushed the ec-tagger-logger-broadcast branch 2 times, most recently from 77051cd to 64a818b Compare March 15, 2022 20:06
@Edouard-chin
Copy link
Member Author

@jdelStrother I have updated the PR to stop propagating formatter on all loggers. Can you check if this fixes your use case ?

- This PR fixes two issues with the Tagged Logging feature in
  conjunction with broadcasting logs.

  For the sake of clarity I'll define the "main logger" and
  the "broadcasted logger" in this snippet:

  ```ruby
    main_logger = ActiveSupport::TaggedLogging.new(ActiveSupport::Logger.new(io))
    broadcaster_logger = ActiveSupport::TaggedLogging.new(ActiveSupport::Logger.new(io))

    main_logger.extend(Activesupport::Logger.broadcast(broadcasted_logger))
  ```

  1) The first issue that this PR fixes, is that the tags on the "main logger"
     don't propagate to the "broadcasted logger" when you pass in a block.

     ```ruby
     main_logger.tagged("FOO") { |logger| logger.info("Hello") }

     # Outputs:
     # [Foo] Hello <- By the main logger
     # Hello       <- By the broadcasted logger
     ```

     A fix was made in rails@70af536
     but that only works for the non block version

  2) It's quite common for the "broadcasted logger" to have a diffent
     log formatter that the "main logger". In example you'd want to
     output JSON logs in one and raw text in the other.

     That wasn't possible before. All loggers had to have the same
     instance of the formatter. The formatter was set on all loggers
     thanks to [this](https://github.com/rails/rails/blob/3fc9d12875dd434e454f74472da02014f6fa5d72/activesupport/lib/active_support/logger.rb#L45-L48) and it's [associated test](https://github.com/rails/rails/blob/3fc9d12875dd434e454f74472da02014f6fa5d72/activesupport/test/broadcast_logger_test.rb#L58-L64)
     This requirement was needed to make the Tagged Logging feature
     work; the tags being set in a thread variable whose name
     uses the `object_id` https://github.com/rails/rails/blob/3fc9d12875dd434e454f74472da02014f6fa5d72/activesupport/lib/active_support/tagged_logging.rb#L59
     (different formatter instance -> different object_id -> different variables)

     In this PR, I have removed the code that sets the same formatter
     instance on all logger. The "broadcaster logger" just need to
     have the `current_tags` point to the `current_tags` of the
     "main logger", I'm doing that by redefing the `current_tags`
     method each time the "main logger" uses a different formatter.

     The advantages by doing so is that custom made formatter
     can now call this `current_tags` method, which will return
     all the tags and process them the way they want.

     ```ruby
       class JSONLogFormatter
         def call(_, _, _, msg)
	   tags = current_tags # Can now retrieve the tags

	   { message: msg, tags: tags }.to_json
	 end
       end

       broadcasted_logger = Logger.new(io)
       broadcaster_logger.formatter = JSONLogFormatter.new
       main_logger = Logger.new(io)
       main_logger.extend(ActiveSupport::Logger.broadcast(broadcasted_logger))
     ```

     The behavior remains the same as before if a logger uses the
     Rails vanilla formatter or the Tagged Logging formatter.
@rafaelfranca
Copy link
Member

While we are on this. Maybe in another PR, we should change the way boardcast work to instead of using inheritance to do it, use composition.

BroadcastLogger.new(main_logger, another_loader)

@jdelStrother
Copy link
Contributor

@jdelStrother I have updated the PR to stop propagating formatter on all loggers. Can you check if this fixes your use case ?

Yep, looks good!

@Edouard-chin
Copy link
Member Author

[...] to instead of using inheritance to do it, use composition.

Yeah that'd help with readability and usage.

Also, I'm just realising that this broadcast feature was private API (or at least it's written as :nodoc:), is there a reason? I'd be encline to write doc and make it public, WDYT ?

@rafaelfranca
Copy link
Member

Also, I'm just realising that this broadcast feature was private API (or at least it's written as :nodoc:), is there a reason? I'd be encline to write doc and make it public, WDYT ?

I think we should make it public when it uses composition. I'm not too comfortable with the current implementation being public.

@rafaelfranca rafaelfranca merged commit 5e770a3 into rails:main Mar 16, 2022
rafaelfranca added a commit that referenced this pull request Mar 16, 2022
Fix TaggedLogging functionality when broadcasting to another logger:
@Edouard-chin Edouard-chin deleted the ec-tagger-logger-broadcast branch March 17, 2022 13:42
@rafaelfranca
Copy link
Member

I'll have to revert this since the tagged method without a block is now leaking tags to the broadcasted logger. See #45854. @Edouard-chin if you want to work on fixing that, please submit a new PR with those fixes as well, but I could not find a way to keep this and fix that bug. I'm more and more convinced we need to change how boardcast is implemented to use composition.

rafaelfranca added a commit that referenced this pull request Sep 8, 2022
…broadcast"

This reverts commit 31925f5.

This was causing tags to leak to the broadcast logger when
`tagged` without a block is used.

Fix #45854.
rafaelfranca added a commit that referenced this pull request Sep 8, 2022
…broadcast"

This reverts commit 5e770a3, reversing
changes made to 989a04d.

This was causing tags to leak to the broadcast logger when
`tagged` without a block is used.

Fix #45854.
@Edouard-chin
Copy link
Member Author

Arf, thanks for letting me know. That's a tricky one.
Yeah fully agree the broadcast feature needs to be reimplemented, it's hard to understand it, too much metaprogramming.

Edouard-chin added a commit to Edouard-chin/rails that referenced this pull request Jul 3, 2023
- Opening this PR to get some first impression and feedback and see if
  that’s the path we want to take.

  ## Context

  While working on rails#44695, I
  realised that Broadcasting was still a private API, although it’s
  commonly used. Rafael mentioned that making it public would require
  some refactor because of the original implementation which was hard
  to understand and maintain.

  TaggedLogging is another piece of this PR, while it’s not related
  to broadcasting, both features combined were a source of issues and
  confusion (see rails#38850, rails#27792, rails#45854 and some more).
  Broadcasting and tagged logging were a bit entangled and I felt it
  would be easier to have the bigger picture in a single PR.

  TaggedLogging is public and the new implementation doesn’t introduce
  any breaking change.

  Broadcasting is in a grey zone, it’s not referenced in our docs but
  I saw it often in apps and libraries. This refactor would break them.
  Happy to revisit that and find a way to make it compatible.

  ## Implementation

  The changes would make a lot of diff chunks, so to make it easier to
  review I opted to not modify the original files and free the constant
  name (Logger, TaggedLogging) for the new implementation that are
  inside new files.
  All code in this PR is new and uses code from the previous
  implementation that don’t appear in the diff. The goal would be to
  copy/paste the required code at the end of the review process.

  ---------------

  ### Changing how broadcasting works:

  Broadcasting in a nutshell worked by “transforming” an existing
  logger into a broadcasted one.
  The logger would then be responsible to log and format its own
  messages as well as passing the message along to other logger it
  broadcasts to.

  The problem with this approach was the following:

  - Heavy use of metaprogramming.
  - Accessing the loggers in the broadcast wasn’t possible.
    Removing a logger from the broadcast either.
  - More importantly, modifying the main logger (the one that broadcasts
    logs to the others) wasn’t possible and the main source of
    misunderstanding.

    ```ruby
      logger = Logger.new(STDOUT)
      stderr_logger = Logger.new(STDER))
      logger.extend(AS::Logger.broadcast(stderr_logger))

      logger.level = DEBUG # This modifies the level on all other loggers
      logger.formatter = … # Modified the formatter on all other loggers
    ```

  -> New approach

  To keep the contract unchanged on what Rails.logger returns, the new
  implementation is still a subclass of Logger.
  The difference is that now the broadcast logger just delegate al
  methods to all the other loggers it’s broadcasting to.
  It’s simple and boring and it’s now just an array that gets
  iterated over.

  Now, users can access all loggers inside the broadcast and modify
  them on the fly. They can also remove any logger from the broadcast
  at any time.

  ```ruby
  # Before

  stdout_logger = Logger.new(STDOUT)
  stderr_logger = Logger.new(STDER)
  file_logger = Logger.new(“development.log”)
  stdout_logger.extend(AS::Logger.broadcast(stderr_logger))
  stdout_logger.extend(AS::Logger.broadcast(file_logger))

  # After

  broadcast = BroadcastLogger.new
  broadcast.broadcast_to(stdout_logger, stderr_logger, file_logger)
  ```

  I also think that now, it should be more clear for users that the
  broadcast sole job is to pass everything to the whole loggers in
  the broadcast. So there should be no surprise that all loggers in
  the broadcast get their level modified when they call
  `broadcast.level = DEBUG` .

  It’s also easier to wrap your head around more complex setup such
  as broadcasting logs to another broadcast:
  `broadcast.broadcast_to(stdout_logger, other_broadcast)`

  ### Changing TaggedLogging

  Tagged logging is painful to implement because there is basically no
  good way to hook into the vanilla logger code. The easiest is to
  hook on the formatter but IMHO this is implemented at the wrong
  level.

  Adding tags on the formatter means:

  - Monkeypatching the formatter on the logger. With the broadcasting
    feature, that meant modifying all formatters on all loggers.
  - From its name, I would assume that a formatter job is just to
    format. Not add modify the logs and add extra information.

    What I felt was missing was an object responsible to process the
    logs just before it gets formatted. So I implemented a
    “LogProcessor” which seats just after the user pass a log,
    but before it gets formatted.
    I thought it is a good addition that would allow to have
    multiple processors in the case users or libraries need to pass
    their logs into multiple processors.
Edouard-chin added a commit to Edouard-chin/rails that referenced this pull request Jul 3, 2023
- ## Context

  While working on rails#44695, I
  realised that Broadcasting was still a private API, although it’s
  commonly used. Rafael mentioned that making it public would require
  some refactor because of the original implementation which was hard
  to understand and maintain.

  ### Changing how broadcasting works:

  Broadcasting in a nutshell worked by “transforming” an existing
  logger into a broadcasted one.
  The logger would then be responsible to log and format its own
  messages as well as passing the message along to other logger it
  broadcasts to.

  The problem with this approach was the following:

  - Heavy use of metaprogramming.
  - Accessing the loggers in the broadcast wasn’t possible.
    Removing a logger from the broadcast either.
  - More importantly, modifying the main logger (the one that broadcasts
    logs to the others) wasn’t possible and the main source of
    misunderstanding.

    ```ruby
      logger = Logger.new(STDOUT)
      stderr_logger = Logger.new(STDER))
      logger.extend(AS::Logger.broadcast(stderr_logger))

      logger.level = DEBUG # This modifies the level on all other loggers
      logger.formatter = … # Modified the formatter on all other loggers
    ```

  -> New approach

  To keep the contract unchanged on what Rails.logger returns, the new
  implementation is still a subclass of Logger.
  The difference is that now the broadcast logger just delegate al
  methods to all the other loggers it’s broadcasting to.
  It’s simple and boring and it’s now just an array that gets
  iterated over.

  Now, users can access all loggers inside the broadcast and modify
  them on the fly. They can also remove any logger from the broadcast
  at any time.

  ```ruby
  # Before

  stdout_logger = Logger.new(STDOUT)
  stderr_logger = Logger.new(STDER)
  file_logger = Logger.new(“development.log”)
  stdout_logger.extend(AS::Logger.broadcast(stderr_logger))
  stdout_logger.extend(AS::Logger.broadcast(file_logger))

  # After

  broadcast = BroadcastLogger.new(stdout_logger, stderr_logger, file_logger)
  ```

  I also think that now, it should be more clear for users that the
  broadcast sole job is to pass everything to the whole loggers in
  the broadcast. So there should be no surprise that all loggers in
  the broadcast get their level modified when they call
  `broadcast.level = DEBUG` .

  It’s also easier to wrap your head around more complex setup such
  as broadcasting logs to another broadcast:
  `broadcast.broadcast_to(stdout_logger, other_broadcast)`
rafaelfranca pushed a commit to Edouard-chin/rails that referenced this pull request Sep 25, 2023
- ## Context

  While working on rails#44695, I
  realised that Broadcasting was still a private API, although it’s
  commonly used. Rafael mentioned that making it public would require
  some refactor because of the original implementation which was hard
  to understand and maintain.

  ### Changing how broadcasting works:

  Broadcasting in a nutshell worked by “transforming” an existing
  logger into a broadcasted one.
  The logger would then be responsible to log and format its own
  messages as well as passing the message along to other logger it
  broadcasts to.

  The problem with this approach was the following:

  - Heavy use of metaprogramming.
  - Accessing the loggers in the broadcast wasn’t possible.
    Removing a logger from the broadcast either.
  - More importantly, modifying the main logger (the one that broadcasts
    logs to the others) wasn’t possible and the main source of
    misunderstanding.

    ```ruby
      logger = Logger.new(STDOUT)
      stderr_logger = Logger.new(STDER))
      logger.extend(AS::Logger.broadcast(stderr_logger))

      logger.level = DEBUG # This modifies the level on all other loggers
      logger.formatter = … # Modified the formatter on all other loggers
    ```

  To keep the contract unchanged on what Rails.logger returns, the new
  BroadcastLogger class implement duck typing with all methods
  that has the vanilla Ruby Logger class.

  It's a simple and boring PORO that keeps an array of all the loggers
  that are part of the broadcast and iterate over whenever a log is
  sent.
  Now, users can access all loggers inside the broadcast and modify
  them on the fly. They can also remove any logger from the broadcast
  at any time.

  ```ruby
  # Before

  stdout_logger = Logger.new(STDOUT)
  stderr_logger = Logger.new(STDER)
  file_logger = Logger.new(“development.log”)
  stdout_logger.extend(AS::Logger.broadcast(stderr_logger))
  stdout_logger.extend(AS::Logger.broadcast(file_logger))

  # After

  broadcast = BroadcastLogger.new(stdout_logger, stderr_logger, file_logger)
  ```

  I also think that now, it should be more clear for users that the
  broadcast sole job is to pass everything to the whole loggers in
  the broadcast. So there should be no surprise that all loggers in
  the broadcast get their level modified when they call
  `broadcast.level = DEBUG` .

  It’s also easier to wrap your head around more complex setup such
  as broadcasting logs to another broadcast:
  `broadcast.broadcast_to(stdout_logger, other_broadcast)`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants