Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trigger is invoked after log, causing date based triggers to log one line in the wrong file #289

Closed
snaggen opened this issue Oct 25, 2022 · 10 comments
Labels
bug open pr An open PR exists

Comments

@snaggen
Copy link

snaggen commented Oct 25, 2022

I have a Date based trigger, so when the date changes it will return true and the logs are rotated.
However due to the fact that the rolling file appender writes to the log first, and then check the policy, the first log line each day is written in the old log, before it is rotated.

I think it would be bettter, if the policy was checked at the top of the append function, before the log is written.

@estk
Copy link
Owner

estk commented Oct 25, 2022

@snaggen thanks for reaching out, I'd be happy to help mentor if you're interested in taking this on.

@snaggen
Copy link
Author

snaggen commented Oct 26, 2022

The current code handling this is this:
src/append/rolling_file/mod.rs

      fn append(&self, record: &Record) -> anyhow::Result<()> {
          // TODO(eas): Perhaps this is better as a concurrent queue?
          let mut writer = self.writer.lock();
  
          let len = { => u64
              let writer = self.get_writer(&mut writer)?;
              self.encoder.encode(writer, record)?;
              writer.flush()?;
              writer.len
          };
  
          let mut file = LogFile {
              writer: &mut writer,
              path: &self.path,
              len,
          };
  
          // TODO(eas): Idea: make this optionally return a future, and if so, we initialize a queue for
          // data that comes in while we are processing the file rotation.
          self.policy.process(&mut file)
      }

So basically just move the self.policy.process(&mut file) to the top of that function should solve the problem. However, the problem here is that process currently takes a LogFile parameter, which have an open writer. So, construct the LogFile struct after the let mut writer = self.writer.lock(); and add the policy call there?

Something like

    fn append(&self, record: &Record) -> anyhow::Result<()> {
        // TODO(eas): Perhaps this is better as a concurrent queue?
        let mut writer = self.writer.lock();
        
        let len = {
            let writer = self.get_writer(&mut writer)?;
            writer.len
        };
        let mut file = LogFile {
            writer: &mut writer,
            path: &self.path,
            len,
        };

        // TODO(eas): Idea: make this optionally return a future, and if so, we initialize a queue for
        // data that comes in while we are processing the file rotation.
        self.policy.process(&mut file)?;

        let writer = self.get_writer(&mut writer)?;
        self.encoder.encode(writer, record)?;
        writer.flush()?;
        Ok(())

    }

@snaggen
Copy link
Author

snaggen commented Oct 26, 2022

Or as a diff

diff --git a/src/append/rolling_file/mod.rs b/src/append/rolling_file/mod.rs
index d391abb..4f241da 100644
--- a/src/append/rolling_file/mod.rs
+++ b/src/append/rolling_file/mod.rs
@@ -169,11 +169,8 @@ impl Append for RollingFileAppender {
 
         let len = {
             let writer = self.get_writer(&mut writer)?;
-            self.encoder.encode(writer, record)?;
-            writer.flush()?;
             writer.len
         };
-
         let mut file = LogFile {
             writer: &mut writer,
             path: &self.path,
@@ -182,7 +179,12 @@ impl Append for RollingFileAppender {
 
         // TODO(eas): Idea: make this optionally return a future, and if so, we initialize a queue for
         // data that comes in while we are processing the file rotation.
-        self.policy.process(&mut file)
+        self.policy.process(&mut file)?;
+
+        let writer = self.get_writer(&mut writer)?;
+        self.encoder.encode(writer, record)?;
+        writer.flush()?;
+        Ok(())
     }

@estk
Copy link
Owner

estk commented Oct 26, 2022

great @snaggen would you open a PR?

@brotskydotcom
Copy link

Sorry to chime in uninvited, but I've also been looking at what's needed for time-based rotation (see also #235) and I'm not sure this is the way to go.

There is a subtle issue introduced by this change: it delays log rotation by size until the next log entry is going to be written. The way the code currently works is that the size of the log file handed to the trigger includes the entry that is triggering the rotation. This is as it should be for size-based rotations, because it means that the rotation is done as soon as an entry has made the file reach the cutoff size. If we were to change the size calculation not to include the current entry, then it's not until the next entry is written that the new size will be seen. And while it's a little far-fetched, this delay does allow odd race conditions to arise, for example:

  • the log rotation policy may have changed, so that a rotation that should have happened under the old policy is not executed.
  • if rotation would have compressed the file (likely), disk space may run low before the next entry is written.

If you think about it, this "delayed until the next entry happens" rotation issue is actually the same problem that this bug is complaining about but in "the other direction": the entry that triggers the rotation is happening after the (timed) trigger point, and so the rotation that should have happened at that trigger point got delayed and so includes the entry.

For both timed triggers and sized triggers both to work correctly, I believe we need trigger checks both before and after the entry is written. Even better would be to leave the sized triggers alone and introduce true "timed" triggers that actually rotate at a given time rather than waiting for an entry to be made at all. (That's the only way I think they can meet user expectations. After all, you don't expect to come in at 8am and find yesterday's log being the current log, even if nothing has been logged since midnight.)

@estk, if you're amenable, I'd like to work on a PR for timed triggers that actually uses timers, and then based on them writes an entry "Timed log rotation triggered " in order to get the existing trigger code to run. (That will also allow combining timed and sized triggers together cleanly, which is another issue that might be complicated by this change.)

@snaggen, if you'd be interested in collaborating on such a PR, I would be glad of the help.

snaggen pushed a commit to snaggen/log4rs that referenced this issue Oct 28, 2022
@snaggen
Copy link
Author

snaggen commented Oct 28, 2022

And I do not agree with the size log delay to be the same kind of bug but in the other direction, since the log output would be exactly the same, it is just the rotation time that will be delayed until the next log event. And since that happens at a random time, that is exactly the same as today. So, this is really just an imaginary issue you are raising.

There is however an very minor issue with the size appender, and that is that the 1M size limit currently produces files larger than 1M (which is really just a matter of semantics, is the limit the trigger point or the max file size). This, and the imaginary issue above, may be resolved by adding the current log row that is going to be appended, to the policy and trigger, so that they may use that in the size calculations and trigger if the size would exceed the defined size. That way a 1M size policy would not generate files lager than 1M as it is now, and it would then also trigger on the same log event as today.
However, this is in my opinion a different issue, and should be handled separately, if at all.

@clickonetwo
Copy link

@snaggen I'm trying to find places where we agree rather than places where we disagree :), so I'm sorry if my first post did not come off that way. Let me try again. Here is where I think we agree:

  1. The proposed change changes the time that log rotation by size happens, but does not alter the content of the log that gets rotated, assuming the log rotation policy remains constant.
  2. The proposed change will ensure that log rotation triggered by a point in time happens before a log entry after that time is written, thus guaranteeing that no "late entries" appear in the rotated log.
  3. The proposed change does not alter the current behavior of rotation by size, which is that logs are only rotated after they reach that size, not just before the next entry would have made them reach that size.

Under that analysis, I agree that this is a positive change, and I further note that it makes it easy for clients to get logs rotated at a specific time by using a time-based trigger and then (as clients) posting a log entry right at that time (which will force the rotation). So...

@snaggen Are you going to submit your proposed change as a PR? @estk seems in favor, and I am now in favor as well.

I will work on a PR that addresses #235 by writing a "client-based trigger" policy which allows clients to trigger log rotation at will (e.g., based on their own timers). This seems like a minimally-invasive way for the existing code to give clients complete control over rotation.

@snaggen
Copy link
Author

snaggen commented Oct 28, 2022

I did submitted a PR this morning

@clickonetwo
Copy link

@snaggen Whoops! Sorry ignore my question above - I see you already submitted the PR. Thanks! I will go ahead with my client trigger based on that.

@estk estk added open pr An open PR exists and removed help wanted labels Oct 30, 2022
@bconn98
Copy link
Collaborator

bconn98 commented Feb 10, 2024

fixed as part of #296

@bconn98 bconn98 closed this as completed Feb 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug open pr An open PR exists
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants