Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional Buffered Events channel. #550

Closed
wants to merge 5 commits into from

Conversation

kris-watts-gravwell
Copy link
Contributor

This PR slightly overhauls the NewWatcher() API so that a list of Options functions can be provided. Currently there is only one configured option but there are a few issues asking for more configuration options so this should allow for an extensible method to add configuration options to NewWatcher.

The goals for this PR:

  1. Do not change the API in any way that would require any refactor by users of the library.
  2. Make all options optional and ensure default behavior is unchanged.
  3. Provide the option to configure a userland buffer in the event channel.

The motivation for this PR

We make extensive use of the fsnotify library but continually run into situations with older RedHat installations where the kernel fsnotify buffer is not large enough to accommodate high load scenarios like log file rotation and the end user cannot modify the kernel parameters. We have been running a fork that created a userland event channel buffer and found that we are able to deal with very large bursts of fsnotify events without being forced to catch errors and recover.

The userland channel buffer will never be as fast as the kernel buffer, and in most cases straight won't be used, but this PR gives users of the library the option to add some padding without changing kernel parameters.

@arp242
Copy link
Member

arp242 commented Jan 19, 2023

What kind of values do you want to set WithBufferedEventChannel() to? If it's not too large, I think always using a buffered channel might be a realistic solution?

@kris-watts-gravwell
Copy link
Contributor Author

kris-watts-gravwell commented Jan 19, 2023

What kind of values do you want to set WithBufferedEventChannel() to? If it's not too large, I think always using a buffered channel might be a realistic solution?

It varies, for our existing fork we leave it to zero and provide a configuration parameter in the application to crank it up.

What we found through pretty aggressive testing is that when the kernel config parameters are sane leaving the channel as unbuffered and forcing the kernel buffer to do its thing was a better solution. The kernel dedups events and is generally really fast.

However, we have a few users that have watched log directories with 10s of thousands of files that all get rotated all at once via logrotate (I know, but we can't control what they do). They are also running unpriveleged on an older RedHat system with small kernel buffers. So we have them optionally crank that channel buffer (100k+) and then the log rotation becomes entirely survivable.

Long story short, leaving it to zero covers 99.5% of uses but having a configurable parameter lets us survive the 0.5% of dumb cases. I don't think an always buffered channel should be the default.

@arp242
Copy link
Member

arp242 commented Jan 19, 2023

Right; "hundreds of thousands" is rather a lot. If it was something like 100 then I'd be okay with just allocating that.


My main concern is that we will now have two kind of With... options: one for AddWith() and one for NewWatcher(), which is not exactly great API design. We only have WithBufferSize() for AddWith() right now, but intuitively it makes sense that either AddWith() or NewWatcher() can accept that, and that probably applies to options we will add in the future too.

My idea was that all options would be per-watch rather than per-watcher, but with WithBufferedEventChannel() that obviously won't work as the channel is already created.

None of this is in any tagged release yet, so we can still change the names or behaviour of anything here. I need to think what the best solution for this is. Some options might be:

  1. Change the names to clarify which option belongs where; can't thing of anything decent that's not half a sentence though.

  2. Maybe make both NewWatcher() and AddWith() accept the same set of options, where NewWatcher() will set the default? Passing WithBufferedEventChannel() to AddWith() will then return an error as a special case. Seems a bit overly complex though.

  3. Instead of NewWatcher(opts ...newOpt) use NewWatcher(size ...int); I don't know if we ever want to add any other option that must be on the Watcher in the future? I can't really think of anything.

  4. Add NewBufferedWatcher() like you did in your fork.

I'm kind of leaning towards option 3 or 4, unless someone can think of another option we really want to add in the future.

@kris-watts-gravwell
Copy link
Contributor Author

kris-watts-gravwell commented Jan 19, 2023

I totally get it. I struggled with what the "right" thing was. The only reason I thought option style parameters might be worth while is this issue: #431

An options parameters thingy might allow shoehorning in something like that through an option.

We love this library and my number 1 priority is to minimize any headaches and ambiguities for you (and your other users).

@arp242
Copy link
Member

arp242 commented Jan 19, 2023

#431 can be a per-watch (rather than per-watcher) option I think? I'm not even sure we need an option for that in first place; just always flushing is probably fine (cc: @horahoradev).

@mattn
Copy link
Member

mattn commented Jan 19, 2023

channel with default size always block sending if data is not received. So this solution can be used to make seamless operations.

@arp242
Copy link
Member

arp242 commented Jan 25, 2023

So personally I'd lean towards NewBufferedWatcher(), as this is the simplest option and I don't expect there will be a need for additional options. I'd be okay with other solutions as well, but I don't really see any advantages to them.

Do you want to rewrite the PR @kris-watts-gravwell? Otherwise I'll take a look at it at some point.

This would basically be the commit from your fork, with FEN support that has been added since, the test you added here, and the API docs from this PR which explain why you might want this function.

If you do, a few small remarks:

  • API docs for NewBufferedWatcher() should be added in the mkdoc.zsh script, rather than "directly". Basically add a set-cmt at the bottom and a new variable like the others. It's a bit esoteric, but this way any future changes/typos/etc. don't need to be fixed 5 times, so it's pretty convenient. If you're having problems with this then I can just do it; don't spend too much time on it (might be difficult to do this from Windows machines for example, and also not 100% sure it'll work well on macOS as I only ever ran it on my Linux machine).

  • I prefer comments as // foo rather than //foo. It's a really minor issue, but it just looks "off" to my eyes without that space, and might as well change that if you're working on it anyway.


Another option not mentioned in my previous comment would be a Watcher.SetBufferSize(n int) function. We can't change the buffer size after a path is already watched, so it would panic or return an error if that's the case, which is why I wouldn't be in favour of it. Just mentioning this so all options are mentioned (all options I could think of anyway).

@kris-watts-gravwell
Copy link
Contributor Author

I will rewrite the pr and get the comments and docs addressed too.

Thanks!

@horahoradev
Copy link
Member

#431 can be a per-watch (rather than per-watcher) option I think? I'm not even sure we need an option for that in first place; just always flushing is probably fine (cc: @horahoradev).

I'd just be very wary of introducing a deadlock if the client's event receiver has already exited before watcher.Close() is called. If the events buffer is already full, and has no handler, and we're only returning from close after we flush all outstanding events, then Close() would block indefinitely, which could introduce dangerous behavior.
It might be better to make it an option just to keep everything backwards-compatible, and make new behavioral requirements opt-in.

@kris-watts-gravwell
Copy link
Contributor Author

kris-watts-gravwell commented Feb 5, 2023

ok, I think i have this PR into a state that matches the consensus. Just adding a new "NewBufferedWatcher" call which can generate a Watcher with a user land Event buffer. I tried to add a comment block that was adequately scary enough that users will only use the new interface if they know exactly what they are doing.

@arp242
Copy link
Member

arp242 commented Feb 9, 2023

One of the tests consistently fails if I run the tests with a buffered watcher:

--- FAIL: TestClose (0.00s)
    --- FAIL: TestClose/events_not_read (0.11s)
        fsnotify_test.go:1034: Events not closed

Not yet sure if that's a bug in the testcase or code 🤔 I've seen some intermittent failures on this before, so could be a bug in the test code that just shows up more consistently here. I'll have to look in to it.

I'll also work on refactoring things a bit so it always runs all tests against both a regular and buffered watcher.

Made my changes over here: https://github.com/fsnotify/fsnotify/commits/configOptions

@arp242 arp242 mentioned this pull request Jul 12, 2023
@arp242
Copy link
Member

arp242 commented Oct 21, 2023

Merged via #572; I finally spent some time fixing up the tests.

You do need to do FSNOTIFY_BUFFER=4096 go test, which is not ideal, but it's a right pain to refactor everything to run the tests more than once, and this is "good enough" for now: as long as it runs in the CI I'm okay with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants