Optional Buffered Events channel. #550

kris-watts-gravwell · 2023-01-19T16:13:31Z

This PR slightly overhauls the NewWatcher() API so that a list of Options functions can be provided. Currently there is only one configured option but there are a few issues asking for more configuration options so this should allow for an extensible method to add configuration options to NewWatcher.

The goals for this PR:

Do not change the API in any way that would require any refactor by users of the library.
Make all options optional and ensure default behavior is unchanged.
Provide the option to configure a userland buffer in the event channel.

The motivation for this PR

We make extensive use of the fsnotify library but continually run into situations with older RedHat installations where the kernel fsnotify buffer is not large enough to accommodate high load scenarios like log file rotation and the end user cannot modify the kernel parameters. We have been running a fork that created a userland event channel buffer and found that we are able to deal with very large bursts of fsnotify events without being forced to catch errors and recover.

The userland channel buffer will never be as fast as the kernel buffer, and in most cases straight won't be used, but this PR gives users of the library the option to add some padding without changing kernel parameters.

…annels

arp242 · 2023-01-19T16:26:56Z

What kind of values do you want to set WithBufferedEventChannel() to? If it's not too large, I think always using a buffered channel might be a realistic solution?

kris-watts-gravwell · 2023-01-19T16:32:29Z

What kind of values do you want to set WithBufferedEventChannel() to? If it's not too large, I think always using a buffered channel might be a realistic solution?

It varies, for our existing fork we leave it to zero and provide a configuration parameter in the application to crank it up.

What we found through pretty aggressive testing is that when the kernel config parameters are sane leaving the channel as unbuffered and forcing the kernel buffer to do its thing was a better solution. The kernel dedups events and is generally really fast.

However, we have a few users that have watched log directories with 10s of thousands of files that all get rotated all at once via logrotate (I know, but we can't control what they do). They are also running unpriveleged on an older RedHat system with small kernel buffers. So we have them optionally crank that channel buffer (100k+) and then the log rotation becomes entirely survivable.

Long story short, leaving it to zero covers 99.5% of uses but having a configurable parameter lets us survive the 0.5% of dumb cases. I don't think an always buffered channel should be the default.

arp242 · 2023-01-19T17:06:10Z

Right; "hundreds of thousands" is rather a lot. If it was something like 100 then I'd be okay with just allocating that.

My main concern is that we will now have two kind of With... options: one for AddWith() and one for NewWatcher(), which is not exactly great API design. We only have WithBufferSize() for AddWith() right now, but intuitively it makes sense that either AddWith() or NewWatcher() can accept that, and that probably applies to options we will add in the future too.

My idea was that all options would be per-watch rather than per-watcher, but with WithBufferedEventChannel() that obviously won't work as the channel is already created.

None of this is in any tagged release yet, so we can still change the names or behaviour of anything here. I need to think what the best solution for this is. Some options might be:

Change the names to clarify which option belongs where; can't thing of anything decent that's not half a sentence though.
Maybe make both NewWatcher() and AddWith() accept the same set of options, where NewWatcher() will set the default? Passing WithBufferedEventChannel() to AddWith() will then return an error as a special case. Seems a bit overly complex though.
Instead of NewWatcher(opts ...newOpt) use NewWatcher(size ...int); I don't know if we ever want to add any other option that must be on the Watcher in the future? I can't really think of anything.
Add NewBufferedWatcher() like you did in your fork.

I'm kind of leaning towards option 3 or 4, unless someone can think of another option we really want to add in the future.

kris-watts-gravwell · 2023-01-19T17:13:14Z

I totally get it. I struggled with what the "right" thing was. The only reason I thought option style parameters might be worth while is this issue: #431

An options parameters thingy might allow shoehorning in something like that through an option.

We love this library and my number 1 priority is to minimize any headaches and ambiguities for you (and your other users).

arp242 · 2023-01-19T17:18:08Z

#431 can be a per-watch (rather than per-watcher) option I think? I'm not even sure we need an option for that in first place; just always flushing is probably fine (cc: @horahoradev).

mattn · 2023-01-19T23:24:01Z

channel with default size always block sending if data is not received. So this solution can be used to make seamless operations.

arp242 · 2023-01-25T14:49:54Z

So personally I'd lean towards NewBufferedWatcher(), as this is the simplest option and I don't expect there will be a need for additional options. I'd be okay with other solutions as well, but I don't really see any advantages to them.

Do you want to rewrite the PR @kris-watts-gravwell? Otherwise I'll take a look at it at some point.

This would basically be the commit from your fork, with FEN support that has been added since, the test you added here, and the API docs from this PR which explain why you might want this function.

If you do, a few small remarks:

API docs for NewBufferedWatcher() should be added in the mkdoc.zsh script, rather than "directly". Basically add a set-cmt at the bottom and a new variable like the others. It's a bit esoteric, but this way any future changes/typos/etc. don't need to be fixed 5 times, so it's pretty convenient. If you're having problems with this then I can just do it; don't spend too much time on it (might be difficult to do this from Windows machines for example, and also not 100% sure it'll work well on macOS as I only ever ran it on my Linux machine).
I prefer comments as // foo rather than //foo. It's a really minor issue, but it just looks "off" to my eyes without that space, and might as well change that if you're working on it anyway.

Another option not mentioned in my previous comment would be a Watcher.SetBufferSize(n int) function. We can't change the buffer size after a path is already watched, so it would panic or return an error if that's the case, which is why I wouldn't be in favour of it. Just mentioning this so all options are mentioned (all options I could think of anyway).

kris-watts-gravwell · 2023-01-26T02:02:33Z

I will rewrite the pr and get the comments and docs addressed too.

Thanks!

horahoradev · 2023-01-26T15:23:50Z

#431 can be a per-watch (rather than per-watcher) option I think? I'm not even sure we need an option for that in first place; just always flushing is probably fine (cc: @horahoradev).

I'd just be very wary of introducing a deadlock if the client's event receiver has already exited before watcher.Close() is called. If the events buffer is already full, and has no handler, and we're only returning from close after we flush all outstanding events, then Close() would block indefinitely, which could introduce dangerous behavior.
It might be better to make it an option just to keep everything backwards-compatible, and make new behavioral requirements opt-in.

…ying to update the existing NewWatcher

… configOptions

kris-watts-gravwell · 2023-02-05T22:33:51Z

ok, I think i have this PR into a state that matches the consensus. Just adding a new "NewBufferedWatcher" call which can generate a Watcher with a user land Event buffer. I tried to add a comment block that was adequately scary enough that users will only use the new interface if they know exactly what they are doing.

arp242 · 2023-02-09T13:43:54Z

One of the tests consistently fails if I run the tests with a buffered watcher:

--- FAIL: TestClose (0.00s)
    --- FAIL: TestClose/events_not_read (0.11s)
        fsnotify_test.go:1034: Events not closed

Not yet sure if that's a bug in the testcase or code 🤔 I've seen some intermittent failures on this before, so could be a bug in the test code that just shows up more consistently here. I'll have to look in to it.

I'll also work on refactoring things a bit so it always runs all tests against both a regular and buffered watcher.

Made my changes over here: https://github.com/fsnotify/fsnotify/commits/configOptions

arp242 · 2023-10-21T19:33:57Z

Merged via #572; I finally spent some time fixing up the tests.

You do need to do FSNOTIFY_BUFFER=4096 go test, which is not ideal, but it's a right pain to refactor everything to run the tests more than once, and this is "good enough" for now: as long as it runs in the CI I'm okay with it.

adding options parameters to NewWatcher to allow for buffered EventCh…

5e8ff90

…annels

kris-watts-gravwell mentioned this pull request Jan 19, 2023

go.mod replace directives are breaking out of tree builds gravwell/gravwell#575

Open

3 tasks

missed one

5e99d85

kris-watts-gravwell added 3 commits February 5, 2023 15:17

Addressing comments and moving to a new top level call rather than tr…

eafc1de

…ying to update the existing NewWatcher

udpate comment blocks

f5b05d8

Merge remote-tracking branch 'refs/remotes/origin/configOptions' into…

90ac4bb

… configOptions

arp242 mentioned this pull request Jul 12, 2023

Add NewBufferedWatcher() #572

Merged

arp242 closed this Oct 21, 2023

shogo82148 mentioned this pull request Mar 6, 2024

port upstream shogo82148/fsnotify#33

Open

25 tasks

arp242 mentioned this pull request May 1, 2024

Support filtering events, add unsupported events #629

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optional Buffered Events channel. #550

Optional Buffered Events channel. #550

kris-watts-gravwell commented Jan 19, 2023

arp242 commented Jan 19, 2023

kris-watts-gravwell commented Jan 19, 2023 •

edited

arp242 commented Jan 19, 2023

kris-watts-gravwell commented Jan 19, 2023 •

edited

arp242 commented Jan 19, 2023

mattn commented Jan 19, 2023

arp242 commented Jan 25, 2023 •

edited

kris-watts-gravwell commented Jan 26, 2023

horahoradev commented Jan 26, 2023

kris-watts-gravwell commented Feb 5, 2023 •

edited

arp242 commented Feb 9, 2023

arp242 commented Oct 21, 2023

Optional Buffered Events channel. #550

Optional Buffered Events channel. #550

Conversation

kris-watts-gravwell commented Jan 19, 2023

The goals for this PR:

The motivation for this PR

arp242 commented Jan 19, 2023

kris-watts-gravwell commented Jan 19, 2023 • edited

arp242 commented Jan 19, 2023

kris-watts-gravwell commented Jan 19, 2023 • edited

arp242 commented Jan 19, 2023

mattn commented Jan 19, 2023

arp242 commented Jan 25, 2023 • edited

kris-watts-gravwell commented Jan 26, 2023

horahoradev commented Jan 26, 2023

kris-watts-gravwell commented Feb 5, 2023 • edited

arp242 commented Feb 9, 2023

arp242 commented Oct 21, 2023

kris-watts-gravwell commented Jan 19, 2023 •

edited

kris-watts-gravwell commented Jan 19, 2023 •

edited

arp242 commented Jan 25, 2023 •

edited

kris-watts-gravwell commented Feb 5, 2023 •

edited