We got a report that adding a fanotify filsystem watch prevents tail -f from receiving events.
Reproducer:
1. Create 3 windows / login sessions. Become root in each session. 2. Choose a mounted filesystem that is pretty quiet; I picked /boot. 3. In the first window, run: fsnotifywait -S -m /boot 4. In the second window, run: echo data >> /boot/foo 5. In the third window, run: tail -f /boot/foo 6. Go back to the second window and run: echo more data >> /boot/foo 7. Observe that the tail command doesn't show the new data. 8. In the first window, hit control-C to interrupt fsnotifywait. 9. In the second window, run: echo still more data >> /boot/foo 10. Observe that the tail command in the third window has now printed the missing data.
When stracing tail, we observed that when fanotify filesystem mark is set, tail does get the inotify event, but the event is receieved with the filename:
read(4, "\1\0\0\0\2\0\0\0\0\0\0\0\20\0\0\0foo\0\0\0\0\0\0\0\0\0\0\0\0\0", 50) = 32
This is unexpected, because tail is watching the file itself and not its parent and is inconsistent with the inotify event received by tail when fanotify filesystem mark is not set:
read(4, "\1\0\0\0\2\0\0\0\0\0\0\0\0\0\0\0", 50) = 16
The inteference between different fsnotify groups was caused by the fact that the mark on the sb requires the filename, so the filename is passed to fsnotify(). Later on, fsnotify_handle_event() tries to take care of not passing the filename to groups (such as inotify) that are interested in the filename only when the parent is watching.
But the logic was incorrect for the case that no group is watching the parent, some groups are watching the sb and some watching the inode.
Reported-by: Miklos Szeredi miklos@szeredi.hu Fixes: 7372e79c9eb9 ("fanotify: fix logic of reporting name info with watched parent") Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Jan Kara jack@suse.cz --- fs/notify/fsnotify.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-)
This is what I plan to merge into my tree.
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c index 82ae8254c068..f976949d2634 100644 --- a/fs/notify/fsnotify.c +++ b/fs/notify/fsnotify.c @@ -333,16 +333,19 @@ static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask, if (!inode_mark) return 0;
- if (mask & FS_EVENT_ON_CHILD) { - /* - * Some events can be sent on both parent dir and child marks - * (e.g. FS_ATTRIB). If both parent dir and child are - * watching, report the event once to parent dir with name (if - * interested) and once to child without name (if interested). - * The child watcher is expecting an event without a file name - * and without the FS_EVENT_ON_CHILD flag. - */ - mask &= ~FS_EVENT_ON_CHILD; + /* + * Some events can be sent on both parent dir and child marks (e.g. + * FS_ATTRIB). If both parent dir and child are watching, report the + * event once to parent dir with name (if interested) and once to child + * without name (if interested). + * + * In any case regardless whether the parent is watching or not, the + * child watcher is expecting an event without the FS_EVENT_ON_CHILD + * flag. The file name is expected if and only if this is a directory + * event. + */ + mask &= ~FS_EVENT_ON_CHILD; + if (!(mask & ALL_FSNOTIFY_DIRENT_EVENTS)) { dir = NULL; name = NULL; }
On Wed, Nov 13, 2024 at 4:55 PM Jan Kara jack@suse.cz wrote:
We got a report that adding a fanotify filsystem watch prevents tail -f from receiving events.
Reproducer:
- Create 3 windows / login sessions. Become root in each session.
- Choose a mounted filesystem that is pretty quiet; I picked /boot.
- In the first window, run: fsnotifywait -S -m /boot
- In the second window, run: echo data >> /boot/foo
- In the third window, run: tail -f /boot/foo
- Go back to the second window and run: echo more data >> /boot/foo
- Observe that the tail command doesn't show the new data.
- In the first window, hit control-C to interrupt fsnotifywait.
- In the second window, run: echo still more data >> /boot/foo
- Observe that the tail command in the third window has now printed
the missing data.
When stracing tail, we observed that when fanotify filesystem mark is set, tail does get the inotify event, but the event is receieved with the filename:
read(4, "\1\0\0\0\2\0\0\0\0\0\0\0\20\0\0\0foo\0\0\0\0\0\0\0\0\0\0\0\0\0", 50) = 32
This is unexpected, because tail is watching the file itself and not its parent and is inconsistent with the inotify event received by tail when fanotify filesystem mark is not set:
read(4, "\1\0\0\0\2\0\0\0\0\0\0\0\0\0\0\0", 50) = 16
The inteference between different fsnotify groups was caused by the fact that the mark on the sb requires the filename, so the filename is passed to fsnotify(). Later on, fsnotify_handle_event() tries to take care of not passing the filename to groups (such as inotify) that are interested in the filename only when the parent is watching.
But the logic was incorrect for the case that no group is watching the parent, some groups are watching the sb and some watching the inode.
Reported-by: Miklos Szeredi miklos@szeredi.hu Fixes: 7372e79c9eb9 ("fanotify: fix logic of reporting name info with watched parent") Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Jan Kara jack@suse.cz
Looks good,
Thanks, Amir.
fs/notify/fsnotify.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-)
This is what I plan to merge into my tree.
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c index 82ae8254c068..f976949d2634 100644 --- a/fs/notify/fsnotify.c +++ b/fs/notify/fsnotify.c @@ -333,16 +333,19 @@ static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask, if (!inode_mark) return 0;
if (mask & FS_EVENT_ON_CHILD) {
/*
* Some events can be sent on both parent dir and child marks
* (e.g. FS_ATTRIB). If both parent dir and child are
* watching, report the event once to parent dir with name (if
* interested) and once to child without name (if interested).
* The child watcher is expecting an event without a file name
* and without the FS_EVENT_ON_CHILD flag.
*/
mask &= ~FS_EVENT_ON_CHILD;
/*
* Some events can be sent on both parent dir and child marks (e.g.
* FS_ATTRIB). If both parent dir and child are watching, report the
* event once to parent dir with name (if interested) and once to child
* without name (if interested).
*
* In any case regardless whether the parent is watching or not, the
* child watcher is expecting an event without the FS_EVENT_ON_CHILD
* flag. The file name is expected if and only if this is a directory
* event.
*/
mask &= ~FS_EVENT_ON_CHILD;
if (!(mask & ALL_FSNOTIFY_DIRENT_EVENTS)) { dir = NULL; name = NULL; }
-- 2.35.3
linux-stable-mirror@lists.linaro.org