The current calculation for splitting nodes tries to enforce a minimum
span on the leaf nodes. This code is complex and never worked correctly
to begin with, due to the min value being passed as 0 for all leaves.
The calculation should just split the data as equally as possible
between the new nodes. Note that b_end will be one more than the data,
so the left side is still favoured in the calculation.
The current code may also lead to a deficient node by not leaving enough
data for the right side of the split. This issue is also addressed with
the split calculation change.
[liam: rephrase the change log]
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Wei Yang <richard.weiyang(a)gmail.com>
CC: Liam R. Howlett <Liam.Howlett(a)Oracle.com>
CC: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
CC: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
---
v3:
* Liam helps rephrase the change log
* add fix tag and cc stable
---
lib/maple_tree.c | 23 ++++++-----------------
1 file changed, 6 insertions(+), 17 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index d0ae808f3a14..4f2950a1c38d 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -1863,11 +1863,11 @@ static inline int mab_no_null_split(struct maple_big_node *b_node,
* Return: The first split location. The middle split is set in @mid_split.
*/
static inline int mab_calc_split(struct ma_state *mas,
- struct maple_big_node *bn, unsigned char *mid_split, unsigned long min)
+ struct maple_big_node *bn, unsigned char *mid_split)
{
unsigned char b_end = bn->b_end;
int split = b_end / 2; /* Assume equal split. */
- unsigned char slot_min, slot_count = mt_slots[bn->type];
+ unsigned char slot_count = mt_slots[bn->type];
/*
* To support gap tracking, all NULL entries are kept together and a node cannot
@@ -1900,18 +1900,7 @@ static inline int mab_calc_split(struct ma_state *mas,
split = b_end / 3;
*mid_split = split * 2;
} else {
- slot_min = mt_min_slots[bn->type];
-
*mid_split = 0;
- /*
- * Avoid having a range less than the slot count unless it
- * causes one node to be deficient.
- * NOTE: mt_min_slots is 1 based, b_end and split are zero.
- */
- while ((split < slot_count - 1) &&
- ((bn->pivot[split] - min) < slot_count - 1) &&
- (b_end - split > slot_min))
- split++;
}
/* Avoid ending a node on a NULL entry */
@@ -2377,7 +2366,7 @@ static inline struct maple_enode
static inline unsigned char mas_mab_to_node(struct ma_state *mas,
struct maple_big_node *b_node, struct maple_enode **left,
struct maple_enode **right, struct maple_enode **middle,
- unsigned char *mid_split, unsigned long min)
+ unsigned char *mid_split)
{
unsigned char split = 0;
unsigned char slot_count = mt_slots[b_node->type];
@@ -2390,7 +2379,7 @@ static inline unsigned char mas_mab_to_node(struct ma_state *mas,
if (b_node->b_end < slot_count) {
split = b_node->b_end;
} else {
- split = mab_calc_split(mas, b_node, mid_split, min);
+ split = mab_calc_split(mas, b_node, mid_split);
*right = mas_new_ma_node(mas, b_node);
}
@@ -2877,7 +2866,7 @@ static void mas_spanning_rebalance(struct ma_state *mas,
mast->bn->b_end--;
mast->bn->type = mte_node_type(mast->orig_l->node);
split = mas_mab_to_node(mas, mast->bn, &left, &right, &middle,
- &mid_split, mast->orig_l->min);
+ &mid_split);
mast_set_split_parents(mast, left, middle, right, split,
mid_split);
mast_cp_to_nodes(mast, left, middle, right, split, mid_split);
@@ -3365,7 +3354,7 @@ static void mas_split(struct ma_state *mas, struct maple_big_node *b_node)
if (mas_push_data(mas, height, &mast, false))
break;
- split = mab_calc_split(mas, b_node, &mid_split, prev_l_mas.min);
+ split = mab_calc_split(mas, b_node, &mid_split);
mast_split_data(&mast, mas, split);
/*
* Usually correct, mab_mas_cp in the above call overwrites
--
2.34.1
We got a report that adding a fanotify filsystem watch prevents tail -f
from receiving events.
Reproducer:
1. Create 3 windows / login sessions. Become root in each session.
2. Choose a mounted filesystem that is pretty quiet; I picked /boot.
3. In the first window, run: fsnotifywait -S -m /boot
4. In the second window, run: echo data >> /boot/foo
5. In the third window, run: tail -f /boot/foo
6. Go back to the second window and run: echo more data >> /boot/foo
7. Observe that the tail command doesn't show the new data.
8. In the first window, hit control-C to interrupt fsnotifywait.
9. In the second window, run: echo still more data >> /boot/foo
10. Observe that the tail command in the third window has now printed
the missing data.
When stracing tail, we observed that when fanotify filesystem mark is
set, tail does get the inotify event, but the event is receieved with
the filename:
read(4, "\1\0\0\0\2\0\0\0\0\0\0\0\20\0\0\0foo\0\0\0\0\0\0\0\0\0\0\0\0\0",
50) = 32
This is unexpected, because tail is watching the file itself and not its
parent and is inconsistent with the inotify event received by tail when
fanotify filesystem mark is not set:
read(4, "\1\0\0\0\2\0\0\0\0\0\0\0\0\0\0\0", 50) = 16
The inteference between different fsnotify groups was caused by the fact
that the mark on the sb requires the filename, so the filename is passed
to fsnotify(). Later on, fsnotify_handle_event() tries to take care of
not passing the filename to groups (such as inotify) that are interested
in the filename only when the parent is watching.
But the logic was incorrect for the case that no group is watching the
parent, some groups are watching the sb and some watching the inode.
Reported-by: Miklos Szeredi <miklos(a)szeredi.hu>
Fixes: 7372e79c9eb9 ("fanotify: fix logic of reporting name info with watched parent")
Cc: stable(a)vger.kernel.org # 5.10+
Signed-off-by: Amir Goldstein <amir73il(a)gmail.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
fs/notify/fsnotify.c | 23 +++++++++++++----------
1 file changed, 13 insertions(+), 10 deletions(-)
This is what I plan to merge into my tree.
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 82ae8254c068..f976949d2634 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -333,16 +333,19 @@ static int fsnotify_handle_event(struct fsnotify_group *group, __u32 mask,
if (!inode_mark)
return 0;
- if (mask & FS_EVENT_ON_CHILD) {
- /*
- * Some events can be sent on both parent dir and child marks
- * (e.g. FS_ATTRIB). If both parent dir and child are
- * watching, report the event once to parent dir with name (if
- * interested) and once to child without name (if interested).
- * The child watcher is expecting an event without a file name
- * and without the FS_EVENT_ON_CHILD flag.
- */
- mask &= ~FS_EVENT_ON_CHILD;
+ /*
+ * Some events can be sent on both parent dir and child marks (e.g.
+ * FS_ATTRIB). If both parent dir and child are watching, report the
+ * event once to parent dir with name (if interested) and once to child
+ * without name (if interested).
+ *
+ * In any case regardless whether the parent is watching or not, the
+ * child watcher is expecting an event without the FS_EVENT_ON_CHILD
+ * flag. The file name is expected if and only if this is a directory
+ * event.
+ */
+ mask &= ~FS_EVENT_ON_CHILD;
+ if (!(mask & ALL_FSNOTIFY_DIRENT_EVENTS)) {
dir = NULL;
name = NULL;
}
--
2.35.3
These are a few patches broken out from [1]. Kalle requested to limit
the number of patches per series to approximately 12 and Francesco to
move the fixes to the front of the series, so here we go.
First two patches are fixes. First one is for host mlme support which
currently is in wireless-next, so no stable tag needed, second one has a
stable tag.
The remaining patches except the last one I have chosen to upstream
first. I'll continue with the other patches after having this series
in shape and merged.
The last one is a new patch not included in [1].
Sascha
[1] https://lore.kernel.org/all/20240820-mwifiex-cleanup-v1-0-320d8de4a4b7@peng…
Signed-off-by: Sascha Hauer <s.hauer(a)pengutronix.de>
---
Changes in v2:
- Add refence to 7bff9c974e1a in commit message of "wifi: mwifiex: drop
asynchronous init waiting code"
- Add extra sentence about bss_started in "wifi: mwifiex: move common
settings out of switch/case"
- Kill now unused MWIFIEX_BSS_TYPE_ANY
- Collect reviewed-by tags from Francesco Dolcini
- Link to v1: https://lore.kernel.org/r/20240826-mwifiex-cleanup-1-v1-0-56e6f8e056ec@peng…
---
Sascha Hauer (12):
wifi: mwifiex: add missing locking
wifi: mwifiex: fix MAC address handling
wifi: mwifiex: deduplicate code in mwifiex_cmd_tx_rate_cfg()
wifi: mwifiex: use adapter as context pointer for mwifiex_hs_activated_event()
wifi: mwifiex: drop unnecessary initialization
wifi: mwifiex: make region_code_mapping_t const
wifi: mwifiex: pass adapter to mwifiex_dnld_cmd_to_fw()
wifi: mwifiex: simplify mwifiex_setup_ht_caps()
wifi: mwifiex: fix indention
wifi: mwifiex: make locally used function static
wifi: mwifiex: move common settings out of switch/case
wifi: mwifiex: drop asynchronous init waiting code
drivers/net/wireless/marvell/mwifiex/cfg80211.c | 38 ++++------
drivers/net/wireless/marvell/mwifiex/cfp.c | 4 +-
drivers/net/wireless/marvell/mwifiex/cmdevt.c | 76 +++++++-------------
drivers/net/wireless/marvell/mwifiex/decl.h | 1 -
drivers/net/wireless/marvell/mwifiex/init.c | 19 ++---
drivers/net/wireless/marvell/mwifiex/main.c | 94 +++++++++----------------
drivers/net/wireless/marvell/mwifiex/main.h | 16 ++---
drivers/net/wireless/marvell/mwifiex/sta_cmd.c | 49 ++++---------
drivers/net/wireless/marvell/mwifiex/txrx.c | 3 +-
drivers/net/wireless/marvell/mwifiex/util.c | 22 +-----
drivers/net/wireless/marvell/mwifiex/wmm.c | 12 ++--
11 files changed, 105 insertions(+), 229 deletions(-)
---
base-commit: 67a72043aa2e6f60f7bbe7bfa598ba168f16d04f
change-id: 20240826-mwifiex-cleanup-1-b5035c7faff6
Best regards,
--
Sascha Hauer <s.hauer(a)pengutronix.de>
From: Zijun Hu <quic_zijuhu(a)quicinc.com>
dev_pm_get_subsys_data() has below 2 issues under condition
(@dev->power.subsys_data != NULL):
- it will do unnecessary kzalloc() and kfree().
- it will return -ENOMEM if the kzalloc() fails, that is wrong
since the kzalloc() is not needed.
Fixed by not doing kzalloc() and returning 0 for the condition.
Fixes: ef27bed1870d ("PM: Reference counting of power.subsys_data")
Cc: stable(a)vger.kernel.org
Signed-off-by: Zijun Hu <quic_zijuhu(a)quicinc.com>
---
drivers/base/power/common.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/base/power/common.c b/drivers/base/power/common.c
index 8c34ae1cd8d5..13cb1f2a06e7 100644
--- a/drivers/base/power/common.c
+++ b/drivers/base/power/common.c
@@ -26,6 +26,14 @@ int dev_pm_get_subsys_data(struct device *dev)
{
struct pm_subsys_data *psd;
+ spin_lock_irq(&dev->power.lock);
+ if (dev->power.subsys_data) {
+ dev->power.subsys_data->refcount++;
+ spin_unlock_irq(&dev->power.lock);
+ return 0;
+ }
+ spin_unlock_irq(&dev->power.lock);
+
psd = kzalloc(sizeof(*psd), GFP_KERNEL);
if (!psd)
return -ENOMEM;
---
base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc
change-id: 20241010-fix_dev_pm_get_subsys_data-2478bb200fde
Best regards,
--
Zijun Hu <quic_zijuhu(a)quicinc.com>
This patch series is to
- Fix an potential wild pointer dereference bug for API:
class_dev_iter_next()
- Improve the following APIs:
class_for_each_device()
class_find_device()
Signed-off-by: Zijun Hu <quic_zijuhu(a)quicinc.com>
---
Zijun Hu (3):
driver core: class: Fix wild pointer dereference in API class_dev_iter_next()
driver core: class: Correct WARN() message in APIs class_(for_each|find)_device()
driver core: class: Delete a redundant check in APIs class_(for_each|find)_device()
drivers/base/class.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
---
base-commit: 9bd133f05b1dca5ca4399a76d04d0f6f4d454e44
change-id: 20241104-class_fix-f176bd9eba22
Best regards,
--
Zijun Hu <quic_zijuhu(a)quicinc.com>
When testing large folio support with XFS on our servers, we observed that
only a few large folios are mapped when reading large files via mmap.
After a thorough analysis, I identified it was caused by the
`/sys/block/*/queue/read_ahead_kb` setting. On our test servers, this
parameter is set to 128KB. After I tune it to 2MB, the large folio can
work as expected. However, I believe the large folio behavior should not be
dependent on the value of read_ahead_kb. It would be more robust if the
kernel can automatically adopt to it.
With /sys/block/*/queue/read_ahead_kb set to 128KB and performing a
sequential read on a 1GB file using MADV_HUGEPAGE, the differences in
/proc/meminfo are as follows:
- before this patch
FileHugePages: 18432 kB
FilePmdMapped: 4096 kB
- after this patch
FileHugePages: 1067008 kB
FilePmdMapped: 1048576 kB
This shows that after applying the patch, the entire 1GB file is mapped to
huge pages. The stable list is CCed, as without this patch, large folios
don’t function optimally in the readahead path.
It's worth noting that if read_ahead_kb is set to a larger value that isn't
aligned with huge page sizes (e.g., 4MB + 128KB), it may still fail to map
to hugepages.
Fixes: 4687fdbb805a ("mm/filemap: Support VM_HUGEPAGE for file mappings")
Suggested-by: Matthew Wilcox <willy(a)infradead.org>
Signed-off-by: Yafang Shao <laoar.shao(a)gmail.com>
Cc: stable(a)vger.kernel.org
---
mm/readahead.c | 2 ++
1 file changed, 2 insertions(+)
Changes:
v1->v2:
- Drop the align (Matthew)
- Improve commit log (Andrew)
RFC->v1: https://lore.kernel.org/linux-mm/20241106092114.8408-1-laoar.shao@gmail.com/
- Simplify the code as suggested by Matthew
RFC: https://lore.kernel.org/linux-mm/20241104143015.34684-1-laoar.shao@gmail.co…
diff --git a/mm/readahead.c b/mm/readahead.c
index 3dc6c7a128dd..9b8a48e736c6 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -385,6 +385,8 @@ static unsigned long get_next_ra_size(struct file_ra_state *ra,
return 4 * cur;
if (cur <= max / 2)
return 2 * cur;
+ if (cur > max)
+ return cur;
return max;
}
--
2.43.5