Andrew, this should replace these patches in mm-hotfixes-unstable: 400922513f30 ("maple_tree: fix mas_skip_node() end slot detection") 4d4ec28ef3a4 ("test_maple_tree: add more testing for mas_empty_area()")
mas_empty_area() was incorrectly returning an error when there was room. The issue was tracked down to mas_skip_node() using the incorrect end-of-slot count. Instead of using the nodes hard limit, the limit of data should be used.
mas_skip_node() was also setting the min and max to that of the child node, which was unnecessary. Within these limits being set, there was also a bug that corrupted the maple state's max if the offset was set to the maximum node pivot. The bug was without consequence unless there was a sufficient gap in the next child node which would cause an error to be returned.
This patch set fixes these errors by removing the limit setting from mas_skip_node() and uses the mas_data_end() for slot limits, and adds tests for all failures discovered.
Liam R. Howlett (2): maple_tree: Fix mas_skip_node() end slot detection test_maple_tree: Add more testing for mas_empty_area()
lib/maple_tree.c | 24 +++++----------------- lib/test_maple_tree.c | 48 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 53 insertions(+), 19 deletions(-)
mas_skip_node() is used to move the maple state to the node with a higher limit. It does this by walking up the tree and increasing the slot count. Since slot count may not be able to be increased, it may need to walk up multiple times to find room to walk right to a higher limit node. The limit of slots that was being used was the node limit and not the last location of data in the node. This would cause the maple state to be shifted outside actual data and enter an error state, thus returning -EBUSY.
The result of the incorrect error state means that mas_awalk() would return an error instead of finding the allocation space.
The fix is to use mas_data_end() in mas_skip_node() to detect the nodes data end point and continue walking the tree up until it is safe to move to a node with a higher limit.
The walk up the tree also sets the maple state limits so remove the buggy code from mas_skip_node(). Setting the limits had the unfortunate side effect of triggering another bug if the parent node was full and the there was no suitable gap in the second last child, but room in the next child.
mas_skip_node() may also be passed a maple state in an error state from mas_anode_descend() when no allocations are available. Return on such an error state immediately.
Reported-by: Snild Dolkow snild@sony.com Link: https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.c... Cc: Stable@vger.kernel.org Cc: Peng Zhang zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett Liam.Howlett@oracle.com --- lib/maple_tree.c | 24 +++++------------------- 1 file changed, 5 insertions(+), 19 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c index 2be86368237d..b1db0bd71aed 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -5188,35 +5188,21 @@ static inline bool mas_rewind_node(struct ma_state *mas) */ static inline bool mas_skip_node(struct ma_state *mas) { - unsigned char slot, slot_count; - unsigned long *pivots; - enum maple_type mt; + if (mas_is_err(mas)) + return false;
- mt = mte_node_type(mas->node); - slot_count = mt_slots[mt] - 1; do { if (mte_is_root(mas->node)) { - slot = mas->offset; - if (slot > slot_count) { + if (mas->offset >= mas_data_end(mas)) { mas_set_err(mas, -EBUSY); return false; } } else { mas_ascend(mas); - slot = mas->offset; - mt = mte_node_type(mas->node); - slot_count = mt_slots[mt] - 1; } - } while (slot > slot_count); - - mas->offset = ++slot; - pivots = ma_pivots(mas_mn(mas), mt); - if (slot > 0) - mas->min = pivots[slot - 1] + 1; - - if (slot <= slot_count) - mas->max = pivots[slot]; + } while (mas->offset >= mas_data_end(mas));
+ mas->offset++; return true; }
On 2023-03-07 19:02, Liam R. Howlett wrote:
mas_skip_node() is used to move the maple state to the node with a higher limit. It does this by walking up the tree and increasing the slot count. Since slot count may not be able to be increased, it may need to walk up multiple times to find room to walk right to a higher limit node. The limit of slots that was being used was the node limit and not the last location of data in the node. This would cause the maple state to be shifted outside actual data and enter an error state, thus returning -EBUSY.
The result of the incorrect error state means that mas_awalk() would return an error instead of finding the allocation space.
The fix is to use mas_data_end() in mas_skip_node() to detect the nodes data end point and continue walking the tree up until it is safe to move to a node with a higher limit.
The walk up the tree also sets the maple state limits so remove the buggy code from mas_skip_node(). Setting the limits had the unfortunate side effect of triggering another bug if the parent node was full and the there was no suitable gap in the second last child, but room in the next child.
mas_skip_node() may also be passed a maple state in an error state from mas_anode_descend() when no allocations are available. Return on such an error state immediately.
Reported-by: Snild Dolkow snild@sony.com Link: https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.c... Cc: Stable@vger.kernel.org Cc: Peng Zhang zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett Liam.Howlett@oracle.com
With the same method as in https://lore.kernel.org/lkml/6f674f9e-9f32-dabb-60be-0e757e145b14@sony.com/, 1000 runs all pass:
linux$ egrep -o 'Failed|Success' qemu.log | sort | uniq -c 1000 Success
If you want it:
Tested-by: Snild Dolkow snild@sony.com
//Snild
Test robust filling of an entire area of the tree, then test one beyond. This is to test the walking back up the tree at the end of nodes and error condition. Test inspired by the reproducer code provided by Snild Dolkow.
The last test in the function tests for the case of a corrupted maple state caused by the incorrect limits set during mas_skip_node(). There needs to be a gap in the second last child and last child, but the search must rule out the second last child's gap. This would avoid correcting the maple state to the correct max limit and return an error.
Cc: Snild Dolkow snild@sony.com Link: https://lore.kernel.org/linux-mm/cb8dc31a-fef2-1d09-f133-e9f7b9f9e77a@sony.c... Cc: Stable@vger.kernel.org Cc: Peng Zhang zhangpeng.00@bytedance.com Fixes: e15e06a83923 ("lib/test_maple_tree: add testing for maple tree") Signed-off-by: Liam R. Howlett Liam.Howlett@oracle.com --- lib/test_maple_tree.c | 48 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+)
diff --git a/lib/test_maple_tree.c b/lib/test_maple_tree.c index 3d19b1f78d71..f1db333270e9 100644 --- a/lib/test_maple_tree.c +++ b/lib/test_maple_tree.c @@ -2670,6 +2670,49 @@ static noinline void check_empty_area_window(struct maple_tree *mt) rcu_read_unlock(); }
+static noinline void check_empty_area_fill(struct maple_tree *mt) +{ + const unsigned long max = 0x25D78000; + unsigned long size; + int loop, shift; + MA_STATE(mas, mt, 0, 0); + + mt_set_non_kernel(99999); + for (shift = 12; shift <= 16; shift++) { + loop = 5000; + size = 1 << shift; + while (loop--) { + mas_set(&mas, 0); + mas_lock(&mas); + MT_BUG_ON(mt, mas_empty_area(&mas, 0, max, size) != 0); + MT_BUG_ON(mt, mas.last != mas.index + size - 1); + mas_store_gfp(&mas, (void *)size, GFP_KERNEL); + mas_unlock(&mas); + mas_reset(&mas); + } + } + + /* No space left. */ + size = 0x1000; + rcu_read_lock(); + MT_BUG_ON(mt, mas_empty_area(&mas, 0, max, size) != -EBUSY); + rcu_read_unlock(); + + /* Fill a depth 3 node to the maximum */ + for (unsigned long i = 629440511; i <= 629440800; i += 6) + mtree_store_range(mt, i, i + 5, (void *)i, GFP_KERNEL); + /* Make space in the second-last depth 4 node */ + mtree_erase(mt, 631668735); + /* Make space in the last depth 4 node */ + mtree_erase(mt, 629506047); + mas_reset(&mas); + /* Search from just after the gap in the second-last depth 4 */ + rcu_read_lock(); + MT_BUG_ON(mt, mas_empty_area(&mas, 629506048, 690000000, 0x5000) != 0); + rcu_read_unlock(); + mt_set_non_kernel(0); +} + static DEFINE_MTREE(tree); static int maple_tree_seed(void) { @@ -2926,6 +2969,11 @@ static int maple_tree_seed(void) check_empty_area_window(&tree); mtree_destroy(&tree);
+ mt_init_flags(&tree, MT_FLAGS_ALLOC_RANGE); + check_empty_area_fill(&tree); + mtree_destroy(&tree); + + #if defined(BENCH) skip: #endif
linux-stable-mirror@lists.linaro.org