On Tue, 02 Sep 2025, Arunpravin Paneer Selvam Arunpravin.PaneerSelvam@amd.com wrote:
Replace the freelist (O(n)) used for free block management with a red-black tree, providing more efficient O(log n) search, insert, and delete operations. This improves scalability and performance when managing large numbers of free blocks per order (e.g., hundreds or thousands).
In the VK-CTS memory stress subtest, the buddy manager merges fragmented memory and inserts freed blocks into the freelist. Since freelist insertion is O(n), this becomes a bottleneck as fragmentation increases. Benchmarking shows list_insert_sorted() consumes ~52.69% CPU with the freelist, compared to just 0.03% with the RB tree (rbtree_insert.isra.0), despite performing the same sorted insert.
This also improves performance in heavily fragmented workloads, such as games or graphics tests that stress memory.
v3(Matthew):
- Remove RB_EMPTY_NODE check in force_merge function.
- Rename rb for loop macros to have less generic names and move to .c file.
- Make the rb node rb and link field as union.
v4(Jani Nikula):
- The kernel-doc comment should be "/**"
- Move all the rbtree macros to rbtree.h and add parens to ensure correct precedence.
Signed-off-by: Arunpravin Paneer Selvam Arunpravin.PaneerSelvam@amd.com
drivers/gpu/drm/drm_buddy.c | 142 ++++++++++++++++++++++-------------- include/drm/drm_buddy.h | 9 ++- include/linux/rbtree.h | 56 ++++++++++++++ 3 files changed, 152 insertions(+), 55 deletions(-)
diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index a94061f373de..978cabfbcf0f 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c
...
+static inline struct drm_buddy_block * +rbtree_last_entry(struct drm_buddy *mm, unsigned int order)
Drive-by reminder that "inline" in a .c file is, in absense of evidence to the contrary, superfluous. Please just let the compiler do its job.
BR, Jani.