Hello,
We discovered that the CONFIG_INFINIBAND_IRDMA configuration option in the linux kernel is causing excessive memory usage on idle mode on specific servers like the DELL VEP4600 (https://www.dell.com/en-us/shop/ipovw/virtual-edge-platform-4600.
By default we were using Debian's linux-image-6.1.0-13-amd64 which is the stable 6.1.55-1 amd64, we then compiled the kernel again with the same config file from the stable 6.1.55 tag and had the same problem. We were able to resolve the memory problem by removing the `CONFIG_INFINIBAND_IRDMA` option from the kernel config.
The tag used to reproduce the problem is v6.1.55. adding the following config `CONFIG_INFINIBAND_IRDMA=m` causes the excessive memory usage to go from 1.4Gb to 7Gb.
Here are the 2 config files used and the outputs showing the Memory usage in both cases.
Do you have any ideas regarding this bug, we only had this bug on this specific hardware model mentioned above.
Debian version of the linux 6.1.55:
#uname -a Linux wibox 6.1.0-13-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.55-1 (2023-09-29) x86_64 GNU/Linux
#free -m total used free shared buff/cache available Mem: 15724 6757 8840 3 410 8967 Swap: 0 0 0
Compiled version without INFINIBAND:
#uname -a Linux wibox 6.1.55-without-infiniband #75 SMP PREEMPT_DYNAMIC Mon Apr 29 19:06:23 CEST 2024 x86_64 GNU/Linux
#free -m total used free shared buff/cache available Mem: 15724 1480 13339 3 1205 14244 Swap: 0 0 0
Thank you,
On Mon, May 06, 2024 at 05:15:55PM +0200, Brian Baboch wrote:
Hello,
We discovered that the CONFIG_INFINIBAND_IRDMA configuration option in the linux kernel is causing excessive memory usage on idle mode on specific servers like the DELL VEP4600 (https://www.dell.com/en-us/shop/ipovw/virtual-edge-platform-4600.
By default we were using Debian's linux-image-6.1.0-13-amd64 which is the stable 6.1.55-1 amd64, we then compiled the kernel again with the same config file from the stable 6.1.55 tag and had the same problem. We were able to resolve the memory problem by removing the `CONFIG_INFINIBAND_IRDMA` option from the kernel config.
The tag used to reproduce the problem is v6.1.55. adding the following config `CONFIG_INFINIBAND_IRDMA=m` causes the excessive memory usage to go from 1.4Gb to 7Gb.
Hi Brian,
Why do you think that this is a bug? DELL VEP4600 supports RDMA, so by enabling CONFIG_INFINIBAND_IRDMA, you compiled RDMA support for Intel NIC. https://dl.dell.com/topicspdf/vep4600_tech_guide_en-us.pdf
You can unload irdma.ko module and restore memory footprint.
Thanks
Hello Leon,
I feel that it's a bug because I don't understand why is this module/option allocating 6GB of RAM without any explicit configuration or usage from us. It's also worth mentioning that we are using the default linux-image from Debian bookworm, and it took us a long time to understand the reason behind this memory increase by bisecting the kernel's config file. Moreover the documentation of the module doesn't mention anything regarding additional memory usage, we're talking about an increase of 6Gb which is huge since we're not using the option. So is that an expected behavior, to have this much increase in the memory consumption, when activating the RDMA option even if we're not using it ? If that's the case, perhaps it would be good to mention this in the documentation.
Thank you
On 5/7/24 13:27, Leon Romanovsky wrote:
On Mon, May 06, 2024 at 05:15:55PM +0200, Brian Baboch wrote:
Hello,
We discovered that the CONFIG_INFINIBAND_IRDMA configuration option in the linux kernel is causing excessive memory usage on idle mode on specific servers like the DELL VEP4600 (https://www.dell.com/en-us/shop/ipovw/virtual-edge-platform-4600.
By default we were using Debian's linux-image-6.1.0-13-amd64 which is the stable 6.1.55-1 amd64, we then compiled the kernel again with the same config file from the stable 6.1.55 tag and had the same problem. We were able to resolve the memory problem by removing the `CONFIG_INFINIBAND_IRDMA` option from the kernel config.
The tag used to reproduce the problem is v6.1.55. adding the following config `CONFIG_INFINIBAND_IRDMA=m` causes the excessive memory usage to go from 1.4Gb to 7Gb.
Hi Brian,
Why do you think that this is a bug? DELL VEP4600 supports RDMA, so by enabling CONFIG_INFINIBAND_IRDMA, you compiled RDMA support for Intel NIC. https://dl.dell.com/topicspdf/vep4600_tech_guide_en-us.pdf
You can unload irdma.ko module and restore memory footprint.
Thanks
Hello Leon,
I feel that it's a bug because I don't understand why is this module/option allocating 6GB of RAM without any explicit configuration or usage from us. It's also worth mentioning that we are using the default linux-image from Debian bookworm, and it took us a long time to understand the reason behind this memory increase by bisecting the kernel's config file. Moreover the documentation of the module doesn't mention anything regarding additional memory usage, we're talking about an increase of 6Gb which is huge since we're not using the option. So is that an expected behavior, to have this much increase in the memory consumption, when activating the RDMA option even if we're not using it ? If that's the case, perhaps it would be good to mention this in the documentation.
Thank you
Hi Brian,
I do not think it is a bug. The high memory usage seems to come from these lines: rsrc_size = irdma_calc_mem_rsrc_size(rf); rf->mem_rsrc = vzalloc(rsrc_size);
inside of irdma_initialize_hw_rsrc function. You can read the code of irdma_calc_mem_rsrc_size to understand the 6GB memory usage.
You can ask developers of irdma to optimize memory usage. Btw., module is loaded == module is used. There is no "loaded and unused".
Konstantin
在 2024/5/7 15:32, Konstantin Taranov 写道:
Hello Leon,
I feel that it's a bug because I don't understand why is this module/option allocating 6GB of RAM without any explicit configuration or usage from us. It's also worth mentioning that we are using the default linux-image from Debian bookworm, and it took us a long time to understand the reason behind this memory increase by bisecting the kernel's config file. Moreover the documentation of the module doesn't mention anything regarding additional memory usage, we're talking about an increase of 6Gb which is huge since we're not using the option. So is that an expected behavior, to have this much increase in the memory consumption, when activating the RDMA option even if we're not using it ? If that's the case, perhaps it would be good to mention this in the documentation.
Thank you
Hi Brian,
I do not think it is a bug. The high memory usage seems to come from these lines: rsrc_size = irdma_calc_mem_rsrc_size(rf); rf->mem_rsrc = vzalloc(rsrc_size);
Exactly. The memory usage is related with the number of QP. When on irdma, the Queue Pairs is 4092, Completion Queues is 8189, the memory usage is about 4194302.
The command "modprobe irdma limits_sel" will change QP numbers. 0 means minimum, up to 124 QPs.
Please use the command "modprobe irdma limits_sel=0" to make tests. Please let us know the test results.
Zhu Yanjun
inside of irdma_initialize_hw_rsrc function. You can read the code of irdma_calc_mem_rsrc_size to understand the 6GB memory usage.
You can ask developers of irdma to optimize memory usage. Btw., module is loaded == module is used. There is no "loaded and unused".
Konstantin
On Tue, May 07, 2024 at 05:24:51PM +0200, Zhu Yanjun wrote:
在 2024/5/7 15:32, Konstantin Taranov 写道:
Hello Leon,
I feel that it's a bug because I don't understand why is this module/option allocating 6GB of RAM without any explicit configuration or usage from us. It's also worth mentioning that we are using the default linux-image from Debian bookworm, and it took us a long time to understand the reason behind this memory increase by bisecting the kernel's config file. Moreover the documentation of the module doesn't mention anything regarding additional memory usage, we're talking about an increase of 6Gb which is huge since we're not using the option. So is that an expected behavior, to have this much increase in the memory consumption, when activating the RDMA option even if we're not using it ? If that's the case, perhaps it would be good to mention this in the documentation.
Thank you
Hi Brian,
I do not think it is a bug. The high memory usage seems to come from these lines: rsrc_size = irdma_calc_mem_rsrc_size(rf); rf->mem_rsrc = vzalloc(rsrc_size);
Exactly. The memory usage is related with the number of QP. When on irdma, the Queue Pairs is 4092, Completion Queues is 8189, the memory usage is about 4194302.
The command "modprobe irdma limits_sel" will change QP numbers. 0 means minimum, up to 124 QPs.
Please use the command "modprobe irdma limits_sel=0" to make tests. Please let us know the test results.
It seems like a really unfortunate design choice in this driver to not have dynamic memory allocation.
Burning 6G on every server that has your HW, regardless if any RDMA apps are run, seems completely excessive.
Jason
Subject: Re: Excessive memory usage when infiniband config is enabled
On Tue, May 07, 2024 at 05:24:51PM +0200, Zhu Yanjun wrote:
在 2024/5/7 15:32, Konstantin Taranov 写道:
Hello Leon,
I feel that it's a bug because I don't understand why is this module/option allocating 6GB of RAM without any explicit configuration or
usage from us.
It's also worth mentioning that we are using the default linux-image from Debian bookworm, and it took us a long time to understand the reason behind this memory increase by bisecting the
kernel's config file.
Moreover the documentation of the module doesn't mention anything regarding additional memory usage, we're talking about an increase of 6Gb which is huge since we're not using the option. So is that an expected behavior, to have this much increase in the memory consumption, when activating the RDMA option even if we're not using it ? If that's the case, perhaps it would be good to mention this in the documentation.
Thank you
Hi Brian,
I do not think it is a bug. The high memory usage seems to come from these
lines:
rsrc_size = irdma_calc_mem_rsrc_size(rf); rf->mem_rsrc = vzalloc(rsrc_size);
Exactly. The memory usage is related with the number of QP. When on irdma, the Queue Pairs is 4092, Completion Queues is 8189, the memory usage is about 4194302.
The command "modprobe irdma limits_sel" will change QP numbers. 0 means minimum, up to 124 QPs.
Please use the command "modprobe irdma limits_sel=0" to make tests. Please let us know the test results.
It seems like a really unfortunate design choice in this driver to not have dynamic memory allocation.
Burning 6G on every server that has your HW, regardless if any RDMA apps are run, seems completely excessive.
So the driver requires to pre-allocate backing pages for HW context objects during device initialization. At least for the x722 and e800 series product lines.
And the amount of memory allocated is proportional to the max QP (primarily) setup for the function.
One option is to set a lower default profile upon driver loading; which will reduce the memory footprint; but exposes lower QP and other verb resources per ib_device. And provide users with a devlink knob to choose a larger/smaller profile as they see fit.
This is sort of what limits_sel module parameter Yanjun suggested realizes, but it is not available in the in-tree driver.
Between, what is the specific Intel NIC model in use? lspci -vv | grep -E 'Eth.*Intel|Product'
Shiraz
Hello,
Thank you for your answers.
It's unfortunate that by default the irdma module uses an extra 5Gb of RAM, which is huge (more than 30% of the available RAM) and that there's practically no way of reducing it without deactivating the module since the limits_sel parameter is not available in the in-tree driver, as mentioned by Shiraz (I can confirm that gen1_limits_sel=0 works on my x722 card, I tested it, but it looks out of topic for me since it's not in the tree).
For my case, since I don't need the irdma module, I will just blacklist it as suggested, but I think that it would be better to change the default value of the resource limit selector so it doesn't consume this much RAM if it's not used.
On 5/8/24 03:24, Saleem, Shiraz wrote:
Subject: Re: Excessive memory usage when infiniband config is enabled
On Tue, May 07, 2024 at 05:24:51PM +0200, Zhu Yanjun wrote:
在 2024/5/7 15:32, Konstantin Taranov 写道:
Hello Leon,
I feel that it's a bug because I don't understand why is this module/option allocating 6GB of RAM without any explicit configuration or
usage from us.
It's also worth mentioning that we are using the default linux-image from Debian bookworm, and it took us a long time to understand the reason behind this memory increase by bisecting the
kernel's config file.
Moreover the documentation of the module doesn't mention anything regarding additional memory usage, we're talking about an increase of 6Gb which is huge since we're not using the option. So is that an expected behavior, to have this much increase in the memory consumption, when activating the RDMA option even if we're not using it ? If that's the case, perhaps it would be good to mention this in the documentation.
Thank you
Hi Brian,
I do not think it is a bug. The high memory usage seems to come from these
lines:
rsrc_size = irdma_calc_mem_rsrc_size(rf); rf->mem_rsrc = vzalloc(rsrc_size);
Exactly. The memory usage is related with the number of QP. When on irdma, the Queue Pairs is 4092, Completion Queues is 8189, the memory usage is about 4194302.
The command "modprobe irdma limits_sel" will change QP numbers. 0 means minimum, up to 124 QPs.
Please use the command "modprobe irdma limits_sel=0" to make tests. Please let us know the test results.
It seems like a really unfortunate design choice in this driver to not have dynamic memory allocation.
Burning 6G on every server that has your HW, regardless if any RDMA apps are run, seems completely excessive.
So the driver requires to pre-allocate backing pages for HW context objects during device initialization. At least for the x722 and e800 series product lines.
And the amount of memory allocated is proportional to the max QP (primarily) setup for the function.
One option is to set a lower default profile upon driver loading; which will reduce the memory footprint; but exposes lower QP and other verb resources per ib_device. And provide users with a devlink knob to choose a larger/smaller profile as they see fit.
This is sort of what limits_sel module parameter Yanjun suggested realizes, but it is not available in the in-tree driver.
Between, what is the specific Intel NIC model in use? lspci -vv | grep -E 'Eth.*Intel|Product'
Shiraz
linux-stable-mirror@lists.linaro.org