While allocating all 512 buffers in one block (just over 4MB) is probably not a good idea, you may need to allocated (and dma map) then in groups.
Thanks for reviewing. But got questions here to double confirm the
idea.
According to original code, it allocates 512 skbs for RX ring and dma mapping one by one. So, the new code allocates memory buffer 512 times to get 512 buffer arrays. Will the 512 buffers arrays be in one block? Do you mean aggregate the buffers as a scatterlist and use dma_map_sg?
If you malloc a buffer of size (8192+32) the allocator will either round it up to a whole number of (often 4k) pages or to a power of 2 of pages - so either 12k of 16k. I think the Linux allocator does the latter. Some of the allocators also 'steal' a bit from the front of the buffer for 'red tape'.
OTOH malloc the space 15 buffers and the allocator will round the 15*(8192 + 32) up to 32*4k - and you waste under 8k across all the buffers.
You then dma_map the large buffer and split into the actual rx buffers. Repeat until you've filled the entire ring. The only complication is remembering the base address (and size) for the dma_unmap and free. Although there is plenty of padding to extend the buffer structure significantly without using more memory. Allocate in 15's and you (probably) have 512 bytes per buffer. Allocate in 31's and you have 256 bytes.
The problem is that larger allocates are more likely to fail (especially if the system has been running for some time). So you almost certainly want to be able to fall back to smaller allocates even though they use more memory.
I also wonder if you actually need 512 8k rx buffers to cover interrupt latency? I've not done any measurements for 20 years!
Thanks for the explanation. I am not sure the combination of 512 8k RX buffers. Maybe Realtek folks can give us some idea. Tony Chuang any comment?
Jian-Hong Pan
512 RX buffers is not necessary I think. But I haven't had a chance to test if reduce the number of RX SKBs could affect the latency. I can run some throughput tests and then decide a minimum numbers that RX ring requires. Or if you can try it.
Thanks. Yan-Hsuan