On Thu, Oct 9, 2025 at 8:33 AM Kairui Song ryncsn@gmail.com wrote:
On Thu, Oct 9, 2025 at 5:10 AM Chris Li chrisl@kernel.org wrote:
I suggest the allocation here detects there is a discard pending and running out of free blocks. Return there and indicate the need to discard. The caller performs the discard without holding the lock, similar to what you do with the order == 0 case.
Thanks for the suggestion. Right, that sounds even better. My initial though was that maybe we can just remove this discard completely since it rarely helps, and if the SSD is really that slow, OOM under heavy
Your argument is that cases happen very rarely. I agree with you on that. The follow up question is that, if that rare case does happen, are we doing the best we can in that situation? The V1 patch is not doing the best as we can, it is pretty much I don't care about the discard much, just ignore it unless order 0 failing forces our hand. As far as I can tell, properly handling that having discard pending condition is not much more complicated than your V1 patch, it might be even simpler because you don't have that order 0 failing logic any more.
pressure might even be an acceptable behaviour. But to make it safer, I made it do discard only when order 0 is failing so the code is simpler.
Let me sent a V2 to handle the discard carefully to reduce potential impact.
Great. Looking forward to it.
BTW, In the caller retry loop, the caller can retry the very swap device that has discard just perform on it, it does not need to retry from the very first swap device. In that regard, it is also a better behavior than V1 or even existing discard behavior, which waits for all devices to discard.
Chris