- Linaro-mm-sig - lists.linaro.org

Re: [Linaro-mm-sig] [RFC PATCH v4 0/2] RDMA/rxe: Add dma-buf support

by Jason Gunthorpe

On Fri, Dec 03, 2021 at 12:51:44PM +0900, Shunsuke Mie wrote: > Hi maintainers, > > Could you please review this patch series? Why is it RFC? I'm confused why this is useful? This can't do copy from MMIO memory, so it shouldn't be compatible with things like Gaudi - does something prevent this? Jason

4 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

by Christian König

Am 03.12.21 um 15:50 schrieb Thomas Hellström: > > On 12/3/21 15:26, Christian König wrote: >> [Adding Daniel here as well] >> >> Am 03.12.21 um 15:18 schrieb Thomas Hellström: >>> [SNIP] >>>> Well that's ok as well. My question is why does this single dma_fence >>>> then shows up in the dma_fence_chain representing the whole >>>> migration? >>> What we'd like to happen during eviction is that we >>> >>> 1) await any exclusive- or moving fences, then schedule the migration >>> blit. The blit manages its own GPU ptes. Results in a single fence. >>> 2) Schedule unbind of any gpu vmas, resulting possibly in multiple >>> fences. >>> 3) Most but not all of the remaining resv shared fences will have been >>> finished in 2) We can't easily tell which so we have a couple of shared >>> fences left. >> >> Stop, wait a second here. We are going a bit in circles. >> >> Before you migrate a buffer, you *MUST* wait for all shared fences to >> complete. This is documented mandatory DMA-buf behavior. >> >> Daniel and I have discussed that quite extensively in the last few >> month. >> >> So how does it come that you do the blit before all shared fences are >> completed? > > Well we don't currently but wanted to... (I haven't consulted Daniel > in the matter, tbh). > > I was under the impression that all writes would add an exclusive > fence to the dma_resv. Yes that's correct. I'm working on to have more than one write fence, but that is currently under review. > If that's not the case or this is otherwise against the mandatory > DMA-buf bevhavior, we can certainly keep that part as is and that > would eliminate 3). Ah, now that somewhat starts to make sense. So your blit only waits for the writes to finish before starting the blit. Yes that's legal as long as you don't change the original content with the blit. But don't you then need to wait for both reads and writes before you unmap the VMAs? Anyway the good news is your problem totally goes away with the DMA-resv rework I've already send out. Basically it is now possible to have more than one fence in the DMA-resv object for migrations and all existing fences are kept around until they are finished. Regards, Christian. > > /Thomas >

4 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

by Christian König

[Adding Daniel here as well] Am 03.12.21 um 15:18 schrieb Thomas Hellström: > [SNIP] >> Well that's ok as well. My question is why does this single dma_fence >> then shows up in the dma_fence_chain representing the whole >> migration? > What we'd like to happen during eviction is that we > > 1) await any exclusive- or moving fences, then schedule the migration > blit. The blit manages its own GPU ptes. Results in a single fence. > 2) Schedule unbind of any gpu vmas, resulting possibly in multiple > fences. > 3) Most but not all of the remaining resv shared fences will have been > finished in 2) We can't easily tell which so we have a couple of shared > fences left. Stop, wait a second here. We are going a bit in circles. Before you migrate a buffer, you *MUST* wait for all shared fences to complete. This is documented mandatory DMA-buf behavior. Daniel and I have discussed that quite extensively in the last few month. So how does it come that you do the blit before all shared fences are completed? > 4) Add all fences resulting from 1) 2) and 3) into the per-memory-type > dma-fence-chain. > 5) hand the resulting dma-fence-chain representing the end of migration > over to ttm's resource manager. > > Now this means we have a dma-fence-chain disguised as a dma-fence out > in the wild, and it could in theory reappear as a 3) fence for another > migration unless a very careful audit is done, or as an input to the > dma-fence-array used for that single dependency. > >> That somehow doesn't seem to make sense because each individual step >> of >> the migration needs to wait for those dependencies as well even when >> it >> runs in parallel. >> >>> But that's not really the point, the point was that an (at least to >>> me) seemingly harmless usage pattern, be it real or fictious, ends >>> up >>> giving you severe internal- or cross-driver headaches. >> Yeah, we probably should document that better. But in general I don't >> see much reason to allow mixing containers. The dma_fence_array and >> dma_fence_chain objects have some distinct use cases and and using >> them >> to build up larger dependency structures sounds really questionable. > Yes, I tend to agree to some extent here. Perhaps add warnings when > adding a chain or array as an input to array and when accidently > joining chains, and provide helpers for flattening if needed. Yeah, that's probably a really good idea. Going to put it on my todo list. Thanks, Christian. > > /Thomas > > >> Christian. >> >>> /Thomas >>> >>> >>>> Regards, >>>> Christian. >>>> >>>> >

4 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

by Christian König

Am 01.12.21 um 13:16 schrieb Thomas Hellström (Intel): > > On 12/1/21 12:25, Christian König wrote: >> Am 01.12.21 um 12:04 schrieb Thomas Hellström (Intel): >>> >>> On 12/1/21 11:32, Christian König wrote: >>>> Am 01.12.21 um 11:15 schrieb Thomas Hellström (Intel): >>>>> [SNIP] >>>>>> >>>>>> What we could do is to avoid all this by not calling the callback >>>>>> with the lock held in the first place. >>>>> >>>>> If that's possible that might be a good idea, pls also see below. >>>> >>>> The problem with that is >>>> dma_fence_signal_locked()/dma_fence_signal_timestamp_locked(). If >>>> we could avoid using that or at least allow it to drop the lock >>>> then we could call the callback without holding it. >>>> >>>> Somebody would need to audit the drivers and see if holding the >>>> lock is really necessary anywhere. >>>> >>>>>> >>>>>>>> >>>>>>>>>> >>>>>>>>>> /Thomas >>>>>>>>> >>>>>>>>> Oh, and a follow up question: >>>>>>>>> >>>>>>>>> If there was a way to break the recursion on final put() >>>>>>>>> (using the same basic approach as patch 2 in this series uses >>>>>>>>> to break recursion in enable_signaling()), so that none of >>>>>>>>> these containers did require any special treatment, would it >>>>>>>>> be worth pursuing? I guess it might be possible by having the >>>>>>>>> callbacks drop the references rather than the loop in the >>>>>>>>> final put. + a couple of changes in code iterating over the >>>>>>>>> fence pointers. >>>>>>>> >>>>>>>> That won't really help, you just move the recursion from the >>>>>>>> final put into the callback. >>>>>>> >>>>>>> How do we recurse from the callback? The introduced fence_put() >>>>>>> of individual fence pointers >>>>>>> doesn't recurse anymore (at most 1 level), and any callback >>>>>>> recursion is broken by the irq_work? >>>>>> >>>>>> Yeah, but then you would need to take another lock to avoid >>>>>> racing with dma_fence_array_signaled(). >>>>>> >>>>>>> >>>>>>> I figure the big amount of work would be to adjust code that >>>>>>> iterates over the individual fence pointers to recognize that >>>>>>> they are rcu protected. >>>>>> >>>>>> Could be that we could solve this with RCU, but that sounds like >>>>>> a lot of churn for no gain at all. >>>>>> >>>>>> In other words even with the problems solved I think it would be >>>>>> a really bad idea to allow chaining of dma_fence_array objects. >>>>> >>>>> Yes, that was really the question, Is it worth pursuing this? I'm >>>>> not really suggesting we should allow this as an intentional >>>>> feature. I'm worried, however, that if we allow these containers >>>>> to start floating around cross-driver (or even internally) >>>>> disguised as ordinary dma_fences, they would require a lot of >>>>> driver special casing, or else completely unexpeced WARN_ON()s and >>>>> lockdep splats would start to turn up, scaring people off from >>>>> using them. And that would be a breeding ground for hairy >>>>> driver-private constructs. >>>> >>>> Well the question is why we would want to do it? >>>> >>>> If it's to avoid inter driver lock dependencies by avoiding to call >>>> the callback with the spinlock held, then yes please. We had tons >>>> of problems with that, resulting in irq_work and work_item >>>> delegation all over the place. >>> >>> Yes, that sounds like something desirable, but in these containers, >>> what's causing the lock dependencies is the enable_signaling() >>> callback that is typically called locked. >>> >>> >>>> >>>> If it's to allow nesting of dma_fence_array instances, then it's >>>> most likely a really bad idea even if we fix all the locking order >>>> problems. >>> >>> Well I think my use-case where I hit a dead end may illustrate what >>> worries me here: >>> >>> 1) We use a dma-fence-array to coalesce all dependencies for ttm >>> object migration. >>> 2) We use a dma-fence-chain to order the resulting dm_fence into a >>> timeline because the TTM resource manager code requires that. >>> >>> Initially seemingly harmless to me. >>> >>> But after a sequence evict->alloc->clear, the dma-fence-chain feeds >>> into the dma-fence-array for the clearing operation. Code still >>> works fine, and no deep recursion, no warnings. But if I were to add >>> another driver to the system that instead feeds a dma-fence-array >>> into a dma-fence-chain, this would give me a lockdep splat. >>> >>> So then if somebody were to come up with the splendid idea of using >>> a dma-fence-chain to initially coalesce fences, I'd hit the same >>> problem or risk illegaly joining two dma-fence-chains together. >>> >>> To fix this, I would need to look at the incoming fences and iterate >>> over any dma-fence-array or dma-fence-chain that is fed into the >>> dma-fence-array to flatten out the input. In fact all >>> dma-fence-array users would need to do that, and even >>> dma-fence-chain users watching out for not joining chains together >>> or accidently add an array that perhaps came as a disguised >>> dma-fence from antother driver. >>> >>> So the purpose to me would be to allow these containers as input to >>> eachother without a lot of in-driver special-casing, be it by >>> breaking recursion on built-in flattening to avoid >>> >>> a) Hitting issues in the future or with existing interoperating >>> drivers. >>> b) Avoid driver-private containers that also might break the >>> interoperability. (For example the i915 currently driver-private >>> dma_fence_work avoid all these problems, but we're attempting to >>> address issues in common code rather than re-inventing stuff >>> internally). >> >> I don't think that a dma_fence_array or dma_fence_chain is the right >> thing to begin with in those use cases. >> >> When you want to coalesce the dependencies for a job you could either >> use an xarray like Daniel did for the scheduler or some hashtable >> like we use in amdgpu. But I don't see the need for exposing the >> dma_fence interface for those. > > This is because the interface to our migration code takes just a > single dma-fence as dependency. Now this is of course something we > need to look at to mitigate this, but see below. Yeah, that's actually fine. >> >> And why do you use dma_fence_chain to generate a timeline for TTM? >> That should come naturally because all the moves must be ordered. > > Oh, in this case because we're looking at adding stuff at the end of > migration (like coalescing object shared fences and / or async unbind > fences), which may not complete in order. Well that's ok as well. My question is why does this single dma_fence then shows up in the dma_fence_chain representing the whole migration? That somehow doesn't seem to make sense because each individual step of the migration needs to wait for those dependencies as well even when it runs in parallel. > But that's not really the point, the point was that an (at least to > me) seemingly harmless usage pattern, be it real or fictious, ends up > giving you severe internal- or cross-driver headaches. Yeah, we probably should document that better. But in general I don't see much reason to allow mixing containers. The dma_fence_array and dma_fence_chain objects have some distinct use cases and and using them to build up larger dependency structures sounds really questionable. Christian. > > /Thomas > > >> >> Regards, >> Christian. >> >>

4 years, 2 months

1
0
0 0

[PATCH] drm/amdkfd: Use max() instead of doing it manually

by Jiapeng Chong

Fix following coccicheck warning: ./drivers/gpu/drm/amd/amdkfd/kfd_svm.c:2193:16-17: WARNING opportunity for max(). Reported-by: Abaci Robot <abaci(a)linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong(a)linux.alibaba.com> --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index f2db49c..4f7e7b1 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -2190,7 +2190,7 @@ void schedule_deferred_list_work(struct svm_range_list *svms) start = mni->interval_tree.start; last = mni->interval_tree.last; - start = (start > range->start ? start : range->start) >> PAGE_SHIFT; + start = max(start, range->start) >> PAGE_SHIFT; last = (last < (range->end - 1) ? last : range->end - 1) >> PAGE_SHIFT; pr_debug("[0x%lx 0x%lx] range[0x%lx 0x%lx] notifier[0x%lx 0x%lx] %d\n", start, last, range->start >> PAGE_SHIFT, -- 1.8.3.1

4 years, 2 months

2
1
0 0

Re: [Linaro-mm-sig] [syzbot] WARNING in __dma_map_sg_attrs

by Christoph Hellwig

This means the virtgpu driver uses dma mapping helpers but has not set up a DMA mask (which most likely suggests it is some kind of virtual device). On Wed, Dec 01, 2021 at 10:18:21AM -0800, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: c5c17547b778 Merge tag 'net-5.16-rc3' of git://git.kernel... > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=13a73609b00000 > kernel config: https://syzkaller.appspot.com/x/.config?x=bf85c53718a1e697 > dashboard link: https://syzkaller.appspot.com/bug?extid=10e27961f4da37c443b2 > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2 > > Unfortunately, I don't have any reproducer for this issue yet. > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+10e27961f4da37c443b2(a)syzkaller.appspotmail.com > > ------------[ cut here ]------------ > WARNING: CPU: 2 PID: 17169 at kernel/dma/mapping.c:188 __dma_map_sg_attrs+0x181/0x1f0 kernel/dma/mapping.c:188 > Modules linked in: > CPU: 0 PID: 17169 Comm: syz-executor.3 Not tainted 5.16.0-rc2-syzkaller #0 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 > RIP: 0010:__dma_map_sg_attrs+0x181/0x1f0 kernel/dma/mapping.c:188 > Code: 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 75 71 4c 8b 3d 70 6d b1 0d e9 db fe ff ff e8 86 ff 12 00 0f 0b e8 7f ff 12 00 <0f> 0b 45 31 e4 e9 54 ff ff ff e8 70 ff 12 00 49 8d 7f 50 48 b8 00 > RSP: 0018:ffffc90002c0fb20 EFLAGS: 00010216 > RAX: 0000000000013018 RBX: 0000000000000020 RCX: ffffc900037d4000 > RDX: 0000000000040000 RSI: ffffffff8163d361 RDI: ffff8880182ae4d0 > RBP: ffff8880182ae088 R08: 0000000000000002 R09: ffff888017ba054f > R10: ffffffff8163d242 R11: 000000000008808a R12: 0000000000000000 > R13: ffff888024ca5700 R14: 0000000000000001 R15: 0000000000000000 > FS: 00007fa269e34700(0000) GS:ffff88802cb00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 000000000040c120 CR3: 000000006c77c000 CR4: 0000000000150ee0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <TASK> > dma_map_sgtable+0x70/0xf0 kernel/dma/mapping.c:264 > drm_gem_map_dma_buf+0x12a/0x1e0 drivers/gpu/drm/drm_prime.c:633 > __map_dma_buf drivers/dma-buf/dma-buf.c:675 [inline] > dma_buf_map_attachment+0x39a/0x5b0 drivers/dma-buf/dma-buf.c:954 > drm_gem_prime_import_dev.part.0+0x85/0x220 drivers/gpu/drm/drm_prime.c:939 > drm_gem_prime_import_dev drivers/gpu/drm/drm_prime.c:982 [inline] > drm_gem_prime_import+0xc8/0x200 drivers/gpu/drm/drm_prime.c:982 > virtgpu_gem_prime_import+0x49/0x150 drivers/gpu/drm/virtio/virtgpu_prime.c:166 > drm_gem_prime_fd_to_handle+0x21d/0x550 drivers/gpu/drm/drm_prime.c:318 > drm_prime_fd_to_handle_ioctl+0x9b/0xd0 drivers/gpu/drm/drm_prime.c:374 > drm_ioctl_kernel+0x27d/0x4e0 drivers/gpu/drm/drm_ioctl.c:782 > drm_ioctl+0x51e/0x9d0 drivers/gpu/drm/drm_ioctl.c:885 > vfs_ioctl fs/ioctl.c:51 [inline] > __do_sys_ioctl fs/ioctl.c:874 [inline] > __se_sys_ioctl fs/ioctl.c:860 [inline] > __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x44/0xae > RIP: 0033:0x7fa26c8beae9 > Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 > RSP: 002b:00007fa269e34188 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 > RAX: ffffffffffffffda RBX: 00007fa26c9d1f60 RCX: 00007fa26c8beae9 > RDX: 00000000200004c0 RSI: 00000000c00c642e RDI: 0000000000000005 > RBP: 00007fa26c918f6d R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > R13: 00007ffc0019c51f R14: 00007fa269e34300 R15: 0000000000022000 > </TASK> > > > --- > This report is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkaller(a)googlegroups.com. > > syzbot will keep track of this issue. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. ---end quoted text---

4 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

by Christian König

Am 01.12.21 um 12:04 schrieb Thomas Hellström (Intel): > > On 12/1/21 11:32, Christian König wrote: >> Am 01.12.21 um 11:15 schrieb Thomas Hellström (Intel): >>> [SNIP] >>>> >>>> What we could do is to avoid all this by not calling the callback >>>> with the lock held in the first place. >>> >>> If that's possible that might be a good idea, pls also see below. >> >> The problem with that is >> dma_fence_signal_locked()/dma_fence_signal_timestamp_locked(). If we >> could avoid using that or at least allow it to drop the lock then we >> could call the callback without holding it. >> >> Somebody would need to audit the drivers and see if holding the lock >> is really necessary anywhere. >> >>>> >>>>>> >>>>>>>> >>>>>>>> /Thomas >>>>>>> >>>>>>> Oh, and a follow up question: >>>>>>> >>>>>>> If there was a way to break the recursion on final put() (using >>>>>>> the same basic approach as patch 2 in this series uses to break >>>>>>> recursion in enable_signaling()), so that none of these >>>>>>> containers did require any special treatment, would it be worth >>>>>>> pursuing? I guess it might be possible by having the callbacks >>>>>>> drop the references rather than the loop in the final put. + a >>>>>>> couple of changes in code iterating over the fence pointers. >>>>>> >>>>>> That won't really help, you just move the recursion from the >>>>>> final put into the callback. >>>>> >>>>> How do we recurse from the callback? The introduced fence_put() of >>>>> individual fence pointers >>>>> doesn't recurse anymore (at most 1 level), and any callback >>>>> recursion is broken by the irq_work? >>>> >>>> Yeah, but then you would need to take another lock to avoid racing >>>> with dma_fence_array_signaled(). >>>> >>>>> >>>>> I figure the big amount of work would be to adjust code that >>>>> iterates over the individual fence pointers to recognize that they >>>>> are rcu protected. >>>> >>>> Could be that we could solve this with RCU, but that sounds like a >>>> lot of churn for no gain at all. >>>> >>>> In other words even with the problems solved I think it would be a >>>> really bad idea to allow chaining of dma_fence_array objects. >>> >>> Yes, that was really the question, Is it worth pursuing this? I'm >>> not really suggesting we should allow this as an intentional >>> feature. I'm worried, however, that if we allow these containers to >>> start floating around cross-driver (or even internally) disguised as >>> ordinary dma_fences, they would require a lot of driver special >>> casing, or else completely unexpeced WARN_ON()s and lockdep splats >>> would start to turn up, scaring people off from using them. And that >>> would be a breeding ground for hairy driver-private constructs. >> >> Well the question is why we would want to do it? >> >> If it's to avoid inter driver lock dependencies by avoiding to call >> the callback with the spinlock held, then yes please. We had tons of >> problems with that, resulting in irq_work and work_item delegation >> all over the place. > > Yes, that sounds like something desirable, but in these containers, > what's causing the lock dependencies is the enable_signaling() > callback that is typically called locked. > > >> >> If it's to allow nesting of dma_fence_array instances, then it's most >> likely a really bad idea even if we fix all the locking order problems. > > Well I think my use-case where I hit a dead end may illustrate what > worries me here: > > 1) We use a dma-fence-array to coalesce all dependencies for ttm > object migration. > 2) We use a dma-fence-chain to order the resulting dm_fence into a > timeline because the TTM resource manager code requires that. > > Initially seemingly harmless to me. > > But after a sequence evict->alloc->clear, the dma-fence-chain feeds > into the dma-fence-array for the clearing operation. Code still works > fine, and no deep recursion, no warnings. But if I were to add another > driver to the system that instead feeds a dma-fence-array into a > dma-fence-chain, this would give me a lockdep splat. > > So then if somebody were to come up with the splendid idea of using a > dma-fence-chain to initially coalesce fences, I'd hit the same problem > or risk illegaly joining two dma-fence-chains together. > > To fix this, I would need to look at the incoming fences and iterate > over any dma-fence-array or dma-fence-chain that is fed into the > dma-fence-array to flatten out the input. In fact all dma-fence-array > users would need to do that, and even dma-fence-chain users watching > out for not joining chains together or accidently add an array that > perhaps came as a disguised dma-fence from antother driver. > > So the purpose to me would be to allow these containers as input to > eachother without a lot of in-driver special-casing, be it by breaking > recursion on built-in flattening to avoid > > a) Hitting issues in the future or with existing interoperating drivers. > b) Avoid driver-private containers that also might break the > interoperability. (For example the i915 currently driver-private > dma_fence_work avoid all these problems, but we're attempting to > address issues in common code rather than re-inventing stuff internally). I don't think that a dma_fence_array or dma_fence_chain is the right thing to begin with in those use cases. When you want to coalesce the dependencies for a job you could either use an xarray like Daniel did for the scheduler or some hashtable like we use in amdgpu. But I don't see the need for exposing the dma_fence interface for those. And why do you use dma_fence_chain to generate a timeline for TTM? That should come naturally because all the moves must be ordered. Regards, Christian.

4 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

by Christian König

Am 01.12.21 um 11:15 schrieb Thomas Hellström (Intel): > [SNIP] >> >> What we could do is to avoid all this by not calling the callback >> with the lock held in the first place. > > If that's possible that might be a good idea, pls also see below. The problem with that is dma_fence_signal_locked()/dma_fence_signal_timestamp_locked(). If we could avoid using that or at least allow it to drop the lock then we could call the callback without holding it. Somebody would need to audit the drivers and see if holding the lock is really necessary anywhere. >> >>>> >>>>>> >>>>>> /Thomas >>>>> >>>>> Oh, and a follow up question: >>>>> >>>>> If there was a way to break the recursion on final put() (using >>>>> the same basic approach as patch 2 in this series uses to break >>>>> recursion in enable_signaling()), so that none of these containers >>>>> did require any special treatment, would it be worth pursuing? I >>>>> guess it might be possible by having the callbacks drop the >>>>> references rather than the loop in the final put. + a couple of >>>>> changes in code iterating over the fence pointers. >>>> >>>> That won't really help, you just move the recursion from the final >>>> put into the callback. >>> >>> How do we recurse from the callback? The introduced fence_put() of >>> individual fence pointers >>> doesn't recurse anymore (at most 1 level), and any callback >>> recursion is broken by the irq_work? >> >> Yeah, but then you would need to take another lock to avoid racing >> with dma_fence_array_signaled(). >> >>> >>> I figure the big amount of work would be to adjust code that >>> iterates over the individual fence pointers to recognize that they >>> are rcu protected. >> >> Could be that we could solve this with RCU, but that sounds like a >> lot of churn for no gain at all. >> >> In other words even with the problems solved I think it would be a >> really bad idea to allow chaining of dma_fence_array objects. > > Yes, that was really the question, Is it worth pursuing this? I'm not > really suggesting we should allow this as an intentional feature. I'm > worried, however, that if we allow these containers to start floating > around cross-driver (or even internally) disguised as ordinary > dma_fences, they would require a lot of driver special casing, or else > completely unexpeced WARN_ON()s and lockdep splats would start to turn > up, scaring people off from using them. And that would be a breeding > ground for hairy driver-private constructs. Well the question is why we would want to do it? If it's to avoid inter driver lock dependencies by avoiding to call the callback with the spinlock held, then yes please. We had tons of problems with that, resulting in irq_work and work_item delegation all over the place. If it's to allow nesting of dma_fence_array instances, then it's most likely a really bad idea even if we fix all the locking order problems. Christian. > > /Thomas > > >> >> Christian. >> >>> >>> >>> Thanks, >>> >>> /Thomas >>> >>>

4 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [PATCH v4] dma-buf: system_heap: Use 'for_each_sgtable_sg' in pages free flow

by John Stultz

On Thu, Nov 25, 2021 at 11:48 PM <guangming.cao(a)mediatek.com> wrote: > > From: Guangming <Guangming.Cao(a)mediatek.com> > > For previous version, it uses 'sg_table.nent's to traverse sg_table in pages > free flow. > However, 'sg_table.nents' is reassigned in 'dma_map_sg', it means the number of > created entries in the DMA adderess space. > So, use 'sg_table.nents' in pages free flow will case some pages can't be freed. > > Here we should use sg_table.orig_nents to free pages memory, but use the > sgtable helper 'for each_sgtable_sg'(, instead of the previous rather common > helper 'for_each_sg' which maybe cause memory leak) is much better. > > Fixes: d963ab0f15fb0 ("dma-buf: system_heap: Allocate higher order pages if available") > Signed-off-by: Guangming <Guangming.Cao(a)mediatek.com> > Reviewed-by: Robin Murphy <robin.murphy(a)arm.com> > Cc: <stable(a)vger.kernel.org> # 5.11.* Thanks so much for catching this and sending in all the revisions! Reviewed-by: John Stultz <john.stultz(a)linaro.org>

4 years, 2 months

2
1
0 0

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

by Christian König

Am 01.12.21 um 09:23 schrieb Thomas Hellström (Intel): > [SNIP] >>>>> Jason and I came up with a deep dive iterator for his use case, but I >>>>> think we don't want to use that any more after my dma_resv rework. >>>>> >>>>> In other words when you need to create a new dma_fence_array you >>>>> flatten >>>>> out the existing construct which is at worst case >>>>> dma_fence_chain->dma_fence_array->dma_fence. >>>> Ok, Are there any cross-driver contract here, Like every driver >>>> using a >>>> dma_fence_array need to check for dma_fence_chain and flatten like >>>> above? >> >> So far we only discussed that on the mailing list but haven't made >> any documentation for that. > > OK, one other cross-driver pitfall I see is if someone accidently > joins two fence chains together by creating a fence chain unknowingly > using another fence chain as the @fence argument? That would indeed be illegal and we should probably add a WARN_ON() for that. > > The third cross-driver pitfall IMHO is the locking dependency these > containers add. Other drivers (read at least i915) may have defined > slightly different locking orders and that should also be addressed if > needed, but that requires a cross driver agreement what the locking > orders really are. Patch 1 actually addresses this, while keeping the > container lockdep warnings for deep recursions, so at least I think > that could serve as a discussion starter. No, drivers should never make any assumptions on that. E.g. when you need to take a look from a callback you must guarantee that you never have that lock taken when you call any of the dma_fence functions. Your patch breaks the lockdep annotation for that. What we could do is to avoid all this by not calling the callback with the lock held in the first place. >> >>>> >>>> /Thomas >>> >>> Oh, and a follow up question: >>> >>> If there was a way to break the recursion on final put() (using the >>> same basic approach as patch 2 in this series uses to break >>> recursion in enable_signaling()), so that none of these containers >>> did require any special treatment, would it be worth pursuing? I >>> guess it might be possible by having the callbacks drop the >>> references rather than the loop in the final put. + a couple of >>> changes in code iterating over the fence pointers. >> >> That won't really help, you just move the recursion from the final >> put into the callback. > > How do we recurse from the callback? The introduced fence_put() of > individual fence pointers > doesn't recurse anymore (at most 1 level), and any callback recursion > is broken by the irq_work? Yeah, but then you would need to take another lock to avoid racing with dma_fence_array_signaled(). > > I figure the big amount of work would be to adjust code that iterates > over the individual fence pointers to recognize that they are rcu > protected. Could be that we could solve this with RCU, but that sounds like a lot of churn for no gain at all. In other words even with the problems solved I think it would be a really bad idea to allow chaining of dma_fence_array objects. Christian. > > > Thanks, > > /Thomas > >

4 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

by Christian König

Am 30.11.21 um 20:27 schrieb Thomas Hellström: > > On 11/30/21 19:12, Thomas Hellström wrote: >> On Tue, 2021-11-30 at 16:02 +0100, Christian König wrote: >>> Am 30.11.21 um 15:35 schrieb Thomas Hellström: >>>> On Tue, 2021-11-30 at 14:26 +0100, Christian König wrote: >>>>> Am 30.11.21 um 13:56 schrieb Thomas Hellström: >>>>>> On 11/30/21 13:42, Christian König wrote: >>>>>>> Am 30.11.21 um 13:31 schrieb Thomas Hellström: >>>>>>>> [SNIP] >>>>>>>>> Other than that, I didn't investigate the nesting fails >>>>>>>>> enough to >>>>>>>>> say I can accurately review this. :) >>>>>>>> Basically the problem is that within enable_signaling() >>>>>>>> which >>>>>>>> is >>>>>>>> called with the dma_fence lock held, we take the dma_fence >>>>>>>> lock >>>>>>>> of >>>>>>>> another fence. If that other fence is a dma_fence_array, or >>>>>>>> a >>>>>>>> dma_fence_chain which in turn tries to lock a >>>>>>>> dma_fence_array >>>>>>>> we hit >>>>>>>> a splat. >>>>>>> Yeah, I already thought that you constructed something like >>>>>>> that. >>>>>>> >>>>>>> You get the splat because what you do here is illegal, you >>>>>>> can't >>>>>>> mix >>>>>>> dma_fence_array and dma_fence_chain like this or you can end >>>>>>> up >>>>>>> in a >>>>>>> stack corruption. >>>>>> Hmm. Ok, so what is the stack corruption, is it that the >>>>>> enable_signaling() will end up with endless recursion? If so, >>>>>> wouldn't >>>>>> it be more usable we break that recursion chain and allow a >>>>>> more >>>>>> general use? >>>>> The problem is that this is not easily possible for >>>>> dma_fence_array >>>>> containers. Just imagine that you drop the last reference to the >>>>> containing fences during dma_fence_array destruction if any of >>>>> the >>>>> contained fences is another container you can easily run into >>>>> recursion >>>>> and with that stack corruption. >>>> Indeed, that would require some deeper surgery. >>>> >>>>> That's one of the major reasons I came up with the >>>>> dma_fence_chain >>>>> container. This one you can chain any number of elements together >>>>> without running into any recursion. >>>>> >>>>>> Also what are the mixing rules between these? Never use a >>>>>> dma-fence-chain as one of the array fences and never use a >>>>>> dma-fence-array as a dma-fence-chain fence? >>>>> You can't add any other container to a dma_fence_array, neither >>>>> other >>>>> dma_fence_array instances nor dma_fence_chain instances. >>>>> >>>>> IIRC at least technically a dma_fence_chain can contain a >>>>> dma_fence_array if you absolutely need that, but Daniel, Jason >>>>> and I >>>>> already had the same discussion a while back and came to the >>>>> conclusion >>>>> to avoid that as well if possible. >>>> Yes, this is actually the use-case. But what I can't easily >>>> guarantee >>>> is that that dma_fence_chain isn't fed into a dma_fence_array >>>> somewhere >>>> else. How do you typically avoid that? >>>> >>>> Meanwhile I guess I need to take a different approach in the driver >>>> to >>>> avoid this altogether. >>> Jason and I came up with a deep dive iterator for his use case, but I >>> think we don't want to use that any more after my dma_resv rework. >>> >>> In other words when you need to create a new dma_fence_array you >>> flatten >>> out the existing construct which is at worst case >>> dma_fence_chain->dma_fence_array->dma_fence. >> Ok, Are there any cross-driver contract here, Like every driver using a >> dma_fence_array need to check for dma_fence_chain and flatten like >> above? So far we only discussed that on the mailing list but haven't made any documentation for that. >> >> /Thomas > > Oh, and a follow up question: > > If there was a way to break the recursion on final put() (using the > same basic approach as patch 2 in this series uses to break recursion > in enable_signaling()), so that none of these containers did require > any special treatment, would it be worth pursuing? I guess it might be > possible by having the callbacks drop the references rather than the > loop in the final put. + a couple of changes in code iterating over > the fence pointers. That won't really help, you just move the recursion from the final put into the callback. What could be possible is to use an work item for any possible operation, e.g. enabling, signaling and destruction. But in the last discussion everybody agreed that it is better to just flatten out the array. Christian. > > > /Thomas > >> >>> Regards, >>> Christian. >>> >>>> /Thomas >>>> >>>> >>>>> Regards, >>>>> Christian. >>>>> >>>>>> /Thomas >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Regards, >>>>>>> Christian. >>>>>>> >>>>>>>> But I'll update the commit message with a typical splat. >>>>>>>> >>>>>>>> /Thomas

4 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

by Christian König

Am 30.11.21 um 15:35 schrieb Thomas Hellström: > On Tue, 2021-11-30 at 14:26 +0100, Christian König wrote: >> Am 30.11.21 um 13:56 schrieb Thomas Hellström: >>> On 11/30/21 13:42, Christian König wrote: >>>> Am 30.11.21 um 13:31 schrieb Thomas Hellström: >>>>> [SNIP] >>>>>> Other than that, I didn't investigate the nesting fails >>>>>> enough to >>>>>> say I can accurately review this. :) >>>>> Basically the problem is that within enable_signaling() which >>>>> is >>>>> called with the dma_fence lock held, we take the dma_fence lock >>>>> of >>>>> another fence. If that other fence is a dma_fence_array, or a >>>>> dma_fence_chain which in turn tries to lock a dma_fence_array >>>>> we hit >>>>> a splat. >>>> Yeah, I already thought that you constructed something like that. >>>> >>>> You get the splat because what you do here is illegal, you can't >>>> mix >>>> dma_fence_array and dma_fence_chain like this or you can end up >>>> in a >>>> stack corruption. >>> Hmm. Ok, so what is the stack corruption, is it that the >>> enable_signaling() will end up with endless recursion? If so, >>> wouldn't >>> it be more usable we break that recursion chain and allow a more >>> general use? >> The problem is that this is not easily possible for dma_fence_array >> containers. Just imagine that you drop the last reference to the >> containing fences during dma_fence_array destruction if any of the >> contained fences is another container you can easily run into >> recursion >> and with that stack corruption. > Indeed, that would require some deeper surgery. > >> That's one of the major reasons I came up with the dma_fence_chain >> container. This one you can chain any number of elements together >> without running into any recursion. >> >>> Also what are the mixing rules between these? Never use a >>> dma-fence-chain as one of the array fences and never use a >>> dma-fence-array as a dma-fence-chain fence? >> You can't add any other container to a dma_fence_array, neither other >> dma_fence_array instances nor dma_fence_chain instances. >> >> IIRC at least technically a dma_fence_chain can contain a >> dma_fence_array if you absolutely need that, but Daniel, Jason and I >> already had the same discussion a while back and came to the >> conclusion >> to avoid that as well if possible. > Yes, this is actually the use-case. But what I can't easily guarantee > is that that dma_fence_chain isn't fed into a dma_fence_array somewhere > else. How do you typically avoid that? > > Meanwhile I guess I need to take a different approach in the driver to > avoid this altogether. Jason and I came up with a deep dive iterator for his use case, but I think we don't want to use that any more after my dma_resv rework. In other words when you need to create a new dma_fence_array you flatten out the existing construct which is at worst case dma_fence_chain->dma_fence_array->dma_fence. Regards, Christian. > > /Thomas > > >> Regards, >> Christian. >> >>> /Thomas >>> >>> >>> >>> >>>> Regards, >>>> Christian. >>>> >>>>> But I'll update the commit message with a typical splat. >>>>> >>>>> /Thomas >

4 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

by Christian König

Am 30.11.21 um 13:56 schrieb Thomas Hellström: > > On 11/30/21 13:42, Christian König wrote: >> Am 30.11.21 um 13:31 schrieb Thomas Hellström: >>> [SNIP] >>>> Other than that, I didn't investigate the nesting fails enough to >>>> say I can accurately review this. :) >>> >>> Basically the problem is that within enable_signaling() which is >>> called with the dma_fence lock held, we take the dma_fence lock of >>> another fence. If that other fence is a dma_fence_array, or a >>> dma_fence_chain which in turn tries to lock a dma_fence_array we hit >>> a splat. >> >> Yeah, I already thought that you constructed something like that. >> >> You get the splat because what you do here is illegal, you can't mix >> dma_fence_array and dma_fence_chain like this or you can end up in a >> stack corruption. > > Hmm. Ok, so what is the stack corruption, is it that the > enable_signaling() will end up with endless recursion? If so, wouldn't > it be more usable we break that recursion chain and allow a more > general use? The problem is that this is not easily possible for dma_fence_array containers. Just imagine that you drop the last reference to the containing fences during dma_fence_array destruction if any of the contained fences is another container you can easily run into recursion and with that stack corruption. That's one of the major reasons I came up with the dma_fence_chain container. This one you can chain any number of elements together without running into any recursion. > Also what are the mixing rules between these? Never use a > dma-fence-chain as one of the array fences and never use a > dma-fence-array as a dma-fence-chain fence? You can't add any other container to a dma_fence_array, neither other dma_fence_array instances nor dma_fence_chain instances. IIRC at least technically a dma_fence_chain can contain a dma_fence_array if you absolutely need that, but Daniel, Jason and I already had the same discussion a while back and came to the conclusion to avoid that as well if possible. Regards, Christian. > > /Thomas > > > > >> >> Regards, >> Christian. >> >>> >>> But I'll update the commit message with a typical splat. >>> >>> /Thomas >>

4 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

by Christian König

Am 30.11.21 um 13:31 schrieb Thomas Hellström: > [SNIP] >> Other than that, I didn't investigate the nesting fails enough to say >> I can accurately review this. :) > > Basically the problem is that within enable_signaling() which is > called with the dma_fence lock held, we take the dma_fence lock of > another fence. If that other fence is a dma_fence_array, or a > dma_fence_chain which in turn tries to lock a dma_fence_array we hit a > splat. Yeah, I already thought that you constructed something like that. You get the splat because what you do here is illegal, you can't mix dma_fence_array and dma_fence_chain like this or you can end up in a stack corruption. Regards, Christian. > > But I'll update the commit message with a typical splat. > > /Thomas

4 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [RFC PATCH 0/2] Attempt to avoid dma-fence-[chain|array] lockdep splats

by Christian König

Am 30.11.21 um 13:19 schrieb Thomas Hellström: > Introducing more usage of dma-fence-chain and dma-fence-array in the > i915 driver we start to hit lockdep splats due to the recursive fence > locking in the dma-fence-chain and dma-fence-array containers. > This is a humble suggestion to try to establish a dma-fence locking order > (patch 1) and to avoid excessive recursive locking in these containers > (patch 2) Well completely NAK to this. This splats are intentional notes that something in the driver code is wrong (or we messed up the chain and array containers somehow). Those two containers are intentionally crafted in a way which allows to avoid any dependency between their spinlocks. So as long as you correctly use them you should never see a splat. Please provide the lockdep splat so that we can analyze the problem. Thanks, Christian. > > Thomas Hellström (2): > dma-fence: Avoid establishing a locking order between fence classes > dma-fence: Avoid excessive recursive fence locking from > enable_signaling() callbacks > > drivers/dma-buf/dma-fence-array.c | 23 +++++++-- > drivers/dma-buf/dma-fence-chain.c | 12 ++++- > drivers/dma-buf/dma-fence.c | 79 +++++++++++++++++++++---------- > include/linux/dma-fence.h | 4 ++ > 4 files changed, 89 insertions(+), 29 deletions(-) > > Cc: linaro-mm-sig(a)lists.linaro.org > Cc: dri-devel(a)lists.freedesktop.org > Cc: Christian König <christian.koenig(a)amd.com> >

4 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [RFC PATCH 1/2] dma-fence: Avoid establishing a locking order between fence classes

by Maarten Lankhorst

On 30-11-2021 13:19, Thomas Hellström wrote: > The locking order for taking two fence locks is implicitly defined in > at least two ways in the code: > > 1) Fence containers first and other fences next, which is defined by > the enable_signaling() callbacks of dma_fence_chain and > dma_fence_array. > 2) Reverse signal order, which is used by __i915_active_fence_set(). > > Now 1) implies 2), except for the signal_on_any mode of dma_fence_array > and 2) does not imply 1), and also 1) makes locking order between > different containers confusing. > > Establish 2) and fix up the signal_on_any mode by calling > enable_signaling() on such fences unlocked at creation. > > Cc: linaro-mm-sig(a)lists.linaro.org > Cc: dri-devel(a)lists.freedesktop.org > Cc: Christian König <christian.koenig(a)amd.com> > Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com> > --- > drivers/dma-buf/dma-fence-array.c | 13 +++-- > drivers/dma-buf/dma-fence-chain.c | 3 +- > drivers/dma-buf/dma-fence.c | 79 +++++++++++++++++++++---------- > include/linux/dma-fence.h | 3 ++ > 4 files changed, 69 insertions(+), 29 deletions(-) > > diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c > index 3e07f961e2f3..0322b92909fe 100644 > --- a/drivers/dma-buf/dma-fence-array.c > +++ b/drivers/dma-buf/dma-fence-array.c > @@ -84,8 +84,8 @@ static bool dma_fence_array_enable_signaling(struct dma_fence *fence) > * insufficient). > */ > dma_fence_get(&array->base); > - if (dma_fence_add_callback(array->fences[i], &cb[i].cb, > - dma_fence_array_cb_func)) { > + if (dma_fence_add_callback_nested(array->fences[i], &cb[i].cb, > + dma_fence_array_cb_func)) { > int error = array->fences[i]->error; > > dma_fence_array_set_pending_error(array, error); > @@ -158,6 +158,7 @@ struct dma_fence_array *dma_fence_array_create(int num_fences, > { > struct dma_fence_array *array; > size_t size = sizeof(*array); > + struct dma_fence *fence; > > /* Allocate the callback structures behind the array. */ > size += num_fences * sizeof(struct dma_fence_array_cb); > @@ -165,8 +166,9 @@ struct dma_fence_array *dma_fence_array_create(int num_fences, > if (!array) > return NULL; > > + fence = &array->base; > spin_lock_init(&array->lock); > - dma_fence_init(&array->base, &dma_fence_array_ops, &array->lock, > + dma_fence_init(fence, &dma_fence_array_ops, &array->lock, > context, seqno); > init_irq_work(&array->work, irq_dma_fence_array_work); > > @@ -174,7 +176,10 @@ struct dma_fence_array *dma_fence_array_create(int num_fences, > atomic_set(&array->num_pending, signal_on_any ? 1 : num_fences); > array->fences = fences; > > - array->base.error = PENDING_ERROR; > + fence->error = PENDING_ERROR; > + > + if (signal_on_any) > + dma_fence_enable_sw_signaling(fence); > > return array; > } > diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c > index 1b4cb3e5cec9..0518e53880f6 100644 > --- a/drivers/dma-buf/dma-fence-chain.c > +++ b/drivers/dma-buf/dma-fence-chain.c > @@ -152,7 +152,8 @@ static bool dma_fence_chain_enable_signaling(struct dma_fence *fence) > struct dma_fence *f = chain ? chain->fence : fence; > > dma_fence_get(f); > - if (!dma_fence_add_callback(f, &head->cb, dma_fence_chain_cb)) { > + if (!dma_fence_add_callback_nested(f, &head->cb, > + dma_fence_chain_cb)) { > dma_fence_put(fence); > return true; > } > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c > index 066400ed8841..90a3d5121746 100644 > --- a/drivers/dma-buf/dma-fence.c > +++ b/drivers/dma-buf/dma-fence.c > @@ -610,6 +610,37 @@ void dma_fence_enable_sw_signaling(struct dma_fence *fence) > } > EXPORT_SYMBOL(dma_fence_enable_sw_signaling); > > +static int __dma_fence_add_callback(struct dma_fence *fence, > + struct dma_fence_cb *cb, > + dma_fence_func_t func, > + int nest_level) > +{ > + unsigned long flags; > + int ret = 0; > + > + if (WARN_ON(!fence || !func)) > + return -EINVAL; > + > + if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) { > + INIT_LIST_HEAD(&cb->node); > + return -ENOENT; > + } > + > + spin_lock_irqsave_nested(fence->lock, flags, 0); Forgot to hook up nest_level here? > + > + if (__dma_fence_enable_signaling(fence)) { > + cb->func = func; > + list_add_tail(&cb->node, &fence->cb_list); > + } else { > + INIT_LIST_HEAD(&cb->node); > + ret = -ENOENT; > + } > + > + spin_unlock_irqrestore(fence->lock, flags); > + > + return ret; > +} > + > /** > * dma_fence_add_callback - add a callback to be called when the fence > * is signaled > @@ -635,33 +666,33 @@ EXPORT_SYMBOL(dma_fence_enable_sw_signaling); > int dma_fence_add_callback(struct dma_fence *fence, struct dma_fence_cb *cb, > dma_fence_func_t func) > { > - unsigned long flags; > - int ret = 0; > - > - if (WARN_ON(!fence || !func)) > - return -EINVAL; > - > - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) { > - INIT_LIST_HEAD(&cb->node); > - return -ENOENT; > - } > - > - spin_lock_irqsave(fence->lock, flags); > - > - if (__dma_fence_enable_signaling(fence)) { > - cb->func = func; > - list_add_tail(&cb->node, &fence->cb_list); > - } else { > - INIT_LIST_HEAD(&cb->node); > - ret = -ENOENT; > - } > - > - spin_unlock_irqrestore(fence->lock, flags); > - > - return ret; > + return __dma_fence_add_callback(fence, cb, func, 0); > } > EXPORT_SYMBOL(dma_fence_add_callback); > Other than that, I didn't investigate the nesting fails enough to say I can accurately review this. :) ~Maarten

4 years, 2 months

1
0
0 0

completely rework the dma_resv semantic

by Christian König

Hi everyone, compared to the last version I've dropped the pruning as suggested by Maarten, split the new DMA_RESV_USAGE_* patches from the general introduction as suggeted by Daniel and renamed OTEHRS to BOOKKEEP as suggested by Pekka. Please take a look and review, Christian.

4 years, 2 months

2
30
0 0

Re: [Linaro-mm-sig] [PATCH] dma_fence_array: Fix PENDING_ERROR leak in dma_fence_array_signaled()

by Christian König

Am 29.11.21 um 13:46 schrieb Thomas Hellström: > On Mon, 2021-11-29 at 13:33 +0100, Christian König wrote: >> Am 29.11.21 um 13:23 schrieb Thomas Hellström: >>> Hi, Christian, >>> >>> On Mon, 2021-11-29 at 09:21 +0100, Christian König wrote: >>>> Am 29.11.21 um 08:35 schrieb Thomas Hellström: >>>>> If a dma_fence_array is reported signaled by a call to >>>>> dma_fence_is_signaled(), it may leak the PENDING_ERROR status. >>>>> >>>>> Fix this by clearing the PENDING_ERROR status if we return true >>>>> in >>>>> dma_fence_array_signaled(). >>>>> >>>>> Fixes: 1f70b8b812f3 ("dma-fence: Propagate errors to dma-fence- >>>>> array container") >>>>> Cc: linaro-mm-sig(a)lists.linaro.org >>>>> Cc: Christian König <ckoenig.leichtzumerken(a)gmail.com> >>>>> Cc: Chris Wilson <chris(a)chris-wilson.co.uk> >>>>> Signed-off-by: Thomas Hellström >>>>> <thomas.hellstrom(a)linux.intel.com> >>>> Reviewed-by: Christian König <christian.koenig(a)amd.com> >>> How are the dma-buf / dma-fence patches typically merged? If i915 >>> is >>> the only fence->error user, could we take this through drm-intel to >>> avoid a backmerge for upcoming i915 work? >> Well that one here looks like a bugfix to me, so either through >> drm-misc-fixes ore some i915 -fixes branch sounds fine to me. >> >> If you have any new development based on that a backmerge of the - >> fixes >> into your -next branch is unavoidable anyway. > Ok, I'll check with Joonas if I can take it through > drm-intel-gt-next, since fixes are cherry-picked from that one. Patch > will then appear in both the -fixes and the -next branch. Well exactly that's the stuff Daniel told me to avoid :) But maybe your i915 workflow is somehow better handling that than the AMD workflow. Christian. > > Thanks, > /Thomas > > >> Regards, >> Christian. >> >>> /Thomas >>> >>> >>>>> --- >>>>> drivers/dma-buf/dma-fence-array.c | 6 +++++- >>>>> 1 file changed, 5 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma- >>>>> buf/dma-fence-array.c >>>>> index d3fbd950be94..3e07f961e2f3 100644 >>>>> --- a/drivers/dma-buf/dma-fence-array.c >>>>> +++ b/drivers/dma-buf/dma-fence-array.c >>>>> @@ -104,7 +104,11 @@ static bool >>>>> dma_fence_array_signaled(struct >>>>> dma_fence *fence) >>>>> { >>>>> struct dma_fence_array *array = >>>>> to_dma_fence_array(fence); >>>>> >>>>> - return atomic_read(&array->num_pending) <= 0; >>>>> + if (atomic_read(&array->num_pending) > 0) >>>>> + return false; >>>>> + >>>>> + dma_fence_array_clear_pending_error(array); >>>>> + return true; >>>>> } >>>>> >>>>> static void dma_fence_array_release(struct dma_fence *fence) >

4 years, 2 months

3
2
0 0

Re: [Linaro-mm-sig] [PATCH 22/26] dma-buf: add enum dma_resv_usage

by Daniel Vetter

On Tue, Nov 23, 2021 at 03:21:07PM +0100, Christian König wrote: > This change adds the dma_resv_usage enum and allows us to specify why a > dma_resv object is queried for its containing fences. > > Additional to that a dma_resv_usage_rw() helper function is added to aid > retrieving the fences for a read or write userspace submission. > > This is then deployed to the different query functions of the dma_resv > object and all of their users. > > Signed-off-by: Christian König <christian.koenig(a)amd.com> Just a few comments on the kenreldoc while I scroll through. > EXPORT_SYMBOL(ib_umem_dmabuf_map_pages); > diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h > index 062571c04bca..37552935bca6 100644 > --- a/include/linux/dma-resv.h > +++ b/include/linux/dma-resv.h > @@ -49,6 +49,86 @@ extern struct ww_class reservation_ww_class; > > struct dma_resv_list; > > +/** > + * enum dma_resv_usage - how the fences from a dma_resv obj are used > + * > + * This enum describes the different use cases for a dma_resv object and > + * controls which fences are returned when queried. > + * > + * An important fact is that there is the order KERNEL<WRITE<READ<OTHER and > + * when the dma_resv object is asked for fences for one use case the fences > + * for the lower use case are returned as well. Might be good to replicate this to all functions that take a dma_resv_usage flag, and then also add a "See enum dma_resv_usage for more information." so we get a clickable hyperlink too. > + * > + * For example when asking for WRITE fences then the KERNEL fences are returned > + * as well. Similar when asked for READ fences then both WRITE and KERNEL > + * fences are returned as well. > + */ > +enum dma_resv_usage { > + /** > + * @DMA_RESV_USAGE_KERNEL: For in kernel memory management only. > + * > + * This should only be used for things like copying or clearing memory > + * with a DMA hardware engine for the purpose of kernel memory > + * management. > + * > + * Drivers *always* need to wait for those fences before accessing the > + * resource protected by the dma_resv object. The only exception for > + * that is when the resource is known to be locked down in place by > + * pinning it previously. Should dma_buf_pin also do the wait for kernel fences? I think that would also ba e bit clearer semantics in the dma-buf patch which does these waits for us. Or should dma_buf_pin be pipelined and it's up to callers to wait? For kms that's definitely the semantics we want, but it's a bit playing with fire situation, so maybe dma-buf should get the more idiot proof semantics? > + */ > + DMA_RESV_USAGE_KERNEL, > + > + /** > + * @DMA_RESV_USAGE_WRITE: Implicit write synchronization. > + * > + * This should only be used for userspace command submissions which add > + * an implicit write dependency. > + */ > + DMA_RESV_USAGE_WRITE, > + > + /** > + * @DMA_RESV_USAGE_READ: Implicit read synchronization. > + * > + * This should only be used for userspace command submissions which add > + * an implicit read dependency. > + */ > + DMA_RESV_USAGE_READ, > + > + /** > + * @DMA_RESV_USAGE_OTHER: No implicit sync. > + * > + * This should be used for operations which don't want to add an > + * implicit dependency at all, but still have a dependency on memory > + * management. > + * > + * This might include things like preemption fences as well as device > + * page table updates or even userspace command submissions. I think we should highlight a bit more that for explicitly synchronized userspace like vk OTHER is the normal case. So really not an exception. Ofc aside from amdkgf there's currently no driver doing this, but really we should have lots of them ... > + * > + * The kernel memory management *always* need to wait for those fences > + * before moving or freeing the resource protected by the dma_resv > + * object. > + */ > + DMA_RESV_USAGE_OTHER > +}; > + > +/** > + * dma_resv_usage_rw - helper for implicit sync > + * @write: true if we create a new implicit sync write > + * > + * This returns the implicit synchronization usage for write or read accesses. Pls add "See enum dma_resv_usage for more details." or so. Never hurts to be plentiful with links :-) > + */ > +static inline enum dma_resv_usage dma_resv_usage_rw(bool write) > +{ > + /* This looks confusing at first sight, but is indeed correct. > + * > + * The rational is that new write operations needs to wait for the > + * existing read and write operations to finish. > + * But a new read operation only needs to wait for the existing write > + * operations to finish. > + */ > + return write ? DMA_RESV_USAGE_READ : DMA_RESV_USAGE_WRITE; > +} > + > /** > * struct dma_resv - a reservation object manages fences for a buffer > * > @@ -147,8 +227,8 @@ struct dma_resv_iter { > /** @obj: The dma_resv object we iterate over */ > struct dma_resv *obj; > > - /** @all_fences: If all fences should be returned */ > - bool all_fences; > + /** @usage: Controls which fences are returned */ > + enum dma_resv_usage usage; > > /** @fence: the currently handled fence */ > struct dma_fence *fence; > @@ -178,14 +258,14 @@ struct dma_fence *dma_resv_iter_next(struct dma_resv_iter *cursor); > * dma_resv_iter_begin - initialize a dma_resv_iter object > * @cursor: The dma_resv_iter object to initialize > * @obj: The dma_resv object which we want to iterate over > - * @all_fences: If all fences should be returned or just the exclusive one > + * @usage: controls which fences to return Please add the blurb here I mentioned above. Maybe adjust the text to use the neatly highlighted @usage. > */ > static inline void dma_resv_iter_begin(struct dma_resv_iter *cursor, > struct dma_resv *obj, > - bool all_fences) > + enum dma_resv_usage usage) > { > cursor->obj = obj; > - cursor->all_fences = all_fences; > + cursor->usage = usage; > cursor->fence = NULL; > } > > @@ -242,7 +322,7 @@ static inline bool dma_resv_iter_is_restarted(struct dma_resv_iter *cursor) > * dma_resv_for_each_fence - fence iterator > * @cursor: a struct dma_resv_iter pointer > * @obj: a dma_resv object pointer > - * @all_fences: true if all fences should be returned > + * @usage: controls which fences to return > * @fence: the current fence > * Same, another place that needs the @usage clarification. > * Iterate over the fences in a struct dma_resv object while holding the > @@ -251,8 +331,8 @@ static inline bool dma_resv_iter_is_restarted(struct dma_resv_iter *cursor) > * valid as long as the lock is held and so no extra reference to the fence is > * taken. > */ > -#define dma_resv_for_each_fence(cursor, obj, all_fences, fence) \ > - for (dma_resv_iter_begin(cursor, obj, all_fences), \ > +#define dma_resv_for_each_fence(cursor, obj, usage, fence) \ > + for (dma_resv_iter_begin(cursor, obj, usage), \ > fence = dma_resv_iter_first(cursor); fence; \ > fence = dma_resv_iter_next(cursor)) > > @@ -421,14 +501,14 @@ void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context, > void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence); > void dma_resv_prune(struct dma_resv *obj); > void dma_resv_prune_unlocked(struct dma_resv *obj); > -int dma_resv_get_fences(struct dma_resv *obj, bool write, > +int dma_resv_get_fences(struct dma_resv *obj, enum dma_resv_usage usage, > unsigned int *num_fences, struct dma_fence ***fences); > -int dma_resv_get_singleton(struct dma_resv *obj, bool write, > +int dma_resv_get_singleton(struct dma_resv *obj, enum dma_resv_usage usage, > struct dma_fence **fence); > int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src); > -long dma_resv_wait_timeout(struct dma_resv *obj, bool wait_all, bool intr, > - unsigned long timeout); > -bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all); > +long dma_resv_wait_timeout(struct dma_resv *obj, enum dma_resv_usage usage, > + bool intr, unsigned long timeout); > +bool dma_resv_test_signaled(struct dma_resv *obj, enum dma_resv_usage usage); > void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq); I took endless amounts of discussions, but I think we're arriving at something really neat and tiny here now finally. Both semantics, and how drivers use them. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch

4 years, 2 months

2
2
0 0

Re: [Linaro-mm-sig] [PATCH] dma_fence_array: Fix PENDING_ERROR leak in dma_fence_array_signaled()

by Christian König

Am 29.11.21 um 13:23 schrieb Thomas Hellström: > Hi, Christian, > > On Mon, 2021-11-29 at 09:21 +0100, Christian König wrote: >> Am 29.11.21 um 08:35 schrieb Thomas Hellström: >>> If a dma_fence_array is reported signaled by a call to >>> dma_fence_is_signaled(), it may leak the PENDING_ERROR status. >>> >>> Fix this by clearing the PENDING_ERROR status if we return true in >>> dma_fence_array_signaled(). >>> >>> Fixes: 1f70b8b812f3 ("dma-fence: Propagate errors to dma-fence- >>> array container") >>> Cc: linaro-mm-sig(a)lists.linaro.org >>> Cc: Christian König <ckoenig.leichtzumerken(a)gmail.com> >>> Cc: Chris Wilson <chris(a)chris-wilson.co.uk> >>> Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com> >> Reviewed-by: Christian König <christian.koenig(a)amd.com> > How are the dma-buf / dma-fence patches typically merged? If i915 is > the only fence->error user, could we take this through drm-intel to > avoid a backmerge for upcoming i915 work? Well that one here looks like a bugfix to me, so either through drm-misc-fixes ore some i915 -fixes branch sounds fine to me. If you have any new development based on that a backmerge of the -fixes into your -next branch is unavoidable anyway. Regards, Christian. > > /Thomas > > >>> --- >>> drivers/dma-buf/dma-fence-array.c | 6 +++++- >>> 1 file changed, 5 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma- >>> buf/dma-fence-array.c >>> index d3fbd950be94..3e07f961e2f3 100644 >>> --- a/drivers/dma-buf/dma-fence-array.c >>> +++ b/drivers/dma-buf/dma-fence-array.c >>> @@ -104,7 +104,11 @@ static bool dma_fence_array_signaled(struct >>> dma_fence *fence) >>> { >>> struct dma_fence_array *array = to_dma_fence_array(fence); >>> >>> - return atomic_read(&array->num_pending) <= 0; >>> + if (atomic_read(&array->num_pending) > 0) >>> + return false; >>> + >>> + dma_fence_array_clear_pending_error(array); >>> + return true; >>> } >>> >>> static void dma_fence_array_release(struct dma_fence *fence) >

4 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [PATCH] dma_fence_array: Fix PENDING_ERROR leak in dma_fence_array_signaled()

by Christian König

Am 29.11.21 um 08:35 schrieb Thomas Hellström: > If a dma_fence_array is reported signaled by a call to > dma_fence_is_signaled(), it may leak the PENDING_ERROR status. > > Fix this by clearing the PENDING_ERROR status if we return true in > dma_fence_array_signaled(). > > Fixes: 1f70b8b812f3 ("dma-fence: Propagate errors to dma-fence-array container") > Cc: linaro-mm-sig(a)lists.linaro.org > Cc: Christian König <ckoenig.leichtzumerken(a)gmail.com> > Cc: Chris Wilson <chris(a)chris-wilson.co.uk> > Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com> Reviewed-by: Christian König <christian.koenig(a)amd.com> > --- > drivers/dma-buf/dma-fence-array.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c > index d3fbd950be94..3e07f961e2f3 100644 > --- a/drivers/dma-buf/dma-fence-array.c > +++ b/drivers/dma-buf/dma-fence-array.c > @@ -104,7 +104,11 @@ static bool dma_fence_array_signaled(struct dma_fence *fence) > { > struct dma_fence_array *array = to_dma_fence_array(fence); > > - return atomic_read(&array->num_pending) <= 0; > + if (atomic_read(&array->num_pending) > 0) > + return false; > + > + dma_fence_array_clear_pending_error(array); > + return true; > } > > static void dma_fence_array_release(struct dma_fence *fence)

4 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [PATCH 11/15] iio: buffer-dma: Boost performance using write-combine cache setting

by Lars-Peter Clausen

On 11/27/21 5:05 PM, Jonathan Cameron wrote: >> Non-coherent mapping with no cache sync: >> - fileio: >> read: 156 MiB/s >> write: 123 MiB/s >> - dmabuf: >> read: 234 MiB/s (capped by sample rate) >> write: 182 MiB/s >> >> Non-coherent reads with no cache sync + write-combine writes: >> - fileio: >> read: 156 MiB/s >> write: 140 MiB/s >> - dmabuf: >> read: 234 MiB/s (capped by sample rate) >> write: 210 MiB/s >> >> >> A few things we can deduce from this: >> >> * Write-combine is not available on Zynq/ARM? If it was working, it >> should give a better performance than the coherent mapping, but it >> doesn't seem to do anything at all. At least it doesn't harm >> performance. > I'm not sure it's very relevant to this sort of streaming write. > If you write a sequence of addresses then nothing stops them getting combined > into a single write whether or not it is write-combining. There is a difference at which point they can get combined. With write-combine they can be coalesced into a single transaction anywhere in the interconnect, as early as the CPU itself. Without write-cobmine the DDR controller might decide to combine them, but not earlier. This can make a difference especially if the write is a narrow write, i.e. the access size is smaller than the buswidth. Lets say you do 32-bit writes, but your bus is 64 bits wide. With WC two 32-bits can be combined into a 64-bit write. Without WC that is not possible and you are potentially not using the bus to its fullest capacity. This is especially true if the memory bus is wider than the widest access size of the CPU.

4 years, 2 months

1
0
0 0

completely rework the dma_resv semantic

by Christian König

Hi guys, as discussed before this set of patches completely rework the dma_resv semantic and spreads the new handling over all the existing drivers and users. First of all this drops the DAG approach because it requires that every single driver implements those relatively complicated rules correctly and any violation of that immediately leads to either corruption of freed memory or even more severe security problems. Instead we just keep all fences around all the time until they are signaled. Only fences with the same context are assumed to be signaled in the correct order since this is exercised elsewhere as well. Replacing fences is now only supported for hardware mechanism like VM page table updates where the hardware can guarantee that the resource can't be accessed any more. Then the concept of a single exclusive fence and multiple shared fences is dropped as well. Instead the dma_resv object is now just a container for dma_fence objects where each fence has associated usage flags. Those use flags describe how the operation represented by the dma_fence object is using the resource protected by the dma_resv object. This allows us to add multiple fences for each usage type. Additionally to the existing WRITE/READ usages this patch set also adds the new KERNEL and OTHER usages. The KERNEL usages is used in cases where the kernel needs to do some operation with the resource protected by the dma_resv object, like copies or clears. Those are mandatory to wait for when dynamic memory management is used. The OTHER usage is for cases where we don't want that the operation represented by the dma_fence object participate in any implicit sync but needs to be respected by the kernel memory management. Examples for those are VM page table updates and preemption fences. While doing this the new implementation cleans up existing workarounds all over the place, but especially amdgpu and TTM. Surprisingly I also found two use cases for the KERNEL/OTHER usage in i915 and Nouveau, those might need more thoughts. In general the existing functionality should been preserved, the only downside is that we now always need to reserve a slot before adding a fence. The newly added call to the reservation function can probably use some more cleanup. TODOs: Testing, testing, testing, doublechecking the newly added kerneldoc for any typos. Please review and/or comment, Christian.

4 years, 2 months

3
30
0 0

Re: [Linaro-mm-sig] [PATCH v4] dma-buf: system_heap: Use 'for_each_sgtable_sg' in pages free flow

by Christian König

Am 26.11.21 um 08:49 schrieb guangming.cao(a)mediatek.com: > From: Guangming <Guangming.Cao(a)mediatek.com> > > For previous version, it uses 'sg_table.nent's to traverse sg_table in pages > free flow. > However, 'sg_table.nents' is reassigned in 'dma_map_sg', it means the number of > created entries in the DMA adderess space. > So, use 'sg_table.nents' in pages free flow will case some pages can't be freed. > > Here we should use sg_table.orig_nents to free pages memory, but use the > sgtable helper 'for each_sgtable_sg'(, instead of the previous rather common > helper 'for_each_sg' which maybe cause memory leak) is much better. > > Fixes: d963ab0f15fb0 ("dma-buf: system_heap: Allocate higher order pages if available") > Signed-off-by: Guangming <Guangming.Cao(a)mediatek.com> > Reviewed-by: Robin Murphy <robin.murphy(a)arm.com> Reviewed-by: Christian König <christian.koenig(a)amd.com> > Cc: <stable(a)vger.kernel.org> # 5.11.* > --- > v4: Correct commit message > 1. Cc stable(a)vger.kernel.org in commit message and add required kernel version. > 2. Add reviewed-by since patch V2 and V4 are same and V2 is reviewed by Robin. > 3. There is no new code change in V4. > V3: Cc stable(a)vger.kernel.org > 1. This patch needs to be merged stable branch, add stable(a)vger.kernel.org > in mail list. > 2. Correct some spelling mistake. > 3. There is No new code change in V3. > V2: use 'for_each_sgtable_sg' to 'replece for_each_sg' as suggested by Robin. > > --- > drivers/dma-buf/heaps/system_heap.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c > index 23a7e74ef966..8660508f3684 100644 > --- a/drivers/dma-buf/heaps/system_heap.c > +++ b/drivers/dma-buf/heaps/system_heap.c > @@ -289,7 +289,7 @@ static void system_heap_dma_buf_release(struct dma_buf *dmabuf) > int i; > > table = &buffer->sg_table; > - for_each_sg(table->sgl, sg, table->nents, i) { > + for_each_sgtable_sg(table, sg, i) { > struct page *page = sg_page(sg); > > __free_pages(page, compound_order(page));

4 years, 2 months

1
0
0 0

Re: [Linaro-mm-sig] [PATCH v3] dma-buf: system_heap: Use 'for_each_sgtable_sg' in pages free flow

by Greg KH

On Fri, Nov 26, 2021 at 11:16:05AM +0800, guangming.cao(a)mediatek.com wrote: > From: Guangming <Guangming.Cao(a)mediatek.com> > > For previous version, it uses 'sg_table.nent's to traverse sg_table in pages > free flow. > However, 'sg_table.nents' is reassigned in 'dma_map_sg', it means the number of > created entries in the DMA adderess space. > So, use 'sg_table.nents' in pages free flow will case some pages can't be freed. > > Here we should use sg_table.orig_nents to free pages memory, but use the > sgtable helper 'for each_sgtable_sg'(, instead of the previous rather common > helper 'for_each_sg' which maybe cause memory leak) is much better. > > Fixes: d963ab0f15fb0 ("dma-buf: system_heap: Allocate higher order pages if available") > > Signed-off-by: Guangming <Guangming.Cao(a)mediatek.com> > --- > drivers/dma-buf/heaps/system_heap.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c > index 23a7e74ef966..8660508f3684 100644 > --- a/drivers/dma-buf/heaps/system_heap.c > +++ b/drivers/dma-buf/heaps/system_heap.c > @@ -289,7 +289,7 @@ static void system_heap_dma_buf_release(struct dma_buf *dmabuf) > int i; > > table = &buffer->sg_table; > - for_each_sg(table->sgl, sg, table->nents, i) { > + for_each_sgtable_sg(table, sg, i) { > struct page *page = sg_page(sg); > > __free_pages(page, compound_order(page)); > -- > 2.17.1 > <formletter> This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly. </formletter>

4 years, 2 months

1
0
0 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig