On 03/16/2018 12:14 PM, jglisse@redhat.com wrote:
From: Ralph Campbell rcampbell@nvidia.com
<snip>
+static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) +{
- struct hmm *hmm = mm->hmm;
- struct hmm_mirror *mirror;
- struct hmm_mirror *mirror_next;
- down_write(&hmm->mirrors_sem);
- list_for_each_entry_safe(mirror, mirror_next, &hmm->mirrors, list) {
list_del_init(&mirror->list);
if (mirror->ops->release)
mirror->ops->release(mirror);
- }
- up_write(&hmm->mirrors_sem);
+}
OK, as for actual code review:
This part of the locking looks good. However, I think it can race against hmm_mirror_register(), because hmm_mirror_register() will just add a new mirror regardless.
So:
thread 1 thread 2 -------------- ----------------- hmm_release hmm_mirror_register down_write(&hmm->mirrors_sem); <blocked: waiting for sem> // deletes all list items up_write unblocked: adds new mirror
...so I think we need a way to back out of any pending hmm_mirror_register() calls, as part of the .release steps, right? It seems hard for the device driver, which could be inside of hmm_mirror_register(), to handle that. Especially considering that right now, hmm_mirror_register() will return success in this case--so there is no indication that anything is wrong.
Maybe hmm_mirror_register() could return an error (and not add to the mirror list), in such a situation, how's that sound?
thanks,