Hello together,
On 11/5/21 22:34, Eric W. Biederman wrote:
I have to dash so this is short.
This is what I am thinking this change should look like.
I am not certain this is truly reviewable as a single change, so I will break it into a couple of smaller ones next time I get the chance.
I think we should concentrate to check the commit from Alexander.
What I did is to write two additional stress test apps - and now I'm able to trigger the use-after-free bug.
It is much simpler, the exclusion of exit_shm() and IPC_RMID didn't work - regardless if your approach or the approach from Alexander/myself is used.
+static inline void shm_clist_del(struct shmid_kernel *shp) +{
- struct task_struct *creator;
- rcu_read_lock();
- creator = rcu_dereference(shp->shm_creator);
- if (creator) {
task_lock(creator);
list_del(&shp->shm_clist);
task_unlock(creator);
- }
- rcu_read_unlock();
+}
shm_clist_del() only synchronizes against exit_shm() when shm_creator is not NULL.
list_del(&shp->shm_clist);
rcu_assign_pointer(shp->shm_creator, NULL);
We set shm_creator to NULL -> no more synchronization.
Now IPC_RMID can run in parallel - regardless if we test for list_empty() or shm_creator.
/* Guarantee shp lives after task_lock is dropped */
ipc_getref(&shp->shm_perm);
task_lock() doesn't help: As soon as shm_creator is set to NULL, IPC_RMID won't acquire task_lock() anymore.
Thus shp can disappear before we arrive at this ipc_getref.
[Yes, I think I have introduced this bug. ]
Corrected version attached.
I'll reboot and retest the patch, then I would send it to akpm as replacement for current patch in mmotm.
--
Manfred