On Wed, Feb 27, 2019 at 11:30:22AM +0100, Jan Kara wrote:
Hello!
Thanks for the detailed report and bisection!
On Wed 27-02-19 00:35:26, Thomas Lindroth wrote:
When I run "losetup --verbose --partscan --read-only --find /mnt/gemini.61rn.3T/Backups/debian.raw" on a 4.14.103 system losetup hangs for exactly 3 minutes. After the hang the loopback device works like it should. The /mnt/gemini.61rn.3T mount is an ext4 fs on dm-crypt on a spinning sata disk.
The hang was introduced in 4.14.95 and there are several loop related patches in 4.14.95. I bisected it down to commit c1e63df4f30c3918476ac9bc594355b0e9629893 "loop: Get rid of loop_index_mutex". Reverting that commit from 4.14.103 also fixes the problem.
So as you mention below, all the problems with loop device deadlocks didn't get fixed in stable kernels as some changes were too intrusive for the stable tree. Now unfortunately the commit 0a42e99b58a "loop: Get rid of loop_index_mutex" that did get backported makes some deadlocks much easier to hit as I'm looking into that now. For example when partitions are reread in loop_set_status(), it takes just one process trying to open the loop device to deadlock the kernel.
Actually that commit got already reverted in 4.4 stable because I've pointed out to Greg earlier that it has a doubtful benefit without followup fixes. But sadly it remained in other stable branches. Now going through the active branches the summary seems to be:
3.18 and older: never applied 4.4: already reverted 4.9: needs revert 4.14: needs revert 4.19 and newer: followup fixes applied
So Greg, can you please revert the same three commits that you've reverted in 4.4 also in 4.9 and 4.14 stable threes? These are:
0a42e99b58a "loop: Get rid of loop_index_mutex" 967d1dc144b "loop: Fold __loop_release into loop_release" 628bd859470 "loop: Fix double mutex_unlock(&loop_ctl_mutex) in loop_control_ioctl()"
Now all reverted, sorry about this.
greg k-h