On Fri, Sep 07, 2018 at 12:43:09PM +0200, Roger Pau Monné wrote:
I would prefer if you could avoid open-coding this here, and instead use xen_vbd_create or similar. I would also prefer that the call to xen_vbd_create in backend_changed was removed and we had a single call to xen_vbd_create that's used for both initial device connection and reconnection.
Also, I think this could cause issues if for some reason the frontend switches to state 'Connected' before hotplug scripts have run, in which case you would try to open an unexpected device because pdevice won't be correctly set.
Sure, this is just to test if the idea would work and needs a lot of cleanup. Unfortunately it does not seem to help with the original problem because this case is not executed on VM shutdown:
case XenbusStateClosed: xen_blkif_disconnect(be->blkif); xen_vbd_free(&be->blkif->vbd); xenbus_switch_state(dev, XenbusStateClosed);
Instead xen_vbd_free gets run from a different code path after the remove script has already failed:
[ 337.407634] block drbd0: State change failed: Device is held open by someone [ 337.407673] block drbd0: state = { cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate r----- } [ 337.407713] block drbd0: wanted = { cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate r----- } ... [ 340.109459] Workqueue: events xen_blkif_deferred_free [xen_blkback] [ 340.109461] 0000000000000000 ffffffff81331e54 ffff883f84d19d38 ffff883f84d19d32 [ 340.109463] ffffffffc058169e ffff883f84d19d88 ffff883f84d19d20 ffffffffc05816f7 [ 340.109465] ffff883f84d19d88 ffff883f87b5a900 ffffffff81092fea 0000000088ec3080 [ 340.109467] Call Trace: [ 340.109471] [<ffffffff81331e54>] ? dump_stack+0x5c/0x78 [ 340.109473] [<ffffffffc058169e>] ? xen_vbd_free.isra.9+0x2e/0x60 [xen_blkback] [ 340.109475] [<ffffffffc05816f7>] ? xen_blkif_deferred_free+0x27/0x70 [xen_blkback] [ 340.109477] [<ffffffff81092fea>] ? process_one_work+0x18a/0x420 [ 340.109479] [<ffffffff810932cd>] ? worker_thread+0x4d/0x490 [ 340.109480] [<ffffffff81093280>] ? process_one_work+0x420/0x420 [ 340.109482] [<ffffffff81099329>] ? kthread+0xd9/0xf0 [ 340.109484] [<ffffffff81099250>] ? kthread_park+0x60/0x60 [ 340.109486] [<ffffffff81615df7>] ? ret_from_fork+0x57/0x70