This is the start of the stable review cycle for the 4.19.32 release.
There are 45 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu Mar 28 04:26:41 UTC 2019.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.32-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.32-rc1
Baolin Wang <baolin.wang(a)linaro.org>
power: supply: charger-manager: Fix incorrect return value
Hui Wang <hui.wang(a)canonical.com>
ALSA: hda - Enforces runtime_resume after S3 and S4 for each codec
Takashi Iwai <tiwai(a)suse.de>
ALSA: hda - Record the current power state before suspend/resume calls
Vlastimil Babka <vbabka(a)suse.cz>
mm, mempolicy: fix uninit memory access
Waiman Long <longman(a)redhat.com>
locking/lockdep: Add debug_locks check in __lock_downgrade()
Jann Horn <jannh(a)google.com>
x86/unwind: Add hardcoded ORC entry for NULL
Jann Horn <jannh(a)google.com>
x86/unwind: Handle NULL pointer calls better in frame unwinder
Dongli Zhang <dongli.zhang(a)oracle.com>
loop: access lo_backing_file only when the loop device is Lo_bound
Florian Westphal <fw(a)strlen.de>
netfilter: ebtables: remove BUGPRINT messages
Chao Yu <yuchao0(a)huawei.com>
f2fs: fix to avoid deadlock of atomic file operations
Myungho Jung <mhjungk(a)gmail.com>
RDMA/cma: Rollback source IP address if failing to acquire device
Chris Wilson <chris(a)chris-wilson.co.uk>
drm: Reorder set_property_atomic to avoid returning with an active ww_ctx
Kefeng Wang <wangkefeng.wang(a)huawei.com>
Bluetooth: hci_ldisc: Postpone HCI_UART_PROTO_READY bit set in hci_uart_set_proto()
Jeremy Cline <jcline(a)redhat.com>
Bluetooth: hci_ldisc: Initialize hci_dev before open()
Myungho Jung <mhjungk(a)gmail.com>
Bluetooth: Fix decrementing reference count twice in releasing socket
Myungho Jung <mhjungk(a)gmail.com>
Bluetooth: hci_uart: Check if socket buffer is ERR_PTR in h4_recv_buf()
Hans Verkuil <hverkuil(a)xs4all.nl>
media: v4l2-ctrls.c/uvc: zero v4l2_event
zhangyi (F) <yi.zhang(a)huawei.com>
ext4: brelse all indirect buffer in ext4_ind_remove_space()
Lukas Czerner <lczerner(a)redhat.com>
ext4: fix data corruption caused by unaligned direct AIO
Jiufei Xue <jiufei.xue(a)linux.alibaba.com>
ext4: fix NULL pointer dereference while journal is aborted
Takashi Iwai <tiwai(a)suse.de>
ALSA: ac97: Fix of-node refcount unbalance
Arnd Bergmann <arnd(a)arndb.de>
ALSA: hda/ca0132 - make pci_iounmap() call conditional
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
ALSA: x86: Fix runtime PM for hdmi-lpe-audio
Steve French <stfrench(a)microsoft.com>
SMB3: Fix SMB3.1.1 guest mounts to Samba
Rasmus Villemoes <linux(a)rasmusvillemoes.dk>
irqchip/gic-v3-its: Fix comparison logic in lpi_range_cmp
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Move objtool_file struct off the stack
Adrian Hunter <adrian.hunter(a)intel.com>
perf probe: Fix getting the kernel map
Ronnie Sahlberg <lsahlber(a)redhat.com>
cifs: allow guest mounts to work for smb3.11
Chen Jie <chenjie6(a)huawei.com>
futex: Ensure that futex address is aligned in handle_futex_death()
Tyrel Datwyler <tyreld(a)linux.vnet.ibm.com>
scsi: ibmvscsi: Fix empty event pool access during host removal
Tyrel Datwyler <tyreld(a)linux.vnet.ibm.com>
scsi: ibmvscsi: Protect ibmvscsi_head from concurrent modificaiton
Michael Ellerman <mpe(a)ellerman.id.au>
powerpc/vdso64: Fix CLOCK_MONOTONIC inconsistencies across Y2038
Archer Yan <ayan(a)wavecomp.com>
MIPS: Fix kernel crash for R6 in jump label branch function
Yasha Cherikovsky <yasha.che3(a)gmail.com>
MIPS: Ensure ELF appended dtb is relocated
Yifeng Li <tomli(a)tomli.me>
mips: loongson64: lemote-2f: Add IRQF_NO_SUSPEND to "cascade" irqaction.
Jan Kara <jack(a)suse.cz>
udf: Fix crash on IO error during truncate
Ilya Dryomov <idryomov(a)gmail.com>
libceph: wait for latest osdmap in ceph_monc_blacklist_add()
Stanislaw Gruszka <sgruszka(a)redhat.com>
iommu/amd: fix sg->dma_address for sg->offset bigger than PAGE_SIZE
Deepak Rawat <drawat(a)vmware.com>
drm/vmwgfx: Return 0 when gmrid::get_node runs out of ID's
Thomas Zimmermann <tzimmermann(a)suse.de>
drm/vmwgfx: Don't double-free the mode stored in par->set_mode
Wolfram Sang <wsa+renesas(a)sang-engineering.com>
mmc: renesas_sdhi: limit block count to 16 bit for old revisions
Alexander Shiyan <shc_work(a)mail.ru>
mmc: mxcmmc: "Revert mmc: mxcmmc: handle highmem pages"
Arnd Bergmann <arnd(a)arndb.de>
mmc: pxamci: fix enum type confusion
Takashi Sakamoto <o-takashi(a)sakamocchi.jp>
ALSA: firewire-motu: use 'version' field of unit directory to identify model
Jaroslav Kysela <perex(a)perex.cz>
ALSA: hda - add Lenovo IdeaCentre B550 to the power_save_blacklist
-------------
Diffstat:
Makefile | 4 +-
arch/mips/include/asm/jump_label.h | 8 +-
arch/mips/kernel/vmlinux.lds.S | 12 ++-
arch/mips/loongson64/lemote-2f/irq.c | 2 +-
arch/powerpc/include/asm/vdso_datapage.h | 8 +-
arch/powerpc/kernel/vdso64/gettimeofday.S | 4 +-
arch/x86/include/asm/unwind.h | 6 ++
arch/x86/kernel/unwind_frame.c | 25 ++++-
arch/x86/kernel/unwind_orc.c | 17 ++++
drivers/block/loop.c | 2 +-
drivers/bluetooth/h4_recv.h | 4 +
drivers/bluetooth/hci_h4.c | 4 +
drivers/bluetooth/hci_ldisc.c | 24 +++--
drivers/gpu/drm/drm_mode_object.c | 5 +-
drivers/gpu/drm/vmwgfx/vmwgfx_fb.c | 12 +--
drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c | 2 +-
drivers/infiniband/core/cma.c | 13 ++-
drivers/iommu/amd_iommu.c | 7 +-
drivers/irqchip/irq-gic-v3-its.c | 2 +-
drivers/media/usb/uvc/uvc_ctrl.c | 2 +-
drivers/media/v4l2-core/v4l2-ctrls.c | 2 +-
drivers/mmc/host/mxcmmc.c | 16 +---
drivers/mmc/host/pxamci.c | 2 +-
drivers/mmc/host/renesas_sdhi_core.c | 8 +-
drivers/power/supply/charger-manager.c | 3 +-
drivers/scsi/ibmvscsi/ibmvscsi.c | 23 ++++-
fs/cifs/smb2pdu.c | 11 ++-
fs/ext4/ext4_jbd2.h | 2 +-
fs/ext4/file.c | 2 +-
fs/ext4/indirect.c | 12 ++-
fs/f2fs/segment.c | 43 ++++++---
fs/udf/truncate.c | 3 +
include/linux/ceph/libceph.h | 2 +
kernel/futex.c | 4 +
kernel/locking/lockdep.c | 3 +
mm/mempolicy.c | 2 +-
net/bluetooth/hci_sock.c | 3 +-
net/bridge/netfilter/ebtables.c | 131 ++++++++------------------
net/ceph/ceph_common.c | 18 +++-
net/ceph/mon_client.c | 9 ++
sound/ac97/bus.c | 2 +-
sound/firewire/motu/motu.c | 20 ++--
sound/pci/hda/hda_codec.c | 57 ++++++++++-
sound/pci/hda/hda_intel.c | 6 +-
sound/pci/hda/patch_ca0132.c | 2 +-
sound/x86/intel_hdmi_audio.c | 1 -
tools/objtool/check.c | 3 +-
tools/perf/util/probe-event.c | 6 +-
48 files changed, 355 insertions(+), 204 deletions(-)
When we hot-remove a device, usually the host sends us a PCI_EJECT message,
and a PCI_BUS_RELATIONS message with bus_rel->device_count == 0. But when
we do the quick hot-add/hot-remove test, the host may not send us the
PCI_EJECT message, if the guest has not fully finished the initialization
by sending the PCI_RESOURCES_ASSIGNED* message to the host, so it's
potentially unsafe to only depend on the pci_destroy_slot() in
hv_eject_device_work(), though create_root_hv_pci_bus() ->
hv_pci_assign_slots() is not called in this case. Note: in this case, the
host still sends the guest a PCI_BUS_RELATIONS message with
bus_rel->device_count == 0.
And, in the quick hot-add/hot-remove test, we can have such a race: before
pci_devices_present_work() -> new_pcichild_device() adds the new device
into hbus->children, we may have already received the PCI_EJECT message,
and hence the taklet handler hv_pci_onchannelcallback() may fail to find
the "hpdev" by get_pcichild_wslot(hbus, dev_message->wslot.slot), so
hv_pci_eject_device() is NOT called; later create_root_hv_pci_bus() ->
hv_pci_assign_slots() creates the slot, and the PCI_BUS_RELATIONS message
with bus_rel->device_count == 0 removes the device from hbus->children, and
we end up being unable to remove the slot in hv_pci_remove() ->
hv_pci_remove_slots().
The patch removes the slot in pci_devices_present_work() when the device
is removed. This can address the above race. Note 1:
pci_devices_present_work() and hv_eject_device_work() run in the
singled-threaded hbus->wq, so there is not a double-remove issue for the
slot. Note 2: we can't offload hv_pci_eject_device() from
hv_pci_onchannelcallback() to the workqueue, because we need
hv_pci_onchannelcallback() synchronously call hv_pci_eject_device() to
poll the channel's ringbuffer to work around the
"hangs in hv_compose_msi_msg()" issue: see
commit de0aa7b2f97d ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()")
Fixes: a15f2c08c708 ("PCI: hv: support reporting serial number as slot information")
Signed-off-by: Dexuan Cui <decui(a)microsoft.com>
Cc: stable(a)vger.kernel.org
---
drivers/pci/controller/pci-hyperv.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index b489412e3502..82acd6155adf 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -1776,6 +1776,10 @@ static void pci_devices_present_work(struct work_struct *work)
hpdev = list_first_entry(&removed, struct hv_pci_dev,
list_entry);
list_del(&hpdev->list_entry);
+
+ if (hpdev->pci_slot)
+ pci_destroy_slot(hpdev->pci_slot);
+
put_pcichild(hpdev);
}
--
2.19.1
This is the start of the stable review cycle for the 4.14.109 release.
There are 41 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu Mar 28 04:26:32 UTC 2019.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.109-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.109-rc1
Arnd Bergmann <arnd(a)arndb.de>
ath10k: avoid possible string overflow
Baolin Wang <baolin.wang(a)linaro.org>
power: supply: charger-manager: Fix incorrect return value
Enric Balletbo i Serra <enric.balletbo(a)collabora.com>
pwm-backlight: Enable/disable the PWM before/after LCD enable toggle.
Jules Maselbas <jules.maselbas(a)arm.com>
sched/cpufreq/schedutil: Fix error path mutex unlock
Baolin Wang <baolin.wang(a)linaro.org>
rtc: Fix overflow when converting time64_t to rtc_time
Kishon Vijay Abraham I <kishon(a)ti.com>
PCI: endpoint: Use EPC's device in dma_alloc_coherent()/dma_free_coherent()
Niklas Cassel <niklas.cassel(a)axis.com>
PCI: designware-ep: Read-only registers need DBI_RO_WR_EN to be writable
Niklas Cassel <niklas.cassel(a)axis.com>
PCI: designware-ep: dw_pcie_ep_set_msi() should only set MMC bits
kehuanlin <chgokhl(a)gmail.com>
scsi: ufs: fix wrong command type of UTRD for UFSHCI v2.1
Andrey Konovalov <andreyknvl(a)google.com>
USB: core: only clean up what we allocated
Peter Zijlstra <peterz(a)infradead.org>
lib/int_sqrt: optimize small argument
Hui Wang <hui.wang(a)canonical.com>
ALSA: hda - Enforces runtime_resume after S3 and S4 for each codec
Takashi Iwai <tiwai(a)suse.de>
ALSA: hda - Record the current power state before suspend/resume calls
Vlastimil Babka <vbabka(a)suse.cz>
mm, mempolicy: fix uninit memory access
Waiman Long <longman(a)redhat.com>
locking/lockdep: Add debug_locks check in __lock_downgrade()
Jann Horn <jannh(a)google.com>
x86/unwind: Add hardcoded ORC entry for NULL
Jann Horn <jannh(a)google.com>
x86/unwind: Handle NULL pointer calls better in frame unwinder
Florian Westphal <fw(a)strlen.de>
netfilter: ebtables: remove BUGPRINT messages
Chris Wilson <chris(a)chris-wilson.co.uk>
drm: Reorder set_property_atomic to avoid returning with an active ww_ctx
Kefeng Wang <wangkefeng.wang(a)huawei.com>
Bluetooth: hci_ldisc: Postpone HCI_UART_PROTO_READY bit set in hci_uart_set_proto()
Jeremy Cline <jcline(a)redhat.com>
Bluetooth: hci_ldisc: Initialize hci_dev before open()
Myungho Jung <mhjungk(a)gmail.com>
Bluetooth: Fix decrementing reference count twice in releasing socket
Myungho Jung <mhjungk(a)gmail.com>
Bluetooth: hci_uart: Check if socket buffer is ERR_PTR in h4_recv_buf()
Hans Verkuil <hverkuil(a)xs4all.nl>
media: v4l2-ctrls.c/uvc: zero v4l2_event
zhangyi (F) <yi.zhang(a)huawei.com>
ext4: brelse all indirect buffer in ext4_ind_remove_space()
Lukas Czerner <lczerner(a)redhat.com>
ext4: fix data corruption caused by unaligned direct AIO
Jiufei Xue <jiufei.xue(a)linux.alibaba.com>
ext4: fix NULL pointer dereference while journal is aborted
Ville Syrjälä <ville.syrjala(a)linux.intel.com>
ALSA: x86: Fix runtime PM for hdmi-lpe-audio
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Move objtool_file struct off the stack
Adrian Hunter <adrian.hunter(a)intel.com>
perf probe: Fix getting the kernel map
Chen Jie <chenjie6(a)huawei.com>
futex: Ensure that futex address is aligned in handle_futex_death()
Tyrel Datwyler <tyreld(a)linux.vnet.ibm.com>
scsi: ibmvscsi: Fix empty event pool access during host removal
Tyrel Datwyler <tyreld(a)linux.vnet.ibm.com>
scsi: ibmvscsi: Protect ibmvscsi_head from concurrent modificaiton
Archer Yan <ayan(a)wavecomp.com>
MIPS: Fix kernel crash for R6 in jump label branch function
Yasha Cherikovsky <yasha.che3(a)gmail.com>
MIPS: Ensure ELF appended dtb is relocated
Yifeng Li <tomli(a)tomli.me>
mips: loongson64: lemote-2f: Add IRQF_NO_SUSPEND to "cascade" irqaction.
Jan Kara <jack(a)suse.cz>
udf: Fix crash on IO error during truncate
Ilya Dryomov <idryomov(a)gmail.com>
libceph: wait for latest osdmap in ceph_monc_blacklist_add()
Stanislaw Gruszka <sgruszka(a)redhat.com>
iommu/amd: fix sg->dma_address for sg->offset bigger than PAGE_SIZE
Thomas Zimmermann <tzimmermann(a)suse.de>
drm/vmwgfx: Don't double-free the mode stored in par->set_mode
Arnd Bergmann <arnd(a)arndb.de>
mmc: pxamci: fix enum type confusion
-------------
Diffstat:
Makefile | 4 +-
arch/mips/include/asm/jump_label.h | 8 +-
arch/mips/kernel/vmlinux.lds.S | 12 +--
arch/mips/loongson64/lemote-2f/irq.c | 2 +-
arch/x86/include/asm/unwind.h | 6 ++
arch/x86/kernel/unwind_frame.c | 25 ++++++-
arch/x86/kernel/unwind_orc.c | 17 +++++
drivers/bluetooth/hci_h4.c | 4 +
drivers/bluetooth/hci_ldisc.c | 24 +++---
drivers/gpu/drm/drm_mode_object.c | 5 +-
drivers/gpu/drm/vmwgfx/vmwgfx_fb.c | 12 +--
drivers/iommu/amd_iommu.c | 7 +-
drivers/media/usb/uvc/uvc_ctrl.c | 2 +-
drivers/media/v4l2-core/v4l2-ctrls.c | 2 +-
drivers/mmc/host/pxamci.c | 2 +-
drivers/net/wireless/ath/ath10k/wmi.c | 2 +-
drivers/pci/dwc/pcie-designware-ep.c | 12 ++-
drivers/pci/dwc/pcie-designware.h | 1 +
drivers/pci/endpoint/pci-epc-core.c | 10 ---
drivers/pci/endpoint/pci-epf-core.c | 4 +-
drivers/power/supply/charger-manager.c | 3 +-
drivers/rtc/rtc-lib.c | 6 +-
drivers/scsi/ibmvscsi/ibmvscsi.c | 23 +++++-
drivers/scsi/ufs/ufshcd.c | 14 ++--
drivers/usb/core/config.c | 9 ++-
drivers/video/backlight/pwm_bl.c | 9 ++-
fs/ext4/ext4_jbd2.h | 2 +-
fs/ext4/file.c | 2 +-
fs/ext4/indirect.c | 12 ++-
fs/udf/truncate.c | 3 +
include/linux/ceph/libceph.h | 2 +
kernel/futex.c | 4 +
kernel/locking/lockdep.c | 3 +
kernel/sched/cpufreq_schedutil.c | 3 +-
lib/int_sqrt.c | 3 +
mm/mempolicy.c | 2 +-
net/bluetooth/hci_sock.c | 3 +-
net/bridge/netfilter/ebtables.c | 131 ++++++++++-----------------------
net/ceph/ceph_common.c | 18 ++++-
net/ceph/mon_client.c | 9 +++
sound/pci/hda/hda_codec.c | 57 +++++++++++++-
sound/x86/intel_hdmi_audio.c | 1 -
tools/objtool/check.c | 3 +-
tools/perf/util/probe-event.c | 6 +-
44 files changed, 303 insertions(+), 186 deletions(-)
After a device is just created in new_pcichild_device(), hpdev->refs is set
to 2 (i.e. the initial value of 1 plus the get_pcichild()).
When we hot remove the device from the host, in Linux VM we first call
hv_pci_eject_device(), which increases hpdev->refs by get_pcichild() and
then schedules a work of hv_eject_device_work(), so hpdev->refs becomes 3
(let's ignore the paired get/put_pcichild() in other places). But in
hv_eject_device_work(), currently we only call put_pcichild() twice,
meaning the 'hpdev' struct can't be freed in put_pcichild(). This patch
adds one put_pcichild() to fix the memory leak.
BTW, the device can also be removed when we run "rmmod pci-hyperv". On this
path (hv_pci_remove() -> hv_pci_bus_exit() -> hv_pci_devices_present()),
hpdev->refs is 2, and we do correctly call put_pcichild() twice in
pci_devices_present_work().
Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
Signed-off-by: Dexuan Cui <decui(a)microsoft.com>
Cc: <stable(a)vger.kernel.org>
---
drivers/pci/controller/pci-hyperv.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 95441a35eceb..30f16b882746 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -1900,6 +1900,9 @@ static void hv_eject_device_work(struct work_struct *work)
sizeof(*ejct_pkt), (unsigned long)&ctxt.pkt,
VM_PKT_DATA_INBAND, 0);
+ /* For the get_pcichild() in hv_pci_eject_device() */
+ put_pcichild(hpdev);
+ /* For the two refs got in new_pcichild_device() */
put_pcichild(hpdev);
put_pcichild(hpdev);
put_hvpcibus(hpdev->hbus);
--
2.19.1
Suppose more than one non-NPIV FCP device is active on the same channel.
Send I/O to storage and have some of the pending I/O run into a SCSI
command timeout, e.g. due to bit errors on the fibre. Now the error
situation stops. However, we saw FCP requests continue to timeout in the
channel. The abort will be successful, but the subsequent TUR fails.
Scsi_eh starts. The LUN reset fails. The target reset fails.
The host reset only did an FCP device recovery. However, for non-NPIV
FCP devices, this does not close and reopen ports on the SAN-side
if other non-NPIV FCP device(s) share the same open ports.
In order to resolve the continuing FCP request timeouts, we need to
explicitly close and reopen ports on the SAN-side.
This was missing since the beginning of zfcp in v2.6.0 history commit
ea127f975424 ("[PATCH] s390 (7/7): zfcp host adapter.").
Note: The FSF requests for forced port reopen could run into FSF request
timeouts due to other reasons. This would trigger an internal FCP device
recovery. Pending forced port reopen recoveries would get dismissed. So
some ports might not get fully reopened during this host reset handler.
However, subsequent I/O would trigger the above described escalation
and eventually all ports would be forced reopen to resolve any
continuing FCP request timeouts due to earlier bit errors.
Signed-off-by: Steffen Maier <maier(a)linux.ibm.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: <stable(a)vger.kernel.org> #3.0+
Reviewed-by: Jens Remus <jremus(a)linux.ibm.com>
Reviewed-by: Benjamin Block <bblock(a)linux.ibm.com>
---
drivers/s390/scsi/zfcp_erp.c | 14 ++++++++++++++
drivers/s390/scsi/zfcp_ext.h | 2 ++
drivers/s390/scsi/zfcp_scsi.c | 4 ++++
3 files changed, 20 insertions(+)
diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index c0b2348d7ce6..e8fc28dba8df 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -624,6 +624,20 @@ static void zfcp_erp_strategy_memwait(struct zfcp_erp_action *erp_action)
add_timer(&erp_action->timer);
}
+void zfcp_erp_port_forced_reopen_all(struct zfcp_adapter *adapter,
+ int clear, char *dbftag)
+{
+ unsigned long flags;
+ struct zfcp_port *port;
+
+ write_lock_irqsave(&adapter->erp_lock, flags);
+ read_lock(&adapter->port_list_lock);
+ list_for_each_entry(port, &adapter->port_list, list)
+ _zfcp_erp_port_forced_reopen(port, clear, dbftag);
+ read_unlock(&adapter->port_list_lock);
+ write_unlock_irqrestore(&adapter->erp_lock, flags);
+}
+
static void _zfcp_erp_port_reopen_all(struct zfcp_adapter *adapter,
int clear, char *dbftag)
{
diff --git a/drivers/s390/scsi/zfcp_ext.h b/drivers/s390/scsi/zfcp_ext.h
index 3fce47b0b21b..c6acca521ffe 100644
--- a/drivers/s390/scsi/zfcp_ext.h
+++ b/drivers/s390/scsi/zfcp_ext.h
@@ -70,6 +70,8 @@ extern void zfcp_erp_port_reopen(struct zfcp_port *port, int clear,
char *dbftag);
extern void zfcp_erp_port_shutdown(struct zfcp_port *, int, char *);
extern void zfcp_erp_port_forced_reopen(struct zfcp_port *, int, char *);
+extern void zfcp_erp_port_forced_reopen_all(struct zfcp_adapter *adapter,
+ int clear, char *dbftag);
extern void zfcp_erp_set_lun_status(struct scsi_device *, u32);
extern void zfcp_erp_clear_lun_status(struct scsi_device *, u32);
extern void zfcp_erp_lun_reopen(struct scsi_device *, int, char *);
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c
index f4f6a07c5222..221d0dfb8493 100644
--- a/drivers/s390/scsi/zfcp_scsi.c
+++ b/drivers/s390/scsi/zfcp_scsi.c
@@ -368,6 +368,10 @@ static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt)
struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
int ret = SUCCESS, fc_ret;
+ if (!(adapter->connection_features & FSF_FEATURE_NPIV_MODE)) {
+ zfcp_erp_port_forced_reopen_all(adapter, 0, "schrh_p");
+ zfcp_erp_wait(adapter);
+ }
zfcp_erp_adapter_reopen(adapter, 0, "schrh_1");
zfcp_erp_wait(adapter);
fc_ret = fc_block_scsi_eh(scpnt);
--
2.16.4
An already deleted SCSI device can exist on the Scsi_Host and remain
there because something still holds a reference.
A new SCSI device with the same H:C:T:L and FCP device, target port WWPN,
and FCP LUN can be created.
When we try to unblock an rport, we still find the deleted SCSI device
and return early because the zfcp_scsi_dev of that SCSI device is
not ZFCP_STATUS_COMMON_UNBLOCKED. Hence we miss to unblock the rport,
even if the new proper SCSI device would be in good state.
Therefore, skip deleted SCSI devices when iterating the sdevs of the shost.
[cf. __scsi_device_lookup{_by_target}() or scsi_device_get()]
The following abbreviated trace sequence can indicate such problem:
Area : REC
Tag : ersfs_3
LUN : 0x4045400300000000
WWPN : 0x50050763031bd327
LUN status : 0x40000000 not ZFCP_STATUS_COMMON_UNBLOCKED
Ready count : n not incremented yet
Running count : 0x00000000
ERP want : 0x01
ERP need : 0xc1 ZFCP_ERP_ACTION_NONE
Area : REC
Tag : ersfs_3
LUN : 0x4045400300000000
WWPN : 0x50050763031bd327
LUN status : 0x41000000
Ready count : n+1
Running count : 0x00000000
ERP want : 0x01
ERP need : 0x01
...
Area : REC
Level : 4 only with increased trace level
Tag : ertru_l
LUN : 0x4045400300000000
WWPN : 0x50050763031bd327
LUN status : 0x40000000
Request ID : 0x0000000000000000
ERP status : 0x01800000
ERP step : 0x1000
ERP action : 0x01
ERP count : 0x00
NOT followed by a trace record with tag "scpaddy"
for WWPN 0x50050763031bd327.
Signed-off-by: Steffen Maier <maier(a)linux.ibm.com>
Fixes: 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN recovery")
Cc: <stable(a)vger.kernel.org> #2.6.32+
Reviewed-by: Jens Remus <jremus(a)linux.ibm.com>
Reviewed-by: Benjamin Block <bblock(a)linux.ibm.com>
---
drivers/s390/scsi/zfcp_erp.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/s390/scsi/zfcp_erp.c b/drivers/s390/scsi/zfcp_erp.c
index 744a64680d5b..c0b2348d7ce6 100644
--- a/drivers/s390/scsi/zfcp_erp.c
+++ b/drivers/s390/scsi/zfcp_erp.c
@@ -1341,6 +1341,9 @@ static void zfcp_erp_try_rport_unblock(struct zfcp_port *port)
struct zfcp_scsi_dev *zsdev = sdev_to_zfcp(sdev);
int lun_status;
+ if (sdev->sdev_state == SDEV_DEL ||
+ sdev->sdev_state == SDEV_CANCEL)
+ continue;
if (zsdev->port != port)
continue;
/* LUN under port of interest */
--
2.16.4
Hello,
We ran automated tests on a patchset that was proposed for merging into this
kernel tree. The patches were applied to:
Kernel repo: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Commit: 239cc2c5a3c8 - Linux 5.0.4
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Merge testing
-------------
We cloned this repository and checked out a ref:
Repo: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Ref: 239cc2c5a3c8 - Linux 5.0.4
We then merged the patchset with `git am`:
alsa-hda-add-lenovo-ideacentre-b550-to-the-power_save_blacklist.patch
alsa-firewire-motu-use-version-field-of-unit-directory-to-identify-model.patch
mmc-pxamci-fix-enum-type-confusion.patch
mmc-alcor-fix-dma-reads.patch
mmc-mxcmmc-revert-mmc-mxcmmc-handle-highmem-pages.patch
mmc-renesas_sdhi-limit-block-count-to-16-bit-for-old-revisions.patch
drm-amdgpu-fix-invalid-use-of-change_bit.patch
drm-vmwgfx-don-t-double-free-the-mode-stored-in-par-set_mode.patch
drm-vmwgfx-return-0-when-gmrid-get_node-runs-out-of-id-s.patch
iommu-amd-fix-sg-dma_address-for-sg-offset-bigger-than-page_size.patch
iommu-iova-fix-tracking-of-recently-failed-iova-address.patch
libceph-wait-for-latest-osdmap-in-ceph_monc_blacklist_add.patch
udf-fix-crash-on-io-error-during-truncate.patch
mips-loongson64-lemote-2f-add-irqf_no_suspend-to-cascade-irqaction.patch
mips-ensure-elf-appended-dtb-is-relocated.patch
mips-fix-kernel-crash-for-r6-in-jump-label-branch-function.patch
powerpc-vdso64-fix-clock_monotonic-inconsistencies-across-y2038.patch
powerpc-security-fix-spectre_v2-reporting.patch
net-mlx5-fix-dct-creation-bad-flow.patch
scsi-core-avoid-that-a-kernel-warning-appears-during-system-resume.patch
scsi-qla2xxx-fix-fc-al-connection-target-discovery.patch
scsi-ibmvscsi-protect-ibmvscsi_head-from-concurrent-modificaiton.patch
scsi-ibmvscsi-fix-empty-event-pool-access-during-host-removal.patch
futex-ensure-that-futex-address-is-aligned-in-handle_futex_death.patch
cifs-allow-guest-mounts-to-work-for-smb3.11.patch
perf-probe-fix-getting-the-kernel-map.patch
objtool-move-objtool_file-struct-off-the-stack.patch
irqchip-gic-v3-its-fix-comparison-logic-in-lpi_range_cmp.patch
clocksource-drivers-riscv-fix-clocksource-mask.patch
smb3-fix-smb3.1.1-guest-mounts-to-samba.patch
alsa-hda-don-t-trigger-jackpoll_work-in-azx_resume.patch
alsa-ac97-fix-of-node-refcount-unbalance.patch
ext4-fix-null-pointer-dereference-while-journal-is-aborted.patch
ext4-fix-data-corruption-caused-by-unaligned-direct-aio.patch
ext4-brelse-all-indirect-buffer-in-ext4_ind_remove_space.patch
media-v4l2-ctrls.c-uvc-zero-v4l2_event.patch
bluetooth-hci_uart-check-if-socket-buffer-is-err_ptr-in-h4_recv_buf.patch
bluetooth-fix-decrementing-reference-count-twice-in-releasing-socket.patch
bluetooth-hci_ldisc-initialize-hci_dev-before-open.patch
bluetooth-hci_ldisc-postpone-hci_uart_proto_ready-bit-set-in-hci_uart_set_proto.patch
drm-vkms-fix-flush_work-without-init_work.patch
rdma-cma-rollback-source-ip-address-if-failing-to-acquire-device.patch
f2fs-fix-to-avoid-deadlock-of-atomic-file-operations.patch
aio-simplify-and-fix-fget-fput-for-io_submit.patch
netfilter-ebtables-remove-bugprint-messages.patch
loop-access-lo_backing_file-only-when-the-loop-device-is-lo_bound.patch
x86-unwind-handle-null-pointer-calls-better-in-frame-unwinder.patch
x86-unwind-add-hardcoded-orc-entry-for-null.patch
locking-lockdep-add-debug_locks-check-in-__lock_downgrade.patch
alsa-hda-record-the-current-power-state-before-suspend-resume-calls.patch
alsa-hda-enforces-runtime_resume-after-s3-and-s4-for-each-codec.patch
Compile testing
---------------
We compiled the kernel for 3 architectures:
aarch64:
make options: make INSTALL_MOD_STRIP=1 -j64 targz-pkg -j64
configuration: https://artifacts.cki-project.org/builds/aarch64/1a0e971af9d7121012aa2ed2a4…
ppc64le:
make options: make INSTALL_MOD_STRIP=1 -j64 targz-pkg -j64
configuration: https://artifacts.cki-project.org/builds/ppc64le/3b1f459123b54321696167667d…
x86_64:
make options: make INSTALL_MOD_STRIP=1 -j64 targz-pkg -j64
configuration: https://artifacts.cki-project.org/builds/x86_64/f2993845864d78a8d5286d9d34c…
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
✅ Boot test [0]
✅ LTP lite - release 20190115 [1]
✅ Loopdev Sanity [2]
✅ xfstests: ext4 [3]
✅ xfstests: xfs [3]
✅ AMTU (Abstract Machine Test Utility) [4]
🚧 ✅ audit: audit testsuite test [5]
✅ httpd: mod_ssl smoke sanity [6]
✅ httpd: php sanity [7]
🚧 ✅ iotop: sanity [8]
🚧 ✅ /CoreOS/net-snmp/Regression/bz251332-tcp-transport
🚧 ✅ tuned: tune-processes-through-perf [9]
✅ Usex - version 1.9-29 [10]
ppc64le:
✅ Boot test [0]
✅ LTP lite - release 20190115 [1]
✅ Loopdev Sanity [2]
✅ xfstests: ext4 [3]
✅ xfstests: xfs [3]
✅ AMTU (Abstract Machine Test Utility) [4]
🚧 ✅ audit: audit testsuite test [5]
✅ httpd: mod_ssl smoke sanity [6]
✅ httpd: php sanity [7]
🚧 ✅ iotop: sanity [8]
🚧 ✅ /CoreOS/net-snmp/Regression/bz251332-tcp-transport
🚧 ✅ tuned: tune-processes-through-perf [9]
✅ Usex - version 1.9-29 [10]
x86_64:
✅ Boot test [0]
✅ LTP lite - release 20190115 [1]
✅ Loopdev Sanity [2]
✅ xfstests: ext4 [3]
✅ xfstests: xfs [3]
✅ AMTU (Abstract Machine Test Utility) [4]
🚧 ✅ audit: audit testsuite test [5]
✅ httpd: mod_ssl smoke sanity [6]
✅ httpd: php sanity [7]
🚧 ✅ iotop: sanity [8]
🚧 ✅ /CoreOS/net-snmp/Regression/bz251332-tcp-transport
🚧 ✅ tuned: tune-processes-through-perf [9]
✅ Usex - version 1.9-29 [10]
Test source:
[0]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[1]: https://github.com/CKI-project/tests-beaker/archive/master.zip#distribution…
[2]: https://github.com/CKI-project/tests-beaker/archive/master.zip#filesystems/…
[3]: https://github.com/CKI-project/tests-beaker/archive/master.zip#/filesystems…
[4]: https://github.com/CKI-project/tests-beaker/archive/master.zip#misc/amtu
[5]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/aud…
[6]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/htt…
[7]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/htt…
[8]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/iot…
[9]: https://github.com/CKI-project/tests-beaker/archive/master.zip#packages/tun…
[10]: https://github.com/CKI-project/tests-beaker/archive/master.zip#standards/us…
Waived tests (marked with 🚧)
-----------------------------
This test run included waived tests. Such tests are executed but their results
are not taken into account. Tests are waived when their results are not
reliable enough, e.g. when they're just introduced or are being fixed.