This is a note to let you know that I've just added the patch titled
iommu/io-pgtable-arm-v7s: Check for leaf entry before dereferencing it
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
iommu-io-pgtable-arm-v7s-check-for-leaf-entry-before-dereferencing-it.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Dec 18 14:12:34 CET 2017
From: Oleksandr Tyshchenko <oleksandr_tyshchenko(a)epam.com>
Date: Mon, 27 Feb 2017 14:30:26 +0200
Subject: iommu/io-pgtable-arm-v7s: Check for leaf entry before dereferencing it
From: Oleksandr Tyshchenko <oleksandr_tyshchenko(a)epam.com>
[ Upstream commit a03849e7210277fa212779b7cd9c30e1ab6194b2 ]
Do a check for already installed leaf entry at the current level before
dereferencing it in order to avoid walking the page table down with
wrong pointer to the next level.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko(a)epam.com>
CC: Will Deacon <will.deacon(a)arm.com>
CC: Robin Murphy <robin.murphy(a)arm.com>
Signed-off-by: Will Deacon <will.deacon(a)arm.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/iommu/io-pgtable-arm-v7s.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -418,8 +418,12 @@ static int __arm_v7s_map(struct arm_v7s_
pte |= ARM_V7S_ATTR_NS_TABLE;
__arm_v7s_set_pte(ptep, pte, 1, cfg);
- } else {
+ } else if (ARM_V7S_PTE_IS_TABLE(pte, lvl)) {
cptep = iopte_deref(pte, lvl);
+ } else {
+ /* We require an unmap first */
+ WARN_ON(!selftest_running);
+ return -EEXIST;
}
/* Rinse, repeat */
Patches currently in stable-queue which might be from oleksandr_tyshchenko(a)epam.com are
queue-4.9/iommu-io-pgtable-arm-v7s-check-for-leaf-entry-before-dereferencing-it.patch
This is a note to let you know that I've just added the patch titled
iommu/amd: Limit the IOVA page range to the specified addresses
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
iommu-amd-limit-the-iova-page-range-to-the-specified-addresses.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Dec 18 14:12:35 CET 2017
From: Gary R Hook <gary.hook(a)amd.com>
Date: Fri, 3 Nov 2017 10:50:34 -0600
Subject: iommu/amd: Limit the IOVA page range to the specified addresses
From: Gary R Hook <gary.hook(a)amd.com>
[ Upstream commit b92b4fb5c14257c0e7eae291ecc1f7b1962e1699 ]
The extent of pages specified when applying a reserved region should
include up to the last page of the range, but not the page following
the range.
Signed-off-by: Gary R Hook <gary.hook(a)amd.com>
Fixes: 8d54d6c8b8f3 ('iommu/amd: Implement apply_dm_region call-back')
Signed-off-by: Alex Williamson <alex.williamson(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/iommu/amd_iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3211,7 +3211,7 @@ static void amd_iommu_apply_dm_region(st
unsigned long start, end;
start = IOVA_PFN(region->start);
- end = IOVA_PFN(region->start + region->length);
+ end = IOVA_PFN(region->start + region->length - 1);
WARN_ON_ONCE(reserve_iova(&dma_dom->iovad, start, end) == NULL);
}
Patches currently in stable-queue which might be from gary.hook(a)amd.com are
queue-4.9/iommu-amd-limit-the-iova-page-range-to-the-specified-addresses.patch
This is a note to let you know that I've just added the patch titled
intel_th: pci: Add Gemini Lake support
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
intel_th-pci-add-gemini-lake-support.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Dec 18 14:12:34 CET 2017
From: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Date: Thu, 30 Jun 2016 16:10:51 +0300
Subject: intel_th: pci: Add Gemini Lake support
From: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
[ Upstream commit 340837f985c2cb87ca0868d4aa9ce42b0fab3a21 ]
This adds Intel(R) Trace Hub PCI ID for Gemini Lake SOC.
Signed-off-by: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/hwtracing/intel_th/pci.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/drivers/hwtracing/intel_th/pci.c
+++ b/drivers/hwtracing/intel_th/pci.c
@@ -95,6 +95,11 @@ static const struct pci_device_id intel_
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x9da6),
.driver_data = (kernel_ulong_t)0,
},
+ {
+ /* Gemini Lake */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x318e),
+ .driver_data = (kernel_ulong_t)0,
+ },
{ 0 },
};
Patches currently in stable-queue which might be from alexander.shishkin(a)linux.intel.com are
queue-4.9/intel_th-pci-add-gemini-lake-support.patch
This is a note to let you know that I've just added the patch titled
Input: i8042 - add TUXEDO BU1406 (N24_25BU) to the nomux list
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
input-i8042-add-tuxedo-bu1406-n24_25bu-to-the-nomux-list.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Dec 18 14:12:34 CET 2017
From: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Date: Tue, 28 Feb 2017 17:14:41 -0800
Subject: Input: i8042 - add TUXEDO BU1406 (N24_25BU) to the nomux list
From: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
[ Upstream commit a4c2a13129f7c5bcf81704c06851601593303fd5 ]
TUXEDO BU1406 does not implement active multiplexing mode properly,
and takes around 550 ms in i8042_set_mux_mode(). Given that the
device does not have external AUX port, there is no downside in
disabling the MUX mode.
Reported-by: Paul Menzel <pmenzel(a)molgen.mpg.de>
Suggested-by: Vojtech Pavlik <vojtech(a)suse.cz>
Reviewed-by: Marcos Paulo de Souza <marcos.souza.org(a)gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/input/serio/i8042-x86ia64io.h | 7 +++++++
1 file changed, 7 insertions(+)
--- a/drivers/input/serio/i8042-x86ia64io.h
+++ b/drivers/input/serio/i8042-x86ia64io.h
@@ -520,6 +520,13 @@ static const struct dmi_system_id __init
DMI_MATCH(DMI_PRODUCT_NAME, "IC4I"),
},
},
+ {
+ /* TUXEDO BU1406 */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Notebook"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "N24_25BU"),
+ },
+ },
{ }
};
Patches currently in stable-queue which might be from dmitry.torokhov(a)gmail.com are
queue-4.9/input-i8042-add-tuxedo-bu1406-n24_25bu-to-the-nomux-list.patch
This is a note to let you know that I've just added the patch titled
icmp: don't fail on fragment reassembly time exceeded
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
icmp-don-t-fail-on-fragment-reassembly-time-exceeded.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Dec 18 14:12:35 CET 2017
From: Matteo Croce <mcroce(a)redhat.com>
Date: Thu, 12 Oct 2017 16:12:37 +0200
Subject: icmp: don't fail on fragment reassembly time exceeded
From: Matteo Croce <mcroce(a)redhat.com>
[ Upstream commit 258bbb1b0e594ad5f5652cb526b3c63e6a7fad3d ]
The ICMP implementation currently replies to an ICMP time exceeded message
(type 11) with an ICMP host unreachable message (type 3, code 1).
However, time exceeded messages can either represent "time to live exceeded
in transit" (code 0) or "fragment reassembly time exceeded" (code 1).
Unconditionally replying to "fragment reassembly time exceeded" with
host unreachable messages might cause unjustified connection resets
which are now easily triggered as UFO has been removed, because, in turn,
sending large buffers triggers IP fragmentation.
The issue can be easily reproduced by running a lot of UDP streams
which is likely to trigger IP fragmentation:
# start netserver in the test namespace
ip netns add test
ip netns exec test netserver
# create a VETH pair
ip link add name veth0 type veth peer name veth0 netns test
ip link set veth0 up
ip -n test link set veth0 up
for i in $(seq 20 29); do
# assign addresses to both ends
ip addr add dev veth0 192.168.$i.1/24
ip -n test addr add dev veth0 192.168.$i.2/24
# start the traffic
netperf -L 192.168.$i.1 -H 192.168.$i.2 -t UDP_STREAM -l 0 &
done
# wait
send_data: data send error: No route to host (errno 113)
netperf: send_omni: send_data failed: No route to host
We need to differentiate instead: if fragment reassembly time exceeded
is reported, we need to silently drop the packet,
if time to live exceeded is reported, maintain the current behaviour.
In both cases increment the related error count "icmpInTimeExcds".
While at it, fix a typo in a comment, and convert the if statement
into a switch to mate it more readable.
Signed-off-by: Matteo Croce <mcroce(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/icmp.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -766,7 +766,7 @@ static bool icmp_tag_validation(int prot
}
/*
- * Handle ICMP_DEST_UNREACH, ICMP_TIME_EXCEED, ICMP_QUENCH, and
+ * Handle ICMP_DEST_UNREACH, ICMP_TIME_EXCEEDED, ICMP_QUENCH, and
* ICMP_PARAMETERPROB.
*/
@@ -794,7 +794,8 @@ static bool icmp_unreach(struct sk_buff
if (iph->ihl < 5) /* Mangled header, drop. */
goto out_err;
- if (icmph->type == ICMP_DEST_UNREACH) {
+ switch (icmph->type) {
+ case ICMP_DEST_UNREACH:
switch (icmph->code & 15) {
case ICMP_NET_UNREACH:
case ICMP_HOST_UNREACH:
@@ -830,8 +831,16 @@ static bool icmp_unreach(struct sk_buff
}
if (icmph->code > NR_ICMP_UNREACH)
goto out;
- } else if (icmph->type == ICMP_PARAMETERPROB)
+ break;
+ case ICMP_PARAMETERPROB:
info = ntohl(icmph->un.gateway) >> 24;
+ break;
+ case ICMP_TIME_EXCEEDED:
+ __ICMP_INC_STATS(net, ICMP_MIB_INTIMEEXCDS);
+ if (icmph->code == ICMP_EXC_FRAGTIME)
+ goto out;
+ break;
+ }
/*
* Throw it at our lower layers
Patches currently in stable-queue which might be from mcroce(a)redhat.com are
queue-4.9/icmp-don-t-fail-on-fragment-reassembly-time-exceeded.patch
This is a note to let you know that I've just added the patch titled
IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ib-ipoib-grab-rtnl-lock-on-heavy-flush-when-calling-ndo_open-stop.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Dec 18 14:12:35 CET 2017
From: Alex Vesker <valex(a)mellanox.com>
Date: Tue, 10 Oct 2017 10:36:41 +0300
Subject: IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop
From: Alex Vesker <valex(a)mellanox.com>
[ Upstream commit b4b678b06f6eef18bff44a338c01870234db0bc9 ]
When ndo_open and ndo_stop are called RTNL lock should be held.
In this specific case ipoib_ib_dev_open calls the offloaded ndo_open
which re-sets the number of TX queue assuming RTNL lock is held.
Since RTNL lock is not held, RTNL assert will fail.
Signed-off-by: Alex Vesker <valex(a)mellanox.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/ulp/ipoib/ipoib_ib.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -1177,10 +1177,15 @@ static void __ipoib_ib_dev_flush(struct
ipoib_ib_dev_down(dev);
if (level == IPOIB_FLUSH_HEAVY) {
+ rtnl_lock();
if (test_bit(IPOIB_FLAG_INITIALIZED, &priv->flags))
ipoib_ib_dev_stop(dev);
- if (ipoib_ib_dev_open(dev) != 0)
+
+ result = ipoib_ib_dev_open(dev);
+ rtnl_unlock();
+ if (result)
return;
+
if (netif_queue_stopped(dev))
netif_start_queue(dev);
}
Patches currently in stable-queue which might be from valex(a)mellanox.com are
queue-4.9/ib-ipoib-grab-rtnl-lock-on-heavy-flush-when-calling-ndo_open-stop.patch
This is a note to let you know that I've just added the patch titled
Ib/hfi1: Return actual operational VLs in port info query
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ib-hfi1-return-actual-operational-vls-in-port-info-query.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Dec 18 14:12:35 CET 2017
From: Patel Jay P <jay.p.patel(a)intel.com>
Date: Mon, 23 Oct 2017 06:05:53 -0700
Subject: Ib/hfi1: Return actual operational VLs in port info query
From: Patel Jay P <jay.p.patel(a)intel.com>
[ Upstream commit 00f9203119dd2774564407c7a67b17d81916298b ]
__subn_get_opa_portinfo stores value returned by hfi1_get_ib_cfg() as
operational vls. hfi1_get_ib_cfg() returns vls_operational field in
hfi1_pportdata. The problem with this is that the value is always equal
to vls_supported field in hfi1_pportdata.
The logic to calculate operational_vls is to set value passed by FM
(in __subn_set_opa_portinfo routine). If no value is passed then
default value is stored in operational_vls.
Field actual_vls_operational is calculated on the basis of buffer
control table. Hence, modifying hfi1_get_ib_cfg() to return
actual_operational_vls when used with HFI1_IB_CFG_OP_VLS parameter
Reviewed-by: Mike Marciniszyn <mike.marciniszyn(a)intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro(a)intel.com>
Signed-off-by: Patel Jay P <jay.p.patel(a)intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro(a)intel.com>
Signed-off-by: Doug Ledford <dledford(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/hw/hfi1/chip.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -9769,7 +9769,7 @@ int hfi1_get_ib_cfg(struct hfi1_pportdat
goto unimplemented;
case HFI1_IB_CFG_OP_VLS:
- val = ppd->vls_operational;
+ val = ppd->actual_vls_operational;
break;
case HFI1_IB_CFG_VL_HIGH_CAP: /* VL arb high priority table size */
val = VL_ARB_HIGH_PRIO_TABLE_SIZE;
Patches currently in stable-queue which might be from jay.p.patel(a)intel.com are
queue-4.9/ib-hfi1-return-actual-operational-vls-in-port-info-query.patch
This is a note to let you know that I've just added the patch titled
IB/core: Fix calculation of maximum RoCE MTU
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ib-core-fix-calculation-of-maximum-roce-mtu.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Dec 18 14:12:35 CET 2017
From: Parav Pandit <parav(a)mellanox.com>
Date: Mon, 16 Oct 2017 08:45:16 +0300
Subject: IB/core: Fix calculation of maximum RoCE MTU
From: Parav Pandit <parav(a)mellanox.com>
[ Upstream commit 99260132fde7bddc6e0132ce53da94d1c9ccabcb ]
The original code only took into consideration the largest header
possible after the IB_BTH_BYTES. This was incorrect, as the largest
possible header size is the largest possible combination of headers we
might run into. The new code accounts for all possible headers in the
largest possible combination and subtracts that from the MTU to make
sure that all packets will fit on the wire.
Link: https://www.spinics.net/lists/linux-rdma/msg54558.html
Fixes: 3c86aa70bf67 ("RDMA/cm: Add RDMA CM support for IBoE devices")
Signed-off-by: Parav Pandit <parav(a)mellanox.com>
Reviewed-by: Daniel Jurgens <danielj(a)mellanox.com>
Reported-by: Roland Dreier <roland(a)purestorage.com>
Signed-off-by: Leon Romanovsky <leon(a)kernel.org>
Signed-off-by: Doug Ledford <dledford(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/rdma/ib_addr.h | 7 ++++---
include/rdma/ib_pack.h | 19 +++++++++++--------
2 files changed, 15 insertions(+), 11 deletions(-)
--- a/include/rdma/ib_addr.h
+++ b/include/rdma/ib_addr.h
@@ -243,10 +243,11 @@ static inline void rdma_addr_set_dgid(st
static inline enum ib_mtu iboe_get_mtu(int mtu)
{
/*
- * reduce IB headers from effective IBoE MTU. 28 stands for
- * atomic header which is the biggest possible header after BTH
+ * Reduce IB headers from effective IBoE MTU.
*/
- mtu = mtu - IB_GRH_BYTES - IB_BTH_BYTES - 28;
+ mtu = mtu - (IB_GRH_BYTES + IB_UDP_BYTES + IB_BTH_BYTES +
+ IB_EXT_XRC_BYTES + IB_EXT_ATOMICETH_BYTES +
+ IB_ICRC_BYTES);
if (mtu >= ib_mtu_enum_to_int(IB_MTU_4096))
return IB_MTU_4096;
--- a/include/rdma/ib_pack.h
+++ b/include/rdma/ib_pack.h
@@ -37,14 +37,17 @@
#include <uapi/linux/if_ether.h>
enum {
- IB_LRH_BYTES = 8,
- IB_ETH_BYTES = 14,
- IB_VLAN_BYTES = 4,
- IB_GRH_BYTES = 40,
- IB_IP4_BYTES = 20,
- IB_UDP_BYTES = 8,
- IB_BTH_BYTES = 12,
- IB_DETH_BYTES = 8
+ IB_LRH_BYTES = 8,
+ IB_ETH_BYTES = 14,
+ IB_VLAN_BYTES = 4,
+ IB_GRH_BYTES = 40,
+ IB_IP4_BYTES = 20,
+ IB_UDP_BYTES = 8,
+ IB_BTH_BYTES = 12,
+ IB_DETH_BYTES = 8,
+ IB_EXT_ATOMICETH_BYTES = 28,
+ IB_EXT_XRC_BYTES = 4,
+ IB_ICRC_BYTES = 4
};
struct ib_field {
Patches currently in stable-queue which might be from parav(a)mellanox.com are
queue-4.9/ib-core-fix-calculation-of-maximum-roce-mtu.patch
This is a note to let you know that I've just added the patch titled
GFS2: Take inode off order_write list when setting jdata flag
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
gfs2-take-inode-off-order_write-list-when-setting-jdata-flag.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Dec 18 14:12:35 CET 2017
From: Bob Peterson <rpeterso(a)redhat.com>
Date: Wed, 20 Sep 2017 08:30:04 -0500
Subject: GFS2: Take inode off order_write list when setting jdata flag
From: Bob Peterson <rpeterso(a)redhat.com>
[ Upstream commit cc555b09d8c3817aeebda43a14ab67049a5653f7 ]
This patch fixes a deadlock caused when the jdata flag is set for
inodes that are already on the ordered write list. Since it is
on the ordered write list, log_flush calls gfs2_ordered_write which
calls filemap_fdatawrite. But since the inode had the jdata flag
set, that calls gfs2_jdata_writepages, which tries to start a new
transaction. A new transaction cannot be started because it tries
to acquire the log_flush rwsem which is already locked by the log
flush operation.
The bottom line is: We cannot switch an inode from ordered to jdata
until we eliminate any ordered data pages (via log flush) or any
log_flush operation afterward will create the circular dependency
above. So we need to flush the log before setting the diskflags to
switch the file mode, then we need to remove the inode from the
ordered writes list.
Before this patch, the log flush was done for jdata->ordered, but
that's wrong. If we're going from jdata to ordered, we don't need
to call gfs2_log_flush because the call to filemap_fdatawrite will
do it for us:
filemap_fdatawrite() -> __filemap_fdatawrite_range()
__filemap_fdatawrite_range() -> do_writepages()
do_writepages() -> gfs2_jdata_writepages()
gfs2_jdata_writepages() -> gfs2_log_flush()
This patch modifies function do_gfs2_set_flags so that if a file
has its jdata flag set, and it's already on the ordered write list,
the log will be flushed and it will be removed from the list
before setting the flag.
Signed-off-by: Bob Peterson <rpeterso(a)redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba(a)redhat.com>
Acked-by: Abhijith Das <adas(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/gfs2/file.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -256,7 +256,7 @@ static int do_gfs2_set_flags(struct file
goto out;
}
if ((flags ^ new_flags) & GFS2_DIF_JDATA) {
- if (flags & GFS2_DIF_JDATA)
+ if (new_flags & GFS2_DIF_JDATA)
gfs2_log_flush(sdp, ip->i_gl, NORMAL_FLUSH);
error = filemap_fdatawrite(inode->i_mapping);
if (error)
@@ -264,6 +264,8 @@ static int do_gfs2_set_flags(struct file
error = filemap_fdatawait(inode->i_mapping);
if (error)
goto out;
+ if (new_flags & GFS2_DIF_JDATA)
+ gfs2_ordered_del_inode(ip);
}
error = gfs2_trans_begin(sdp, RES_DINODE, 0);
if (error)
Patches currently in stable-queue which might be from rpeterso(a)redhat.com are
queue-4.9/gfs2-take-inode-off-order_write-list-when-setting-jdata-flag.patch