The patch titled
Subject: libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields
has been removed from the -mm tree. Its filename was
libnvdimm-pfn-fix-fsdax-mode-namespace-info-block-zero-fields.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Dan Williams <dan.j.williams(a)intel.com>
Subject: libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields
At namespace creation time there is the potential for the "expected to be
zero" fields of a 'pfn' info-block to be filled with indeterminate data.
While the kernel buffer is zeroed on allocation it is immediately
overwritten by nd_pfn_validate() filling it with the current contents of
the on-media info-block location. For fields like, 'flags' and the
'padding' it potentially means that future implementations can not rely on
those fields being zero.
In preparation to stop using the 'start_pad' and 'end_trunc' fields for
section alignment, arrange for fields that are not explicitly initialized
to be guaranteed zero. Bump the minor version to indicate it is safe to
assume the 'padding' and 'flags' are zero. Otherwise, this corruption is
expected to benign since all other critical fields are explicitly
initialized.
Note The cc: stable is about spreading this new policy to as many kernels
as possible not fixing an issue in those kernels. It is not until the
change titled "libnvdimm/pfn: Stop padding pmem namespaces to section
alignment" where this improper initialization becomes a problem. So if
someone decides to backport "libnvdimm/pfn: Stop padding pmem namespaces
to section alignment" (which is not tagged for stable), make sure this
pre-requisite is flagged.
Link: http://lkml.kernel.org/r/156092356065.979959.6681003754765958296.stgit@dwil…
Fixes: 32ab0a3f5170 ("libnvdimm, pmem: 'struct page' for pmem")
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Tested-by: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com> [ppc64]
Cc: <stable(a)vger.kernel.org>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Jane Chu <jane.chu(a)oracle.com>
Cc: Jeff Moyer <jmoyer(a)redhat.com>
Cc: Jérôme Glisse <jglisse(a)redhat.com>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Logan Gunthorpe <logang(a)deltatee.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Toshi Kani <toshi.kani(a)hpe.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Yang <richardw.yang(a)linux.intel.com>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/nvdimm/dax_devs.c | 2 +-
drivers/nvdimm/pfn.h | 1 +
drivers/nvdimm/pfn_devs.c | 18 +++++++++++++++---
3 files changed, 17 insertions(+), 4 deletions(-)
--- a/drivers/nvdimm/dax_devs.c~libnvdimm-pfn-fix-fsdax-mode-namespace-info-block-zero-fields
+++ a/drivers/nvdimm/dax_devs.c
@@ -118,7 +118,7 @@ int nd_dax_probe(struct device *dev, str
nvdimm_bus_unlock(&ndns->dev);
if (!dax_dev)
return -ENOMEM;
- pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL);
+ pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL);
nd_pfn->pfn_sb = pfn_sb;
rc = nd_pfn_validate(nd_pfn, DAX_SIG);
dev_dbg(dev, "dax: %s\n", rc == 0 ? dev_name(dax_dev) : "<none>");
--- a/drivers/nvdimm/pfn_devs.c~libnvdimm-pfn-fix-fsdax-mode-namespace-info-block-zero-fields
+++ a/drivers/nvdimm/pfn_devs.c
@@ -412,6 +412,15 @@ static int nd_pfn_clear_memmap_errors(st
return 0;
}
+/**
+ * nd_pfn_validate - read and validate info-block
+ * @nd_pfn: fsdax namespace runtime state / properties
+ * @sig: 'devdax' or 'fsdax' signature
+ *
+ * Upon return the info-block buffer contents (->pfn_sb) are
+ * indeterminate when validation fails, and a coherent info-block
+ * otherwise.
+ */
int nd_pfn_validate(struct nd_pfn *nd_pfn, const char *sig)
{
u64 checksum, offset;
@@ -557,7 +566,7 @@ int nd_pfn_probe(struct device *dev, str
nvdimm_bus_unlock(&ndns->dev);
if (!pfn_dev)
return -ENOMEM;
- pfn_sb = devm_kzalloc(dev, sizeof(*pfn_sb), GFP_KERNEL);
+ pfn_sb = devm_kmalloc(dev, sizeof(*pfn_sb), GFP_KERNEL);
nd_pfn = to_nd_pfn(pfn_dev);
nd_pfn->pfn_sb = pfn_sb;
rc = nd_pfn_validate(nd_pfn, PFN_SIG);
@@ -693,7 +702,7 @@ static int nd_pfn_init(struct nd_pfn *nd
u64 checksum;
int rc;
- pfn_sb = devm_kzalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL);
+ pfn_sb = devm_kmalloc(&nd_pfn->dev, sizeof(*pfn_sb), GFP_KERNEL);
if (!pfn_sb)
return -ENOMEM;
@@ -702,11 +711,14 @@ static int nd_pfn_init(struct nd_pfn *nd
sig = DAX_SIG;
else
sig = PFN_SIG;
+
rc = nd_pfn_validate(nd_pfn, sig);
if (rc != -ENODEV)
return rc;
/* no info block, do init */;
+ memset(pfn_sb, 0, sizeof(*pfn_sb));
+
nd_region = to_nd_region(nd_pfn->dev.parent);
if (nd_region->ro) {
dev_info(&nd_pfn->dev,
@@ -759,7 +771,7 @@ static int nd_pfn_init(struct nd_pfn *nd
memcpy(pfn_sb->uuid, nd_pfn->uuid, 16);
memcpy(pfn_sb->parent_uuid, nd_dev_to_uuid(&ndns->dev), 16);
pfn_sb->version_major = cpu_to_le16(1);
- pfn_sb->version_minor = cpu_to_le16(2);
+ pfn_sb->version_minor = cpu_to_le16(3);
pfn_sb->start_pad = cpu_to_le32(start_pad);
pfn_sb->end_trunc = cpu_to_le32(end_trunc);
pfn_sb->align = cpu_to_le32(nd_pfn->align);
--- a/drivers/nvdimm/pfn.h~libnvdimm-pfn-fix-fsdax-mode-namespace-info-block-zero-fields
+++ a/drivers/nvdimm/pfn.h
@@ -28,6 +28,7 @@ struct nd_pfn_sb {
__le32 end_trunc;
/* minor-version-2 record the base alignment of the mapping */
__le32 align;
+ /* minor-version-3 guarantee the padding and flags are zero */
u8 padding[4000];
__le64 checksum;
};
_
Patches currently in -mm which might be from dan.j.williams(a)intel.com are
The patch titled
Subject: resource: fix locking in find_next_iomem_res()
has been removed from the -mm tree. Its filename was
resource-fix-locking-in-find_next_iomem_res.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Nadav Amit <namit(a)vmware.com>
Subject: resource: fix locking in find_next_iomem_res()
Since resources can be removed, locking should ensure that the resource is
not removed while accessing it. However, find_next_iomem_res() does not
hold the lock while copying the data of the resource.
Keep holding the lock while the data is copied. While at it, change the
return value to a more informative value. It is disregarded by the
callers.
[akpm(a)linux-foundation.org: fix find_next_iomem_res() documentation]
Link: http://lkml.kernel.org/r/20190613045903.4922-2-namit@vmware.com
Fixes: ff3cc952d3f00 ("resource: Add remove_resource interface")
Signed-off-by: Nadav Amit <namit(a)vmware.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: Dan Williams <dan.j.williams(a)intel.com>
Cc: Borislav Petkov <bp(a)suse.de>
Cc: Toshi Kani <toshi.kani(a)hpe.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/resource.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
--- a/kernel/resource.c~resource-fix-locking-in-find_next_iomem_res
+++ a/kernel/resource.c
@@ -326,7 +326,7 @@ EXPORT_SYMBOL(release_resource);
*
* If a resource is found, returns 0 and @*res is overwritten with the part
* of the resource that's within [@start..@end]; if none is found, returns
- * -1 or -EINVAL for other invalid parameters.
+ * -ENODEV. Returns -EINVAL for invalid parameters.
*
* This function walks the whole tree and not just first level children
* unless @first_lvl is true.
@@ -365,16 +365,16 @@ static int find_next_iomem_res(resource_
break;
}
- read_unlock(&resource_lock);
- if (!p)
- return -1;
+ if (p) {
+ /* copy data */
+ res->start = max(start, p->start);
+ res->end = min(end, p->end);
+ res->flags = p->flags;
+ res->desc = p->desc;
+ }
- /* copy data */
- res->start = max(start, p->start);
- res->end = min(end, p->end);
- res->flags = p->flags;
- res->desc = p->desc;
- return 0;
+ read_unlock(&resource_lock);
+ return p ? 0 : -ENODEV;
}
static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
_
Patches currently in -mm which might be from namit(a)vmware.com are
Please apply commit a1078e821b605813 ("xen: let
alloc_xenballooned_pages() fail if not enough memory free")
to the stable kernel tree.
It is mitigating a security bug related to Xen (XSA-300).
Juergen
From: Brian Foster <bfoster(a)redhat.com>
commit 6958d11f77d45db80f7e22a21a74d4d5f44dc667 upstream.
We've had rather rare reports of bmap btree block corruption where
the bmap root block has a level count of zero. The root cause of the
corruption is so far unknown. We do have verifier checks to detect
this form of on-disk corruption, but this doesn't cover a memory
corruption variant of the problem. The latter is a reasonable
possibility because the root block is part of the inode fork and can
reside in-core for some time before inode extents are read.
If this occurs, it leads to a system crash such as the following:
BUG: unable to handle kernel paging request at ffffffff00000221
PF error: [normal kernel read fault]
...
RIP: 0010:xfs_trans_brelse+0xf/0x200 [xfs]
...
Call Trace:
xfs_iread_extents+0x379/0x540 [xfs]
xfs_file_iomap_begin_delay+0x11a/0xb40 [xfs]
? xfs_attr_get+0xd1/0x120 [xfs]
? iomap_write_begin.constprop.40+0x2d0/0x2d0
xfs_file_iomap_begin+0x4c4/0x6d0 [xfs]
? __vfs_getxattr+0x53/0x70
? iomap_write_begin.constprop.40+0x2d0/0x2d0
iomap_apply+0x63/0x130
? iomap_write_begin.constprop.40+0x2d0/0x2d0
iomap_file_buffered_write+0x62/0x90
? iomap_write_begin.constprop.40+0x2d0/0x2d0
xfs_file_buffered_aio_write+0xe4/0x3b0 [xfs]
__vfs_write+0x150/0x1b0
vfs_write+0xba/0x1c0
ksys_pwrite64+0x64/0xa0
do_syscall_64+0x5a/0x1d0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
The crash occurs because xfs_iread_extents() attempts to release an
uninitialized buffer pointer as the level == 0 value prevented the
buffer from ever being allocated or read. Change the level > 0
assert to an explicit error check in xfs_iread_extents() to avoid
crashing the kernel in the event of localized, in-core inode
corruption.
Signed-off-by: Brian Foster <bfoster(a)redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong(a)oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong(a)oracle.com>
[mcgrof: fixes kz#204223 ]
Signed-off-by: Luis Chamberlain <mcgrof(a)kernel.org>
---
fs/xfs/libxfs/xfs_bmap.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 3a496ffe6551..ab2465bc413a 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -1178,7 +1178,10 @@ xfs_iread_extents(
* Root level must use BMAP_BROOT_PTR_ADDR macro to get ptr out.
*/
level = be16_to_cpu(block->bb_level);
- ASSERT(level > 0);
+ if (unlikely(level == 0)) {
+ XFS_ERROR_REPORT(__func__, XFS_ERRLEVEL_LOW, mp);
+ return -EFSCORRUPTED;
+ }
pp = XFS_BMAP_BROOT_PTR_ADDR(mp, block, 1, ifp->if_broot_bytes);
bno = be64_to_cpu(*pp);
--
2.18.0
When a pin is active-low, logical trigger edge should be inverted to match
the same interrupt opportunity.
For example, a button pushed triggers falling edge in ACTIVE_HIGH case; in
ACTIVE_LOW case, the button pushed triggers rising edge. For user space the
IRQ requesting doesn't need to do any modification except to configuring
GPIOHANDLE_REQUEST_ACTIVE_LOW.
For example, we want to catch the event when the button is pushed. The
button on the original board drives level to be low when it is pushed, and
drives level to be high when it is released.
In user space we can do:
req.handleflags = GPIOHANDLE_REQUEST_INPUT;
req.eventflags = GPIOEVENT_REQUEST_FALLING_EDGE;
while (1) {
read(fd, &dat, sizeof(dat));
if (dat.id == GPIOEVENT_EVENT_FALLING_EDGE)
printf("button pushed\n");
}
Run the same logic on another board which the polarity of the button is
inverted; it drives level to be high when pushed, and level to be low when
released. For this inversion we add flag GPIOHANDLE_REQUEST_ACTIVE_LOW:
req.handleflags = GPIOHANDLE_REQUEST_INPUT |
GPIOHANDLE_REQUEST_ACTIVE_LOW;
req.eventflags = GPIOEVENT_REQUEST_FALLING_EDGE;
At the result, there are no any events caught when the button is pushed.
By the way, button releasing will emit a "falling" event. The timing of
"falling" catching is not expected.
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael Wu <michael.wu(a)vatics.com>
---
Changes from v1:
- Correct undeclared 'IRQ_TRIGGER_RISING'
- Add an example to descibe the issue
---
drivers/gpio/gpiolib.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index e013d417a936..9c9597f929d7 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -956,9 +956,11 @@ static int lineevent_create(struct gpio_device *gdev, void __user *ip)
}
if (eflags & GPIOEVENT_REQUEST_RISING_EDGE)
- irqflags |= IRQF_TRIGGER_RISING;
+ irqflags |= test_bit(FLAG_ACTIVE_LOW, &desc->flags) ?
+ IRQF_TRIGGER_FALLING : IRQF_TRIGGER_RISING;
if (eflags & GPIOEVENT_REQUEST_FALLING_EDGE)
- irqflags |= IRQF_TRIGGER_FALLING;
+ irqflags |= test_bit(FLAG_ACTIVE_LOW, &desc->flags) ?
+ IRQF_TRIGGER_RISING : IRQF_TRIGGER_FALLING;
irqflags |= IRQF_ONESHOT;
INIT_KFIFO(le->events);
--
2.17.1
objtool points out several conditions that it does not like, depending
on the combination with other configuration options and compiler
variants:
stack protector:
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch()+0xbf: call to __stack_chk_fail() with UACCESS enabled
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch_v1()+0xbe: call to __stack_chk_fail() with UACCESS enabled
stackleak plugin:
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch()+0x4a: call to stackleak_track_stack() with UACCESS enabled
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch_v1()+0x4a: call to stackleak_track_stack() with UACCESS enabled
kasan:
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch()+0x25: call to memcpy() with UACCESS enabled
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch_v1()+0x25: call to memcpy() with UACCESS enabled
The stackleak and kasan options just need to be disabled for this file
as we do for other files already. For the stack protector, we already
attempt to disable it, but this fails on clang because the check is
mixed with the gcc specific -fno-conserve-stack option. According
to Andrey Ryabinin, that option is not even needed, dropping it here
fixes the stackprotector issue.
Fixes: d08965a27e84 ("x86/uaccess, ubsan: Fix UBSAN vs. SMAP")
Link: https://lore.kernel.org/lkml/20190617123109.667090-1-arnd@arndb.de/t/
Link: https://lore.kernel.org/lkml/20190722091050.2188664-1-arnd@arndb.de/t/
Cc: stable(a)vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
v2:
- drop -fno-conserve-stack
- fix the Fixes: line
---
lib/Makefile | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/lib/Makefile b/lib/Makefile
index 095601ce371d..29c02a924973 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -279,7 +279,8 @@ obj-$(CONFIG_UCS2_STRING) += ucs2_string.o
obj-$(CONFIG_UBSAN) += ubsan.o
UBSAN_SANITIZE_ubsan.o := n
-CFLAGS_ubsan.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)
+KASAN_SANITIZE_ubsan.o := n
+CFLAGS_ubsan.o := $(call cc-option, -fno-stack-protector) $(DISABLE_STACKLEAK_PLUGIN)
obj-$(CONFIG_SBITMAP) += sbitmap.o
--
2.20.0
objtool points out several conditions that it does not like, depending
on the combination with other configuration options and compiler
variants:
stack protector:
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch()+0xbf: call to __stack_chk_fail() with UACCESS enabled
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch_v1()+0xbe: call to __stack_chk_fail() with UACCESS enabled
stackleak plugin:
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch()+0x4a: call to stackleak_track_stack() with UACCESS enabled
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch_v1()+0x4a: call to stackleak_track_stack() with UACCESS enabled
kasan:
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch()+0x25: call to memcpy() with UACCESS enabled
lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch_v1()+0x25: call to memcpy() with UACCESS enabled
The stackleak and kasan options just need to be disabled for this file
as we do for other files already. For the stack protector, we already
attempt to disable it, but this fails on clang because the check is
mixed with the gcc specific -fno-conserve-stack option, so we need to
test them separately.
Fixes: 42440c1f9911 ("lib/ubsan: add type mismatch handler for new GCC/Clang")
Link: https://lore.kernel.org/lkml/20190617123109.667090-1-arnd@arndb.de/t/
Cc: stable(a)vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd(a)arndb.de>
---
lib/Makefile | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/lib/Makefile b/lib/Makefile
index 095601ce371d..320e3b632dd3 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -279,7 +279,8 @@ obj-$(CONFIG_UCS2_STRING) += ucs2_string.o
obj-$(CONFIG_UBSAN) += ubsan.o
UBSAN_SANITIZE_ubsan.o := n
-CFLAGS_ubsan.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)
+KASAN_SANITIZE_ubsan.o := n
+CFLAGS_ubsan.o := $(call cc-option, -fno-conserve-stack) $(call cc-option, -fno-stack-protector) $(DISABLE_STACKLEAK_PLUGIN)
obj-$(CONFIG_SBITMAP) += sbitmap.o
--
2.20.0
Shakeel Butt reported premature oom on kernel with
"cgroup_disable=memory" since mem_cgroup_is_root() returns false even
though memcg is actually NULL. The drop_caches is also broken.
It is because commit aeed1d325d42 ("mm/vmscan.c: generalize shrink_slab()
calls in shrink_node()") removed the !memcg check before
!mem_cgroup_is_root(). And, surprisingly root memcg is allocated even
though memory cgroup is disabled by kernel boot parameter.
Add mem_cgroup_disabled() check to make reclaimer work as expected.
Fixes: aeed1d325d42 ("mm/vmscan.c: generalize shrink_slab() calls in shrink_node()")
Reported-by: Shakeel Butt <shakeelb(a)google.com>
Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Kirill Tkhai <ktkhai(a)virtuozzo.com>
Cc: Roman Gushchin <guro(a)fb.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Qian Cai <cai(a)lca.pw>
Cc: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: stable(a)vger.kernel.org 4.19+
Signed-off-by: Yang Shi <yang.shi(a)linux.alibaba.com>
---
mm/vmscan.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index f8e3dcd..c10dc02 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -684,7 +684,14 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
unsigned long ret, freed = 0;
struct shrinker *shrinker;
- if (!mem_cgroup_is_root(memcg))
+ /*
+ * The root memcg might be allocated even though memcg is disabled
+ * via "cgroup_disable=memory" boot parameter. This could make
+ * mem_cgroup_is_root() return false, then just run memcg slab
+ * shrink, but skip global shrink. This may result in premature
+ * oom.
+ */
+ if (!mem_cgroup_disabled() && !mem_cgroup_is_root(memcg))
return shrink_slab_memcg(gfp_mask, nid, memcg, priority);
if (!down_read_trylock(&shrinker_rwsem))
--
1.8.3.1