From: Hans Verkuil <hverkuil(a)xs4all.nl>
onetland: cherry-picked from cust/cisco/r28n-cisco: def9fd009919
If the wait for completion was interrupted, then make sure to cancel
any delayed work.
This can only happen if a transmit is waiting for a reply, and you press
Ctrl-C or reboot/poweroff or something like that which interrupts the
thread waiting for the reply and then proceeds to delete the CEC message.
Since the delayed work wasn't canceled, once it would trigger it referred
to stale data and resulted in a kernel oops.
Signed-off-by: Hans Verkuil <hans.verkuil(a)cisco.com>
Fixes: 05f525b833e8 ("cec: add new tx/rx status bits to detect aborts/timeouts")
Cc: <stable(a)vger.kernel.org> # for v4.18 and up
diff --git a/drivers/media/cec/cec-adap.c b/drivers/media/cec/cec-adap.c
index 0ab3a2a73b23..c00ab539ab90 100644
--- a/drivers/media/cec/cec-adap.c
+++ b/drivers/media/cec/cec-adap.c
@@ -822,6 +822,8 @@ int cec_transmit_msg_fh(struct cec_adapter *adap, struct cec_msg *msg,
*/
mutex_unlock(&adap->lock);
wait_for_completion_killable(&data->c);
+ if (!data->completed)
+ cancel_delayed_work_sync(&data->work);
mutex_lock(&adap->lock);
/* Cancel the transmit if it was interrupted */
--
2.20.1
Commit 03c4749dd6c7 ("gpio / ACPI: Drop unnecessary ACPI GPIO to Linux
GPIO translation") has made the cherryview gpio numbers sparse, to get
a 1:1 mapping between ACPI pin numbers and gpio numbers in Linux.
This has greatly simplified things, but the code setting the
irq_valid_mask was not updated for this, so the valid mask is still in
the old "compressed" numbering with the gaps in the pin numbers skipped,
which is wrong as irq_valid_mask needs to be expressed in gpio numbers.
This results in the following error on devices using pin 24 (0x0018) on
the north GPIO controller as an ACPI event source:
[ 0.422452] cherryview-pinctrl INT33FF:01: Failed to translate GPIO to IRQ
This has been reported (by email) to be happening on a Caterpillar CAT T20
tablet and I've reproduced this myself on a Medion Akoya e2215t 2-in-1.
This commit uses the pin number instead of the compressed index into
community->pins to clear the correct bits in irq_valid_mask for GPIOs
using GPEs for interrupts, fixing these errors and in case of the
Medion Akoya e2215t also fixing the LID switch not working.
Cc: stable(a)vger.kernel.org
Cc: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Fixes: 03c4749dd6c7 ("gpio / ACPI: Drop unnecessary ACPI GPIO to Linux GPIO translation")
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
---
drivers/pinctrl/intel/pinctrl-cherryview.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pinctrl/intel/pinctrl-cherryview.c b/drivers/pinctrl/intel/pinctrl-cherryview.c
index aae51c507f59..02ff5e8b0510 100644
--- a/drivers/pinctrl/intel/pinctrl-cherryview.c
+++ b/drivers/pinctrl/intel/pinctrl-cherryview.c
@@ -1563,7 +1563,7 @@ static void chv_init_irq_valid_mask(struct gpio_chip *chip,
intsel >>= CHV_PADCTRL0_INTSEL_SHIFT;
if (intsel >= community->nirqs)
- clear_bit(i, valid_mask);
+ clear_bit(desc->number, valid_mask);
}
}
--
2.23.0
Hello,
We ran automated tests on a patchset that was proposed for merging into this
kernel tree. The patches were applied to:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Commit: 365dab61f74e - Linux 5.3.7
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://artifacts.cki-project.org/pipelines/238409
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Merge testing
-------------
We cloned this repository and checked out the following commit:
Repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Commit: 365dab61f74e - Linux 5.3.7
We grabbed the f1ba1badc8f7 commit of the stable queue repository.
We then merged the patchset with `git am`:
drm-free-the-writeback_job-when-it-with-an-empty-fb.patch
drm-clear-the-fence-pointer-when-writeback-job-signa.patch
clk-ti-dra7-fix-mcasp8-clock-bits.patch
arm-dts-fix-wrong-clocks-for-dra7-mcasp.patch
nvme-pci-fix-a-race-in-controller-removal.patch
scsi-ufs-skip-shutdown-if-hba-is-not-powered.patch
scsi-megaraid-disable-device-when-probe-failed-after.patch
scsi-qla2xxx-silence-fwdump-template-message.patch
scsi-qla2xxx-fix-unbound-sleep-in-fcport-delete-path.patch
scsi-qla2xxx-fix-stale-mem-access-on-driver-unload.patch
scsi-qla2xxx-fix-n2n-link-reset.patch
scsi-qla2xxx-fix-n2n-link-up-fail.patch
arm-dts-fix-gpio0-flags-for-am335x-icev2.patch
arm-omap2-fix-missing-reset-done-flag-for-am3-and-am.patch
arm-omap2-add-missing-lcdc-midlemode-for-am335x.patch
arm-omap2-fix-warnings-with-broken-omap2_set_init_vo.patch
nvme-tcp-fix-wrong-stop-condition-in-io_work.patch
nvme-pci-save-pci-state-before-putting-drive-into-de.patch
nvme-fix-an-error-code-in-nvme_init_subsystem.patch
nvme-rdma-fix-max_hw_sectors-calculation.patch
added-quirks-for-adata-xpg-sx8200-pro-512gb.patch
nvme-add-quirk-for-kingston-nvme-ssd-running-fw-e8fk.patch
nvme-allow-64-bit-results-in-passthru-commands.patch
drm-komeda-prevent-memory-leak-in-komeda_wb_connecto.patch
nvme-rdma-fix-possible-use-after-free-in-connect-tim.patch
blk-mq-honor-io-scheduler-for-multiqueue-devices.patch
ieee802154-ca8210-prevent-memory-leak.patch
arm-dts-am4372-set-memory-bandwidth-limit-for-dispc.patch
net-dsa-qca8k-use-up-to-7-ports-for-all-operations.patch
mips-dts-ar9331-fix-interrupt-controller-size.patch
xen-efi-set-nonblocking-callbacks.patch
loop-change-queue-block-size-to-match-when-using-dio.patch
nl80211-fix-null-pointer-dereference.patch
mac80211-fix-txq-null-pointer-dereference.patch
netfilter-nft_connlimit-disable-bh-on-garbage-collec.patch
net-mscc-ocelot-add-missing-of_node_put-after-callin.patch
net-dsa-rtl8366rb-add-missing-of_node_put-after-call.patch
net-stmmac-xgmac-not-all-unicast-addresses-may-be-av.patch
net-stmmac-dwmac4-always-update-the-mac-hash-filter.patch
net-stmmac-correctly-take-timestamp-for-ptpv2.patch
net-stmmac-do-not-stop-phy-if-wol-is-enabled.patch
net-ag71xx-fix-mdio-subnode-support.patch
risc-v-clear-load-reservations-while-restoring-hart-.patch
riscv-fix-memblock-reservation-for-device-tree-blob.patch
drm-amdgpu-fix-multiple-memory-leaks-in-acp_hw_init.patch
drm-amd-display-memory-leak.patch
mips-loongson-fix-the-link-time-qualifier-of-serial_.patch
net-hisilicon-fix-usage-of-uninitialized-variable-in.patch
net-stmmac-avoid-deadlock-on-suspend-resume.patch
selftests-kvm-fix-libkvm-build-error.patch
lib-textsearch-fix-escapes-in-example-code.patch
s390-mm-fix-wunused-but-set-variable-warnings.patch
r8152-set-macpassthru-in-reset_resume-callback.patch
net-phy-allow-for-reset-line-to-be-tied-to-a-sleepy-.patch
net-phy-fix-write-to-mii-ctrl1000-register.patch
namespace-fix-namespace.pl-script-to-support-relativ.patch
convert-filldir-64-from-__put_user-to-unsafe_put_use.patch
elf-don-t-use-map_fixed_noreplace-for-elf-executable.patch
make-filldir-64-verify-the-directory-entry-filename-.patch
uaccess-implement-a-proper-unsafe_copy_to_user-and-s.patch
filldir-64-remove-warn_on_once-for-bad-directory-ent.patch
net_sched-fix-backward-compatibility-for-tca_kind.patch
net_sched-fix-backward-compatibility-for-tca_act_kin.patch
libata-ahci-fix-pcs-quirk-application.patch
md-raid0-fix-warning-message-for-parameter-default_l.patch
revert-drm-radeon-fix-eeh-during-kexec.patch
Compile testing
---------------
We compiled the kernel for 3 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ xfstests: xfs
✅ selinux-policy: serge-testsuite
✅ lvm thinp sanity
✅ storage: software RAID testing
🚧 ✅ Storage blktests
Host 2:
✅ Boot test
✅ Podman system integration test (as root)
✅ Podman system integration test (as user)
✅ LTP lite
✅ Loopdev Sanity
✅ jvm test suite
✅ AMTU (Abstract Machine Test Utility)
✅ LTP: openposix test suite
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ audit: audit testsuite test
✅ httpd: mod_ssl smoke sanity
✅ iotop: sanity
✅ tuned: tune-processes-through-perf
✅ Usex - version 1.9-29
✅ storage: SCSI VPD
✅ stress: stress-ng
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ storage: dm/common
ppc64le:
Host 1:
✅ Boot test
✅ Podman system integration test (as root)
✅ Podman system integration test (as user)
✅ LTP lite
✅ Loopdev Sanity
✅ jvm test suite
✅ AMTU (Abstract Machine Test Utility)
✅ LTP: openposix test suite
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ audit: audit testsuite test
✅ httpd: mod_ssl smoke sanity
✅ iotop: sanity
✅ tuned: tune-processes-through-perf
✅ Usex - version 1.9-29
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ storage: dm/common
Host 2:
✅ Boot test
✅ xfstests: xfs
✅ selinux-policy: serge-testsuite
✅ lvm thinp sanity
✅ storage: software RAID testing
🚧 ✅ Storage blktests
x86_64:
Host 1:
✅ Boot test
✅ xfstests: xfs
✅ selinux-policy: serge-testsuite
✅ lvm thinp sanity
✅ storage: software RAID testing
🚧 ✅ Storage blktests
Host 2:
✅ Boot test
✅ Podman system integration test (as root)
✅ Podman system integration test (as user)
✅ LTP lite
✅ Loopdev Sanity
✅ jvm test suite
✅ AMTU (Abstract Machine Test Utility)
✅ LTP: openposix test suite
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ audit: audit testsuite test
✅ httpd: mod_ssl smoke sanity
✅ iotop: sanity
✅ tuned: tune-processes-through-perf
✅ pciutils: sanity smoke test
✅ Usex - version 1.9-29
✅ storage: SCSI VPD
✅ stress: stress-ng
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ storage: dm/common
Test sources: https://github.com/CKI-project/tests-beaker
💚 Pull requests are welcome for new tests or improvements to existing tests!
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running are marked with ⏱. Reports for non-upstream kernels have
a Beaker recipe linked to next to each host.
In do_hres(), we currently use whether the return value of __arch_get_
hw_counter() is negtive to indicate fallback, but this is not a good
idea. Because:
1, ARM64 returns ULL_MAX but MIPS returns 0 when clock_mode is invalid;
2, For a 64bit counter, a "negtive" value of counter is actually valid.
To solve this problem, we use U64_MAX as the only "invalid" return
value -- this is still not fully correct, but has no problem in most
cases. Moreover, all vdso time-related functions should rely on the
return value of __arch_use_vsyscall(), because update_vdso_data() and
update_vsyscall_tz() also rely on it. So, in the core functions of
__cvdso_gettimeofday(), __cvdso_clock_gettime() and __cvdso_clock_
getres(), if __arch_use_vsyscall() returns false, we use the fallback
functions directly.
Fixes: 00b26474c2f1613d7ab894c5 ("lib/vdso: Provide generic VDSO implementation")
Cc: stable(a)vger.kernel.org
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Paul Burton <paul.burton(a)mips.com>
Cc: linux-mips(a)vger.kernel.org
Cc: linux-arm-kernel(a)lists.infradead.org
Signed-off-by: Huacai Chen <chenhc(a)lemote.com>
---
arch/arm64/include/asm/vdso/vsyscall.h | 2 +-
arch/mips/include/asm/vdso/vsyscall.h | 2 +-
include/asm-generic/vdso/vsyscall.h | 2 +-
lib/vdso/gettimeofday.c | 12 +++++++++++-
4 files changed, 14 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/include/asm/vdso/vsyscall.h b/arch/arm64/include/asm/vdso/vsyscall.h
index 0c731bf..406e6de 100644
--- a/arch/arm64/include/asm/vdso/vsyscall.h
+++ b/arch/arm64/include/asm/vdso/vsyscall.h
@@ -31,7 +31,7 @@ int __arm64_get_clock_mode(struct timekeeper *tk)
#define __arch_get_clock_mode __arm64_get_clock_mode
static __always_inline
-int __arm64_use_vsyscall(struct vdso_data *vdata)
+int __arm64_use_vsyscall(const struct vdso_data *vdata)
{
return !vdata[CS_HRES_COARSE].clock_mode;
}
diff --git a/arch/mips/include/asm/vdso/vsyscall.h b/arch/mips/include/asm/vdso/vsyscall.h
index 1953147..8b10dd7 100644
--- a/arch/mips/include/asm/vdso/vsyscall.h
+++ b/arch/mips/include/asm/vdso/vsyscall.h
@@ -29,7 +29,7 @@ int __mips_get_clock_mode(struct timekeeper *tk)
#define __arch_get_clock_mode __mips_get_clock_mode
static __always_inline
-int __mips_use_vsyscall(struct vdso_data *vdata)
+int __mips_use_vsyscall(const struct vdso_data *vdata)
{
return (vdata[CS_HRES_COARSE].clock_mode != VDSO_CLOCK_NONE);
}
diff --git a/include/asm-generic/vdso/vsyscall.h b/include/asm-generic/vdso/vsyscall.h
index e94b1978..ac05a625 100644
--- a/include/asm-generic/vdso/vsyscall.h
+++ b/include/asm-generic/vdso/vsyscall.h
@@ -26,7 +26,7 @@ static __always_inline int __arch_get_clock_mode(struct timekeeper *tk)
#endif /* __arch_get_clock_mode */
#ifndef __arch_use_vsyscall
-static __always_inline int __arch_use_vsyscall(struct vdso_data *vdata)
+static __always_inline int __arch_use_vsyscall(const struct vdso_data *vdata)
{
return 1;
}
diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index e630e7f..4ad062e 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -9,6 +9,7 @@
#include <linux/hrtimer_defs.h>
#include <vdso/datapage.h>
#include <vdso/helpers.h>
+#include <vdso/vsyscall.h>
/*
* The generic vDSO implementation requires that gettimeofday.h
@@ -50,7 +51,7 @@ static int do_hres(const struct vdso_data *vd, clockid_t clk,
cycles = __arch_get_hw_counter(vd->clock_mode);
ns = vdso_ts->nsec;
last = vd->cycle_last;
- if (unlikely((s64)cycles < 0))
+ if (unlikely(cycles == U64_MAX))
return -1;
ns += vdso_calc_delta(cycles, last, vd->mask, vd->mult);
@@ -91,6 +92,9 @@ __cvdso_clock_gettime_common(clockid_t clock, struct __kernel_timespec *ts)
if (unlikely((u32) clock >= MAX_CLOCKS))
return -1;
+ if (!__arch_use_vsyscall(vd))
+ return -1;
+
/*
* Convert the clockid to a bitmask and use it to check which
* clocks are handled in the VDSO directly.
@@ -145,6 +149,9 @@ __cvdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz)
{
const struct vdso_data *vd = __arch_get_vdso_data();
+ if (!__arch_use_vsyscall(vd))
+ return gettimeofday_fallback(tv, tz);
+
if (likely(tv != NULL)) {
struct __kernel_timespec ts;
@@ -189,6 +196,9 @@ int __cvdso_clock_getres_common(clockid_t clock, struct __kernel_timespec *res)
if (unlikely((u32) clock >= MAX_CLOCKS))
return -1;
+ if (!__arch_use_vsyscall(vd))
+ return -1;
+
hrtimer_res = READ_ONCE(vd[CS_HRES_COARSE].hrtimer_res);
/*
* Convert the clockid to a bitmask and use it to check which
--
2.7.0
From: David Hildenbrand <david(a)redhat.com>
Subject: mm/memory_hotplug: don't access uninitialized memmaps in shrink_pgdat_span()
We might use the nid of memmaps that were never initialized. For example,
if the memmap was poisoned, we will crash the kernel in pfn_to_nid() right
now. Let's use the calculated boundaries of the separate zones instead.
This now also avoids having to iterate over a whole bunch of subsections
again, after shrinking one zone.
Before commit d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug"),
the memmap was initialized to 0 and the node was set to the right value.
After that commit, the node might be garbage.
We'll have to fix shrink_zone_span() next.
Link: http://lkml.kernel.org/r/20191006085646.5768-4-david@redhat.com
Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [d0dc12e86b319]
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reported-by: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Wei Yang <richardw.yang(a)linux.intel.com>
Cc: Alexander Duyck <alexander.h.duyck(a)linux.intel.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
Cc: Benjamin Herrenschmidt <benh(a)kernel.crashing.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Christian Borntraeger <borntraeger(a)de.ibm.com>
Cc: Christophe Leroy <christophe.leroy(a)c-s.fr>
Cc: Damian Tometzki <damian.tometzki(a)gmail.com>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Fenghua Yu <fenghua.yu(a)intel.com>
Cc: Gerald Schaefer <gerald.schaefer(a)de.ibm.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Halil Pasic <pasic(a)linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens(a)de.ibm.com>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Jun Yao <yaojun8558363(a)gmail.com>
Cc: Logan Gunthorpe <logang(a)deltatee.com>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Cc: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Pankaj Gupta <pagupta(a)redhat.com>
Cc: Paul Mackerras <paulus(a)samba.org>
Cc: Pavel Tatashin <pavel.tatashin(a)microsoft.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Qian Cai <cai(a)lca.pw>
Cc: Rich Felker <dalias(a)libc.org>
Cc: Robin Murphy <robin.murphy(a)arm.com>
Cc: Steve Capper <steve.capper(a)arm.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: Vasily Gorbik <gor(a)linux.ibm.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Yang <richard.weiyang(a)gmail.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Yoshinori Sato <ysato(a)users.sourceforge.jp>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org> [4.13+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory_hotplug.c | 74 +++++++++---------------------------------
1 file changed, 16 insertions(+), 58 deletions(-)
--- a/mm/memory_hotplug.c~mm-memory_hotplug-dont-access-uninitialized-memmaps-in-shrink_pgdat_span
+++ a/mm/memory_hotplug.c
@@ -436,67 +436,25 @@ static void shrink_zone_span(struct zone
zone_span_writeunlock(zone);
}
-static void shrink_pgdat_span(struct pglist_data *pgdat,
- unsigned long start_pfn, unsigned long end_pfn)
+static void update_pgdat_span(struct pglist_data *pgdat)
{
- unsigned long pgdat_start_pfn = pgdat->node_start_pfn;
- unsigned long p = pgdat_end_pfn(pgdat); /* pgdat_end_pfn namespace clash */
- unsigned long pgdat_end_pfn = p;
- unsigned long pfn;
- int nid = pgdat->node_id;
-
- if (pgdat_start_pfn == start_pfn) {
- /*
- * If the section is smallest section in the pgdat, it need
- * shrink pgdat->node_start_pfn and pgdat->node_spanned_pages.
- * In this case, we find second smallest valid mem_section
- * for shrinking zone.
- */
- pfn = find_smallest_section_pfn(nid, NULL, end_pfn,
- pgdat_end_pfn);
- if (pfn) {
- pgdat->node_start_pfn = pfn;
- pgdat->node_spanned_pages = pgdat_end_pfn - pfn;
- }
- } else if (pgdat_end_pfn == end_pfn) {
- /*
- * If the section is biggest section in the pgdat, it need
- * shrink pgdat->node_spanned_pages.
- * In this case, we find second biggest valid mem_section for
- * shrinking zone.
- */
- pfn = find_biggest_section_pfn(nid, NULL, pgdat_start_pfn,
- start_pfn);
- if (pfn)
- pgdat->node_spanned_pages = pfn - pgdat_start_pfn + 1;
- }
-
- /*
- * If the section is not biggest or smallest mem_section in the pgdat,
- * it only creates a hole in the pgdat. So in this case, we need not
- * change the pgdat.
- * But perhaps, the pgdat has only hole data. Thus it check the pgdat
- * has only hole or not.
- */
- pfn = pgdat_start_pfn;
- for (; pfn < pgdat_end_pfn; pfn += PAGES_PER_SUBSECTION) {
- if (unlikely(!pfn_valid(pfn)))
- continue;
-
- if (pfn_to_nid(pfn) != nid)
- continue;
-
- /* Skip range to be removed */
- if (pfn >= start_pfn && pfn < end_pfn)
- continue;
+ unsigned long node_start_pfn = 0, node_end_pfn = 0;
+ struct zone *zone;
- /* If we find valid section, we have nothing to do */
- return;
+ for (zone = pgdat->node_zones;
+ zone < pgdat->node_zones + MAX_NR_ZONES; zone++) {
+ unsigned long zone_end_pfn = zone->zone_start_pfn +
+ zone->spanned_pages;
+
+ /* No need to lock the zones, they can't change. */
+ if (zone_end_pfn > node_end_pfn)
+ node_end_pfn = zone_end_pfn;
+ if (zone->zone_start_pfn < node_start_pfn)
+ node_start_pfn = zone->zone_start_pfn;
}
- /* The pgdat has no valid section */
- pgdat->node_start_pfn = 0;
- pgdat->node_spanned_pages = 0;
+ pgdat->node_start_pfn = node_start_pfn;
+ pgdat->node_spanned_pages = node_end_pfn - node_start_pfn;
}
static void __remove_zone(struct zone *zone, unsigned long start_pfn,
@@ -507,7 +465,7 @@ static void __remove_zone(struct zone *z
pgdat_resize_lock(zone->zone_pgdat, &flags);
shrink_zone_span(zone, start_pfn, start_pfn + nr_pages);
- shrink_pgdat_span(pgdat, start_pfn, start_pfn + nr_pages);
+ update_pgdat_span(pgdat);
pgdat_resize_unlock(zone->zone_pgdat, &flags);
}
_
From: "Aneesh Kumar K.V" <aneesh.kumar(a)linux.ibm.com>
Subject: mm/memunmap: don't access uninitialized memmap in memunmap_pages()
Patch series "mm/memory_hotplug: Shrink zones before removing memory", v6.
This series fixes the access of uninitialized memmaps when shrinking
zones/nodes and when removing memory. Also, it contains all fixes for
crashes that can be triggered when removing certain namespace using
memunmap_pages() - ZONE_DEVICE, reported by Aneesh.
We stop trying to shrink ZONE_DEVICE, as it's buggy, fixing it would be
more involved (we don't have SECTION_IS_ONLINE as an indicator), and
shrinking is only of limited use (set_zone_contiguous() cannot detect the
ZONE_DEVICE as contiguous).
We continue shrinking !ZONE_DEVICE zones, however, I reduced the amount of
code to a minimum. Shrinking is especially necessary to keep
zone->contiguous set where possible, especially, on memory unplug of DIMMs
at zone boundaries.
--------------------------------------------------------------------------
Zones are now properly shrunk when offlining memory blocks or when
onlining failed. This allows to properly shrink zones on memory unplug
even if the separate memory blocks of a DIMM were onlined to different
zones or re-onlined to a different zone after offlining.
Example:
:/# cat /proc/zoneinfo
Node 1, zone Movable
spanned 0
present 0
managed 0
:/# echo "online_movable" > /sys/devices/system/memory/memory41/state
:/# echo "online_movable" > /sys/devices/system/memory/memory43/state
:/# cat /proc/zoneinfo
Node 1, zone Movable
spanned 98304
present 65536
managed 65536
:/# echo 0 > /sys/devices/system/memory/memory43/online
:/# cat /proc/zoneinfo
Node 1, zone Movable
spanned 32768
present 32768
managed 32768
:/# echo 0 > /sys/devices/system/memory/memory41/online
:/# cat /proc/zoneinfo
Node 1, zone Movable
spanned 0
present 0
managed 0
This patch (of 10):
With an altmap, the memmap falling into the reserved altmap space are not
initialized and, therefore, contain a garbage NID and a garbage zone.
Make sure to read the NID/zone from a memmap that was initialized.
This fixes a kernel crash that is observed when destroying a namespace:
[ 81.356173] kernel BUG at include/linux/mm.h:1107!
cpu 0x1: Vector: 700 (Program Check) at [c000000274087890]
pc: c0000000004b9728: memunmap_pages+0x238/0x340
lr: c0000000004b9724: memunmap_pages+0x234/0x340
...
pid = 3669, comm = ndctl
kernel BUG at include/linux/mm.h:1107!
[c000000274087ba0] c0000000009e3500 devm_action_release+0x30/0x50
[c000000274087bc0] c0000000009e4758 release_nodes+0x268/0x2d0
[c000000274087c30] c0000000009dd144 device_release_driver_internal+0x174/0x240
[c000000274087c70] c0000000009d9dfc unbind_store+0x13c/0x190
[c000000274087cb0] c0000000009d8a24 drv_attr_store+0x44/0x60
[c000000274087cd0] c0000000005a7470 sysfs_kf_write+0x70/0xa0
[c000000274087d10] c0000000005a5cac kernfs_fop_write+0x1ac/0x290
[c000000274087d60] c0000000004be45c __vfs_write+0x3c/0x70
[c000000274087d80] c0000000004c26e4 vfs_write+0xe4/0x200
[c000000274087dd0] c0000000004c2a6c ksys_write+0x7c/0x140
[c000000274087e20] c00000000000bbd0 system_call+0x5c/0x68
The "page_zone(pfn_to_page(pfn)" was introduced by 69324b8f4833 ("mm,
devm_memremap_pages: add MEMORY_DEVICE_PRIVATE support"), however, I
think we will never have driver reserved memory with
MEMORY_DEVICE_PRIVATE (no altmap AFAIKS).
[david(a)redhat.com: minimze code changes, rephrase description]
Link: http://lkml.kernel.org/r/20191006085646.5768-2-david@redhat.com
Fixes: 2c2a5af6fed2 ("mm, memory_hotplug: add nid parameter to arch_remove_memory")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Logan Gunthorpe <logang(a)deltatee.com>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Damian Tometzki <damian.tometzki(a)gmail.com>
Cc: Alexander Duyck <alexander.h.duyck(a)linux.intel.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Anshuman Khandual <anshuman.khandual(a)arm.com>
Cc: Benjamin Herrenschmidt <benh(a)kernel.crashing.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Christian Borntraeger <borntraeger(a)de.ibm.com>
Cc: Christophe Leroy <christophe.leroy(a)c-s.fr>
Cc: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc: Fenghua Yu <fenghua.yu(a)intel.com>
Cc: Gerald Schaefer <gerald.schaefer(a)de.ibm.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Halil Pasic <pasic(a)linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens(a)de.ibm.com>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Jun Yao <yaojun8558363(a)gmail.com>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Masahiro Yamada <yamada.masahiro(a)socionext.com>
Cc: "Matthew Wilcox (Oracle)" <willy(a)infradead.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Pankaj Gupta <pagupta(a)redhat.com>
Cc: Paul Mackerras <paulus(a)samba.org>
Cc: Pavel Tatashin <pasha.tatashin(a)soleen.com>
Cc: Pavel Tatashin <pavel.tatashin(a)microsoft.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Qian Cai <cai(a)lca.pw>
Cc: Rich Felker <dalias(a)libc.org>
Cc: Robin Murphy <robin.murphy(a)arm.com>
Cc: Steve Capper <steve.capper(a)arm.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Tony Luck <tony.luck(a)intel.com>
Cc: Vasily Gorbik <gor(a)linux.ibm.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Yang <richard.weiyang(a)gmail.com>
Cc: Wei Yang <richardw.yang(a)linux.intel.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Yoshinori Sato <ysato(a)users.sourceforge.jp>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org> [5.0+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memremap.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
--- a/mm/memremap.c~mm-memunmap-dont-access-uninitialized-memmap-in-memunmap_pages
+++ a/mm/memremap.c
@@ -103,6 +103,7 @@ static void dev_pagemap_cleanup(struct d
void memunmap_pages(struct dev_pagemap *pgmap)
{
struct resource *res = &pgmap->res;
+ struct page *first_page;
unsigned long pfn;
int nid;
@@ -111,14 +112,16 @@ void memunmap_pages(struct dev_pagemap *
put_page(pfn_to_page(pfn));
dev_pagemap_cleanup(pgmap);
+ /* make sure to access a memmap that was actually initialized */
+ first_page = pfn_to_page(pfn_first(pgmap));
+
/* pages are dead and unused, undo the arch mapping */
- nid = page_to_nid(pfn_to_page(PHYS_PFN(res->start)));
+ nid = page_to_nid(first_page);
mem_hotplug_begin();
if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
- pfn = PHYS_PFN(res->start);
- __remove_pages(page_zone(pfn_to_page(pfn)), pfn,
- PHYS_PFN(resource_size(res)), NULL);
+ __remove_pages(page_zone(first_page), PHYS_PFN(res->start),
+ PHYS_PFN(resource_size(res)), NULL);
} else {
arch_remove_memory(nid, res->start, resource_size(res),
pgmap_altmap(pgmap));
_
This brings some fixes and allows to start more VMs with an in-kernel
XIVE or XICS-on-XIVE device.
Changes since v1 (https://patchwork.ozlabs.org/cover/1166099/):
- drop a useless patch
- add a patch to show VP ids in debugfs
- update some changelogs
- fix buggy check in patch 5
- Cc: stable
--
Greg
---
Greg Kurz (6):
KVM: PPC: Book3S HV: XIVE: Set kvm->arch.xive when VPs are allocated
KVM: PPC: Book3S HV: XIVE: Ensure VP isn't already in use
KVM: PPC: Book3S HV: XIVE: Show VP id in debugfs
KVM: PPC: Book3S HV: XIVE: Compute the VP id in a common helper
KVM: PPC: Book3S HV: XIVE: Make VP block size configurable
KVM: PPC: Book3S HV: XIVE: Allow userspace to set the # of VPs
Documentation/virt/kvm/devices/xics.txt | 14 +++
Documentation/virt/kvm/devices/xive.txt | 8 ++
arch/powerpc/include/uapi/asm/kvm.h | 3 +
arch/powerpc/kvm/book3s_xive.c | 142 ++++++++++++++++++++++++-------
arch/powerpc/kvm/book3s_xive.h | 17 ++++
arch/powerpc/kvm/book3s_xive_native.c | 40 +++------
6 files changed, 167 insertions(+), 57 deletions(-)
When number of free space in the journal is very low, the arithmetic in
jbd2_log_space_left() could underflow resulting in very high number of
free blocks and thus triggering assertion failure in transaction commit
code complaining there's not enough space in the journal:
J_ASSERT(journal->j_free > 1);
Properly check for the low number of free blocks.
CC: stable(a)vger.kernel.org
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
include/linux/jbd2.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 603fbc4e2f70..10e6049c0ba9 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1582,7 +1582,7 @@ static inline int jbd2_space_needed(journal_t *journal)
static inline unsigned long jbd2_log_space_left(journal_t *journal)
{
/* Allow for rounding errors */
- unsigned long free = journal->j_free - 32;
+ long free = journal->j_free - 32;
if (journal->j_committing_transaction) {
unsigned long committing = atomic_read(&journal->
@@ -1591,7 +1591,7 @@ static inline unsigned long jbd2_log_space_left(journal_t *journal)
/* Transaction + control blocks */
free -= committing + (committing >> JBD2_CONTROL_BLOCKS_SHIFT);
}
- return free;
+ return max_t(long, free, 0);
}
/*
--
2.16.4
In commit 8020919a9b99 ("mac80211: Properly handle SKB with radiotap
only"), buffers whose length is too short cause a WARN_ON(1) to be
executed. This change exposed a fault in rtlwifi drivers, which is fixed
by increasing the length of the affected buffer before it is sent to
mac80211.
Cc: Stable <stable(a)vger.kernel.org> # v5.0+
Signed-off-by: Larry Finger <Larry.Finger(a)lwfinger.net>
---
V2 - added missing usage of new len
---
Please Apply to 5.4
---
drivers/net/wireless/realtek/rtlwifi/pci.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/wireless/realtek/rtlwifi/pci.c b/drivers/net/wireless/realtek/rtlwifi/pci.c
index 6087ec7a90a6..3e9185162e51 100644
--- a/drivers/net/wireless/realtek/rtlwifi/pci.c
+++ b/drivers/net/wireless/realtek/rtlwifi/pci.c
@@ -692,12 +692,15 @@ static void _rtl_pci_rx_to_mac80211(struct ieee80211_hw *hw,
dev_kfree_skb_any(skb);
} else {
struct sk_buff *uskb = NULL;
+ int len = skb->len;
+ if (unlikely(len <= FCS_LEN))
+ len = FCS_LEN + 2;
uskb = dev_alloc_skb(skb->len + 128);
if (likely(uskb)) {
memcpy(IEEE80211_SKB_RXCB(uskb), &rx_status,
sizeof(rx_status));
- skb_put_data(uskb, skb->data, skb->len);
+ skb_put_data(uskb, skb->data, len);
dev_kfree_skb_any(skb);
ieee80211_rx_irqsafe(hw, uskb);
} else {
--
2.23.0