Changes since v2 [1]:
- Drop the "mem-quiet" pm-debug interface in favor of an explicit
hibernate_quiet_exec() helper that executes firmware activation, or
any other subsystem provided routine, in a system-quiet context.
(Rafael)
- Rework the sysfs interface to add an explicit trigger to run
activation under hibernate_quiet_exec(). Rename
ndbusX/firmware_activate to ndbusX/firmware/activate, and add a
ndbusX/firmware/capability. Some ndctl reworks are needed to catch up
with this change.
- The new ndbusX/firmware/capability attribute indicates the default
activation method / execution context between "live" and "suspend".
[1]: http://lore.kernel.org/r/159408711335.2385045.2567600405906448375.stgit@dwi…
---
Quoting the documentation:
Some persistent memory devices run a firmware locally on the device /
"DIMM" to perform tasks like media management, capacity provisioning,
and health monitoring. The process of updating that firmware typically
involves a reboot because it has implications for in-flight memory
transactions. However, reboots are disruptive and at least the Intel
persistent memory platform implementation, described by the Intel ACPI
DSM specification [1], has added support for activating firmware at
runtime.
[1]: https://docs.pmem.io/persistent-memory/
The approach taken is to abstract the Intel platform specific mechanism
behind a libnvdimm-generic sysfs interface. The interface could support
runtime-firmware-activation on another architecture without need to
change userspace tooling.
The ACPI NFIT implementation involves a set of device-specific-methods
(DSMs) to 'arm' individual devices for activation and bus-level
'trigger' method to execute the activation. Informational / enumeration
methods are also provided at the bus and device level.
One complicating aspect of the memory device firmware activation is that
the memory controller may need to be quiesced, no memory cycles, during
the activation. While the platform has mechanisms to support holding off
in-flight DMA during the activation, the device response to that delay
is potentially undefined. The platform may reject a runtime firmware
update if, for example a PCI-E device does not support its completion
timeout value being increased to meet the activation time. Outside of
device timeouts the quiesce period may also violate application
timeouts.
Given the above device and application timeout considerations the
implementation uses a new hibernate_quiet_exec() facility to carry-out
firmware activation. This imposes the same conditions that allow for a
stable memory image snapshot to be taken for a hibernate-to-disk
sequence. However, if desired, runtime activation without the hibernate
freeze can be forced as an override.
The ndctl utility grows the following extensions / commands to drive
this mechanism:
1/ The existing update-firmware command will 'arm' devices where the
firmware image is staged by default.
ndctl update-firmware all -f firmware_image.bin
2/ The existing ability to enumerate firmware-update capabilities now
includes firmware activate capabilities at the 'bus' and 'dimm/device'
level:
ndctl list -BDF -b nfit_test.0
[
{
"provider":"nfit_test.0",
"dev":"ndbus2",
"scrub_state":"idle",
"firmware":{
"activate_method":"suspend",
"activate_state":"idle"
},
"dimms":[
{
"dev":"nmem1",
"id":"cdab-0a-07e0-ffffffff",
"handle":0,
"phys_id":0,
"security":"disabled",
"firmware":{
"current_version":0,
"can_update":true
}
},
...
3/ The new activate-firmware command triggers firmware activation per
the platform enumerated context, "suspend" vs "live", or can be forced
to "live" if there is a explicit knowledge that allowing applications
and devices to race the quiesce timeout will have no adverse effects.
ndctl activate-firmware nfit_test.0 [--force]
These patches are passing an updated version of the ndctl
"firmware-update.sh" unit test (to be posted).
---
Dan Williams (11):
libnvdimm: Validate command family indices
ACPI: NFIT: Move bus_dsm_mask out of generic nvdimm_bus_descriptor
ACPI: NFIT: Define runtime firmware activation commands
tools/testing/nvdimm: Cleanup dimm index passing
tools/testing/nvdimm: Add command debug messages
tools/testing/nvdimm: Prepare nfit_ctl_test() for ND_CMD_CALL emulation
tools/testing/nvdimm: Emulate firmware activation commands
driver-core: Introduce DEVICE_ATTR_ADMIN_{RO,RW}
libnvdimm: Convert to DEVICE_ATTR_ADMIN_RO()
PM, libnvdimm: Add runtime firmware activation support
ACPI: NFIT: Add runtime firmware activate support
Documentation/ABI/testing/sysfs-bus-nfit | 19 +
Documentation/ABI/testing/sysfs-bus-nvdimm | 2
.../driver-api/nvdimm/firmware-activate.rst | 86 ++++
drivers/acpi/nfit/core.c | 142 +++++--
drivers/acpi/nfit/intel.c | 386 ++++++++++++++++++++
drivers/acpi/nfit/intel.h | 61 +++
drivers/acpi/nfit/nfit.h | 38 ++
drivers/nvdimm/bus.c | 16 +
drivers/nvdimm/core.c | 149 ++++++++
drivers/nvdimm/dimm_devs.c | 119 ++++++
drivers/nvdimm/namespace_devs.c | 2
drivers/nvdimm/nd-core.h | 1
drivers/nvdimm/pfn_devs.c | 2
drivers/nvdimm/region_devs.c | 2
include/linux/device.h | 4
include/linux/libnvdimm.h | 52 +++
include/linux/suspend.h | 6
include/linux/sysfs.h | 7
include/uapi/linux/ndctl.h | 5
kernel/power/hibernate.c | 97 +++++
tools/testing/nvdimm/test/nfit.c | 367 +++++++++++++++----
21 files changed, 1449 insertions(+), 114 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-bus-nvdimm
create mode 100644 Documentation/driver-api/nvdimm/firmware-activate.rst
base-commit: 48778464bb7d346b47157d21ffde2af6b2d39110
The !ATOMIC_IOMAP version of io_maping_init_wc will always return
success, even when the ioremap fails.
Since the ATOMIC_IOMAP version returns NULL when the init fails, and
callers check for a NULL return on error this is unexpected.
During a device probe, where the ioremap failed, a crash can look
like this:
BUG: unable to handle page fault for address: 0000000000210000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
Oops: 0002 [#1] PREEMPT SMP
CPU: 0 PID: 177 Comm:
RIP: 0010:fill_page_dma [i915]
gen8_ppgtt_create [i915]
i915_ppgtt_create [i915]
intel_gt_init [i915]
i915_gem_init [i915]
i915_driver_probe [i915]
pci_device_probe
really_probe
driver_probe_device
The remap failure occurred much earlier in the probe. If it had
been propagated, the driver would have exited with an error.
Return NULL on ioremap failure.
Fixes: cafaf14a5d8f ("io-mapping: Always create a struct to hold metadata about the io-mapping")
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
---
v2: reflect review comments
---
include/linux/io-mapping.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index 0beaa3eba155..5641e06cbcf7 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -118,7 +118,7 @@ io_mapping_init_wc(struct io_mapping *iomap,
iomap->prot = pgprot_noncached(PAGE_KERNEL);
#endif
- return iomap;
+ return iomap->iomem ? iomap : NULL;
}
static inline void
--
2.21.0
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: de2b41be8fcccb2f5b6c480d35df590476344201
Gitweb: https://git.kernel.org/tip/de2b41be8fcccb2f5b6c480d35df590476344201
Author: Joerg Roedel <jroedel(a)suse.de>
AuthorDate: Tue, 21 Jul 2020 11:34:48 +02:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Wed, 22 Jul 2020 09:38:37 +02:00
x86, vmlinux.lds: Page-align end of ..page_aligned sections
On x86-32 the idt_table with 256 entries needs only 2048 bytes. It is
page-aligned, but the end of the .bss..page_aligned section is not
guaranteed to be page-aligned.
As a result, objects from other .bss sections may end up on the same 4k
page as the idt_table, and will accidentially get mapped read-only during
boot, causing unexpected page-faults when the kernel writes to them.
This could be worked around by making the objects in the page aligned
sections page sized, but that's wrong.
Explicit sections which store only page aligned objects have an implicit
guarantee that the object is alone in the page in which it is placed. That
works for all objects except the last one. That's inconsistent.
Enforcing page sized objects for these sections would wreckage memory
sanitizers, because the object becomes artificially larger than it should
be and out of bound access becomes legit.
Align the end of the .bss..page_aligned and .data..page_aligned section on
page-size so all objects places in these sections are guaranteed to have
their own page.
[ tglx: Amended changelog ]
Signed-off-by: Joerg Roedel <jroedel(a)suse.de>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/20200721093448.10417-1-joro@8bytes.org
---
arch/x86/kernel/vmlinux.lds.S | 1 +
include/asm-generic/vmlinux.lds.h | 5 ++++-
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 3bfc8dd..9a03e5b 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -358,6 +358,7 @@ SECTIONS
.bss : AT(ADDR(.bss) - LOAD_OFFSET) {
__bss_start = .;
*(.bss..page_aligned)
+ . = ALIGN(PAGE_SIZE);
*(BSS_MAIN)
BSS_DECRYPTED
. = ALIGN(PAGE_SIZE);
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index db600ef..052e0f0 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -341,7 +341,8 @@
#define PAGE_ALIGNED_DATA(page_align) \
. = ALIGN(page_align); \
- *(.data..page_aligned)
+ *(.data..page_aligned) \
+ . = ALIGN(page_align);
#define READ_MOSTLY_DATA(align) \
. = ALIGN(align); \
@@ -737,7 +738,9 @@
. = ALIGN(bss_align); \
.bss : AT(ADDR(.bss) - LOAD_OFFSET) { \
BSS_FIRST_SECTIONS \
+ . = ALIGN(PAGE_SIZE); \
*(.bss..page_aligned) \
+ . = ALIGN(PAGE_SIZE); \
*(.dynbss) \
*(BSS_MAIN) \
*(COMMON) \
When CROSS_COMPILE is set (e.g. aarch64-linux-gnu-), if
$(CROSS_COMPILE)elfedit is found at /usr/bin/aarch64-linux-gnu-elfedit,
GCC_TOOLCHAIN_DIR will be set to /usr/bin/. --prefix= will be set to
/usr/bin/ and Clang as of 11 will search for both
$(prefix)aarch64-linux-gnu-$needle and $(prefix)$needle.
GCC searchs for $(prefix)aarch64-linux-gnu/$version/$needle,
$(prefix)aarch64-linux-gnu/$needle and $(prefix)$needle. In practice,
$(prefix)aarch64-linux-gnu/$needle rarely contains executables.
To better model how GCC's -B/--prefix takes in effect in practice, newer
Clang (since
https://github.com/llvm/llvm-project/commit/3452a0d8c17f7166f479706b293caf6…)
only searches for $(prefix)$needle. Currently it will find /usr/bin/as
instead of /usr/bin/aarch64-linux-gnu-as.
Set --prefix= to $(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE)
(/usr/bin/aarch64-linux-gnu-) so that newer Clang can find the
appropriate cross compiling GNU as (when -no-integrated-as is in
effect).
Cc: stable(a)vger.kernel.org
Reported-by: Nathan Chancellor <natechancellor(a)gmail.com>
Signed-off-by: Fangrui Song <maskray(a)google.com>
Reviewed-by: Nathan Chancellor <natechancellor(a)gmail.com>
Tested-by: Nathan Chancellor <natechancellor(a)gmail.com>
Tested-by: Nick Desaulniers <ndesaulniers(a)google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1099
---
Changes in v2:
* Updated description to add tags and the llvm-project commit link.
* Fixed a typo.
Changes in v3:
* Add Cc: stable(a)vger.kernel.org
---
Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Makefile b/Makefile
index 0b5f8538bde5..3ac83e375b61 100644
--- a/Makefile
+++ b/Makefile
@@ -567,7 +567,7 @@ ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),)
ifneq ($(CROSS_COMPILE),)
CLANG_FLAGS += --target=$(notdir $(CROSS_COMPILE:%-=%))
GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)elfedit))
-CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)
+CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE)
GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..)
endif
ifneq ($(GCC_TOOLCHAIN),)
--
2.28.0.rc0.105.gf9edc3c819-goog
PASID and PRI capabilities are only enumerated in PF devices. VF devices
do not enumerate these capabilites. IOMMU drivers also need to enumerate
them before enabling features in the IOMMU. Extending the same support as
PASID feature discovery (pci_pasid_features) for PRI.
Signed-off-by: Ashok Raj <ashok.raj(a)intel.com>
v2: Fixed build failure from lkp when CONFIG_PRI=n
Almost all the PRI functions were called only when CONFIG_PASID is
set. Except the new pci_pri_supported().
previous version sent in error because i missed dry-run before recompile
:-(
To: Bjorn Helgaas <bhelgaas(a)google.com>
To: Joerg Roedel <joro(a)8bytes.com>
To: Lu Baolu <baolu.lu(a)intel.com>
Cc: stable(a)vger.kernel.org
Cc: linux-pci(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: Ashok Raj <ashok.raj(a)intel.com>
Cc: iommu(a)lists.linux-foundation.org
---
drivers/iommu/intel/iommu.c | 2 +-
drivers/pci/ats.c | 13 +++++++++++++
include/linux/pci-ats.h | 4 ++++
3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d759e7234e98..276452f5e6a7 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2560,7 +2560,7 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
}
if (info->ats_supported && ecap_prs(iommu->ecap) &&
- pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI))
+ pci_pri_supported(pdev))
info->pri_supported = 1;
}
}
diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index b761c1f72f67..2e6cf0c700f7 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -325,6 +325,19 @@ int pci_prg_resp_pasid_required(struct pci_dev *pdev)
return pdev->pasid_required;
}
+
+/**
+ * pci_pri_supported - Check if PRI is supported.
+ * @pdev: PCI device structure
+ *
+ * Returns true if PRI capability is present, false otherwise.
+ */
+bool pci_pri_supported(struct pci_dev *pdev)
+{
+ /* VFs share the PF PRI configuration */
+ return !!(pci_physfn(pdev)->pri_cap);
+}
+EXPORT_SYMBOL_GPL(pci_pri_supported);
#endif /* CONFIG_PCI_PRI */
#ifdef CONFIG_PCI_PASID
diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
index f75c307f346d..df54cd5b15db 100644
--- a/include/linux/pci-ats.h
+++ b/include/linux/pci-ats.h
@@ -28,6 +28,10 @@ int pci_enable_pri(struct pci_dev *pdev, u32 reqs);
void pci_disable_pri(struct pci_dev *pdev);
int pci_reset_pri(struct pci_dev *pdev);
int pci_prg_resp_pasid_required(struct pci_dev *pdev);
+bool pci_pri_supported(struct pci_dev *pdev);
+#else
+static inline bool pci_pri_supported(struct pci_dev *pdev)
+{ return false; }
#endif /* CONFIG_PCI_PRI */
#ifdef CONFIG_PCI_PASID
--
2.7.4
The patch titled
Subject: mm: fix kthread_use_mm() vs TLB invalidate
has been added to the -mm tree. Its filename is
mm-fix-kthread_use_mm-vs-tlb-invalidate.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-fix-kthread_use_mm-vs-tlb-inval…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-fix-kthread_use_mm-vs-tlb-inval…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Zijlstra <peterz(a)infradead.org>
Subject: mm: fix kthread_use_mm() vs TLB invalidate
For SMP systems using IPI based TLB invalidation, looking at
current->active_mm is entirely reasonable. This then presents the
following race condition:
CPU0 CPU1
flush_tlb_mm(mm) use_mm(mm)
<send-IPI>
tsk->active_mm = mm;
<IPI>
if (tsk->active_mm == mm)
// flush TLBs
</IPI>
switch_mm(old_mm,mm,tsk);
Where it is possible the IPI flushed the TLBs for @old_mm, not @mm,
because the IPI lands before we actually switched.
Avoid this by disabling IRQs across changing ->active_mm and switch_mm().
[ There are all sorts of reasons this might be harmless for various
architecture specific reasons, but best not leave the door open at all. ]
Link: http://lkml.kernel.org/r/20200721154106.GE10769@hirez.programming.kicks-ass…
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reported-by: Andy Lutomirski <luto(a)amacapital.net>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Jann Horn <jannh(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Nicholas Piggin <npiggin(a)gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/kthread.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
--- a/kernel/kthread.c~mm-fix-kthread_use_mm-vs-tlb-invalidate
+++ a/kernel/kthread.c
@@ -1239,13 +1239,15 @@ void kthread_use_mm(struct mm_struct *mm
WARN_ON_ONCE(tsk->mm);
task_lock(tsk);
+ local_irq_disable();
active_mm = tsk->active_mm;
if (active_mm != mm) {
mmgrab(mm);
tsk->active_mm = mm;
}
tsk->mm = mm;
- switch_mm(active_mm, mm, tsk);
+ switch_mm_irqs_off(active_mm, mm, tsk);
+ local_irq_enable();
task_unlock(tsk);
#ifdef finish_arch_post_lock_switch
finish_arch_post_lock_switch();
@@ -1274,9 +1276,11 @@ void kthread_unuse_mm(struct mm_struct *
task_lock(tsk);
sync_mm_rss(mm);
+ local_irq_disable();
tsk->mm = NULL;
/* active_mm is still 'mm' */
enter_lazy_tlb(mm, tsk);
+ local_irq_enable();
task_unlock(tsk);
}
EXPORT_SYMBOL_GPL(kthread_unuse_mm);
_
Patches currently in -mm which might be from peterz(a)infradead.org are
mm-fix-kthread_use_mm-vs-tlb-invalidate.patch
The patch titled
Subject: io-mapping: indicate mapping failure
has been added to the -mm tree. Its filename is
io-mapping-indicate-mapping-failure.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/io-mapping-indicate-mapping-failur…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/io-mapping-indicate-mapping-failur…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: "Michael J. Ruhl" <michael.j.ruhl(a)intel.com>
Subject: io-mapping: indicate mapping failure
The !ATOMIC_IOMAP version of io_maping_init_wc will always return success,
even when the ioremap fails.
Since the ATOMIC_IOMAP version returns NULL when the init fails, and
callers check for a NULL return on error this is unexpected.
During a device probe, where the ioremap failed, a crash can look
like this:
BUG: unable to handle page fault for address: 0000000000210000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
Oops: 0002 [#1] PREEMPT SMP
CPU: 0 PID: 177 Comm:
RIP: 0010:fill_page_dma [i915]
gen8_ppgtt_create [i915]
i915_ppgtt_create [i915]
intel_gt_init [i915]
i915_gem_init [i915]
i915_driver_probe [i915]
pci_device_probe
really_probe
driver_probe_device
The remap failure occurred much earlier in the probe. If it had
been propagated, the driver would have exited with an error.
Return NULL on ioremap failure.
Link: http://lkml.kernel.org/r/20200721171936.81563-1-michael.j.ruhl@intel.com
Fixes: cafaf14a5d8f ("io-mapping: Always create a struct to hold metadata about the io-mapping")
Signed-off-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/io-mapping.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/io-mapping.h~io-mapping-indicate-mapping-failure
+++ a/include/linux/io-mapping.h
@@ -118,7 +118,7 @@ io_mapping_init_wc(struct io_mapping *io
iomap->prot = pgprot_noncached(PAGE_KERNEL);
#endif
- return iomap;
+ return iomap->iomem ? iomap : NULL;
}
static inline void
_
Patches currently in -mm which might be from michael.j.ruhl(a)intel.com are
io-mapping-indicate-mapping-failure.patch
The patch titled
Subject: fork: silence a false postive warning in __mmdrop
has been added to the -mm tree. Its filename is
fork-silence-a-false-postive-warning-in-__mmdrop.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/fork-silence-a-false-postive-warni…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/fork-silence-a-false-postive-warni…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Qian Cai <cai(a)lca.pw>
Subject: fork: silence a false postive warning in __mmdrop
commit bf2c59fce407 ("sched/core: Fix illegal RCU from offline CPUs")
delayed,
idle->active_mm = &init_mm;
into finish_cpu() instead of idle_task_exit() which results in a false
positive warning that was originally designed in the commit 3eda69c92d47
("kernel/fork.c: detect early free of a live mm").
WARNING: CPU: 127 PID: 72976 at kernel/fork.c:697
__mmdrop+0x230/0x2c0
do_exit+0x424/0xfa0
Call Trace:
do_exit+0x424/0xfa0
do_group_exit+0x64/0xd0
sys_exit_group+0x24/0x30
system_call_exception+0x108/0x1d0
system_call_common+0xf0/0x278
Link: http://lkml.kernel.org/r/20200604150344.1796-1-cai@lca.pw
Fixes: bf2c59fce407 ("sched/core: Fix illegal RCU from offline CPUs")
Signed-off-by: Qian Cai <cai(a)lca.pw>
Cc: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: Michael Ellerman <mpe(a)ellerman.id.au> (powerpc)
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/fork.c | 1 -
1 file changed, 1 deletion(-)
--- a/kernel/fork.c~fork-silence-a-false-postive-warning-in-__mmdrop
+++ a/kernel/fork.c
@@ -694,7 +694,6 @@ void __mmdrop(struct mm_struct *mm)
{
BUG_ON(mm == &init_mm);
WARN_ON_ONCE(mm == current->mm);
- WARN_ON_ONCE(mm == current->active_mm);
mm_free_pgd(mm);
destroy_context(mm);
mmu_notifier_subscriptions_destroy(mm);
_
Patches currently in -mm which might be from cai(a)lca.pw are
fork-silence-a-false-postive-warning-in-__mmdrop.patch
mm-page_alloc-silence-a-kasan-false-positive.patch
mm-kmemleak-silence-kcsan-splats-in-checksum.patch
mm-frontswap-mark-various-intentional-data-races.patch
mm-page_io-mark-various-intentional-data-races.patch
mm-page_io-mark-various-intentional-data-races-v2.patch
mm-swap_state-mark-various-intentional-data-races.patch
mm-swapfile-fix-and-annotate-various-data-races.patch
mm-swapfile-fix-and-annotate-various-data-races-v2.patch
mm-page_counter-fix-various-data-races-at-memsw.patch
mm-memcontrol-fix-a-data-race-in-scan-count.patch
mm-list_lru-fix-a-data-race-in-list_lru_count_one.patch
mm-mempool-fix-a-data-race-in-mempool_free.patch
mm-rmap-annotate-a-data-race-at-tlb_flush_batched.patch
mm-swap-annotate-data-races-for-lru_rotate_pvecs.patch
mm-annotate-a-data-race-in-page_zonenum.patch
When CROSS_COMPILE is set (e.g. aarch64-linux-gnu-), if
$(CROSS_COMPILE)elfedit is found at /usr/bin/aarch64-linux-gnu-elfedit,
GCC_TOOLCHAIN_DIR will be set to /usr/bin/. --prefix= will be set to
/usr/bin/ and Clang as of 11 will search for both
$(prefix)aarch64-linux-gnu-$needle and $(prefix)$needle.
GCC searchs for $(prefix)aarch64-linux-gnu/$version/$needle,
$(prefix)aarch64-linux-gnu/$needle and $(prefix)$needle. In practice,
$(prefix)aarch64-linux-gnu/$needle rarely contains executables.
To better model how GCC's -B/--prefix takes in effect in practice, newer
Clang (since
https://github.com/llvm/llvm-project/commit/3452a0d8c17f7166f479706b293caf6…)
only searches for $(prefix)$needle. Currently it will find /usr/bin/as
instead of /usr/bin/aarch64-linux-gnu-as.
Set --prefix= to $(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE)
(/usr/bin/aarch64-linux-gnu-) so that newer Clang can find the
appropriate cross compiling GNU as (when -no-integrated-as is in
effect).
Reported-by: Nathan Chancellor <natechancellor(a)gmail.com>
Signed-off-by: Fangrui Song <maskray(a)google.com>
Reviewed-by: Nathan Chancellor <natechancellor(a)gmail.com>
Tested-by: Nathan Chancellor <natechancellor(a)gmail.com>
Tested-by: Nick Desaulniers <ndesaulniers(a)google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1099
---
Changes in v2:
* Updated description to add tags and the llvm-project commit link.
* Fixed a typo.
---
Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Makefile b/Makefile
index 0b5f8538bde5..3ac83e375b61 100644
--- a/Makefile
+++ b/Makefile
@@ -567,7 +567,7 @@ ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),)
ifneq ($(CROSS_COMPILE),)
CLANG_FLAGS += --target=$(notdir $(CROSS_COMPILE:%-=%))
GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)elfedit))
-CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)
+CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE)
GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..)
endif
ifneq ($(GCC_TOOLCHAIN),)
--
2.28.0.rc0.105.gf9edc3c819-goog
PASID and PRI capabilities are only enumerated in PF devices. VF devices
do not enumerate these capabilites. IOMMU drivers also need to enumerate
them before enabling features in the IOMMU. Extending the same support as
PASID feature discovery (pci_pasid_features) for PRI.
Signed-off-by: Ashok Raj <ashok.raj(a)intel.com>
v2: Fixed build failure from lkp when CONFIG_PRI=n
Almost all the PRI functions were called only when CONFIG_PASID is
set. Except the new pci_pri_supported().
To: Bjorn Helgaas <bhelgaas(a)google.com>
To: Joerg Roedel <joro(a)8bytes.com>
To: Lu Baolu <baolu.lu(a)intel.com>
Cc: stable(a)vger.kernel.org
Cc: linux-pci(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: Ashok Raj <ashok.raj(a)intel.com>
Cc: iommu(a)lists.linux-foundation.org
---
drivers/iommu/intel/iommu.c | 2 +-
drivers/pci/ats.c | 14 ++++++++++++++
include/linux/pci-ats.h | 4 ++++
3 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d759e7234e98..276452f5e6a7 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2560,7 +2560,7 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
}
if (info->ats_supported && ecap_prs(iommu->ecap) &&
- pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI))
+ pci_pri_supported(pdev))
info->pri_supported = 1;
}
}
diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index b761c1f72f67..ffb4de8c5a77 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -461,6 +461,20 @@ int pci_pasid_features(struct pci_dev *pdev)
}
EXPORT_SYMBOL_GPL(pci_pasid_features);
+/**
+ * pci_pri_supported - Check if PRI is supported.
+ * @pdev: PCI device structure
+ *
+ * Returns false when no PRI capability is present.
+ * Returns true if PRI feature is supported and enabled
+ */
+bool pci_pri_supported(struct pci_dev *pdev)
+{
+ /* VFs share the PF PRI configuration */
+ return !!(pci_physfn(pdev)->pri_cap);
+}
+EXPORT_SYMBOL_GPL(pci_pri_supported);
+
#define PASID_NUMBER_SHIFT 8
#define PASID_NUMBER_MASK (0x1f << PASID_NUMBER_SHIFT)
/**
diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
index f75c307f346d..fc989295daf3 100644
--- a/include/linux/pci-ats.h
+++ b/include/linux/pci-ats.h
@@ -28,6 +28,10 @@ int pci_enable_pri(struct pci_dev *pdev, u32 reqs);
void pci_disable_pri(struct pci_dev *pdev);
int pci_reset_pri(struct pci_dev *pdev);
int pci_prg_resp_pasid_required(struct pci_dev *pdev);
+bool pci_pri_supported(struct pci_dev *pdev);
+#else
+bool pci_pri_supported(struct pci_dev *pdev)
+{ return false; }
#endif /* CONFIG_PCI_PRI */
#ifdef CONFIG_PCI_PASID
--
2.7.4
The IMA_APPRAISE_BOOTPARAM config allows enabling different "ima_appraise="
modes - log, fix, enforce - at run time, but not when IMA architecture
specific policies are enabled. This prevents properly labeling the
filesystem on systems where secure boot is supported, but not enabled on the
platform. Only when secure boot is actually enabled should these IMA
appraise modes be disabled.
This patch removes the compile time dependency and makes it a runtime
decision, based on the secure boot state of that platform.
Test results as follows:
-> x86-64 with secure boot enabled
[ 0.015637] Kernel command line: <...> ima_policy=appraise_tcb ima_appraise=fix
[ 0.015668] ima: Secure boot enabled: ignoring ima_appraise=fix boot parameter option
-> powerpc with secure boot disabled
[ 0.000000] Kernel command line: <...> ima_policy=appraise_tcb ima_appraise=fix
[ 0.000000] Secure boot mode disabled
-> Running the system without secure boot and with both options set:
CONFIG_IMA_APPRAISE_BOOTPARAM=y
CONFIG_IMA_ARCH_POLICY=y
Audit prompts "missing-hash" but still allow execution and, consequently,
filesystem labeling:
type=INTEGRITY_DATA msg=audit(07/09/2020 12:30:27.778:1691) : pid=4976
uid=root auid=root ses=2
subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 op=appraise_data
cause=missing-hash comm=bash name=/usr/bin/evmctl dev="dm-0" ino=493150
res=no
Cc: stable(a)vger.kernel.org
Fixes: d958083a8f64 ("x86/ima: define arch_get_ima_policy() for x86")
Signed-off-by: Bruno Meneguele <bmeneg(a)redhat.com>
---
v6:
- explictly print the bootparam being ignored to the user (Mimi)
v5:
- add pr_info() to inform user the ima_appraise= boot param is being
ignored due to secure boot enabled (Nayna)
- add some testing results to commit log
v4:
- instead of change arch_policy loading code, check secure boot state at
"ima_appraise=" parameter handler (Mimi)
v3:
- extend secure boot arch checker to also consider trusted boot
- enforce IMA appraisal when secure boot is effectively enabled (Nayna)
- fix ima_appraise flag assignment by or'ing it (Mimi)
v2:
- pr_info() message prefix correction
security/integrity/ima/Kconfig | 2 +-
security/integrity/ima/ima_appraise.c | 6 ++++++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
index edde88dbe576..62dc11a5af01 100644
--- a/security/integrity/ima/Kconfig
+++ b/security/integrity/ima/Kconfig
@@ -232,7 +232,7 @@ config IMA_APPRAISE_REQUIRE_POLICY_SIGS
config IMA_APPRAISE_BOOTPARAM
bool "ima_appraise boot parameter"
- depends on IMA_APPRAISE && !IMA_ARCH_POLICY
+ depends on IMA_APPRAISE
default y
help
This option enables the different "ima_appraise=" modes
diff --git a/security/integrity/ima/ima_appraise.c b/security/integrity/ima/ima_appraise.c
index a9649b04b9f1..28a59508c6bd 100644
--- a/security/integrity/ima/ima_appraise.c
+++ b/security/integrity/ima/ima_appraise.c
@@ -19,6 +19,12 @@
static int __init default_appraise_setup(char *str)
{
#ifdef CONFIG_IMA_APPRAISE_BOOTPARAM
+ if (arch_ima_get_secureboot()) {
+ pr_info("Secure boot enabled: ignoring ima_appraise=%s boot parameter option",
+ str);
+ return 1;
+ }
+
if (strncmp(str, "off", 3) == 0)
ima_appraise = 0;
else if (strncmp(str, "log", 3) == 0)
--
2.26.2
This is a note to let you know that I've just added the patch titled
serial: 8250_mtk: Fix high-speed baud rates clamping
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 551e553f0d4ab623e2a6f424ab5834f9c7b5229c Mon Sep 17 00:00:00 2001
From: Serge Semin <Sergey.Semin(a)baikalelectronics.ru>
Date: Tue, 14 Jul 2020 15:41:12 +0300
Subject: serial: 8250_mtk: Fix high-speed baud rates clamping
Commit 7b668c064ec3 ("serial: 8250: Fix max baud limit in generic 8250
port") fixed limits of a baud rate setting for a generic 8250 port.
In other words since that commit the baud rate has been permitted to be
within [uartclk / 16 / UART_DIV_MAX; uartclk / 16], which is absolutely
normal for a standard 8250 UART port. But there are custom 8250 ports,
which provide extended baud rate limits. In particular the Mediatek 8250
port can work with baud rates up to "uartclk" speed.
Normally that and any other peculiarity is supposed to be handled in a
custom set_termios() callback implemented in the vendor-specific
8250-port glue-driver. Currently that is how it's done for the most of
the vendor-specific 8250 ports, but for some reason for Mediatek a
solution has been spread out to both the glue-driver and to the generic
8250-port code. Due to that a bug has been introduced, which permitted the
extended baud rate limit for all even for standard 8250-ports. The bug
has been fixed by the commit 7b668c064ec3 ("serial: 8250: Fix max baud
limit in generic 8250 port") by narrowing the baud rates limit back down to
the normal bounds. Unfortunately by doing so we also broke the
Mediatek-specific extended bauds feature.
A fix of the problem described above is twofold. First since we can't get
back the extended baud rate limits feature to the generic set_termios()
function and that method supports only a standard baud rates range, the
requested baud rate must be locally stored before calling it and then
restored back to the new termios structure after the generic set_termios()
finished its magic business. By doing so we still use the
serial8250_do_set_termios() method to set the LCR/MCR/FCR/etc. registers,
while the extended baud rate setting procedure will be performed later in
the custom Mediatek-specific set_termios() callback. Second since a true
baud rate is now fully calculated in the custom set_termios() method we
need to locally update the port timeout by calling the
uart_update_timeout() function. After the fixes described above are
implemented in the 8250_mtk.c driver, the Mediatek 8250-port should
get back to normally working with extended baud rates.
Link: https://lore.kernel.org/linux-serial/20200701211337.3027448-1-danielwinkler…
Fixes: 7b668c064ec3 ("serial: 8250: Fix max baud limit in generic 8250 port")
Reported-by: Daniel Winkler <danielwinkler(a)google.com>
Signed-off-by: Serge Semin <Sergey.Semin(a)baikalelectronics.ru>
Cc: stable <stable(a)vger.kernel.org>
Tested-by: Claire Chang <tientzu(a)chromium.org>
Link: https://lore.kernel.org/r/20200714124113.20918-1-Sergey.Semin@baikalelectro…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/8250/8250_mtk.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/tty/serial/8250/8250_mtk.c b/drivers/tty/serial/8250/8250_mtk.c
index f839380c2f4c..98b8a3e30733 100644
--- a/drivers/tty/serial/8250/8250_mtk.c
+++ b/drivers/tty/serial/8250/8250_mtk.c
@@ -306,8 +306,21 @@ mtk8250_set_termios(struct uart_port *port, struct ktermios *termios,
}
#endif
+ /*
+ * Store the requested baud rate before calling the generic 8250
+ * set_termios method. Standard 8250 port expects bauds to be
+ * no higher than (uartclk / 16) so the baud will be clamped if it
+ * gets out of that bound. Mediatek 8250 port supports speed
+ * higher than that, therefore we'll get original baud rate back
+ * after calling the generic set_termios method and recalculate
+ * the speed later in this method.
+ */
+ baud = tty_termios_baud_rate(termios);
+
serial8250_do_set_termios(port, termios, old);
+ tty_termios_encode_baud_rate(termios, baud, baud);
+
/*
* Mediatek UARTs use an extra highspeed register (MTK_UART_HIGHS)
*
@@ -339,6 +352,11 @@ mtk8250_set_termios(struct uart_port *port, struct ktermios *termios,
*/
spin_lock_irqsave(&port->lock, flags);
+ /*
+ * Update the per-port timeout.
+ */
+ uart_update_timeout(port, termios->c_cflag, baud);
+
/* set DLAB we have cval saved in up->lcr from the call to the core */
serial_port_out(port, UART_LCR, up->lcr | UART_LCR_DLAB);
serial_dl_write(up, quot);
--
2.27.0
This is a note to let you know that I've just added the patch titled
serial: tegra: fix CREAD handling for PIO
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From b374c562ee7ab3f3a1daf959c01868bae761571c Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Fri, 10 Jul 2020 15:59:46 +0200
Subject: serial: tegra: fix CREAD handling for PIO
Commit 33ae787b74fc ("serial: tegra: add support to ignore read") added
support for dropping input in case CREAD isn't set, but for PIO the
ignore_status_mask wasn't checked until after the character had been
put in the receive buffer.
Note that the NULL tty-port test is bogus and will be removed by a
follow-on patch.
Fixes: 33ae787b74fc ("serial: tegra: add support to ignore read")
Cc: stable <stable(a)vger.kernel.org> # 5.4
Cc: Shardar Shariff Md <smohammed(a)nvidia.com>
Cc: Krishna Yarlagadda <kyarlagadda(a)nvidia.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Acked-by: Thierry Reding <treding(a)nvidia.com>
Link: https://lore.kernel.org/r/20200710135947.2737-2-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/serial-tegra.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/serial-tegra.c b/drivers/tty/serial/serial-tegra.c
index 8de8bac9c6c7..b3bbee6b6702 100644
--- a/drivers/tty/serial/serial-tegra.c
+++ b/drivers/tty/serial/serial-tegra.c
@@ -653,11 +653,14 @@ static void tegra_uart_handle_rx_pio(struct tegra_uart_port *tup,
ch = (unsigned char) tegra_uart_read(tup, UART_RX);
tup->uport.icount.rx++;
- if (!uart_handle_sysrq_char(&tup->uport, ch) && tty)
- tty_insert_flip_char(tty, ch, flag);
+ if (uart_handle_sysrq_char(&tup->uport, ch))
+ continue;
if (tup->uport.ignore_status_mask & UART_LSR_DR)
continue;
+
+ if (tty)
+ tty_insert_flip_char(tty, ch, flag);
} while (1);
}
--
2.27.0
This is the start of the stable review cycle for the 4.4.231 release.
There are 58 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 22 Jul 2020 15:27:31 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.231-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.4.231-rc1
Vincent Guittot <vincent.guittot(a)linaro.org>
sched/fair: handle case of task_h_load() returning 0
Michał Mirosław <mirq-linux(a)rere.qmqm.pl>
misc: atmel-ssc: lock with mutex instead of spinlock
Krzysztof Kozlowski <krzk(a)kernel.org>
dmaengine: fsl-edma: Fix NULL pointer exception in fsl_edma_tx_handler
Vishwas M <vishwas.reddy.vr(a)gmail.com>
hwmon: (emc2103) fix unable to change fan pwm1_enable attribute
Huacai Chen <chenhc(a)lemote.com>
MIPS: Fix build for LTS kernel caused by backporting lpj adjustment
Esben Haabendal <esben(a)geanix.com>
uio_pdrv_genirq: fix use without device tree and no interrupt
David Pedersen <limero1337(a)gmail.com>
Input: i8042 - add Lenovo XiaoXin Air 12 to i8042 nomux list
Alexander Usyskin <alexander.usyskin(a)intel.com>
mei: bus: don't clean driver pointer
Chirantan Ekbote <chirantan(a)chromium.org>
fuse: Fix parameter for FS_IOC_{GET,SET}FLAGS
Alexander Lobakin <alobakin(a)pm.me>
virtio: virtio_console: add missing MODULE_DEVICE_TABLE() for rproc serial
AceLan Kao <acelan.kao(a)canonical.com>
USB: serial: option: add Quectel EG95 LTE modem
Jörgen Storvist <jorgen.storvist(a)gmail.com>
USB: serial: option: add GosunCn GM500 series
Igor Moura <imphilippini(a)gmail.com>
USB: serial: ch341: add new Product ID for CH340
James Hilliard <james.hilliard1(a)gmail.com>
USB: serial: cypress_m8: enable Simply Automated UPB PIM
Johan Hovold <johan(a)kernel.org>
USB: serial: iuu_phoenix: fix memory corruption
Zhang Qiang <qiang.zhang(a)windriver.com>
usb: gadget: function: fix missing spinlock in f_uac1_legacy
Peter Chen <peter.chen(a)nxp.com>
usb: chipidea: core: add wakeup support for extcon
Tom Rix <trix(a)redhat.com>
USB: c67x00: fix use after free in c67x00_giveback_urb
Takashi Iwai <tiwai(a)suse.de>
ALSA: usb-audio: Fix race against the error recovery URB submission
Takashi Iwai <tiwai(a)suse.de>
ALSA: line6: Perform sanity check for each URB creation
Takashi Iwai <tiwai(a)suse.de>
usb: core: Add a helper function to check the validity of EP type in URB
Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
HID: magicmouse: do not set up autorepeat
Álvaro Fernández Rojas <noltari(a)gmail.com>
mtd: rawnand: brcmnand: fix CS0 layout
Jin Yao <yao.jin(a)linux.intel.com>
perf stat: Zero all the 'ena' and 'run' array slot stats for interval mode
Dan Carpenter <dan.carpenter(a)oracle.com>
staging: comedi: verify array index is correct before using it
Michał Mirosław <mirq-linux(a)rere.qmqm.pl>
usb: gadget: udc: atmel: fix uninitialized read in debug printk
Sasha Levin <sashal(a)kernel.org>
Revert "usb/ohci-platform: Fix a warning when hibernating"
Sasha Levin <sashal(a)kernel.org>
Revert "usb/xhci-plat: Set PM runtime as active on resume"
Sasha Levin <sashal(a)kernel.org>
Revert "usb/ehci-platform: Set PM runtime as active on resume"
Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
i2c: eg20t: Load module automatically if ID matches
Eric Dumazet <edumazet(a)google.com>
tcp: md5: allow changing MD5 keys in all socket states
Eric Dumazet <edumazet(a)google.com>
tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers
Eric Dumazet <edumazet(a)google.com>
tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key()
Christoph Paasch <cpaasch(a)apple.com>
tcp: make sure listeners don't initialize congestion-control state
Sean Tranchetti <stranche(a)codeaurora.org>
genetlink: remove genl_bind
Martin Varghese <martin.varghese(a)nokia.com>
net: Added pointer check for dst->ops->neigh_lookup in dst_neigh_lookup_skb
Eric Dumazet <edumazet(a)google.com>
llc: make sure applications use ARPHRD_ETHER
Xin Long <lucien.xin(a)gmail.com>
l2tp: remove skb_dst_set() from l2tp_xmit_skb()
Sabrina Dubroca <sd(a)queasysnail.net>
ipv4: fill fl4_icmp_{type,code} in ping_v4_sendmsg
Davide Caratti <dcaratti(a)redhat.com>
bnxt_en: fix NULL dereference in case SR-IOV configuration fails
Vineet Gupta <vgupta(a)synopsys.com>
ARC: elf: use right ELF_ARCH
Vineet Gupta <vgupta(a)synopsys.com>
ARC: entry: fix potential EFA clobber when TIF_SYSCALL_TRACE
Tom Rix <trix(a)redhat.com>
drm/radeon: fix double free
Boris Burkov <boris(a)bur.io>
btrfs: fix fatal extent_buffer readahead vs releasepage race
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "ath9k: Fix general protection fault in ath9k_hif_usb_rx_cb"
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: x86: bit 8 of non-leaf PDPEs is not reserved
Hector Martin <marcan(a)marcan.st>
ALSA: usb-audio: add quirk for MacroSilicon MS2109
Hui Wang <hui.wang(a)canonical.com>
ALSA: hda - let hs_mic be picked ahead of hp_mic
xidongwang <wangxidong_97(a)163.com>
ALSA: opl3: fix infoleak in opl3
Wei Li <liwei391(a)huawei.com>
arm64: kgdb: Fix single-step exception handling oops
Vinod Koul <vkoul(a)kernel.org>
ALSA: compress: fix partial_drain completion state
Andre Edich <andre.edich(a)microchip.com>
smsc95xx: avoid memory leak in smsc95xx_bind
Andre Edich <andre.edich(a)microchip.com>
smsc95xx: check return value of smsc95xx_reset
Li Heng <liheng40(a)huawei.com>
net: cxgb4: fix return error value in t4_prep_fw
Tomas Henzl <thenzl(a)redhat.com>
scsi: mptscsih: Fix read sense data size
Zhenzhong Duan <zhenzhong.duan(a)gmail.com>
spi: spidev: fix a potential use-after-free in spidev_release()
Zhenzhong Duan <zhenzhong.duan(a)gmail.com>
spi: spidev: fix a race between spidev_release and spidev_remove
Christian Borntraeger <borntraeger(a)de.ibm.com>
KVM: s390: reduce number of IO pins to 1
-------------
Diffstat:
Makefile | 4 +-
arch/arc/include/asm/elf.h | 2 +-
arch/arc/kernel/entry.S | 16 +++-----
arch/arm64/kernel/kgdb.c | 2 +-
arch/mips/kernel/time.c | 13 ++-----
arch/s390/include/asm/kvm_host.h | 8 ++--
arch/x86/kvm/mmu.c | 2 +-
drivers/char/virtio_console.c | 3 +-
drivers/dma/fsl-edma.c | 7 ++++
drivers/gpu/drm/radeon/ci_dpm.c | 7 ++--
drivers/hid/hid-magicmouse.c | 6 +++
drivers/hwmon/emc2103.c | 2 +-
drivers/i2c/busses/i2c-eg20t.c | 1 +
drivers/input/serio/i8042-x86ia64io.h | 7 ++++
drivers/message/fusion/mptscsih.c | 4 +-
drivers/misc/atmel-ssc.c | 24 ++++++------
drivers/misc/mei/bus.c | 3 +-
drivers/mtd/nand/brcmnand/brcmnand.c | 5 ++-
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 2 +-
drivers/net/ethernet/chelsio/cxgb4/t4_hw.c | 8 ++--
drivers/net/usb/smsc95xx.c | 9 ++++-
drivers/net/wireless/ath/ath9k/hif_usb.c | 48 ++++++-----------------
drivers/net/wireless/ath/ath9k/hif_usb.h | 5 ---
drivers/spi/spidev.c | 24 ++++++------
drivers/staging/comedi/drivers/addi_apci_1500.c | 10 +++--
drivers/uio/uio_pdrv_genirq.c | 2 +-
drivers/usb/c67x00/c67x00-sched.c | 2 +-
drivers/usb/chipidea/core.c | 24 ++++++++++++
drivers/usb/core/urb.c | 30 ++++++++++++--
drivers/usb/gadget/function/f_uac1.c | 2 +
drivers/usb/gadget/udc/atmel_usba_udc.c | 2 +-
drivers/usb/host/ehci-platform.c | 5 ---
drivers/usb/host/ohci-platform.c | 5 ---
drivers/usb/host/xhci-plat.c | 11 +-----
drivers/usb/serial/ch341.c | 1 +
drivers/usb/serial/cypress_m8.c | 2 +
drivers/usb/serial/cypress_m8.h | 3 ++
drivers/usb/serial/iuu_phoenix.c | 8 ++--
drivers/usb/serial/option.c | 6 +++
fs/btrfs/extent_io.c | 40 +++++++++++--------
fs/fuse/file.c | 12 +++++-
include/linux/usb.h | 2 +
include/net/dst.h | 10 ++++-
include/net/genetlink.h | 8 ----
include/sound/compress_driver.h | 10 ++++-
kernel/sched/fair.c | 10 ++++-
net/ipv4/ping.c | 3 ++
net/ipv4/tcp.c | 13 ++++---
net/ipv4/tcp_cong.c | 2 +-
net/ipv4/tcp_ipv4.c | 15 +++++--
net/l2tp/l2tp_core.c | 5 +--
net/llc/af_llc.c | 10 +++--
net/netlink/genetlink.c | 52 -------------------------
sound/core/compress_offload.c | 4 ++
sound/drivers/opl3/opl3_synth.c | 2 +
sound/pci/hda/hda_auto_parser.c | 6 +++
sound/usb/line6/capture.c | 2 +
sound/usb/line6/playback.c | 2 +
sound/usb/midi.c | 17 +++++---
sound/usb/quirks-table.h | 52 +++++++++++++++++++++++++
tools/perf/util/stat.c | 6 ++-
61 files changed, 358 insertions(+), 250 deletions(-)
This is a note to let you know that I've just added the patch titled
tty: xilinx_uartps: Really fix id assignment
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 22a82fa7d6c3e16d56a036b1fa697a39b954adf0 Mon Sep 17 00:00:00 2001
From: Helmut Grohne <helmut.grohne(a)intenta.de>
Date: Mon, 13 Jul 2020 09:32:28 +0200
Subject: tty: xilinx_uartps: Really fix id assignment
The problems started with the revert (18cc7ac8a28e28). The
cdns_uart_console.index is statically assigned -1. When the port is
registered, Linux assigns consecutive numbers to it. It turned out that
when using ttyPS1 as console, the index is not updated as we are reusing
the same cdns_uart_console instance for multiple ports. When registering
ttyPS0, it gets updated from -1 to 0, but when registering ttyPS1, it
already is 0 and not updated.
That led to 2ae11c46d5fdc4. It assigns the index prior to registering
the uart_driver once. Unfortunately, that ended up breaking the
situation where the probe order does not match the id order. When using
the same device tree for both uboot and linux, it is important that the
serial0 alias points to the console. So some boards reverse those
aliases. This was reported by Jan Kiszka. The proposed fix was reverting
the index assignment and going back to the previous iteration.
However such a reversed assignement (serial0 -> uart1, serial1 -> uart0)
was already partially broken by the revert (18cc7ac8a28e28). While the
ttyPS device works, the kmsg connection is already broken and kernel
messages go missing. Reverting the id assignment does not fix this.
>From the xilinx_uartps driver pov (after reverting the refactoring
commits), there can be only one console. This manifests in static
variables console_pprt and cdns_uart_console. These variables are not
properly linked and can go out of sync. The cdns_uart_console.index is
important for uart_add_one_port. We call that function for each port -
one of which hopefully is the console. If it isn't, the CON_ENABLED flag
is not set and console_port is cleared. The next cdns_uart_probe call
then tries to register the next port using that same cdns_uart_console.
It is important that console_port and cdns_uart_console (and its index
in particular) stay in sync. The index assignment implemented by
Shubhrajyoti Datta is correct in principle. It just may have to happen a
second time if the first cdns_uart_probe call didn't encounter the
console device. And we shouldn't change the index once the console uart
is registered.
Reported-by: Shubhrajyoti Datta <shubhrajyoti.datta(a)xilinx.com>
Reported-by: Jan Kiszka <jan.kiszka(a)web.de>
Link: https://lore.kernel.org/linux-serial/f4092727-d8f5-5f91-2c9f-76643aace993@s…
Fixes: 18cc7ac8a28e28 ("Revert "serial: uartps: Register own uart console and driver structures"")
Fixes: 2ae11c46d5fdc4 ("tty: xilinx_uartps: Fix missing id assignment to the console")
Fixes: 76ed2e10579671 ("Revert "tty: xilinx_uartps: Fix missing id assignment to the console"")
Signed-off-by: Helmut Grohne <helmut.grohne(a)intenta.de>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20200713073227.GA3805@laureti-dev
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/xilinx_uartps.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/xilinx_uartps.c b/drivers/tty/serial/xilinx_uartps.c
index 672cfa075e28..2833f1418d6d 100644
--- a/drivers/tty/serial/xilinx_uartps.c
+++ b/drivers/tty/serial/xilinx_uartps.c
@@ -1580,8 +1580,10 @@ static int cdns_uart_probe(struct platform_device *pdev)
* If register_console() don't assign value, then console_port pointer
* is cleanup.
*/
- if (!console_port)
+ if (!console_port) {
+ cdns_uart_console.index = id;
console_port = port;
+ }
#endif
rc = uart_add_one_port(&cdns_uart_uart_driver, port);
@@ -1594,8 +1596,10 @@ static int cdns_uart_probe(struct platform_device *pdev)
#ifdef CONFIG_SERIAL_XILINX_PS_UART_CONSOLE
/* This is not port which is used for console that's why clean it up */
if (console_port == port &&
- !(cdns_uart_uart_driver.cons->flags & CON_ENABLED))
+ !(cdns_uart_uart_driver.cons->flags & CON_ENABLED)) {
console_port = NULL;
+ cdns_uart_console.index = -1;
+ }
#endif
cdns_uart_data->cts_override = of_property_read_bool(pdev->dev.of_node,
--
2.27.0
This is a note to let you know that I've just added the patch titled
vt: Reject zero-sized screen buffer size.
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From ce684552a266cb1c7cc2f7e623f38567adec6653 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Date: Sun, 12 Jul 2020 20:10:12 +0900
Subject: vt: Reject zero-sized screen buffer size.
syzbot is reporting general protection fault in do_con_write() [1] caused
by vc->vc_screenbuf == ZERO_SIZE_PTR caused by vc->vc_screenbuf_size == 0
caused by vc->vc_cols == vc->vc_rows == vc->vc_size_row == 0 caused by
fb_set_var() from ioctl(FBIOPUT_VSCREENINFO) on /dev/fb0 , for
gotoxy(vc, 0, 0) from reset_terminal() from vc_init() from vc_allocate()
from con_install() from tty_init_dev() from tty_open() on such console
causes vc->vc_pos == 0x10000000e due to
((unsigned long) ZERO_SIZE_PTR) + -1U * 0 + (-1U << 1).
I don't think that a console with 0 column or 0 row makes sense. And it
seems that vc_do_resize() does not intend to allow resizing a console to
0 column or 0 row due to
new_cols = (cols ? cols : vc->vc_cols);
new_rows = (lines ? lines : vc->vc_rows);
exception.
Theoretically, cols and rows can be any range as long as
0 < cols * rows * 2 <= KMALLOC_MAX_SIZE is satisfied (e.g.
cols == 1048576 && rows == 2 is possible) because of
vc->vc_size_row = vc->vc_cols << 1;
vc->vc_screenbuf_size = vc->vc_rows * vc->vc_size_row;
in visual_init() and kzalloc(vc->vc_screenbuf_size) in vc_allocate().
Since we can detect cols == 0 or rows == 0 via screenbuf_size = 0 in
visual_init(), we can reject kzalloc(0). Then, vc_allocate() will return
an error, and con_write() will not be called on a console with 0 column
or 0 row.
We need to make sure that integer overflow in visual_init() won't happen.
Since vc_do_resize() restricts cols <= 32767 and rows <= 32767, applying
1 <= cols <= 32767 and 1 <= rows <= 32767 restrictions to vc_allocate()
will be practically fine.
This patch does not touch con_init(), for returning -EINVAL there
does not help when we are not returning -ENOMEM.
[1] https://syzkaller.appspot.com/bug?extid=017265e8553724e514e8
Reported-and-tested-by: syzbot <syzbot+017265e8553724e514e8(a)syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20200712111013.11881-1-penguin-kernel@I-love.SAKU…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/vt/vt.c | 29 ++++++++++++++++++-----------
1 file changed, 18 insertions(+), 11 deletions(-)
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 48a8199f7845..42d8c67a481f 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -1092,10 +1092,19 @@ static const struct tty_port_operations vc_port_ops = {
.destruct = vc_port_destruct,
};
+/*
+ * Change # of rows and columns (0 means unchanged/the size of fg_console)
+ * [this is to be used together with some user program
+ * like resize that changes the hardware videomode]
+ */
+#define VC_MAXCOL (32767)
+#define VC_MAXROW (32767)
+
int vc_allocate(unsigned int currcons) /* return 0 on success */
{
struct vt_notifier_param param;
struct vc_data *vc;
+ int err;
WARN_CONSOLE_UNLOCKED();
@@ -1125,6 +1134,11 @@ int vc_allocate(unsigned int currcons) /* return 0 on success */
if (!*vc->vc_uni_pagedir_loc)
con_set_default_unimap(vc);
+ err = -EINVAL;
+ if (vc->vc_cols > VC_MAXCOL || vc->vc_rows > VC_MAXROW ||
+ vc->vc_screenbuf_size > KMALLOC_MAX_SIZE || !vc->vc_screenbuf_size)
+ goto err_free;
+ err = -ENOMEM;
vc->vc_screenbuf = kzalloc(vc->vc_screenbuf_size, GFP_KERNEL);
if (!vc->vc_screenbuf)
goto err_free;
@@ -1143,7 +1157,7 @@ int vc_allocate(unsigned int currcons) /* return 0 on success */
visual_deinit(vc);
kfree(vc);
vc_cons[currcons].d = NULL;
- return -ENOMEM;
+ return err;
}
static inline int resize_screen(struct vc_data *vc, int width, int height,
@@ -1158,14 +1172,6 @@ static inline int resize_screen(struct vc_data *vc, int width, int height,
return err;
}
-/*
- * Change # of rows and columns (0 means unchanged/the size of fg_console)
- * [this is to be used together with some user program
- * like resize that changes the hardware videomode]
- */
-#define VC_RESIZE_MAXCOL (32767)
-#define VC_RESIZE_MAXROW (32767)
-
/**
* vc_do_resize - resizing method for the tty
* @tty: tty being resized
@@ -1201,7 +1207,7 @@ static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc,
user = vc->vc_resize_user;
vc->vc_resize_user = 0;
- if (cols > VC_RESIZE_MAXCOL || lines > VC_RESIZE_MAXROW)
+ if (cols > VC_MAXCOL || lines > VC_MAXROW)
return -EINVAL;
new_cols = (cols ? cols : vc->vc_cols);
@@ -1212,7 +1218,7 @@ static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc,
if (new_cols == vc->vc_cols && new_rows == vc->vc_rows)
return 0;
- if (new_screen_size > KMALLOC_MAX_SIZE)
+ if (new_screen_size > KMALLOC_MAX_SIZE || !new_screen_size)
return -EINVAL;
newscreen = kzalloc(new_screen_size, GFP_USER);
if (!newscreen)
@@ -3393,6 +3399,7 @@ static int __init con_init(void)
INIT_WORK(&vc_cons[currcons].SAK_work, vc_SAK);
tty_port_init(&vc->port);
visual_init(vc, currcons, 1);
+ /* Assuming vc->vc_{cols,rows,screenbuf_size} are sane here. */
vc->vc_screenbuf = kzalloc(vc->vc_screenbuf_size, GFP_NOWAIT);
vc_init(vc, vc->vc_rows, vc->vc_cols,
currcons || !vc->vc_sw->con_save_screen);
--
2.27.0
The !ATOMIC_IOMAP version of io_maping_init_wc will always return
success, even when the ioremap fails.
Since the ATOMIC_IOMAP version returns NULL when the init fails, and
callers check for a NULL return on error this is unexpected.
Return NULL on ioremap failure.
Fixes: cafaf14a5d8f ("io-mapping: Always create a struct to hold metadata about the io-mapping"
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
---
include/linux/io-mapping.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index 0beaa3eba155..5641e06cbcf7 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -118,7 +118,7 @@ io_mapping_init_wc(struct io_mapping *iomap,
iomap->prot = pgprot_noncached(PAGE_KERNEL);
#endif
- return iomap;
+ return iomap->iomem ? iomap : NULL;
}
static inline void
--
2.21.0
Sometimes it is good to know when your mapping failed.
Fixes: cafaf14a5d8f ("io-mapping: Always create a struct to hold metadata about the io-mapping"
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
---
include/linux/io-mapping.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index 0beaa3eba155..5641e06cbcf7 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -118,7 +118,7 @@ io_mapping_init_wc(struct io_mapping *iomap,
iomap->prot = pgprot_noncached(PAGE_KERNEL);
#endif
- return iomap;
+ return iomap->iomem ? iomap : NULL;
}
static inline void
--
2.21.0
plane->index is NOT the index of the color plane in a YUV frame.
Actually, a YUV frame is represented by a single drm_plane, even though
it contains three Y, U, V planes.
v2-v3: No change
Cc: stable(a)vger.kernel.org # v5.3
Fixes: 90b86fcc47b4 ("DRM: Add KMS driver for the Ingenic JZ47xx SoCs")
Signed-off-by: Paul Cercueil <paul(a)crapouillou.net>
Acked-by: Sam Ravnborg <sam(a)ravnborg.org>
---
drivers/gpu/drm/ingenic/ingenic-drm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/ingenic/ingenic-drm.c b/drivers/gpu/drm/ingenic/ingenic-drm.c
index deb37b4a8e91..606d8acb0954 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm.c
@@ -386,7 +386,7 @@ static void ingenic_drm_plane_atomic_update(struct drm_plane *plane,
addr = drm_fb_cma_get_gem_addr(state->fb, state, 0);
width = state->src_w >> 16;
height = state->src_h >> 16;
- cpp = state->fb->format->cpp[plane->index];
+ cpp = state->fb->format->cpp[0];
priv->dma_hwdesc->addr = addr;
priv->dma_hwdesc->cmd = width * height * cpp / 4;
--
2.27.0
This is a note to let you know that I've just added the patch titled
usb: xhci: Fix ASM2142/ASM3142 DMA addressing
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From dbb0897e805f2ab1b8bc358f6c3d878a376b8897 Mon Sep 17 00:00:00 2001
From: Forest Crossman <cyrozap(a)gmail.com>
Date: Fri, 17 Jul 2020 06:27:34 -0500
Subject: usb: xhci: Fix ASM2142/ASM3142 DMA addressing
The ASM2142/ASM3142 (same PCI IDs) does not support full 64-bit DMA
addresses, which can cause silent memory corruption or IOMMU errors on
platforms that use the upper bits. Add the XHCI_NO_64BIT_SUPPORT quirk
to fix this issue.
Signed-off-by: Forest Crossman <cyrozap(a)gmail.com>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20200717112734.328432-1-cyrozap@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/xhci-pci.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index ef513c2fb843..9234c82e70e4 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -265,6 +265,9 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA &&
pdev->device == 0x1142)
xhci->quirks |= XHCI_TRUST_TX_LENGTH;
+ if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA &&
+ pdev->device == 0x2142)
+ xhci->quirks |= XHCI_NO_64BIT_SUPPORT;
if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA &&
pdev->device == PCI_DEVICE_ID_ASMEDIA_1042A_XHCI)
--
2.27.0
This is a note to let you know that I've just added the patch titled
usb: xhci-mtk: fix the failure of bandwidth allocation
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 5ce1a24dd98c00a57a8fa13660648abf7e08e3ef Mon Sep 17 00:00:00 2001
From: Chunfeng Yun <chunfeng.yun(a)mediatek.com>
Date: Fri, 10 Jul 2020 13:57:52 +0800
Subject: usb: xhci-mtk: fix the failure of bandwidth allocation
The wMaxPacketSize field of endpoint descriptor may be zero
as default value in alternate interface, and they are not
actually selected when start stream, so skip them when try to
allocate bandwidth.
Cc: stable <stable(a)vger.kernel.org>
Fixes: 0cbd4b34cda9 ("xhci: mediatek: support MTK xHCI host controller")
Signed-off-by: Chunfeng Yun <chunfeng.yun(a)mediatek.com>
Link: https://lore.kernel.org/r/1594360672-2076-1-git-send-email-chunfeng.yun@med…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/xhci-mtk-sch.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/usb/host/xhci-mtk-sch.c b/drivers/usb/host/xhci-mtk-sch.c
index fea555570ad4..45c54d56ecbd 100644
--- a/drivers/usb/host/xhci-mtk-sch.c
+++ b/drivers/usb/host/xhci-mtk-sch.c
@@ -557,6 +557,10 @@ static bool need_bw_sch(struct usb_host_endpoint *ep,
if (is_fs_or_ls(speed) && !has_tt)
return false;
+ /* skip endpoint with zero maxpkt */
+ if (usb_endpoint_maxp(&ep->desc) == 0)
+ return false;
+
return true;
}
--
2.27.0
From: Marc Kleine-Budde <mkl(a)pengutronix.de>
[ Upstream commit e84861fec32dee8a2e62bbaa52cded6b05a2a456 ]
This function is used by dev_get_regmap() to retrieve a regmap for the
specified device. If the device has more than one regmap, the name parameter
can be used to specify one.
The code here uses a pointer comparison to check for equal strings. This
however will probably always fail, as the regmap->name is allocated via
kstrdup_const() from the regmap's config->name.
Fix this by using strcmp() instead.
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
Link: https://lore.kernel.org/r/20200703103315.267996-1-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/base/regmap/regmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
index 77cabde977edd..4a4efc6f54b55 100644
--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -1106,7 +1106,7 @@ static int dev_get_regmap_match(struct device *dev, void *res, void *data)
/* If the user didn't specify a name match any */
if (data)
- return (*r)->name == data;
+ return !strcmp((*r)->name, data);
else
return 1;
}
--
2.25.1
Hi,
Multiple users reported NFS causes NULL pointer dereference [1] on Ubuntu, due to commit "SUNRPC: Add "@len" parameter to gss_unwrap()" and commit "SUNRPC: Fix GSS privacy computation of auth->au_ralign".
The same issue happens on upstream stable 5.4.y branch.
The mainline kernel doesn't have this issue though.
Should we revert them? Or is there any missing commits need to be backported to v5.4?
[1] https://bugs.launchpad.net/bugs/1886277
Kai-Heng
Currently nvme_tcp_try_send_data() doesn't use kernel_sendpage() to
send slab pages. But for pages allocated by __get_free_pages() without
__GFP_COMP, which also have refcount as 0, they are still sent by
kernel_sendpage() to remote end, this is problematic.
When bcache uses a remote NVMe SSD via nvme-over-tcp as its cache
device, writing meta data e.g. cache_set->disk_buckets to remote SSD may
trigger a kernel panic due to the above problem. Bcause the meta data
pages for cache_set->disk_buckets are allocated by __get_free_pages()
without __GFP_COMP.
This problem should be fixed both in upper layer driver (bcache) and
nvme-over-tcp code. This patch fixes the nvme-over-tcp code by checking
whether the page refcount is 0, if yes then don't use kernel_sendpage()
and call sock_no_sendpage() to send the page into network stack.
The code comments in this patch is copied and modified from drbd where
the similar problem already gets solved by Philipp Reisner. This is the
best code comment including my own version.
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni(a)wdc.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Hannes Reinecke <hare(a)suse.de>
Cc: Jan Kara <jack(a)suse.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy(a)solarflare.com>
Cc: Philipp Reisner <philipp.reisner(a)linbit.com>
Cc: Sagi Grimberg <sagi(a)grimberg.me>
Cc: Vlastimil Babka <vbabka(a)suse.com>
Cc: stable(a)vger.kernel.org
---
Changelog:
v2: fix typo in patch subject.
v1: the initial version.
drivers/nvme/host/tcp.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 79ef2b8e2b3c..faa71db7522a 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -887,8 +887,17 @@ static int nvme_tcp_try_send_data(struct nvme_tcp_request *req)
else
flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST;
- /* can't zcopy slab pages */
- if (unlikely(PageSlab(page))) {
+ /*
+ * e.g. XFS meta- & log-data is in slab pages, or bcache meta
+ * data pages, or other high order pages allocated by
+ * __get_free_pages() without __GFP_COMP, which have a page_count
+ * of 0 and/or have PageSlab() set. We cannot use send_page for
+ * those, as that does get_page(); put_page(); and would cause
+ * either a VM_BUG directly, or __page_cache_release a page that
+ * would actually still be referenced by someone, leading to some
+ * obscure delayed Oops somewhere else.
+ */
+ if (unlikely(PageSlab(page) || page_count(page) < 1)) {
ret = sock_no_sendpage(queue->sock, page, offset, len,
flags);
} else {
--
2.26.2
arm build failed on stable-rc 5.4 branch.
make -sk KBUILD_BUILD_USER=TuxBuild -C/linux -j32 ARCH=arm
CROSS_COMPILE=arm-linux-gnueabihf- HOSTCC=gcc CC="sccache
arm-linux-gnueabihf-gcc" O=build zImage
#
../drivers/firmware/efi/arm-init.c: In function ‘efifb_add_links’:
../drivers/firmware/efi/arm-init.c:327:12: error: implicit declaration
of function ‘get_dev_from_fwnode’
[-Werror=implicit-function-declaration]
327 | sup_dev = get_dev_from_fwnode(&sup_np->fwnode);
| ^~~~~~~~~~~~~~~~~~~
../drivers/firmware/efi/arm-init.c:327:10: warning: assignment to
‘struct device *’ from ‘int’ makes pointer from integer without a cast
[-Wint-conversion]
327 | sup_dev = get_dev_from_fwnode(&sup_np->fwnode);
| ^
../drivers/firmware/efi/arm-init.c: At top level:
../drivers/firmware/efi/arm-init.c:352:3: error: ‘const struct
fwnode_operations’ has no member named ‘add_links’
352 | .add_links = efifb_add_links,
| ^~~~~~~~~
../drivers/firmware/efi/arm-init.c:352:15: error: initialization of
‘struct fwnode_handle * (*)(struct fwnode_handle *)’ from incompatible
pointer type ‘int (*)(const struct fwnode_handle *, struct device *)’
[-Werror=incompatible-pointer-types]
352 | .add_links = efifb_add_links,
| ^~~~~~~~~~~~~~~
../drivers/firmware/efi/arm-init.c:352:15: note: (near initialization
for ‘efifb_fwnode_ops.get’)
seems like this is coming from the below patch
--
efi/arm: Defer probe of PCIe backed efifb on DT systems
[ Upstream commit 64c8a0cd0a535891d5905c3a1651150f0f141439 ]
The new of_devlink support breaks PCIe probing on ARM platforms booting
via UEFI if the firmware exposes a EFI framebuffer that is backed by a
PCI device. The reason is that the probing order gets reversed,
resulting in a resource conflict on the framebuffer memory window when
the PCIe probes last, causing it to give up entirely.
Given that we rely on PCI quirks to deal with EFI framebuffers that get
moved around in memory, we cannot simply drop the memory reservation, so
instead, let's use the device link infrastructure to register this
dependency, and force the probing to occur in the expected order.
Co-developed-by: Saravana Kannan <saravanak(a)google.com>
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
Signed-off-by: Saravana Kannan <saravanak(a)google.com>
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Link: https://lore.kernel.org/r/20200113172245.27925-9-ardb@kernel.org
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
--
Linaro LKFT
https://lkft.linaro.org
Although I was not originally involved in the development of these
patches, I recently came across them while looking over the source:
upstream commit 429120f3df2d ("block: fix splitting segments on boundary masks")
Cc: stable(a)vger.kernel.org # v5.1+
Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
upstream commit 4a2f704eb2d8 ("block: fix get_max_segment_size() overflow on 32bit arch")
Fixes: 429120f3df2d ("block: fix splitting segments on boundary masks")
https://www.spinics.net/lists/linux-block/msg48605.htmlhttps://www.spinics.net/lists/linux-block/msg48959.html
The first patch mentions fixing problems with filesystem corruption, so
it seems important, but it has never been included in any -stable
kernel. Is there a specific reason these patches have been excluded
from -stable, or is it just a mistake?
These patches would be relevant to kernel versions 5.1 - 5.4, with 5.4.y
being the only active stable kernel series in that range. The patches
apply cleanly on top of 5.4.51.
Tony Battersby
Cybernetics
Hello,
Commit ab6f762f0f53162d41 Linus' HEAD.
printk_deferred() does not make sure that it's safe to write to
per-CPU data, which causes problems when printk_deferred() is
invoked "too early", before per-CPU areas are initialized. There
are multiple bug reports, e.g.
https://bugzilla.kernel.org/show_bug.cgi?id=206847
-ss
The patch below does not apply to the 5.7-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 92e0575b99835b5b3aaab2132dd551e0e04eb96a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= <ville.syrjala(a)linux.intel.com>
Date: Sat, 11 Jul 2020 11:03:36 +0300
Subject: [PATCH] drm/i915: Recalculate FBC w/a stride when needed
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently we're failing to recalculate the gen9 FBC w/a stride
unless something more drastic than just the modifier itself has
changed. This often leaves us with FBC enabled with the linear
fbdev framebuffer without the w/a stride enabled. That will cause
an immediate underrun and FBC will get promptly disabled.
Fix the problem by checking if the w/a stride is about to change,
and go through the full dance if so. This part of the FBC code
is still pretty much a disaster and will need lots more work.
But this should at least fix the immediate issue.
v2: Deactivate FBC when the modifier changes since that will
likely require resetting the w/a CFB stride
Cc: stable(a)vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200711080336.13423-1-ville.…
Reviewed-by: José Roberto de Souza <jose.souza(a)intel.com>
(cherry picked from commit 0428ab013fdd39dbfb8f4cd8ad2b60af3776c6b9)
Signed-off-by: Jani Nikula <jani.nikula(a)intel.com>
diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c b/drivers/gpu/drm/i915/display/intel_fbc.c
index a65d9d8b79a7..412572f88b67 100644
--- a/drivers/gpu/drm/i915/display/intel_fbc.c
+++ b/drivers/gpu/drm/i915/display/intel_fbc.c
@@ -719,6 +719,25 @@ static bool intel_fbc_cfb_size_changed(struct drm_i915_private *dev_priv)
fbc->compressed_fb.size * fbc->threshold;
}
+static u16 intel_fbc_gen9_wa_cfb_stride(struct drm_i915_private *dev_priv)
+{
+ struct intel_fbc *fbc = &dev_priv->fbc;
+ struct intel_fbc_state_cache *cache = &fbc->state_cache;
+
+ if ((IS_GEN9_BC(dev_priv) || IS_BROXTON(dev_priv)) &&
+ cache->fb.modifier != I915_FORMAT_MOD_X_TILED)
+ return DIV_ROUND_UP(cache->plane.src_w, 32 * fbc->threshold) * 8;
+ else
+ return 0;
+}
+
+static bool intel_fbc_gen9_wa_cfb_stride_changed(struct drm_i915_private *dev_priv)
+{
+ struct intel_fbc *fbc = &dev_priv->fbc;
+
+ return fbc->params.gen9_wa_cfb_stride != intel_fbc_gen9_wa_cfb_stride(dev_priv);
+}
+
static bool intel_fbc_can_enable(struct drm_i915_private *dev_priv)
{
struct intel_fbc *fbc = &dev_priv->fbc;
@@ -877,6 +896,7 @@ static void intel_fbc_get_reg_params(struct intel_crtc *crtc,
params->crtc.i9xx_plane = to_intel_plane(crtc->base.primary)->i9xx_plane;
params->fb.format = cache->fb.format;
+ params->fb.modifier = cache->fb.modifier;
params->fb.stride = cache->fb.stride;
params->cfb_size = intel_fbc_calculate_cfb_size(dev_priv, cache);
@@ -906,6 +926,9 @@ static bool intel_fbc_can_flip_nuke(const struct intel_crtc_state *crtc_state)
if (params->fb.format != cache->fb.format)
return false;
+ if (params->fb.modifier != cache->fb.modifier)
+ return false;
+
if (params->fb.stride != cache->fb.stride)
return false;
@@ -1185,7 +1208,8 @@ void intel_fbc_enable(struct intel_atomic_state *state,
if (fbc->crtc) {
if (fbc->crtc != crtc ||
- !intel_fbc_cfb_size_changed(dev_priv))
+ (!intel_fbc_cfb_size_changed(dev_priv) &&
+ !intel_fbc_gen9_wa_cfb_stride_changed(dev_priv)))
goto out;
__intel_fbc_disable(dev_priv);
@@ -1207,12 +1231,7 @@ void intel_fbc_enable(struct intel_atomic_state *state,
goto out;
}
- if ((IS_GEN9_BC(dev_priv) || IS_BROXTON(dev_priv)) &&
- plane_state->hw.fb->modifier != I915_FORMAT_MOD_X_TILED)
- cache->gen9_wa_cfb_stride =
- DIV_ROUND_UP(cache->plane.src_w, 32 * fbc->threshold) * 8;
- else
- cache->gen9_wa_cfb_stride = 0;
+ cache->gen9_wa_cfb_stride = intel_fbc_gen9_wa_cfb_stride(dev_priv);
drm_dbg_kms(&dev_priv->drm, "Enabling FBC on pipe %c\n",
pipe_name(crtc->pipe));
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f79f118bf192..ae99a9190200 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -440,6 +440,7 @@ struct intel_fbc {
struct {
const struct drm_format_info *format;
unsigned int stride;
+ u64 modifier;
} fb;
int cfb_size;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 110f9efa858f584c6bed177cd48d0c0f526940e1 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Mon, 13 Jul 2020 17:05:49 +0100
Subject: [PATCH] drm/i915/gt: Only swap to a random sibling once upon creation
The danger in switching at random upon intel_context_pin is that the
context may still actually be inflight, as it will not be scheduled out
until a context switch after it is complete -- that may be a long time
after we do a final intel_context_unpin.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2118
Fixes: 6d06779e8672 ("drm/i915: Load balancing across a virtual engine")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.3+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200713160549.17344-1-chris@…
(cherry picked from commit 90a987205c6cf74116a102ed446d22d92cdaf915)
Signed-off-by: Jani Nikula <jani.nikula(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index d270d2db6f0a..cb07e1d2a353 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -5396,13 +5396,8 @@ static void virtual_engine_initial_hint(struct virtual_engine *ve)
* typically be the first we inspect for submission.
*/
swp = prandom_u32_max(ve->num_siblings);
- if (!swp)
- return;
-
- swap(ve->siblings[swp], ve->siblings[0]);
- if (!intel_engine_has_relative_mmio(ve->siblings[0]))
- virtual_update_register_offsets(ve->context.lrc_reg_state,
- ve->siblings[0]);
+ if (swp)
+ swap(ve->siblings[swp], ve->siblings[0]);
}
static int virtual_context_alloc(struct intel_context *ce)
@@ -5415,15 +5410,9 @@ static int virtual_context_alloc(struct intel_context *ce)
static int virtual_context_pin(struct intel_context *ce)
{
struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
- int err;
/* Note: we must use a real engine class for setting up reg state */
- err = __execlists_context_pin(ce, ve->siblings[0]);
- if (err)
- return err;
-
- virtual_engine_initial_hint(ve);
- return 0;
+ return __execlists_context_pin(ce, ve->siblings[0]);
}
static void virtual_context_enter(struct intel_context *ce)
@@ -5770,6 +5759,7 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
ve->base.flags |= I915_ENGINE_IS_VIRTUAL;
+ virtual_engine_initial_hint(ve);
return &ve->context;
err_put:
On Mon, Jun 29, 2020 at 04:28:05PM +0200, SeongJae Park wrote:
> Hello,
>
>
> With my little script, I found below commits in the mainline tree are more than
> 1 week old and fixing commits that back-ported in v5.4..v5.4.49, but not merged
> in the stable/linux-5.4.y tree. Are those need to be merged in but missed or
> dealyed?
>
> 9210c075cef2 ("nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll()")
I tried this first patch, and it doesn't apply to the 5.4.y tree, so are
you sure you tried these yourself?
If so, please send a series of backported patches that you have
successfully tested, or if a patch applies cleanly, just the git id.
thanks,
greg k-h
Commit 005c34ae4b44f085120d7f371121ec7ded677761 upstream.
The GIC driver uses a RMW sequence to update the affinity, and
relies on the gic_lock_irqsave/gic_unlock_irqrestore sequences
to update it atomically.
But these sequences only expand into anything meaningful if
the BL_SWITCHER option is selected, which almost never happens.
It also turns out that using a RMW and locks is just as silly,
as the GIC distributor supports byte accesses for the GICD_TARGETRn
registers, which when used make the update atomic by definition.
Drop the terminally broken code and replace it by a byte write.
Fixes: 04c8b0f82c7d ("irqchip/gic: Make locking a BL_SWITCHER only feature")
Cc: stable(a)vger.kernel.org
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
---
drivers/irqchip/irq-gic.c | 13 +++----------
1 file changed, 3 insertions(+), 10 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index d6c404b3584d..006b17593c12 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -324,10 +324,8 @@ static int gic_irq_set_vcpu_affinity(struct irq_data *d, void *vcpu)
static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
bool force)
{
- void __iomem *reg = gic_dist_base(d) + GIC_DIST_TARGET + (gic_irq(d) & ~3);
- unsigned int cpu, shift = (gic_irq(d) % 4) * 8;
- u32 val, mask, bit;
- unsigned long flags;
+ void __iomem *reg = gic_dist_base(d) + GIC_DIST_TARGET + gic_irq(d);
+ unsigned int cpu;
if (!force)
cpu = cpumask_any_and(mask_val, cpu_online_mask);
@@ -337,12 +335,7 @@ static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
if (cpu >= NR_GIC_CPU_IF || cpu >= nr_cpu_ids)
return -EINVAL;
- gic_lock_irqsave(flags);
- mask = 0xff << shift;
- bit = gic_cpu_map[cpu] << shift;
- val = readl_relaxed(reg) & ~mask;
- writel_relaxed(val | bit, reg);
- gic_unlock_irqrestore(flags);
+ writeb_relaxed(gic_cpu_map[cpu], reg);
return IRQ_SET_MASK_OK_DONE;
}
--
2.27.0
[ Upstream commit 01cfcde9c26d8555f0e6e9aea9d6049f87683998 ]
task_h_load() can return 0 in some situations like running stress-ng
mmapfork, which forks thousands of threads, in a sched group on a 224 cores
system. The load balance doesn't handle this correctly because
env->imbalance never decreases and it will stop pulling tasks only after
reaching loop_max, which can be equal to the number of running tasks of
the cfs. Make sure that imbalance will be decreased by at least 1.
We can't simply ensure that task_h_load() returns at least one because it
would imply to handle underflow in other places.
Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
[removed misfit part which was not implemented yet]
Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org>
Reviewed-by: Valentin Schneider <valentin.schneider(a)arm.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann(a)arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann(a)arm.com>
Cc: <stable(a)vger.kernel.org> # v4.19 v4.14 v4.9 v4.4
cc: Sasha Levin <sashal(a)kernel.org>
Link: https://lkml.kernel.org/r/20200710152426.16981-1-vincent.guittot@linaro.org
---
This patch also applies on v4.14.188 v4.9.230 and v4.4.230
kernel/sched/fair.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 92b1e71f13c8..d8c249e6dcb7 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7337,7 +7337,15 @@ static int detach_tasks(struct lb_env *env)
if (!can_migrate_task(p, env))
goto next;
- load = task_h_load(p);
+ /*
+ * Depending of the number of CPUs and tasks and the
+ * cgroup hierarchy, task_h_load() can return a null
+ * value. Make sure that env->imbalance decreases
+ * otherwise detach_tasks() will stop only after
+ * detaching up to loop_max tasks.
+ */
+ load = max_t(unsigned long, task_h_load(p), 1);
+
if (sched_feat(LB_MIN) && load < 16 && !env->sd->nr_balance_failed)
goto next;
--
2.17.1
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 15956689a0e60aa0c795174f3c310b60d8794235 Mon Sep 17 00:00:00 2001
From: Will Deacon <will(a)kernel.org>
Date: Fri, 3 Jul 2020 12:08:42 +0100
Subject: [PATCH] arm64: compat: Ensure upper 32 bits of x0 are zero on syscall
return
Although we zero the upper bits of x0 on entry to the kernel from an
AArch32 task, we do not clear them on the exception return path and can
therefore expose 64-bit sign extended syscall return values to userspace
via interfaces such as the 'perf_regs' ABI, which deal exclusively with
64-bit registers.
Explicitly clear the upper 32 bits of x0 on return from a compat system
call.
Cc: <stable(a)vger.kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Keno Fischer <keno(a)juliacomputing.com>
Cc: Luis Machado <luis.machado(a)linaro.org>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
index 65299a2dcf9c..cfc0672013f6 100644
--- a/arch/arm64/include/asm/syscall.h
+++ b/arch/arm64/include/asm/syscall.h
@@ -34,6 +34,10 @@ static inline long syscall_get_error(struct task_struct *task,
struct pt_regs *regs)
{
unsigned long error = regs->regs[0];
+
+ if (is_compat_thread(task_thread_info(task)))
+ error = sign_extend64(error, 31);
+
return IS_ERR_VALUE(error) ? error : 0;
}
@@ -47,7 +51,13 @@ static inline void syscall_set_return_value(struct task_struct *task,
struct pt_regs *regs,
int error, long val)
{
- regs->regs[0] = (long) error ? error : val;
+ if (error)
+ val = error;
+
+ if (is_compat_thread(task_thread_info(task)))
+ val = lower_32_bits(val);
+
+ regs->regs[0] = val;
}
#define SYSCALL_MAX_ARGS 6
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 7c14466a12af..98a26d4e7b0c 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -50,6 +50,9 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
ret = do_ni_syscall(regs, scno);
}
+ if (is_compat_task())
+ ret = lower_32_bits(ret);
+
regs->regs[0] = ret;
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 15956689a0e60aa0c795174f3c310b60d8794235 Mon Sep 17 00:00:00 2001
From: Will Deacon <will(a)kernel.org>
Date: Fri, 3 Jul 2020 12:08:42 +0100
Subject: [PATCH] arm64: compat: Ensure upper 32 bits of x0 are zero on syscall
return
Although we zero the upper bits of x0 on entry to the kernel from an
AArch32 task, we do not clear them on the exception return path and can
therefore expose 64-bit sign extended syscall return values to userspace
via interfaces such as the 'perf_regs' ABI, which deal exclusively with
64-bit registers.
Explicitly clear the upper 32 bits of x0 on return from a compat system
call.
Cc: <stable(a)vger.kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Keno Fischer <keno(a)juliacomputing.com>
Cc: Luis Machado <luis.machado(a)linaro.org>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
index 65299a2dcf9c..cfc0672013f6 100644
--- a/arch/arm64/include/asm/syscall.h
+++ b/arch/arm64/include/asm/syscall.h
@@ -34,6 +34,10 @@ static inline long syscall_get_error(struct task_struct *task,
struct pt_regs *regs)
{
unsigned long error = regs->regs[0];
+
+ if (is_compat_thread(task_thread_info(task)))
+ error = sign_extend64(error, 31);
+
return IS_ERR_VALUE(error) ? error : 0;
}
@@ -47,7 +51,13 @@ static inline void syscall_set_return_value(struct task_struct *task,
struct pt_regs *regs,
int error, long val)
{
- regs->regs[0] = (long) error ? error : val;
+ if (error)
+ val = error;
+
+ if (is_compat_thread(task_thread_info(task)))
+ val = lower_32_bits(val);
+
+ regs->regs[0] = val;
}
#define SYSCALL_MAX_ARGS 6
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 7c14466a12af..98a26d4e7b0c 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -50,6 +50,9 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
ret = do_ni_syscall(regs, scno);
}
+ if (is_compat_task())
+ ret = lower_32_bits(ret);
+
regs->regs[0] = ret;
}
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 15956689a0e60aa0c795174f3c310b60d8794235 Mon Sep 17 00:00:00 2001
From: Will Deacon <will(a)kernel.org>
Date: Fri, 3 Jul 2020 12:08:42 +0100
Subject: [PATCH] arm64: compat: Ensure upper 32 bits of x0 are zero on syscall
return
Although we zero the upper bits of x0 on entry to the kernel from an
AArch32 task, we do not clear them on the exception return path and can
therefore expose 64-bit sign extended syscall return values to userspace
via interfaces such as the 'perf_regs' ABI, which deal exclusively with
64-bit registers.
Explicitly clear the upper 32 bits of x0 on return from a compat system
call.
Cc: <stable(a)vger.kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Keno Fischer <keno(a)juliacomputing.com>
Cc: Luis Machado <luis.machado(a)linaro.org>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
index 65299a2dcf9c..cfc0672013f6 100644
--- a/arch/arm64/include/asm/syscall.h
+++ b/arch/arm64/include/asm/syscall.h
@@ -34,6 +34,10 @@ static inline long syscall_get_error(struct task_struct *task,
struct pt_regs *regs)
{
unsigned long error = regs->regs[0];
+
+ if (is_compat_thread(task_thread_info(task)))
+ error = sign_extend64(error, 31);
+
return IS_ERR_VALUE(error) ? error : 0;
}
@@ -47,7 +51,13 @@ static inline void syscall_set_return_value(struct task_struct *task,
struct pt_regs *regs,
int error, long val)
{
- regs->regs[0] = (long) error ? error : val;
+ if (error)
+ val = error;
+
+ if (is_compat_thread(task_thread_info(task)))
+ val = lower_32_bits(val);
+
+ regs->regs[0] = val;
}
#define SYSCALL_MAX_ARGS 6
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 7c14466a12af..98a26d4e7b0c 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -50,6 +50,9 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
ret = do_ni_syscall(regs, scno);
}
+ if (is_compat_task())
+ ret = lower_32_bits(ret);
+
regs->regs[0] = ret;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ac2081cdc4d99c57f219c1a6171526e0fa0a6fff Mon Sep 17 00:00:00 2001
From: Will Deacon <will(a)kernel.org>
Date: Thu, 2 Jul 2020 21:16:20 +0100
Subject: [PATCH] arm64: ptrace: Consistently use pseudo-singlestep exceptions
Although the arm64 single-step state machine can be fast-forwarded in
cases where we wish to generate a SIGTRAP without actually executing an
instruction, this has two major limitations outside of simply skipping
an instruction due to emulation.
1. Stepping out of a ptrace signal stop into a signal handler where
SIGTRAP is blocked. Fast-forwarding the stepping state machine in
this case will result in a forced SIGTRAP, with the handler reset to
SIG_DFL.
2. The hardware implicitly fast-forwards the state machine when executing
an SVC instruction for issuing a system call. This can interact badly
with subsequent ptrace stops signalled during the execution of the
system call (e.g. SYSCALL_EXIT or seccomp traps), as they may corrupt
the stepping state by updating the PSTATE for the tracee.
Resolve both of these issues by injecting a pseudo-singlestep exception
on entry to a signal handler and also on return to userspace following a
system call.
Cc: <stable(a)vger.kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Luis Machado <luis.machado(a)linaro.org>
Reported-by: Keno Fischer <keno(a)juliacomputing.com>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 6ea8b6a26ae9..5e784e16ee89 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -93,6 +93,7 @@ void arch_release_task_struct(struct task_struct *tsk);
#define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU)
#define _TIF_UPROBE (1 << TIF_UPROBE)
#define _TIF_FSCHECK (1 << TIF_FSCHECK)
+#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP)
#define _TIF_32BIT (1 << TIF_32BIT)
#define _TIF_SVE (1 << TIF_SVE)
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 68b7f34a08f5..057d4aa1af4d 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1818,12 +1818,23 @@ static void tracehook_report_syscall(struct pt_regs *regs,
saved_reg = regs->regs[regno];
regs->regs[regno] = dir;
- if (dir == PTRACE_SYSCALL_EXIT)
+ if (dir == PTRACE_SYSCALL_ENTER) {
+ if (tracehook_report_syscall_entry(regs))
+ forget_syscall(regs);
+ regs->regs[regno] = saved_reg;
+ } else if (!test_thread_flag(TIF_SINGLESTEP)) {
tracehook_report_syscall_exit(regs, 0);
- else if (tracehook_report_syscall_entry(regs))
- forget_syscall(regs);
+ regs->regs[regno] = saved_reg;
+ } else {
+ regs->regs[regno] = saved_reg;
- regs->regs[regno] = saved_reg;
+ /*
+ * Signal a pseudo-step exception since we are stepping but
+ * tracer modifications to the registers may have rewound the
+ * state machine.
+ */
+ tracehook_report_syscall_exit(regs, 1);
+ }
}
int syscall_trace_enter(struct pt_regs *regs)
@@ -1851,12 +1862,14 @@ int syscall_trace_enter(struct pt_regs *regs)
void syscall_trace_exit(struct pt_regs *regs)
{
+ unsigned long flags = READ_ONCE(current_thread_info()->flags);
+
audit_syscall_exit(regs);
- if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
+ if (flags & _TIF_SYSCALL_TRACEPOINT)
trace_sys_exit(regs, regs_return_value(regs));
- if (test_thread_flag(TIF_SYSCALL_TRACE))
+ if (flags & (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP))
tracehook_report_syscall(regs, PTRACE_SYSCALL_EXIT);
rseq_syscall(regs);
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 801d56cdf701..3b4f31f35e45 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -800,7 +800,6 @@ static void setup_restart_syscall(struct pt_regs *regs)
*/
static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
{
- struct task_struct *tsk = current;
sigset_t *oldset = sigmask_to_save();
int usig = ksig->sig;
int ret;
@@ -824,14 +823,8 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
*/
ret |= !valid_user_regs(®s->user_regs, current);
- /*
- * Fast forward the stepping logic so we step into the signal
- * handler.
- */
- if (!ret)
- user_fastforward_single_step(tsk);
-
- signal_setup_done(ret, ksig, 0);
+ /* Step into the signal handler if we are stepping */
+ signal_setup_done(ret, ksig, test_thread_flag(TIF_SINGLESTEP));
}
/*
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 5f5b868292f5..7c14466a12af 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -139,7 +139,7 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
if (!has_syscall_work(flags) && !IS_ENABLED(CONFIG_DEBUG_RSEQ)) {
local_daif_mask();
flags = current_thread_info()->flags;
- if (!has_syscall_work(flags)) {
+ if (!has_syscall_work(flags) && !(flags & _TIF_SINGLESTEP)) {
/*
* We're off to userspace, where interrupts are
* always enabled after we restore the flags from
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ac2081cdc4d99c57f219c1a6171526e0fa0a6fff Mon Sep 17 00:00:00 2001
From: Will Deacon <will(a)kernel.org>
Date: Thu, 2 Jul 2020 21:16:20 +0100
Subject: [PATCH] arm64: ptrace: Consistently use pseudo-singlestep exceptions
Although the arm64 single-step state machine can be fast-forwarded in
cases where we wish to generate a SIGTRAP without actually executing an
instruction, this has two major limitations outside of simply skipping
an instruction due to emulation.
1. Stepping out of a ptrace signal stop into a signal handler where
SIGTRAP is blocked. Fast-forwarding the stepping state machine in
this case will result in a forced SIGTRAP, with the handler reset to
SIG_DFL.
2. The hardware implicitly fast-forwards the state machine when executing
an SVC instruction for issuing a system call. This can interact badly
with subsequent ptrace stops signalled during the execution of the
system call (e.g. SYSCALL_EXIT or seccomp traps), as they may corrupt
the stepping state by updating the PSTATE for the tracee.
Resolve both of these issues by injecting a pseudo-singlestep exception
on entry to a signal handler and also on return to userspace following a
system call.
Cc: <stable(a)vger.kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Luis Machado <luis.machado(a)linaro.org>
Reported-by: Keno Fischer <keno(a)juliacomputing.com>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 6ea8b6a26ae9..5e784e16ee89 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -93,6 +93,7 @@ void arch_release_task_struct(struct task_struct *tsk);
#define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU)
#define _TIF_UPROBE (1 << TIF_UPROBE)
#define _TIF_FSCHECK (1 << TIF_FSCHECK)
+#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP)
#define _TIF_32BIT (1 << TIF_32BIT)
#define _TIF_SVE (1 << TIF_SVE)
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 68b7f34a08f5..057d4aa1af4d 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1818,12 +1818,23 @@ static void tracehook_report_syscall(struct pt_regs *regs,
saved_reg = regs->regs[regno];
regs->regs[regno] = dir;
- if (dir == PTRACE_SYSCALL_EXIT)
+ if (dir == PTRACE_SYSCALL_ENTER) {
+ if (tracehook_report_syscall_entry(regs))
+ forget_syscall(regs);
+ regs->regs[regno] = saved_reg;
+ } else if (!test_thread_flag(TIF_SINGLESTEP)) {
tracehook_report_syscall_exit(regs, 0);
- else if (tracehook_report_syscall_entry(regs))
- forget_syscall(regs);
+ regs->regs[regno] = saved_reg;
+ } else {
+ regs->regs[regno] = saved_reg;
- regs->regs[regno] = saved_reg;
+ /*
+ * Signal a pseudo-step exception since we are stepping but
+ * tracer modifications to the registers may have rewound the
+ * state machine.
+ */
+ tracehook_report_syscall_exit(regs, 1);
+ }
}
int syscall_trace_enter(struct pt_regs *regs)
@@ -1851,12 +1862,14 @@ int syscall_trace_enter(struct pt_regs *regs)
void syscall_trace_exit(struct pt_regs *regs)
{
+ unsigned long flags = READ_ONCE(current_thread_info()->flags);
+
audit_syscall_exit(regs);
- if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
+ if (flags & _TIF_SYSCALL_TRACEPOINT)
trace_sys_exit(regs, regs_return_value(regs));
- if (test_thread_flag(TIF_SYSCALL_TRACE))
+ if (flags & (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP))
tracehook_report_syscall(regs, PTRACE_SYSCALL_EXIT);
rseq_syscall(regs);
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 801d56cdf701..3b4f31f35e45 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -800,7 +800,6 @@ static void setup_restart_syscall(struct pt_regs *regs)
*/
static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
{
- struct task_struct *tsk = current;
sigset_t *oldset = sigmask_to_save();
int usig = ksig->sig;
int ret;
@@ -824,14 +823,8 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
*/
ret |= !valid_user_regs(®s->user_regs, current);
- /*
- * Fast forward the stepping logic so we step into the signal
- * handler.
- */
- if (!ret)
- user_fastforward_single_step(tsk);
-
- signal_setup_done(ret, ksig, 0);
+ /* Step into the signal handler if we are stepping */
+ signal_setup_done(ret, ksig, test_thread_flag(TIF_SINGLESTEP));
}
/*
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 5f5b868292f5..7c14466a12af 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -139,7 +139,7 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
if (!has_syscall_work(flags) && !IS_ENABLED(CONFIG_DEBUG_RSEQ)) {
local_daif_mask();
flags = current_thread_info()->flags;
- if (!has_syscall_work(flags)) {
+ if (!has_syscall_work(flags) && !(flags & _TIF_SINGLESTEP)) {
/*
* We're off to userspace, where interrupts are
* always enabled after we restore the flags from
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ac2081cdc4d99c57f219c1a6171526e0fa0a6fff Mon Sep 17 00:00:00 2001
From: Will Deacon <will(a)kernel.org>
Date: Thu, 2 Jul 2020 21:16:20 +0100
Subject: [PATCH] arm64: ptrace: Consistently use pseudo-singlestep exceptions
Although the arm64 single-step state machine can be fast-forwarded in
cases where we wish to generate a SIGTRAP without actually executing an
instruction, this has two major limitations outside of simply skipping
an instruction due to emulation.
1. Stepping out of a ptrace signal stop into a signal handler where
SIGTRAP is blocked. Fast-forwarding the stepping state machine in
this case will result in a forced SIGTRAP, with the handler reset to
SIG_DFL.
2. The hardware implicitly fast-forwards the state machine when executing
an SVC instruction for issuing a system call. This can interact badly
with subsequent ptrace stops signalled during the execution of the
system call (e.g. SYSCALL_EXIT or seccomp traps), as they may corrupt
the stepping state by updating the PSTATE for the tracee.
Resolve both of these issues by injecting a pseudo-singlestep exception
on entry to a signal handler and also on return to userspace following a
system call.
Cc: <stable(a)vger.kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Luis Machado <luis.machado(a)linaro.org>
Reported-by: Keno Fischer <keno(a)juliacomputing.com>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 6ea8b6a26ae9..5e784e16ee89 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -93,6 +93,7 @@ void arch_release_task_struct(struct task_struct *tsk);
#define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU)
#define _TIF_UPROBE (1 << TIF_UPROBE)
#define _TIF_FSCHECK (1 << TIF_FSCHECK)
+#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP)
#define _TIF_32BIT (1 << TIF_32BIT)
#define _TIF_SVE (1 << TIF_SVE)
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 68b7f34a08f5..057d4aa1af4d 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1818,12 +1818,23 @@ static void tracehook_report_syscall(struct pt_regs *regs,
saved_reg = regs->regs[regno];
regs->regs[regno] = dir;
- if (dir == PTRACE_SYSCALL_EXIT)
+ if (dir == PTRACE_SYSCALL_ENTER) {
+ if (tracehook_report_syscall_entry(regs))
+ forget_syscall(regs);
+ regs->regs[regno] = saved_reg;
+ } else if (!test_thread_flag(TIF_SINGLESTEP)) {
tracehook_report_syscall_exit(regs, 0);
- else if (tracehook_report_syscall_entry(regs))
- forget_syscall(regs);
+ regs->regs[regno] = saved_reg;
+ } else {
+ regs->regs[regno] = saved_reg;
- regs->regs[regno] = saved_reg;
+ /*
+ * Signal a pseudo-step exception since we are stepping but
+ * tracer modifications to the registers may have rewound the
+ * state machine.
+ */
+ tracehook_report_syscall_exit(regs, 1);
+ }
}
int syscall_trace_enter(struct pt_regs *regs)
@@ -1851,12 +1862,14 @@ int syscall_trace_enter(struct pt_regs *regs)
void syscall_trace_exit(struct pt_regs *regs)
{
+ unsigned long flags = READ_ONCE(current_thread_info()->flags);
+
audit_syscall_exit(regs);
- if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
+ if (flags & _TIF_SYSCALL_TRACEPOINT)
trace_sys_exit(regs, regs_return_value(regs));
- if (test_thread_flag(TIF_SYSCALL_TRACE))
+ if (flags & (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP))
tracehook_report_syscall(regs, PTRACE_SYSCALL_EXIT);
rseq_syscall(regs);
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 801d56cdf701..3b4f31f35e45 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -800,7 +800,6 @@ static void setup_restart_syscall(struct pt_regs *regs)
*/
static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
{
- struct task_struct *tsk = current;
sigset_t *oldset = sigmask_to_save();
int usig = ksig->sig;
int ret;
@@ -824,14 +823,8 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
*/
ret |= !valid_user_regs(®s->user_regs, current);
- /*
- * Fast forward the stepping logic so we step into the signal
- * handler.
- */
- if (!ret)
- user_fastforward_single_step(tsk);
-
- signal_setup_done(ret, ksig, 0);
+ /* Step into the signal handler if we are stepping */
+ signal_setup_done(ret, ksig, test_thread_flag(TIF_SINGLESTEP));
}
/*
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 5f5b868292f5..7c14466a12af 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -139,7 +139,7 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
if (!has_syscall_work(flags) && !IS_ENABLED(CONFIG_DEBUG_RSEQ)) {
local_daif_mask();
flags = current_thread_info()->flags;
- if (!has_syscall_work(flags)) {
+ if (!has_syscall_work(flags) && !(flags & _TIF_SINGLESTEP)) {
/*
* We're off to userspace, where interrupts are
* always enabled after we restore the flags from
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 263a0269a59c0b4145829462a107fe7f7327105f Mon Sep 17 00:00:00 2001
From: Dinh Nguyen <dinguyen(a)kernel.org>
Date: Mon, 29 Jun 2020 11:25:43 -0500
Subject: [PATCH] arm64: dts: stratix10: add status to qspi dts node
Add status = "okay" to QSPI node.
Fixes: 0cb140d07fc75 ("arm64: dts: stratix10: Add QSPI support for Stratix10")
Cc: linux-stable <stable(a)vger.kernel.org> # >= v5.6
Signed-off-by: Dinh Nguyen <dinguyen(a)kernel.org>
diff --git a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
index f6c4a15079d3..feadd21bc0dc 100644
--- a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
+++ b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
@@ -155,6 +155,7 @@
};
&qspi {
+ status = "okay";
flash@0 {
#address-cells = <1>;
#size-cells = <1>;
diff --git a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts
index 9946515b8afd..4000c393243d 100644
--- a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts
+++ b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts
@@ -188,6 +188,7 @@
};
&qspi {
+ status = "okay";
flash@0 {
#address-cells = <1>;
#size-cells = <1>;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 263a0269a59c0b4145829462a107fe7f7327105f Mon Sep 17 00:00:00 2001
From: Dinh Nguyen <dinguyen(a)kernel.org>
Date: Mon, 29 Jun 2020 11:25:43 -0500
Subject: [PATCH] arm64: dts: stratix10: add status to qspi dts node
Add status = "okay" to QSPI node.
Fixes: 0cb140d07fc75 ("arm64: dts: stratix10: Add QSPI support for Stratix10")
Cc: linux-stable <stable(a)vger.kernel.org> # >= v5.6
Signed-off-by: Dinh Nguyen <dinguyen(a)kernel.org>
diff --git a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
index f6c4a15079d3..feadd21bc0dc 100644
--- a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
+++ b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
@@ -155,6 +155,7 @@
};
&qspi {
+ status = "okay";
flash@0 {
#address-cells = <1>;
#size-cells = <1>;
diff --git a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts
index 9946515b8afd..4000c393243d 100644
--- a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts
+++ b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts
@@ -188,6 +188,7 @@
};
&qspi {
+ status = "okay";
flash@0 {
#address-cells = <1>;
#size-cells = <1>;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 4237c625304b212a3f30adf787901082082511ec Mon Sep 17 00:00:00 2001
From: Tim Harvey <tharvey(a)gateworks.com>
Date: Tue, 23 Jun 2020 12:06:54 -0700
Subject: [PATCH] ARM: dts: imx6qdl-gw551x: fix audio SSI
The audio codec on the GW551x routes to ssi1. It fixes audio capture on
the device.
Cc: stable(a)vger.kernel.org
Fixes: 3117e851cef1 ("ARM: dts: imx: Add TDA19971 HDMI Receiver to GW551x")
Signed-off-by: Tim Harvey <tharvey(a)gateworks.com>
Signed-off-by: Shawn Guo <shawnguo(a)kernel.org>
diff --git a/arch/arm/boot/dts/imx6qdl-gw551x.dtsi b/arch/arm/boot/dts/imx6qdl-gw551x.dtsi
index c38e86eedcc0..8c33510c9519 100644
--- a/arch/arm/boot/dts/imx6qdl-gw551x.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-gw551x.dtsi
@@ -110,7 +110,7 @@
simple-audio-card,frame-master = <&sound_codec>;
sound_cpu: simple-audio-card,cpu {
- sound-dai = <&ssi2>;
+ sound-dai = <&ssi1>;
};
sound_codec: simple-audio-card,codec {
From: Finley Xiao <finley.xiao(a)rock-chips.com>
commit 371a3bc79c11b707d7a1b7a2c938dc3cc042fffb upstream.
The function cpu_power_to_freq is used to find a frequency and set the
cooling device to consume at most the power to be converted. For example,
if the power to be converted is 80mW, and the em table is as follow.
struct em_cap_state table[] = {
/* KHz mW */
{ 1008000, 36, 0 },
{ 1200000, 49, 0 },
{ 1296000, 59, 0 },
{ 1416000, 72, 0 },
{ 1512000, 86, 0 },
};
The target frequency should be 1416000KHz, not 1512000KHz.
Fixes: 349d39dc5739 ("thermal: cpu_cooling: merge frequency and power tables")
Cc: <stable(a)vger.kernel.org> # v4.13+
Signed-off-by: Finley Xiao <finley.xiao(a)rock-chips.com>
Acked-by: Viresh Kumar <viresh.kumar(a)linaro.org>
Reviewed-by: Amit Kucheria <amit.kucheria(a)linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano(a)linaro.org>
Link: https://lore.kernel.org/r/20200619090825.32747-1-finley.xiao@rock-chips.com
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
Hi Greg,
I am resending this as I got your emails of this failing on 4.14, 4.19
and 5.4. This should be applied to all three of them.
@Finley: I hope I have done it correctly, please do check it as this
required me to rewrite the code to adapt to previous kernels.
drivers/thermal/cpu_cooling.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 908a8014cf76..1f4387a5ceae 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -280,11 +280,11 @@ static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_cdev,
int i;
struct freq_table *freq_table = cpufreq_cdev->freq_table;
- for (i = 1; i <= cpufreq_cdev->max_level; i++)
- if (power > freq_table[i].power)
+ for (i = 0; i < cpufreq_cdev->max_level; i++)
+ if (power >= freq_table[i].power)
break;
- return freq_table[i - 1].frequency;
+ return freq_table[i].frequency;
}
/**
--
2.25.0.rc1.19.g042ed3e048af
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 6348dd291e3653534a9e28e6917569bc9967b35b Mon Sep 17 00:00:00 2001
From: Charan Teja Kalla <charante(a)codeaurora.org>
Date: Fri, 19 Jun 2020 17:27:19 +0530
Subject: [PATCH] dmabuf: use spinlock to access dmabuf->name
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
There exists a sleep-while-atomic bug while accessing the dmabuf->name
under mutex in the dmabuffs_dname(). This is caused from the SELinux
permissions checks on a process where it tries to validate the inherited
files from fork() by traversing them through iterate_fd() (which
traverse files under spin_lock) and call
match_file(security/selinux/hooks.c) where the permission checks happen.
This audit information is logged using dump_common_audit_data() where it
calls d_path() to get the file path name. If the file check happen on
the dmabuf's fd, then it ends up in ->dmabuffs_dname() and use mutex to
access dmabuf->name. The flow will be like below:
flush_unauthorized_files()
iterate_fd()
spin_lock() --> Start of the atomic section.
match_file()
file_has_perm()
avc_has_perm()
avc_audit()
slow_avc_audit()
common_lsm_audit()
dump_common_audit_data()
audit_log_d_path()
d_path()
dmabuffs_dname()
mutex_lock()--> Sleep while atomic.
Call trace captured (on 4.19 kernels) is below:
___might_sleep+0x204/0x208
__might_sleep+0x50/0x88
__mutex_lock_common+0x5c/0x1068
__mutex_lock_common+0x5c/0x1068
mutex_lock_nested+0x40/0x50
dmabuffs_dname+0xa0/0x170
d_path+0x84/0x290
audit_log_d_path+0x74/0x130
common_lsm_audit+0x334/0x6e8
slow_avc_audit+0xb8/0xf8
avc_has_perm+0x154/0x218
file_has_perm+0x70/0x180
match_file+0x60/0x78
iterate_fd+0x128/0x168
selinux_bprm_committing_creds+0x178/0x248
security_bprm_committing_creds+0x30/0x48
install_exec_creds+0x1c/0x68
load_elf_binary+0x3a4/0x14e0
search_binary_handler+0xb0/0x1e0
So, use spinlock to access dmabuf->name to avoid sleep-while-atomic.
Cc: <stable(a)vger.kernel.org> [5.3+]
Signed-off-by: Charan Teja Kalla <charante(a)codeaurora.org>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
Acked-by: Christian König <christian.koenig(a)amd.com>
[sumits: added comment to spinlock_t definition to avoid warning]
Signed-off-by: Sumit Semwal <sumit.semwal(a)linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/a83e7f0d-4e54-9848-4b58-e1acd…
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 412629601ad3..1ca609f66fdf 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -45,10 +45,10 @@ static char *dmabuffs_dname(struct dentry *dentry, char *buffer, int buflen)
size_t ret = 0;
dmabuf = dentry->d_fsdata;
- dma_resv_lock(dmabuf->resv, NULL);
+ spin_lock(&dmabuf->name_lock);
if (dmabuf->name)
ret = strlcpy(name, dmabuf->name, DMA_BUF_NAME_LEN);
- dma_resv_unlock(dmabuf->resv);
+ spin_unlock(&dmabuf->name_lock);
return dynamic_dname(dentry, buffer, buflen, "/%s:%s",
dentry->d_name.name, ret > 0 ? name : "");
@@ -338,8 +338,10 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf)
kfree(name);
goto out_unlock;
}
+ spin_lock(&dmabuf->name_lock);
kfree(dmabuf->name);
dmabuf->name = name;
+ spin_unlock(&dmabuf->name_lock);
out_unlock:
dma_resv_unlock(dmabuf->resv);
@@ -402,10 +404,10 @@ static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file)
/* Don't count the temporary reference taken inside procfs seq_show */
seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1);
seq_printf(m, "exp_name:\t%s\n", dmabuf->exp_name);
- dma_resv_lock(dmabuf->resv, NULL);
+ spin_lock(&dmabuf->name_lock);
if (dmabuf->name)
seq_printf(m, "name:\t%s\n", dmabuf->name);
- dma_resv_unlock(dmabuf->resv);
+ spin_unlock(&dmabuf->name_lock);
}
static const struct file_operations dma_buf_fops = {
@@ -542,6 +544,7 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
dmabuf->size = exp_info->size;
dmabuf->exp_name = exp_info->exp_name;
dmabuf->owner = exp_info->owner;
+ spin_lock_init(&dmabuf->name_lock);
init_waitqueue_head(&dmabuf->poll);
dmabuf->cb_excl.poll = dmabuf->cb_shared.poll = &dmabuf->poll;
dmabuf->cb_excl.active = dmabuf->cb_shared.active = 0;
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index ab0c156abee6..a2ca294eaebe 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -311,6 +311,7 @@ struct dma_buf {
void *vmap_ptr;
const char *exp_name;
const char *name;
+ spinlock_t name_lock; /* spinlock to protect name access */
struct module *owner;
struct list_head list_node;
void *priv;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From a683f390b93f4d1292f849fc48d28e322046120f Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Mon, 4 Jul 2016 09:50:36 +0000
Subject: [PATCH] timers: Forward the wheel clock whenever possible
The wheel clock is stale when a CPU goes into a long idle sleep. This has the
side effect that timers which are queued end up in the outer wheel levels.
That results in coarser granularity.
To solve this, we keep track of the idle state and forward the wheel clock
whenever possible.
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Arjan van de Ven <arjan(a)infradead.org>
Cc: Chris Mason <clm(a)fb.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Frederic Weisbecker <fweisbec(a)gmail.com>
Cc: George Spelvin <linux(a)sciencehorizons.net>
Cc: Josh Triplett <josh(a)joshtriplett.org>
Cc: Len Brown <lenb(a)kernel.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Paul E. McKenney <paulmck(a)linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: rt(a)linutronix.de
Link: http://lkml.kernel.org/r/20160704094342.512039360@linutronix.de
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h
index 966a5a6fdd0a..f738251000fe 100644
--- a/kernel/time/tick-internal.h
+++ b/kernel/time/tick-internal.h
@@ -164,3 +164,4 @@ static inline void timers_update_migration(bool update_nohz) { }
DECLARE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases);
extern u64 get_next_timer_interrupt(unsigned long basej, u64 basem);
+void timer_clear_idle(void);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 69abc7bfe80f..5d81f9aa30d2 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -700,6 +700,12 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
delta = next_tick - basemono;
if (delta <= (u64)TICK_NSEC) {
tick.tv64 = 0;
+
+ /*
+ * Tell the timer code that the base is not idle, i.e. undo
+ * the effect of get_next_timer_interrupt():
+ */
+ timer_clear_idle();
/*
* We've not stopped the tick yet, and there's a timer in the
* next period, so no point in stopping it either, bail.
@@ -809,6 +815,12 @@ static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now)
tick_do_update_jiffies64(now);
cpu_load_update_nohz_stop();
+ /*
+ * Clear the timer idle flag, so we avoid IPIs on remote queueing and
+ * the clock forward checks in the enqueue path:
+ */
+ timer_clear_idle();
+
calc_load_exit_idle();
touch_softlockup_watchdog_sched();
/*
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 658051c97a3c..9339d71ee998 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -196,9 +196,11 @@ struct timer_base {
spinlock_t lock;
struct timer_list *running_timer;
unsigned long clk;
+ unsigned long next_expiry;
unsigned int cpu;
bool migration_enabled;
bool nohz_active;
+ bool is_idle;
DECLARE_BITMAP(pending_map, WHEEL_SIZE);
struct hlist_head vectors[WHEEL_SIZE];
} ____cacheline_aligned;
@@ -519,24 +521,37 @@ static void internal_add_timer(struct timer_base *base, struct timer_list *timer
{
__internal_add_timer(base, timer);
+ if (!IS_ENABLED(CONFIG_NO_HZ_COMMON) || !base->nohz_active)
+ return;
+
/*
- * Check whether the other CPU is in dynticks mode and needs
- * to be triggered to reevaluate the timer wheel. We are
- * protected against the other CPU fiddling with the timer by
- * holding the timer base lock. This also makes sure that a
- * CPU on the way to stop its tick can not evaluate the timer
- * wheel.
- *
- * Spare the IPI for deferrable timers on idle targets though.
- * The next busy ticks will take care of it. Except full dynticks
- * require special care against races with idle_cpu(), lets deal
- * with that later.
+ * TODO: This wants some optimizing similar to the code below, but we
+ * will do that when we switch from push to pull for deferrable timers.
*/
- if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && base->nohz_active) {
- if (!(timer->flags & TIMER_DEFERRABLE) ||
- tick_nohz_full_cpu(base->cpu))
+ if (timer->flags & TIMER_DEFERRABLE) {
+ if (tick_nohz_full_cpu(base->cpu))
wake_up_nohz_cpu(base->cpu);
+ return;
}
+
+ /*
+ * We might have to IPI the remote CPU if the base is idle and the
+ * timer is not deferrable. If the other CPU is on the way to idle
+ * then it can't set base->is_idle as we hold the base lock:
+ */
+ if (!base->is_idle)
+ return;
+
+ /* Check whether this is the new first expiring timer: */
+ if (time_after_eq(timer->expires, base->next_expiry))
+ return;
+
+ /*
+ * Set the next expiry time and kick the CPU so it can reevaluate the
+ * wheel:
+ */
+ base->next_expiry = timer->expires;
+ wake_up_nohz_cpu(base->cpu);
}
#ifdef CONFIG_TIMER_STATS
@@ -844,10 +859,11 @@ static inline struct timer_base *get_timer_base(u32 tflags)
return get_timer_cpu_base(tflags, tflags & TIMER_CPUMASK);
}
-static inline struct timer_base *get_target_base(struct timer_base *base,
- unsigned tflags)
+#ifdef CONFIG_NO_HZ_COMMON
+static inline struct timer_base *
+__get_target_base(struct timer_base *base, unsigned tflags)
{
-#if defined(CONFIG_NO_HZ_COMMON) && defined(CONFIG_SMP)
+#ifdef CONFIG_SMP
if ((tflags & TIMER_PINNED) || !base->migration_enabled)
return get_timer_this_cpu_base(tflags);
return get_timer_cpu_base(tflags, get_nohz_timer_target());
@@ -856,6 +872,43 @@ static inline struct timer_base *get_target_base(struct timer_base *base,
#endif
}
+static inline void forward_timer_base(struct timer_base *base)
+{
+ /*
+ * We only forward the base when it's idle and we have a delta between
+ * base clock and jiffies.
+ */
+ if (!base->is_idle || (long) (jiffies - base->clk) < 2)
+ return;
+
+ /*
+ * If the next expiry value is > jiffies, then we fast forward to
+ * jiffies otherwise we forward to the next expiry value.
+ */
+ if (time_after(base->next_expiry, jiffies))
+ base->clk = jiffies;
+ else
+ base->clk = base->next_expiry;
+}
+#else
+static inline struct timer_base *
+__get_target_base(struct timer_base *base, unsigned tflags)
+{
+ return get_timer_this_cpu_base(tflags);
+}
+
+static inline void forward_timer_base(struct timer_base *base) { }
+#endif
+
+static inline struct timer_base *
+get_target_base(struct timer_base *base, unsigned tflags)
+{
+ struct timer_base *target = __get_target_base(base, tflags);
+
+ forward_timer_base(target);
+ return target;
+}
+
/*
* We are using hashed locking: Holding per_cpu(timer_bases[x]).lock means
* that all timers which are tied to this base are locked, and the base itself
@@ -1417,16 +1470,49 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
spin_lock(&base->lock);
nextevt = __next_timer_interrupt(base);
- spin_unlock(&base->lock);
+ base->next_expiry = nextevt;
+ /*
+ * We have a fresh next event. Check whether we can forward the base:
+ */
+ if (time_after(nextevt, jiffies))
+ base->clk = jiffies;
+ else if (time_after(nextevt, base->clk))
+ base->clk = nextevt;
- if (time_before_eq(nextevt, basej))
+ if (time_before_eq(nextevt, basej)) {
expires = basem;
- else
+ base->is_idle = false;
+ } else {
expires = basem + (nextevt - basej) * TICK_NSEC;
+ /*
+ * If we expect to sleep more than a tick, mark the base idle:
+ */
+ if ((expires - basem) > TICK_NSEC)
+ base->is_idle = true;
+ }
+ spin_unlock(&base->lock);
return cmp_next_hrtimer_event(basem, expires);
}
+/**
+ * timer_clear_idle - Clear the idle state of the timer base
+ *
+ * Called with interrupts disabled
+ */
+void timer_clear_idle(void)
+{
+ struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]);
+
+ /*
+ * We do this unlocked. The worst outcome is a remote enqueue sending
+ * a pointless IPI, but taking the lock would just make the window for
+ * sending the IPI a few instructions smaller for the cost of taking
+ * the lock in the exit from idle path.
+ */
+ base->is_idle = false;
+}
+
static int collect_expired_timers(struct timer_base *base,
struct hlist_head *heads)
{
@@ -1440,7 +1526,7 @@ static int collect_expired_timers(struct timer_base *base,
/*
* If the next timer is ahead of time forward to current
- * jiffies, otherwise forward to the next expiry time.
+ * jiffies, otherwise forward to the next expiry time:
*/
if (time_after(next, jiffies)) {
/* The call site will increment clock! */
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From a683f390b93f4d1292f849fc48d28e322046120f Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Mon, 4 Jul 2016 09:50:36 +0000
Subject: [PATCH] timers: Forward the wheel clock whenever possible
The wheel clock is stale when a CPU goes into a long idle sleep. This has the
side effect that timers which are queued end up in the outer wheel levels.
That results in coarser granularity.
To solve this, we keep track of the idle state and forward the wheel clock
whenever possible.
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Arjan van de Ven <arjan(a)infradead.org>
Cc: Chris Mason <clm(a)fb.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Frederic Weisbecker <fweisbec(a)gmail.com>
Cc: George Spelvin <linux(a)sciencehorizons.net>
Cc: Josh Triplett <josh(a)joshtriplett.org>
Cc: Len Brown <lenb(a)kernel.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Paul E. McKenney <paulmck(a)linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: rt(a)linutronix.de
Link: http://lkml.kernel.org/r/20160704094342.512039360@linutronix.de
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h
index 966a5a6fdd0a..f738251000fe 100644
--- a/kernel/time/tick-internal.h
+++ b/kernel/time/tick-internal.h
@@ -164,3 +164,4 @@ static inline void timers_update_migration(bool update_nohz) { }
DECLARE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases);
extern u64 get_next_timer_interrupt(unsigned long basej, u64 basem);
+void timer_clear_idle(void);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 69abc7bfe80f..5d81f9aa30d2 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -700,6 +700,12 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
delta = next_tick - basemono;
if (delta <= (u64)TICK_NSEC) {
tick.tv64 = 0;
+
+ /*
+ * Tell the timer code that the base is not idle, i.e. undo
+ * the effect of get_next_timer_interrupt():
+ */
+ timer_clear_idle();
/*
* We've not stopped the tick yet, and there's a timer in the
* next period, so no point in stopping it either, bail.
@@ -809,6 +815,12 @@ static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now)
tick_do_update_jiffies64(now);
cpu_load_update_nohz_stop();
+ /*
+ * Clear the timer idle flag, so we avoid IPIs on remote queueing and
+ * the clock forward checks in the enqueue path:
+ */
+ timer_clear_idle();
+
calc_load_exit_idle();
touch_softlockup_watchdog_sched();
/*
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 658051c97a3c..9339d71ee998 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -196,9 +196,11 @@ struct timer_base {
spinlock_t lock;
struct timer_list *running_timer;
unsigned long clk;
+ unsigned long next_expiry;
unsigned int cpu;
bool migration_enabled;
bool nohz_active;
+ bool is_idle;
DECLARE_BITMAP(pending_map, WHEEL_SIZE);
struct hlist_head vectors[WHEEL_SIZE];
} ____cacheline_aligned;
@@ -519,24 +521,37 @@ static void internal_add_timer(struct timer_base *base, struct timer_list *timer
{
__internal_add_timer(base, timer);
+ if (!IS_ENABLED(CONFIG_NO_HZ_COMMON) || !base->nohz_active)
+ return;
+
/*
- * Check whether the other CPU is in dynticks mode and needs
- * to be triggered to reevaluate the timer wheel. We are
- * protected against the other CPU fiddling with the timer by
- * holding the timer base lock. This also makes sure that a
- * CPU on the way to stop its tick can not evaluate the timer
- * wheel.
- *
- * Spare the IPI for deferrable timers on idle targets though.
- * The next busy ticks will take care of it. Except full dynticks
- * require special care against races with idle_cpu(), lets deal
- * with that later.
+ * TODO: This wants some optimizing similar to the code below, but we
+ * will do that when we switch from push to pull for deferrable timers.
*/
- if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && base->nohz_active) {
- if (!(timer->flags & TIMER_DEFERRABLE) ||
- tick_nohz_full_cpu(base->cpu))
+ if (timer->flags & TIMER_DEFERRABLE) {
+ if (tick_nohz_full_cpu(base->cpu))
wake_up_nohz_cpu(base->cpu);
+ return;
}
+
+ /*
+ * We might have to IPI the remote CPU if the base is idle and the
+ * timer is not deferrable. If the other CPU is on the way to idle
+ * then it can't set base->is_idle as we hold the base lock:
+ */
+ if (!base->is_idle)
+ return;
+
+ /* Check whether this is the new first expiring timer: */
+ if (time_after_eq(timer->expires, base->next_expiry))
+ return;
+
+ /*
+ * Set the next expiry time and kick the CPU so it can reevaluate the
+ * wheel:
+ */
+ base->next_expiry = timer->expires;
+ wake_up_nohz_cpu(base->cpu);
}
#ifdef CONFIG_TIMER_STATS
@@ -844,10 +859,11 @@ static inline struct timer_base *get_timer_base(u32 tflags)
return get_timer_cpu_base(tflags, tflags & TIMER_CPUMASK);
}
-static inline struct timer_base *get_target_base(struct timer_base *base,
- unsigned tflags)
+#ifdef CONFIG_NO_HZ_COMMON
+static inline struct timer_base *
+__get_target_base(struct timer_base *base, unsigned tflags)
{
-#if defined(CONFIG_NO_HZ_COMMON) && defined(CONFIG_SMP)
+#ifdef CONFIG_SMP
if ((tflags & TIMER_PINNED) || !base->migration_enabled)
return get_timer_this_cpu_base(tflags);
return get_timer_cpu_base(tflags, get_nohz_timer_target());
@@ -856,6 +872,43 @@ static inline struct timer_base *get_target_base(struct timer_base *base,
#endif
}
+static inline void forward_timer_base(struct timer_base *base)
+{
+ /*
+ * We only forward the base when it's idle and we have a delta between
+ * base clock and jiffies.
+ */
+ if (!base->is_idle || (long) (jiffies - base->clk) < 2)
+ return;
+
+ /*
+ * If the next expiry value is > jiffies, then we fast forward to
+ * jiffies otherwise we forward to the next expiry value.
+ */
+ if (time_after(base->next_expiry, jiffies))
+ base->clk = jiffies;
+ else
+ base->clk = base->next_expiry;
+}
+#else
+static inline struct timer_base *
+__get_target_base(struct timer_base *base, unsigned tflags)
+{
+ return get_timer_this_cpu_base(tflags);
+}
+
+static inline void forward_timer_base(struct timer_base *base) { }
+#endif
+
+static inline struct timer_base *
+get_target_base(struct timer_base *base, unsigned tflags)
+{
+ struct timer_base *target = __get_target_base(base, tflags);
+
+ forward_timer_base(target);
+ return target;
+}
+
/*
* We are using hashed locking: Holding per_cpu(timer_bases[x]).lock means
* that all timers which are tied to this base are locked, and the base itself
@@ -1417,16 +1470,49 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
spin_lock(&base->lock);
nextevt = __next_timer_interrupt(base);
- spin_unlock(&base->lock);
+ base->next_expiry = nextevt;
+ /*
+ * We have a fresh next event. Check whether we can forward the base:
+ */
+ if (time_after(nextevt, jiffies))
+ base->clk = jiffies;
+ else if (time_after(nextevt, base->clk))
+ base->clk = nextevt;
- if (time_before_eq(nextevt, basej))
+ if (time_before_eq(nextevt, basej)) {
expires = basem;
- else
+ base->is_idle = false;
+ } else {
expires = basem + (nextevt - basej) * TICK_NSEC;
+ /*
+ * If we expect to sleep more than a tick, mark the base idle:
+ */
+ if ((expires - basem) > TICK_NSEC)
+ base->is_idle = true;
+ }
+ spin_unlock(&base->lock);
return cmp_next_hrtimer_event(basem, expires);
}
+/**
+ * timer_clear_idle - Clear the idle state of the timer base
+ *
+ * Called with interrupts disabled
+ */
+void timer_clear_idle(void)
+{
+ struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]);
+
+ /*
+ * We do this unlocked. The worst outcome is a remote enqueue sending
+ * a pointless IPI, but taking the lock would just make the window for
+ * sending the IPI a few instructions smaller for the cost of taking
+ * the lock in the exit from idle path.
+ */
+ base->is_idle = false;
+}
+
static int collect_expired_timers(struct timer_base *base,
struct hlist_head *heads)
{
@@ -1440,7 +1526,7 @@ static int collect_expired_timers(struct timer_base *base,
/*
* If the next timer is ahead of time forward to current
- * jiffies, otherwise forward to the next expiry time.
+ * jiffies, otherwise forward to the next expiry time:
*/
if (time_after(next, jiffies)) {
/* The call site will increment clock! */
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 517a660b46de - regmap: debugfs: Don't sleep while atomic for fast_io regmaps
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=dataware…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
Host 2:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
⚡⚡⚡ LTP
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Ethernet drivers sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
🚧 ⚡⚡⚡ CIFS Connectathon
🚧 ⚡⚡⚡ POSIX pjd-fstest suites
🚧 ⚡⚡⚡ jvm - DaCapo Benchmark Suite
🚧 ⚡⚡⚡ jvm - jcstress tests
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ Networking firewall: basic netfilter test
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ trace: ftrace/tracer
s390x:
Host 1:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
🚧 ⚡⚡⚡ kdump - file-load
Host 3:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
x86_64:
Host 1:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 2:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 3:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ✅ xfstests - btrfs
🚧 ✅ IOMMU boot test
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ power-management: cpupower/sanity test
🚧 ✅ Storage blktests
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
[ Upstream commit 01cfcde9c26d8555f0e6e9aea9d6049f87683998 ]
task_h_load() can return 0 in some situations like running stress-ng
mmapfork, which forks thousands of threads, in a sched group on a 224 cores
system. The load balance doesn't handle this correctly because
env->imbalance never decreases and it will stop pulling tasks only after
reaching loop_max, which can be equal to the number of running tasks of
the cfs. Make sure that imbalance will be decreased by at least 1.
misfit task is the other feature that doesn't handle correctly such
situation although it's probably more difficult to face the problem
because of the smaller number of CPUs and running tasks on heterogenous
system.
We can't simply ensure that task_h_load() returns at least one because it
would imply to handle underflow in other places.
Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider(a)arm.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann(a)arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann(a)arm.com>
Cc: <stable(a)vger.kernel.org> # v5.4
cc: Sasha Levin <sashal(a)kernel.org>
Link: https://lkml.kernel.org/r/20200710152426.16981-1-vincent.guittot@linaro.org
---
kernel/sched/fair.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2f81e4ae844e..9b16080093be 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3824,7 +3824,11 @@ static inline void update_misfit_status(struct task_struct *p, struct rq *rq)
return;
}
- rq->misfit_task_load = task_h_load(p);
+ /*
+ * Make sure that misfit_task_load will not be null even if
+ * task_h_load() returns 0.
+ */
+ rq->misfit_task_load = max_t(unsigned long, task_h_load(p), 1);
}
#else /* CONFIG_SMP */
@@ -7407,7 +7411,15 @@ static int detach_tasks(struct lb_env *env)
if (!can_migrate_task(p, env))
goto next;
- load = task_h_load(p);
+ /*
+ * Depending of the number of CPUs and tasks and the
+ * cgroup hierarchy, task_h_load() can return a null
+ * value. Make sure that env->imbalance decreases
+ * otherwise detach_tasks() will stop only after
+ * detaching up to loop_max tasks.
+ */
+ load = max_t(unsigned long, task_h_load(p), 1);
+
if (sched_feat(LB_MIN) && load < 16 && !env->sd->nr_balance_failed)
goto next;
--
2.17.1
This is a note to let you know that I've just added the patch titled
staging: comedi: addi_apci_1564: check INSN_CONFIG_DIGITAL_TRIG shift
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 926234f1b8434c4409aa4c53637aa3362ca07cea Mon Sep 17 00:00:00 2001
From: Ian Abbott <abbotti(a)mev.co.uk>
Date: Fri, 17 Jul 2020 15:52:56 +0100
Subject: staging: comedi: addi_apci_1564: check INSN_CONFIG_DIGITAL_TRIG shift
The `INSN_CONFIG` comedi instruction with sub-instruction code
`INSN_CONFIG_DIGITAL_TRIG` includes a base channel in `data[3]`. This is
used as a right shift amount for other bitmask values without being
checked. Shift amounts greater than or equal to 32 will result in
undefined behavior. Add code to deal with this.
Fixes: 1e15687ea472 ("staging: comedi: addi_apci_1564: add Change-of-State interrupt subdevice and required functions")
Cc: <stable(a)vger.kernel.org> #3.17+
Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk>
Link: https://lore.kernel.org/r/20200717145257.112660-4-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
.../staging/comedi/drivers/addi_apci_1564.c | 20 +++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/drivers/staging/comedi/drivers/addi_apci_1564.c b/drivers/staging/comedi/drivers/addi_apci_1564.c
index 10501fe6bb25..1268ba34be5f 100644
--- a/drivers/staging/comedi/drivers/addi_apci_1564.c
+++ b/drivers/staging/comedi/drivers/addi_apci_1564.c
@@ -331,14 +331,22 @@ static int apci1564_cos_insn_config(struct comedi_device *dev,
unsigned int *data)
{
struct apci1564_private *devpriv = dev->private;
- unsigned int shift, oldmask;
+ unsigned int shift, oldmask, himask, lomask;
switch (data[0]) {
case INSN_CONFIG_DIGITAL_TRIG:
if (data[1] != 0)
return -EINVAL;
shift = data[3];
- oldmask = (1U << shift) - 1;
+ if (shift < 32) {
+ oldmask = (1U << shift) - 1;
+ himask = data[4] << shift;
+ lomask = data[5] << shift;
+ } else {
+ oldmask = 0xffffffffu;
+ himask = 0;
+ lomask = 0;
+ }
switch (data[2]) {
case COMEDI_DIGITAL_TRIG_DISABLE:
devpriv->ctrl = 0;
@@ -362,8 +370,8 @@ static int apci1564_cos_insn_config(struct comedi_device *dev,
devpriv->mode2 &= oldmask;
}
/* configure specified channels */
- devpriv->mode1 |= data[4] << shift;
- devpriv->mode2 |= data[5] << shift;
+ devpriv->mode1 |= himask;
+ devpriv->mode2 |= lomask;
break;
case COMEDI_DIGITAL_TRIG_ENABLE_LEVELS:
if (devpriv->ctrl != (APCI1564_DI_IRQ_ENA |
@@ -380,8 +388,8 @@ static int apci1564_cos_insn_config(struct comedi_device *dev,
devpriv->mode2 &= oldmask;
}
/* configure specified channels */
- devpriv->mode1 |= data[4] << shift;
- devpriv->mode2 |= data[5] << shift;
+ devpriv->mode1 |= himask;
+ devpriv->mode2 |= lomask;
break;
default:
return -EINVAL;
--
2.27.0
This is a note to let you know that I've just added the patch titled
staging: comedi: addi_apci_1500: check INSN_CONFIG_DIGITAL_TRIG shift
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From fc846e9db67c7e808d77bf9e2ef3d49e3820ce5d Mon Sep 17 00:00:00 2001
From: Ian Abbott <abbotti(a)mev.co.uk>
Date: Fri, 17 Jul 2020 15:52:57 +0100
Subject: staging: comedi: addi_apci_1500: check INSN_CONFIG_DIGITAL_TRIG shift
The `INSN_CONFIG` comedi instruction with sub-instruction code
`INSN_CONFIG_DIGITAL_TRIG` includes a base channel in `data[3]`. This is
used as a right shift amount for other bitmask values without being
checked. Shift amounts greater than or equal to 32 will result in
undefined behavior. Add code to deal with this, adjusting the checks
for invalid channels so that enabled channel bits that would have been
lost by shifting are also checked for validity. Only channels 0 to 15
are valid.
Fixes: a8c66b684efaf ("staging: comedi: addi_apci_1500: rewrite the subdevice support functions")
Cc: <stable(a)vger.kernel.org> #4.0+: ef75e14a6c93: staging: comedi: verify array index is correct before using it
Cc: <stable(a)vger.kernel.org> #4.0+
Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk>
Link: https://lore.kernel.org/r/20200717145257.112660-5-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
.../staging/comedi/drivers/addi_apci_1500.c | 24 +++++++++++++++----
1 file changed, 19 insertions(+), 5 deletions(-)
diff --git a/drivers/staging/comedi/drivers/addi_apci_1500.c b/drivers/staging/comedi/drivers/addi_apci_1500.c
index 689acd69a1b9..816dd25b9d0e 100644
--- a/drivers/staging/comedi/drivers/addi_apci_1500.c
+++ b/drivers/staging/comedi/drivers/addi_apci_1500.c
@@ -452,13 +452,14 @@ static int apci1500_di_cfg_trig(struct comedi_device *dev,
struct apci1500_private *devpriv = dev->private;
unsigned int trig = data[1];
unsigned int shift = data[3];
- unsigned int hi_mask = data[4] << shift;
- unsigned int lo_mask = data[5] << shift;
- unsigned int chan_mask = hi_mask | lo_mask;
- unsigned int old_mask = (1 << shift) - 1;
+ unsigned int hi_mask;
+ unsigned int lo_mask;
+ unsigned int chan_mask;
+ unsigned int old_mask;
unsigned int pm;
unsigned int pt;
unsigned int pp;
+ unsigned int invalid_chan;
if (trig > 1) {
dev_dbg(dev->class_dev,
@@ -466,7 +467,20 @@ static int apci1500_di_cfg_trig(struct comedi_device *dev,
return -EINVAL;
}
- if (chan_mask > 0xffff) {
+ if (shift <= 16) {
+ hi_mask = data[4] << shift;
+ lo_mask = data[5] << shift;
+ old_mask = (1U << shift) - 1;
+ invalid_chan = (data[4] | data[5]) >> (16 - shift);
+ } else {
+ hi_mask = 0;
+ lo_mask = 0;
+ old_mask = 0xffff;
+ invalid_chan = data[4] | data[5];
+ }
+ chan_mask = hi_mask | lo_mask;
+
+ if (invalid_chan) {
dev_dbg(dev->class_dev, "invalid digital trigger channel\n");
return -EINVAL;
}
--
2.27.0
This is a note to let you know that I've just added the patch titled
staging: comedi: addi_apci_1032: check INSN_CONFIG_DIGITAL_TRIG shift
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 0bd0db42a030b75c20028c7ba6e327b9cb554116 Mon Sep 17 00:00:00 2001
From: Ian Abbott <abbotti(a)mev.co.uk>
Date: Fri, 17 Jul 2020 15:52:55 +0100
Subject: staging: comedi: addi_apci_1032: check INSN_CONFIG_DIGITAL_TRIG shift
The `INSN_CONFIG` comedi instruction with sub-instruction code
`INSN_CONFIG_DIGITAL_TRIG` includes a base channel in `data[3]`. This is
used as a right shift amount for other bitmask values without being
checked. Shift amounts greater than or equal to 32 will result in
undefined behavior. Add code to deal with this.
Fixes: 33cdce6293dcc ("staging: comedi: addi_apci_1032: conform to new INSN_CONFIG_DIGITAL_TRIG")
Cc: <stable(a)vger.kernel.org> #3.8+
Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk>
Link: https://lore.kernel.org/r/20200717145257.112660-3-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
.../staging/comedi/drivers/addi_apci_1032.c | 20 +++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/drivers/staging/comedi/drivers/addi_apci_1032.c b/drivers/staging/comedi/drivers/addi_apci_1032.c
index 560649be9d13..e035c9f757a1 100644
--- a/drivers/staging/comedi/drivers/addi_apci_1032.c
+++ b/drivers/staging/comedi/drivers/addi_apci_1032.c
@@ -106,14 +106,22 @@ static int apci1032_cos_insn_config(struct comedi_device *dev,
unsigned int *data)
{
struct apci1032_private *devpriv = dev->private;
- unsigned int shift, oldmask;
+ unsigned int shift, oldmask, himask, lomask;
switch (data[0]) {
case INSN_CONFIG_DIGITAL_TRIG:
if (data[1] != 0)
return -EINVAL;
shift = data[3];
- oldmask = (1U << shift) - 1;
+ if (shift < 32) {
+ oldmask = (1U << shift) - 1;
+ himask = data[4] << shift;
+ lomask = data[5] << shift;
+ } else {
+ oldmask = 0xffffffffu;
+ himask = 0;
+ lomask = 0;
+ }
switch (data[2]) {
case COMEDI_DIGITAL_TRIG_DISABLE:
devpriv->ctrl = 0;
@@ -136,8 +144,8 @@ static int apci1032_cos_insn_config(struct comedi_device *dev,
devpriv->mode2 &= oldmask;
}
/* configure specified channels */
- devpriv->mode1 |= data[4] << shift;
- devpriv->mode2 |= data[5] << shift;
+ devpriv->mode1 |= himask;
+ devpriv->mode2 |= lomask;
break;
case COMEDI_DIGITAL_TRIG_ENABLE_LEVELS:
if (devpriv->ctrl != (APCI1032_CTRL_INT_ENA |
@@ -154,8 +162,8 @@ static int apci1032_cos_insn_config(struct comedi_device *dev,
devpriv->mode2 &= oldmask;
}
/* configure specified channels */
- devpriv->mode1 |= data[4] << shift;
- devpriv->mode2 |= data[5] << shift;
+ devpriv->mode1 |= himask;
+ devpriv->mode2 |= lomask;
break;
default:
return -EINVAL;
--
2.27.0
This is a note to let you know that I've just added the patch titled
staging: comedi: ni_6527: fix INSN_CONFIG_DIGITAL_TRIG support
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From f07804ec77d77f8a9dcf570a24154e17747bc82f Mon Sep 17 00:00:00 2001
From: Ian Abbott <abbotti(a)mev.co.uk>
Date: Fri, 17 Jul 2020 15:52:54 +0100
Subject: staging: comedi: ni_6527: fix INSN_CONFIG_DIGITAL_TRIG support
`ni6527_intr_insn_config()` processes `INSN_CONFIG` comedi instructions
for the "interrupt" subdevice. When `data[0]` is
`INSN_CONFIG_DIGITAL_TRIG` it is configuring the digital trigger. When
`data[2]` is `COMEDI_DIGITAL_TRIG_ENABLE_EDGES` it is configuring rising
and falling edge detection for the digital trigger, using a base channel
number (or shift amount) in `data[3]`, a rising edge bitmask in
`data[4]` and falling edge bitmask in `data[5]`.
If the base channel number (shift amount) is greater than or equal to
the number of channels (24) of the digital input subdevice, there are no
changes to the rising and falling edges, so the mask of channels to be
changed can be set to 0, otherwise the mask of channels to be changed,
and the rising and falling edge bitmasks are shifted by the base channel
number before calling `ni6527_set_edge_detection()` to change the
appropriate registers. Unfortunately, the code is comparing the base
channel (shift amount) to the interrupt subdevice's number of channels
(1) instead of the digital input subdevice's number of channels (24).
Fix it by comparing to 32 because all shift amounts for an `unsigned
int` must be less than that and everything from bit 24 upwards is
ignored by `ni6527_set_edge_detection()` anyway.
Fixes: 110f9e687c1a8 ("staging: comedi: ni_6527: support INSN_CONFIG_DIGITAL_TRIG")
Cc: <stable(a)vger.kernel.org> # 3.17+
Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk>
Link: https://lore.kernel.org/r/20200717145257.112660-2-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/comedi/drivers/ni_6527.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/staging/comedi/drivers/ni_6527.c b/drivers/staging/comedi/drivers/ni_6527.c
index 4d1eccb5041d..4518c2680b7c 100644
--- a/drivers/staging/comedi/drivers/ni_6527.c
+++ b/drivers/staging/comedi/drivers/ni_6527.c
@@ -332,7 +332,7 @@ static int ni6527_intr_insn_config(struct comedi_device *dev,
case COMEDI_DIGITAL_TRIG_ENABLE_EDGES:
/* check shift amount */
shift = data[3];
- if (shift >= s->n_chan) {
+ if (shift >= 32) {
mask = 0;
rising = 0;
falling = 0;
--
2.27.0
From: Michael Trimarchi <michael(a)amarulasolutions.com>
The current pin muxing scheme muxes GPIO_1 pad for USB_OTG_ID
because of which when card is inserted, usb otg is enumerated
and the card is never detected.
[ 64.492645] cfg80211: failed to load regulatory.db
[ 64.492657] imx-sdma 20ec000.sdma: external firmware not found, using ROM firmware
[ 76.343711] ci_hdrc ci_hdrc.0: EHCI Host Controller
[ 76.349742] ci_hdrc ci_hdrc.0: new USB bus registered, assigned bus number 2
[ 76.388862] ci_hdrc ci_hdrc.0: USB 2.0 started, EHCI 1.00
[ 76.396650] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.08
[ 76.405412] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 76.412763] usb usb2: Product: EHCI Host Controller
[ 76.417666] usb usb2: Manufacturer: Linux 5.8.0-rc1-next-20200618 ehci_hcd
[ 76.424623] usb usb2: SerialNumber: ci_hdrc.0
[ 76.431755] hub 2-0:1.0: USB hub found
[ 76.435862] hub 2-0:1.0: 1 port detected
The TRM mentions GPIO_1 pad should be muxed/assigned for card detect
and ENET_RX_ER pad for USB_OTG_ID for proper operation.
This patch fixes pin muxing as per TRM and is tested on a
i.Core 1.5 MX6 DL SOM.
[ 22.449165] mmc0: host does not support reading read-only switch, assuming write-enable
[ 22.459992] mmc0: new high speed SDHC card at address 0001
[ 22.469725] mmcblk0: mmc0:0001 EB1QT 29.8 GiB
[ 22.478856] mmcblk0: p1 p2
Fixes: 6df11287f7c9 ("ARM: dts: imx6q: Add Engicam i.CoreM6 Quad/Dual initial support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael Trimarchi <michael(a)amarulasolutions.com>
Signed-off-by: Suniel Mahesh <sunil(a)amarulasolutions.com>
---
Changes for v3:
- Changed subject of the patch, added fixes tag and copied stable kernel
as suggested by Shawn Guo.
Changes for v2:
- Changed patch description as suggested by Michael Trimarchi to make it
more readable/understandable.
NOTE:
- patch tested on i.Core 1.5 MX6 DL
---
arch/arm/boot/dts/imx6qdl-icore.dtsi | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/imx6qdl-icore.dtsi b/arch/arm/boot/dts/imx6qdl-icore.dtsi
index f2f475e..23c318d 100644
--- a/arch/arm/boot/dts/imx6qdl-icore.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-icore.dtsi
@@ -398,7 +398,7 @@
pinctrl_usbotg: usbotggrp {
fsl,pins = <
- MX6QDL_PAD_GPIO_1__USB_OTG_ID 0x17059
+ MX6QDL_PAD_ENET_RX_ER__USB_OTG_ID 0x17059
>;
};
@@ -410,6 +410,7 @@
MX6QDL_PAD_SD1_DAT1__SD1_DATA1 0x17070
MX6QDL_PAD_SD1_DAT2__SD1_DATA2 0x17070
MX6QDL_PAD_SD1_DAT3__SD1_DATA3 0x17070
+ MX6QDL_PAD_GPIO_1__GPIO1_IO01 0x1b0b0
>;
};
--
2.7.4