Linux-stable-mirror June 2019

linux-stable-mirror@lists.linaro.org

338 participants
1243 discussions

[PATCH v2] riscv: mm: synchronize MMU after pte change

by ShihPo Hung

Because RISC-V compliant implementations can cache invalid entries in TLB, an SFENCE.VMA is necessary after changes to the page table. This patch adds an SFENCE.vma for the vmalloc_fault path. Signed-off-by: ShihPo Hung <shihpo.hung(a)sifive.com> Cc: Palmer Dabbelt <palmer(a)sifive.com> Cc: Albert Ou <aou(a)eecs.berkeley.edu> Cc: Paul Walmsley <paul.walmsley(a)sifive.com> Cc: linux-riscv(a)lists.infradead.org Cc: stable(a)vger.kernel.org --- arch/riscv/mm/fault.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 88401d5..a1c7b39 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -29,6 +29,7 @@ #include <asm/pgalloc.h> #include <asm/ptrace.h> +#include <asm/tlbflush.h> /* * This routine handles page faults. It determines the address and the @@ -281,6 +282,16 @@ asmlinkage void do_page_fault(struct pt_regs *regs) pte_k = pte_offset_kernel(pmd_k, addr); if (!pte_present(*pte_k)) goto no_context; + + /* + * The kernel assumes that TLBs don't cache invalid entries, but + * in RISC-V, SFENCE.VMA specifies an ordering constraint, not a + * cache flush; it is necessary even after writing invalid entries. + * Relying on flush_tlb_fix_spurious_fault would suffice, but + * the extra traps reduce performance. So, eagerly SFENCE.VMA. + */ + local_flush_tlb_page(addr); + return; } } -- 2.7.4

6 years, 6 months

[PATCH v2] scsi: ufs: Avoid runtime suspend possibly being blocked forever

by Stanley Chu

UFS runtime suspend can be triggered after pm_runtime_enable() is invoked in ufshcd_pltfrm_init(). However if the first runtime suspend is triggered before binding ufs_hba structure to ufs device structure via platform_set_drvdata(), then UFS runtime suspend will be no longer triggered in the future because its dev->power.runtime_error was set in the first triggering and does not have any chance to be cleared. To be more clear, dev->power.runtime_error is set if hba is NULL in ufshcd_runtime_suspend() which returns -EINVAL to rpm_callback() where dev->power.runtime_error is set as -EINVAL. In this case, any future rpm_suspend() for UFS device fails because rpm_check_suspend_allowed() fails due to non-zero dev->power.runtime_error. To resolve this issue, make sure the first UFS runtime suspend get valid "hba" in ufshcd_runtime_suspend(): Enable UFS runtime PM only after hba is successfully bound to UFS device structure. Fixes: 62694735ca95 ([SCSI] ufs: Add runtime PM support for UFS host controller driver) Cc: stable(a)vger.kernel.org Signed-off-by: Stanley Chu <stanley.chu(a)mediatek.com> --- drivers/scsi/ufs/ufshcd-pltfrm.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/ufs/ufshcd-pltfrm.c b/drivers/scsi/ufs/ufshcd-pltfrm.c index 8a74ec30c3d2..d7d521b394c3 100644 --- a/drivers/scsi/ufs/ufshcd-pltfrm.c +++ b/drivers/scsi/ufs/ufshcd-pltfrm.c @@ -430,24 +430,21 @@ int ufshcd_pltfrm_init(struct platform_device *pdev, goto dealloc_host; } - pm_runtime_set_active(&pdev->dev); - pm_runtime_enable(&pdev->dev); - ufshcd_init_lanes_per_dir(hba); err = ufshcd_init(hba, mmio_base, irq); if (err) { dev_err(dev, "Initialization failed\n"); - goto out_disable_rpm; + goto dealloc_host; } platform_set_drvdata(pdev, hba); + pm_runtime_set_active(&pdev->dev); + pm_runtime_enable(&pdev->dev); + return 0; -out_disable_rpm: - pm_runtime_disable(&pdev->dev); - pm_runtime_set_suspended(&pdev->dev); dealloc_host: ufshcd_dealloc_host(hba); out: -- 2.18.0

6 years, 6 months

[PATCH v2 0/7] NCR5380 drivers: fixes and other improvements

by Finn Thain

Among other improvements, this patch series fixes a data corruption bug in the mac_scsi driver and a bug in the EH abort routine in the core 5380 driver. For consistency I have ignored certain checkpatch.pl complaints about the indentation in mac_scsi.c. The remaining complaints seem to be false positives. Some of these patches are not trivial to backport. Those patches have been nominated for recent -stable branches only. Changed since v1: - Added acked-by tag. - Dropped new mac_pdma.h file. Finn Thain (7): Revert "scsi: ncr5380: Increase register polling limit" scsi: NCR5380: Always re-enable reselection interrupt scsi: NCR5380: Handle PDMA failure reliably scsi: mac_scsi: Increase PIO/PDMA transfer length threshold scsi: mac_scsi: Fix pseudo DMA implementation, take 2 scsi: mac_scsi: Enable PDMA on Mac IIfx scsi: mac_scsi: Treat Last Byte Sent time-out as failure arch/m68k/mac/config.c | 10 +- drivers/scsi/NCR5380.c | 18 +- drivers/scsi/NCR5380.h | 2 +- drivers/scsi/mac_scsi.c | 421 +++++++++++++++++++++++++--------------- 4 files changed, 273 insertions(+), 178 deletions(-) -- 2.21.0

6 years, 6 months

+ mm-idle-page-fix-oops-because-end_pfn-is-larger-than-max_pfn.patch added to -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm/page_idle.c: fix oops because end_pfn is larger than max_pfn has been added to the -mm tree. Its filename is mm-idle-page-fix-oops-because-end_pfn-is-larger-than-max_pfn.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-idle-page-fix-oops-because-end_… and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-idle-page-fix-oops-because-end_… Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Colin Ian King <colin.king(a)canonical.com> Subject: mm/page_idle.c: fix oops because end_pfn is larger than max_pfn Currently the calcuation of end_pfn can round up the pfn number to more than the actual maximum number of pfns, causing an Oops. Fix this by ensuring end_pfn is never more than max_pfn. This can be easily triggered when on systems where the end_pfn gets rounded up to more than max_pfn using the idle-page stress-ng stress test: sudo stress-ng --idle-page 0 [ 3812.222790] BUG: unable to handle kernel paging request at 00000000000020d8 [ 3812.224341] #PF error: [normal kernel read fault] [ 3812.225144] PGD 0 P4D 0 [ 3812.225626] Oops: 0000 [#1] SMP PTI [ 3812.226264] CPU: 1 PID: 11039 Comm: stress-ng-idle- Not tainted 5.0.0-5-generic #6-Ubuntu [ 3812.227643] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 3812.229286] RIP: 0010:page_idle_get_page+0xc8/0x1a0 [ 3812.230173] Code: 0f b1 0a 75 7d 48 8b 03 48 89 c2 48 c1 e8 33 83 e0 07 48 c1 ea 36 48 8d 0c 40 4c 8d 24 88 49 c1 e4 07 4c 03 24 d5 00 89 c3 be <49> 8b 44 24 58 48 8d b8 80 a1 02 00 e8 07 d5 77 00 48 8b 53 08 48 [ 3812.234641] RSP: 0018:ffffafd7c672fde8 EFLAGS: 00010202 [ 3812.235792] RAX: 0000000000000005 RBX: ffffe36341fff700 RCX: 000000000000000f [ 3812.237739] RDX: 0000000000000284 RSI: 0000000000000275 RDI: 0000000001fff700 [ 3812.239225] RBP: ffffafd7c672fe00 R08: ffffa0bc34056410 R09: 0000000000000276 [ 3812.241027] R10: ffffa0bc754e9b40 R11: ffffa0bc330f6400 R12: 0000000000002080 [ 3812.242555] R13: ffffe36341fff700 R14: 0000000000080000 R15: ffffa0bc330f6400 [ 3812.244073] FS: 00007f0ec1ea5740(0000) GS:ffffa0bc7db00000(0000) knlGS:0000000000000000 [ 3812.245968] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3812.247162] CR2: 00000000000020d8 CR3: 0000000077d68000 CR4: 00000000000006e0 [ 3812.249045] Call Trace: [ 3812.249625] page_idle_bitmap_write+0x8c/0x140 [ 3812.250567] sysfs_kf_bin_write+0x5c/0x70 [ 3812.251406] kernfs_fop_write+0x12e/0x1b0 [ 3812.252282] __vfs_write+0x1b/0x40 [ 3812.253002] vfs_write+0xab/0x1b0 [ 3812.253941] ksys_write+0x55/0xc0 [ 3812.254660] __x64_sys_write+0x1a/0x20 [ 3812.255446] do_syscall_64+0x5a/0x110 [ 3812.256254] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Link: http://lkml.kernel.org/r/20190618124352.28307-1-colin.king@canonical.com Fixes: 33c3fc71c8cf ("mm: introduce idle page tracking") Signed-off-by: Colin Ian King <colin.king(a)canonical.com> Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org> Cc: Michal Hocko <mhocko(a)suse.com> Cc: Mike Rapoport <rppt(a)linux.vnet.ibm.com> Cc: Mel Gorman <mgorman(a)techsingularity.net> Cc: Stephen Rothwell <sfr(a)canb.auug.org.au> Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com> Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/page_idle.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/mm/page_idle.c~mm-idle-page-fix-oops-because-end_pfn-is-larger-than-max_pfn +++ a/mm/page_idle.c @@ -136,7 +136,7 @@ static ssize_t page_idle_bitmap_read(str end_pfn = pfn + count * BITS_PER_BYTE; if (end_pfn > max_pfn) - end_pfn = ALIGN(max_pfn, BITMAP_CHUNK_BITS); + end_pfn = max_pfn; for (; pfn < end_pfn; pfn++) { bit = pfn % BITMAP_CHUNK_BITS; @@ -181,7 +181,7 @@ static ssize_t page_idle_bitmap_write(st end_pfn = pfn + count * BITS_PER_BYTE; if (end_pfn > max_pfn) - end_pfn = ALIGN(max_pfn, BITMAP_CHUNK_BITS); + end_pfn = max_pfn; for (; pfn < end_pfn; pfn++) { bit = pfn % BITMAP_CHUNK_BITS; _ Patches currently in -mm which might be from colin.king(a)canonical.com are mm-idle-page-fix-oops-because-end_pfn-is-larger-than-max_pfn.patch scripts-spellingtxt-add-more-spellings-to-spellingtxt.patch z3fold-add-inter-page-compaction-fix.patch coda-clean-up-indentation-replace-spaces-with-tab.patch

6 years, 6 months

[PATCH v3] Documentation: Add section about CPU vulnerabilities for Spectre

by Tim Chen

Add documentation for Spectre vulnerability and the mitigation mechanisms: - Explain the problem and risks - Document the mitigation mechanisms - Document the command line controls - Document the sysfs files Co-developed-by: Andi Kleen <ak(a)linux.intel.com> Signed-off-by: Andi Kleen <ak(a)linux.intel.com> Co-developed-by: Tim Chen <tim.c.chen(a)linux.intel.com> Signed-off-by: Tim Chen <tim.c.chen(a)linux.intel.com> Cc: stable(a)vger.kernel.org --- Documentation/admin-guide/hw-vuln/index.rst | 1 + Documentation/admin-guide/hw-vuln/spectre.rst | 651 ++++++++++++++++++ Documentation/userspace-api/spec_ctrl.rst | 2 + 3 files changed, 654 insertions(+) create mode 100644 Documentation/admin-guide/hw-vuln/spectre.rst diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst index ffc064c1ec68..49311f3da6f2 100644 --- a/Documentation/admin-guide/hw-vuln/index.rst +++ b/Documentation/admin-guide/hw-vuln/index.rst @@ -9,5 +9,6 @@ are configurable at compile, boot or run time. .. toctree:: :maxdepth: 1 + spectre l1tf mds diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst new file mode 100644 index 000000000000..dbcb356b2699 --- /dev/null +++ b/Documentation/admin-guide/hw-vuln/spectre.rst @@ -0,0 +1,651 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Spectre Side Channels +===================== + +Spectre is a class of side channel attacks that exploit branch prediction +and speculative execution on modern CPUs to read memory, possibly +bypassing access controls. Speculative execution side channel exploits +do not modify memory but attempt to infer privileged data in the memory. + +This document covers Spectre variant 1 and Spectre variant 2. + +Affected processors +------------------- + +Speculative execution side channel methods affect a wide range of modern +high performance processors, since most modern high speed processors +use branch prediction and speculative execution. + +The following CPUs are vulnerable: + + - Intel Core, Atom, Pentium, and Xeon processors + + - AMD Phenom, EPYC, and Zen processors + + - IBM POWER and zSeries processors + + - Higher end ARM processors + + - Apple CPUs + + - Higher end MIPS CPUs + + - Likely most other high performance CPUs. Contact your CPU vendor for details. + +Whether a processor is affected or not can be read out from the Spectre +vulnerability files in sysfs. See :ref:`spectre_sys_info`. + +Related CVEs +------------ + +The following CVE entries describe Spectre variants: + + ============= ======================= ================= + CVE-2017-5753 Bounds check bypass Spectre variant 1 + CVE-2017-5715 Branch target injection Spectre variant 2 + ============= ======================= ================= + +Problem +------- + +CPUs use speculative operations to improve performance. That may leave +traces of memory accesses or computations in the processor's caches, +buffers, and branch predictors. Malicious software may be able to +influence the speculative execution paths, and then use the side effects +of the speculative execution in the CPUs' caches and buffers to infer +privileged data touched during the speculative execution. + +Spectre variant 1 attacks take advantage of speculative execution of +conditional branches, while Spectre variant 2 attacks use speculative +execution of indirect branches to leak privileged memory. See [1] [5] +[7] [10] [11]. + +Spectre variant 1 (Bounds Check Bypass) +--------------------------------------- + +The bounds check bypass attack [2] takes advantage of speculative +execution that bypass conditional branch instructions used for memory +access bounds check (e.g. checking if the index of an array results in +memory access within a valid range). This results in memory accesses +to invalid memory (with out-of-bound index) that are done speculatively +before validation checks resolve. Such speculative memory accesses can +leave side effects, creating side channels which leak information to +the attacker. + +There are some extensions of Spectre variant 1 attacks for reading +data over the network, see [12]. However such attacks are difficult, +low bandwidth, fragile, and are considered low risk. + +Spectre variant 2 (Branch Target Injection) +------------------------------------------- + +The branch target injection attack takes advantage of speculative +execution of indirect branches [3]. The indirect branch predictors +inside the processor used to guess the target of indirect branches can +be influenced by an attacker, causing gadget code to be speculatively +executed, thus exposing sensitive data touched by the victim. The side +effects left in the CPU's caches during speculative execution can be +measured to infer data values. + +.. _poison_btb: + +In Spectre variant 2 attacks, the attacker can steer speculative indirect +branches in the victim to gadget code by poisoning the branch target +buffer of a CPU used for predicting indirect branch addresses. Such +poisoning could be done by indirect branching into existing code, with the +address offset of the indirect branch under the attacker's control. Since +the branch prediction hardware does not fully disambiguate branch address +and uses the offset for prediction, this could cause privileged code's +indirect branch to jump to a gadget code with the same offset. + +The most useful gadgets take an attacker-controlled input parameter (such +as a register value) so that the memory read can be controlled. Gadgets +without input parameters might be possible, but the attacker would have +very little control over what memory can be read, reducing the risk of +the attack revealing useful data. + +One other variant 2 attack vector is for the attacker to poison the +return stack buffer (RSB) [13] to cause speculative RET execution to go +to an gadget. An attacker's imbalanced CALL instructions might "poison" +entries in the return stack buffer which are later consumed by a victim's +RET instruction. This attack can be mitigated by flushing the return +stack buffer on context switch, or VM exit. + +On systems with simultaneous multi-threading (SMT), attacks are possible +from from the sibling thread, as level 1 cache and branch target buffer +(BTB) may be shared between hardware threads in a CPU core. A malicious +program running on the sibling thread may influence its peer's BTB to +steer its indirect branch speculations to gadget code, and measure the +speculative execution's side effects left in level 1 cache to infer the +victim's data. + +Attack scenarios +---------------- + +The following list of attack scenarios have been anticipated, but may +not cover all possible attack vectors. + +1. A user process attacking the kernel +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + + The attacker passes a parameter to the kernel via a register or + via a known address in memory during a syscall. Such parameter may + be used later by the kernel as an index to an array or to derive + a pointer for a Spectre variant 1 attack. The index or pointer + is invalid, but bound checks are bypassed in the code branch taken + for speculative execution. This could cause privileged memory to be + accessed and leaked. + + For kernel code that has been identified where data pointers could + potentially be influenced for Spectre attacks, new "nospec" accessor + macros are used to prevent speculative loading of data. + + Spectre variant 2 attacker can :ref:`poison <poison_btb>` the branch + target buffer (BTB) before issuing syscall to launch an attack. + After entering the kernel, the kernel could use the poisoned branch + target buffer on indirect jump and jump to gadget code in speculative + execution. + + If an attacker tries to control the memory addresses leaked during + speculative execution, he would also need to pass a parameter to the + gadget, either through a register or a known address in memory. After + the gadget has executed, he can measure the side effect. + + The kernel can protect itself against consuming poisoned branch + target buffer entries by using return trampolines (also known as + "retpoline") [3] [9] for all indirect branches. Return trampolines + trap speculative execution paths to prevent jumping to gadget code + during speculative execution. x86 CPUs with Enhanced Indirect + Branch Restricted Speculation (Enhanced IBRS) available in hardware + should use the feature to mitigate Spectre variant 2 instead of + retpoline. Enhanced IBRS is more efficient than retpoline. + + There may be gadget code in firmware which could be exploited with + Spectre variant 2 attack by a rogue user process. To mitigate such + attacks on x86, Indirect Branch Restricted Speculation (IBRS) feature + is turned on before the kernel invokes any firmware code. + +2. A user process attacking another user process +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + + A malicious user process can try to attack another user process, + either via a context switch on the same hardware thread, or from the + sibling hyperthread sharing a physical processor core on simultaneous + multi-threading (SMT) system. + + Spectre variant 1 attacks generally require passing parameters + between the processes, which needs a data passing relationship, such + as remote procedure calls (RPC). Those parameters are used in gadget + code to derive invalid data pointers accessing privileged memory in + the attacked process. + + Spectre variant 2 attacks can be launched from a rogue process by + :ref:`poisoning <poison_btb>` the branch target buffer. This can + influence the indirect branch targets for a victim process that either + runs later on the same hardware thread, or running concurrently on + a sibling hardware thread sharing the same physical core. + + On x86, a user process can protect itself against Spectre variant + 2 attacks by using the prctl() syscall to disable indirect branch + speculation for itself. An administrator can also cordon off an + unsafe process from polluting the branch target buffer by disabling the + process's indirect branch speculation. This comes with a performance + cost from not using indirect branch speculation and clearing the + branch target buffer. When SMT is enabled, for a process that has + indirect branch speculation disabled, Single Threaded Indirect Branch + Predictors (STIBP) [4] are turned on to prevent the sibling thread + from controlling branch target buffer. In addition, the Indirect + Branch Prediction Barrier (IBPB) is issued to clear the branch target + buffer when context switching to and from such process. + + On x86, the return stack buffer is stuffed on context switch. + This prevents the branch target buffer from being used for branch + prediction when the return stack buffer underflows while switching to + a deeper call stack. Any poisoned entries in the return stack buffer + left by the previous process will also be cleared. + + User programs should use address space randomization to make attacks + more difficult (Set /proc/sys/kernel/randomize_va_space = 1 or 2). + +3. A virtualized guest attacking the host +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + + The attack mechanism is similar to how user processes attack the + kernel. The kernel is entered via hyper-calls or other virtualization + exit paths. + + For Spectre variant 1 attacks, rogue guests can pass parameters + (e.g. in registers) via hyper-calls to derive invalid pointers to + speculate into privileged memory after entering the kernel. For places + where such kernel code has been identified, nospec accessor macros + are used to stop speculative memory access. + + For Spectre variant 2 attacks, rogue guests can :ref:`poison + <poison_btb>` the branch target buffer or return stack buffer, causing + the kernel to jump to gadget code in the speculative execution paths. + + To mitigate variant 2, the host kernel can use return trampolines + for indirect branches to bypass the poisoned branch target buffer, + and flushing the return stack buffer on VM exit. This prevents rogue + guests from affecting indirect branching in the host kernel. + + To protect host processes from rogue guests, host processes can have + indirect branch speculation disabled via prctl(). The branch target + buffer is cleared before context switching to such processes. + +4. A virtualized guest attacking other guest +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + + A rogue guest may attack another guest to get data accessible by the + other guest. + + Spectre variant 1 attacks are possible if parameters can be passed + between guests. This may be done via mechanisms such as shared memory + or message passing. Such parameters could be used to derive data + pointers to privileged data in guest. The privileged data could be + accessed by gadget code in the victim's speculation paths. + + Spectre variant 2 attacks can be launched from a rogue guest by + :ref:`poisoning <poison_btb>` the branch target buffer or the return + stack buffer. Such poisoned entries could be used to influence + speculation execution paths in the victim guest. + + Linux kernel mitigates attacks to other guests running in the same + CPU hardware thread by flushing the return stack buffer on VM exit, + and clearing the branch target buffer before switching to a new guest. + + If SMT is used, Spectre variant 2 attacks from an untrusted guest + in the sibling hyperthread can be mitigated by the administrator, + by turning off the unsafe guest's indirect branch speculation via + prctl(). A guest can also protect itself by turning on microcode + based mitigations (such as IBPB or STIBP on x86) within the guest. + +.. _spectre_sys_info: + +Spectre system information +-------------------------- + +The Linux kernel provides a sysfs interface to enumerate the current +mitigation status of the system for Spectre: whether the system is +vulnerable, and which mitigations are active. + +The sysfs file showing Spectre variant 1 mitigation status is: + + /sys/devices/system/cpu/vulnerabilities/spectre_v1 + +The possible values in this file are: + + ======================================= ================================= + 'Mitigation: __user pointer sanitation' Protection in kernel on a case by + case base with explicit pointer + sanitation. + ======================================= ================================= + +However, the protections are put in place on a case by case basis, +and there is no guarantee that all possible attack vectors for Spectre +variant 1 are covered. + +The spectre_v2 kernel file reports if the kernel has been compiled with +retpoline mitigation or if the CPU has hardware mitigation, and if the +CPU has support for additional process-specific mitigation. + +This file also reports CPU features enabled by microcode to mitigate +attack between user processes: + +1. Indirect Branch Prediction Barrier (IBPB) to add additional + isolation between processes of different users. +2. Single Thread Indirect Branch Predictors (STIBP) to add additional + isolation between CPU threads running on the same core. + +These CPU features may impact performance when used and can be enabled +per process on a case-by-case base. + +The sysfs file showing Spectre variant 2 mitigation status is: + + /sys/devices/system/cpu/vulnerabilities/spectre_v2 + +The possible values in this file are: + + - Kernel status: + + ==================================== ================================= + 'Not affected' The processor is not vulnerable + 'Vulnerable' Vulnerable, no mitigation + 'Mitigation: Full generic retpoline' Software-focused mitigation + 'Mitigation: Full AMD retpoline' AMD-specific software mitigation + 'Mitigation: Enhanced IBRS' Hardware-focused mitigation + ==================================== ================================= + + - Firmware status: Show if Indirect Branch Restricted Speculation (IBRS) is + used to protect against Spectre variant 2 attacks when calling firmware (x86 only). + + ========== ============================================================= + 'IBRS_FW' Protection against user program attacks when calling firmware + ========== ============================================================= + + - Indirect branch prediction barrier (IBPB) status for protection between + processes of different users. This feature can be controlled through + prctl() per process, or through kernel command line options. This is + an x86 only feature. For more details see below. + + =================== ======================================================== + 'IBPB: disabled' IBPB unused + 'IBPB: always-on' Use IBPB on all tasks + 'IBPB: conditional' Use IBPB on SECCOMP or indirect branch restricted tasks + =================== ======================================================== + + - Single threaded indirect branch prediction (STIBP) status for protection + between different hyper threads. This feature can be controlled through + prctl per process, or through kernel command line options. This is x86 + only feature. For more details see below. + + ==================== ======================================================== + 'STIBP: disabled' STIBP unused + 'STIBP: forced' Use STIBP on all tasks + 'STIBP: conditional' Use STIBP on SECCOMP or indirect branch restricted tasks + ==================== ======================================================== + + - Return stack buffer (RSB) protection status: + + ============= =========================================== + 'RSB filling' Protection of RSB on context switch enabled + ============= =========================================== + +Full mitigation might require an microcode update from the CPU +vendor. When the necessary microcode is not available, the kernel will +report vulnerability. + +Turning on mitigation for Spectre variant 1 and Spectre variant 2 +----------------------------------------------------------------- + +1. Kernel mitigation +^^^^^^^^^^^^^^^^^^^^ + + For the Spectre variant 1, vulnerable kernel code (as determined by + code audit or scanning tools) are annotated on a case by case basis to + use nospec accessor macros for bounds clipping [2] to avoid any usable + disclosure gadgets. However, it may not cover all attack vectors for + Spectre variant 1. + + For Spectre variant 2 mitigation, the compiler turns indirect calls or + jumps in the kernel into equivalent return trampolines (retpolines) + [3] [9] to go to the target addresses. Speculative execution paths + under retpolines are trapped in an infinite loop to prevent any + speculative execution jumping to a gadget. + + To turn on retpoline mitigation on a vulnerable CPU, the kernel + needs to be compiled with a gcc compiler that supports the + -mindirect-branch=thunk-extern -mindirect-branch-register options. + If the kernel is compiled with a Clang compiler, the compiler needs + to support -mretpoline-external-thunk option. The kernel config + CONFIG_RETPOLINE needs to be turned on, and the CPU needs to run with + the latest updated microcode. + + On Intel Skylake-era systems the mitigation covers most, but not all, + cases. See [3] for more details. + + On CPUs with hardware mitigation for Spectre variant 2 (e.g. Enhanced + IBRS on x86), retpoline is automatically disabled at run time. + + The retpoline mitigation is turned on by default on vulnerable + CPUs. It can be forced on or off by the administrator + via the kernel command line and sysfs control files. See + :ref:`spectre_mitigation_control_command_line`. + + On x86, indirect branch restricted speculation is turned on by default + before invoking any firmware code to prevent Spectre variant 2 exploits + using the firmware. + + Using kernel address space randomization (CONFIG_RANDOMIZE_SLAB=y + and CONFIG_SLAB_FREELIST_RANDOM=y in the kernel configuration) makes + attacks on the kernel generally more difficult. + +2. User program mitigation +^^^^^^^^^^^^^^^^^^^^^^^^^^ + + User programs can mitigate Spectre variant 1 using LFENCE or "bounds + clipping". For more details see [2]. + + For Spectre variant 2 mitigation, individual user programs + can be compiled with return trampolines for indirect branches. + This protects them from consuming poisoned entries in the branch + target buffer left by malicious software. Alternatively, the + programs can disable their indirect branch speculation via prctl() + (See :ref:`Documentation/userspace-api/spec_ctrl.rst <set_spec_ctrl>`) + On x86, this will turn on STIBP to guard against attacks from the + sibling thread when the user program is running, and use IBPB to + flush the branch target buffer when switching to/from the program. + + Restricting indirect branch speculation on a user program will + also prevent the program from launching a variant 2 attack + on x86. All sand-boxed SECCOMP programs have indirect branch + speculation restricted by default. Administrators can change + that behavior via the kernel command line and sysfs control files. + See :ref:`spectre_mitigation_control_command_line`. + + Programs that disable their indirect branch speculation will have + more overheads and run slower. + + User programs should use address space randomization + (/proc/sys/kernel/randomize_va_space = 1 or 2) to make attacks more + difficult. + +3. VM mitigation +^^^^^^^^^^^^^^^^ + + Within the kernel, Spectre variant 1 attacks from rogue guests are + mitigated on a case by case basis in VM exit paths. Vulnerable code + uses nospec accessor macros for "bounds clipping", to avoid any + usable disclosure gadgets. However, this may not cover all variant + 1 attack vectors. + + For Spectre variant 2 attacks from rogue guests to the kernel, the + Linux kernel uses retpoline or Enhanced IBRS to prevent consumption of + poisoned entries in branch target buffer left by rogue guests. It also + flushes the return stack buffer on every VM exit to prevent a return + stack buffer underflow so poisoned branch target buffer could be used, + or attacker guests leaving poisoned entries in the return stack buffer. + + To mitigate guest-to-guest attacks in the same CPU hardware thread, + the branch target buffer is sanitized by flushing before switching + to a new guest on a CPU. + + The above mitigations are turned on by default on vulnerable CPUs. + + To mitigate guest-to-guest attacks from sibling thread when SMT is + in use, an untrusted guest running in the sibling thread can have + its indirect branch speculation disabled by administrator via prctl(). + + The kernel also allows guests to use any microcode based mitigation + they chose to use (such as IBPB or STIBP on x86) to protect themselves. + +.. _spectre_mitigation_control_command_line: + +Mitigation control on the kernel command line +--------------------------------------------- + +Spectre variant 2 mitigation can be disabled or force enabled at the +kernel command line. + + nospectre_v2 [X86] Disable all mitigations for the Spectre variant 2 + (indirect branch prediction) vulnerability. System may + allow data leaks with this option, which is equivalent + to spectre_v2=off. + + + spectre_v2= [X86] Control mitigation of Spectre variant 2 + (indirect branch speculation) vulnerability. + The default operation protects the kernel from + user space attacks. + + on - unconditionally enable, implies + spectre_v2_user=on + off - unconditionally disable, implies + spectre_v2_user=off + auto - kernel detects whether your CPU model is + vulnerable + + Selecting 'on' will, and 'auto' may, choose a + mitigation method at run time according to the + CPU, the available microcode, the setting of the + CONFIG_RETPOLINE configuration option, and the + compiler with which the kernel was built. + + Selecting 'on' will also enable the mitigation + against user space to user space task attacks. + + Selecting 'off' will disable both the kernel and + the user space protections. + + Specific mitigations can also be selected manually: + + retpoline - replace indirect branches + retpoline,generic - google's original retpoline + retpoline,amd - AMD-specific minimal thunk + + Not specifying this option is equivalent to + spectre_v2=auto. + +For user space mitigation: + + spectre_v2_user= + [X86] Control mitigation of Spectre variant 2 + (indirect branch speculation) vulnerability between + user space tasks + + on - Unconditionally enable mitigations. Is + enforced by spectre_v2=on + + off - Unconditionally disable mitigations. Is + enforced by spectre_v2=off + + prctl - Indirect branch speculation is enabled, + but mitigation can be enabled via prctl + per thread. The mitigation control state + is inherited on fork. + + prctl,ibpb + - Like "prctl" above, but only STIBP is + controlled per thread. IBPB is issued + always when switching between different user + space processes. + + seccomp + - Same as "prctl" above, but all seccomp + threads will enable the mitigation unless + they explicitly opt out. + + seccomp,ibpb + - Like "seccomp" above, but only STIBP is + controlled per thread. IBPB is issued + always when switching between different + user space processes. + + auto - Kernel selects the mitigation depending on + the available CPU features and vulnerability. + + Default mitigation: + If CONFIG_SECCOMP=y then "seccomp", otherwise "prctl" + + Not specifying this option is equivalent to + spectre_v2_user=auto. + + In general the kernel by default selects + reasonable mitigations for the current CPU. To + disable Spectre variant 2 mitigations boot with + spectre_v2=off. Spectre variant 1 mitigations + cannot be disabled. + +Mitigation selection guide +-------------------------- + +1. Trusted userspace +^^^^^^^^^^^^^^^^^^^^ + + If all userspace applications are from trusted sources and do not + execute externally supplied untrusted code, then the mitigations can + be disabled. + +2. Protect sensitive programs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + + For security-sensitive programs that have secrets (e.g. crypto + keys), protection against Spectre variant 2 can be put in place by + disabling indirect branch speculation when the program is running + (See :ref:`Documentation/userspace-api/spec_ctrl.rst <set_spec_ctrl>`). + +3. Sandbox untrusted programs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + + Untrusted programs that could be a source of attacks can be cordoned + off by disabling their indirect branch speculation when they are run + (See :ref:`Documentation/userspace-api/spec_ctrl.rst <set_spec_ctrl>`). + This prevents untrusted programs from polluting the branch target + buffer. All programs running in SECCOMP sandboxes have indirect + branch speculation restricted by default. This behavior can be + changed via the kernel command line and sysfs control files. See + :ref:`spectre_mitigation_control_command_line`. + +3. High security mode +^^^^^^^^^^^^^^^^^^^^^ + + All Spectre variant 2 mitigations can be forced on + at boot time for all programs (See the "on" option in + :ref:`spectre_mitigation_control_command_line`). This will add + overhead as indirect branch speculations for all programs will be + restricted. + + On x86, branch target buffer will be flushed with IBPB when switching + to a new program. STIBP is left on all the time to protect programs + against variant 2 attacks originating from programs running on + sibling threads. + + Alternatively, STIBP can be used only when running programs + whose indirect branch speculation is explicitly disabled, + while IBPB is still used all the time when switching to a new + program to clear the branch target buffer (See "ibpb" option in + :ref:`spectre_mitigation_control_command_line`). This "ibpb" option + has less performance cost than the "on" option, which leaves STIBP + on all the time. + +References on Spectre +--------------------- + +Intel white papers: + +[1] `Intel analysis of speculative execution side channels <https://newsroom.intel.com/wp-content/uploads/sites/11/2018/01/Intel-Analys…>`_. + +[2] `Bounds check bypass <https://software.intel.com/security-software-guidance/software-guidance/bou…>`_. + +[3] `Deep dive: Retpoline: A branch target injection mitigation <https://software.intel.com/security-software-guidance/insights/deep-dive-re…>`_. + +[4] `Deep Dive: Single Thread Indirect Branch Predictors <https://software.intel.com/security-software-guidance/insights/deep-dive-si…>`_. + +AMD white papers: + +[5] `AMD64 technology indirect branch control extension <https://developer.amd.com/wp-content/resources/Architecture_Guidelines_Upda…>`_. + +[6] `Software techniques for managing speculation on AMD processors <https://developer.amd.com/wp-content/resources/90343-B_SoftwareTechniquesfo…>`_. + +ARM white papers: + +[7] `Cache speculation side-channels <https://developer.arm.com/support/arm-security-updates/speculative-processo…>`_. + +[8] `Cache speculation issues update <https://developer.arm.com/support/arm-security-updates/speculative-processo…>`_. + +Google white paper: + +[9] `Retpoline: a software construct for preventing branch-target-injection <https://support.google.com/faqs/answer/7625886>`_. + +MIPS white paper: + +[10] `MIPS: response on speculative execution and side channel vulnerabilities <https://www.mips.com/blog/mips-response-on-speculative-execution-and-side-c…>`_. + +Academic papers: + +[11] `Spectre Attacks: Exploiting Speculative Execution <https://spectreattack.com/spectre.pdf>`_. + +[12] `NetSpectre: Read Arbitrary Memory over Network <https://arxiv.org/abs/1807.10535>`_. + +[13] `Spectre Returns! Speculation Attacks using the Return Stack Buffer <https://www.usenix.org/system/files/conference/woot18/woot18-paper-koruyeh.…>`_. diff --git a/Documentation/userspace-api/spec_ctrl.rst b/Documentation/userspace-api/spec_ctrl.rst index 1129c7550a48..7ddd8f667459 100644 --- a/Documentation/userspace-api/spec_ctrl.rst +++ b/Documentation/userspace-api/spec_ctrl.rst @@ -49,6 +49,8 @@ If PR_SPEC_PRCTL is set, then the per-task control of the mitigation is available. If not set, prctl(PR_SET_SPECULATION_CTRL) for the speculation misfeature will fail. +.. _set_spec_ctrl: + PR_SET_SPECULATION_CTRL ----------------------- -- 2.20.1

6 years, 6 months

crypto/NX: Set receive window credits to max number of CRBs in RxFIFO

by Haren Myneni

System gets checkstop if RxFIFO overruns with more requests than the maximum possible number of CRBs in FIFO at the same time. So find max CRBs from FIFO size and set it to receive window credits. CC: stable(a)vger.kernel.org # v4.14+ Signed-off-by:Haren Myneni <haren(a)us.ibm.com> diff --git a/drivers/crypto/nx/nx-842-powernv.c b/drivers/crypto/nx/nx-842-powernv.c index 4acbc47..e78ff5c 100644 --- a/drivers/crypto/nx/nx-842-powernv.c +++ b/drivers/crypto/nx/nx-842-powernv.c @@ -27,8 +27,6 @@ #define WORKMEM_ALIGN (CRB_ALIGN) #define CSB_WAIT_MAX (5000) /* ms */ #define VAS_RETRIES (10) -/* # of requests allowed per RxFIFO at a time. 0 for unlimited */ -#define MAX_CREDITS_PER_RXFIFO (1024) struct nx842_workmem { /* Below fields must be properly aligned */ @@ -812,7 +810,11 @@ static int __init vas_cfg_coproc_info(struct device_node *dn, int chip_id, rxattr.lnotify_lpid = lpid; rxattr.lnotify_pid = pid; rxattr.lnotify_tid = tid; - rxattr.wcreds_max = MAX_CREDITS_PER_RXFIFO; + /* + * Maximum RX window credits can not be more than #CRBs in + * RxFIFO. Otherwise, can get checkstop if RxFIFO overruns. + */ + rxattr.wcreds_max = fifo_size / CRB_SIZE; /* * Open a VAS receice window which is used to configure RxFIFO

6 years, 6 months

[PATCH stable v4.19.52] x86/resctrl: Don't stop walking closids when a locksetup group is found

by James Morse

commit 87d3aa28f345bea77c396855fa5d5fec4c24461f upstream. When a new control group is created __init_one_rdt_domain() walks all the other closids to calculate the sets of used and unused bits. If it discovers a pseudo_locksetup group, it breaks out of the loop. This means any later closid doesn't get its used bits added to used_b. These bits will then get set in unused_b, and added to the new control group's configuration, even if they were marked as exclusive for a later closid. When encountering a pseudo_locksetup group, we should continue. This is because "a resource group enters 'pseudo-locked' mode after the schemata is written while the resource group is in 'pseudo-locksetup' mode." When we find a pseudo_locksetup group, its configuration is expected to be overwritten, we can skip it. Fixes: dfe9674b04ff6 ("x86/intel_rdt: Enable entering of pseudo-locksetup mode") Signed-off-by: James Morse <james.morse(a)arm.com> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Acked-by: Reinette Chatre <reinette.chatre(a)intel.com> Cc: Fenghua Yu <fenghua.yu(a)intel.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: H Peter Avin <hpa(a)zytor.com> Cc: <stable(a)vger.kernel.org> Link: https://lkml.kernel.org/r/20190603172531.178830-1-james.morse@arm.com [Dropped comment due to lack space] Signed-off-by: James Morse <james.morse(a)arm.com> --- arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c index 643670fb8943..274d220d0a83 100644 --- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c +++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c @@ -2379,7 +2379,7 @@ static int rdtgroup_init_alloc(struct rdtgroup *rdtgrp) if (closid_allocated(i) && i != closid) { mode = rdtgroup_mode_by_closid(i); if (mode == RDT_MODE_PSEUDO_LOCKSETUP) - break; + continue; used_b |= *ctrl; if (mode == RDT_MODE_SHAREABLE) d->new_ctrl |= *ctrl; -- 2.20.1

6 years, 6 months

[PATCH stable v5.1.11] x86/resctrl: Don't stop walking closids when a locksetup group is found

by James Morse

commit 87d3aa28f345bea77c396855fa5d5fec4c24461f upstream. When a new control group is created __init_one_rdt_domain() walks all the other closids to calculate the sets of used and unused bits. If it discovers a pseudo_locksetup group, it breaks out of the loop. This means any later closid doesn't get its used bits added to used_b. These bits will then get set in unused_b, and added to the new control group's configuration, even if they were marked as exclusive for a later closid. When encountering a pseudo_locksetup group, we should continue. This is because "a resource group enters 'pseudo-locked' mode after the schemata is written while the resource group is in 'pseudo-locksetup' mode." When we find a pseudo_locksetup group, its configuration is expected to be overwritten, we can skip it. Fixes: dfe9674b04ff6 ("x86/intel_rdt: Enable entering of pseudo-locksetup mode") Signed-off-by: James Morse <james.morse(a)arm.com> Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de> Acked-by: Reinette Chatre <reinette.chatre(a)intel.com> Cc: Fenghua Yu <fenghua.yu(a)intel.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: H Peter Avin <hpa(a)zytor.com> Cc: <stable(a)vger.kernel.org> Link: https://lkml.kernel.org/r/20190603172531.178830-1-james.morse@arm.com [Dropped comment due to lack of space] Signed-off-by: James Morse <james.morse(a)arm.com> --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 85212a32b54d..c51b56e29948 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2556,7 +2556,7 @@ static int rdtgroup_init_alloc(struct rdtgroup *rdtgrp) if (closid_allocated(i) && i != closid) { mode = rdtgroup_mode_by_closid(i); if (mode == RDT_MODE_PSEUDO_LOCKSETUP) - break; + continue; /* * If CDP is active include peer * domain's usage to ensure there -- 2.20.1

6 years, 6 months

patch "xhci: detect USB 3.2 capable host controllers correctly" added to usb-linus

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled xhci: detect USB 3.2 capable host controllers correctly to my usb git tree which can be found at git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git in the usb-linus branch. The patch will show up in the next release of the linux-next tree (usually sometime within the next 24 hours during the week.) The patch will hopefully also be merged in Linus's tree for the next -rc kernel release. If you have any questions about this process, please let me know. >From ddd57980a0fde30f7b5d14b888a2cc84d01610e8 Mon Sep 17 00:00:00 2001 From: Mathias Nyman <mathias.nyman(a)linux.intel.com> Date: Tue, 18 Jun 2019 17:27:48 +0300 Subject: xhci: detect USB 3.2 capable host controllers correctly USB 3.2 capability in a host can be detected from the xHCI Supported Protocol Capability major and minor revision fields. If major is 0x3 and minor 0x20 then the host is USB 3.2 capable. For USB 3.2 capable hosts set the root hub lane count to 2. The Major Revision and Minor Revision fields contain a BCD version number. The value of the Major Revision field is JJh and the value of the Minor Revision field is MNh for version JJ.M.N, where JJ = major revision number, M - minor version number, N = sub-minor version number, e.g. version 3.1 is represented with a value of 0310h. Also fix the extra whitespace printed out when announcing regular SuperSpeed hosts. Cc: <stable(a)vger.kernel.org> # v4.18+ Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/usb/host/xhci.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index 78a2a937dd83..3f79f35d0b19 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -5065,16 +5065,26 @@ int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks) } else { /* * Some 3.1 hosts return sbrn 0x30, use xhci supported protocol - * minor revision instead of sbrn + * minor revision instead of sbrn. Minor revision is a two digit + * BCD containing minor and sub-minor numbers, only show minor. */ - minor_rev = xhci->usb3_rhub.min_rev; - if (minor_rev) { + minor_rev = xhci->usb3_rhub.min_rev / 0x10; + + switch (minor_rev) { + case 2: + hcd->speed = HCD_USB32; + hcd->self.root_hub->speed = USB_SPEED_SUPER_PLUS; + hcd->self.root_hub->rx_lanes = 2; + hcd->self.root_hub->tx_lanes = 2; + break; + case 1: hcd->speed = HCD_USB31; hcd->self.root_hub->speed = USB_SPEED_SUPER_PLUS; + break; } - xhci_info(xhci, "Host supports USB 3.%x %s SuperSpeed\n", + xhci_info(xhci, "Host supports USB 3.%x %sSuperSpeed\n", minor_rev, - minor_rev ? "Enhanced" : ""); + minor_rev ? "Enhanced " : ""); xhci->usb3_rhub.hcd = hcd; /* xHCI private pointer was set in xhci_pci_probe for the second -- 2.22.0

6 years, 6 months

patch "usb: xhci: Don't try to recover an endpoint if port is in error" added to usb-linus

by gregkh＠linuxfoundation.org

This is a note to let you know that I've just added the patch titled usb: xhci: Don't try to recover an endpoint if port is in error to my usb git tree which can be found at git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git in the usb-linus branch. The patch will show up in the next release of the linux-next tree (usually sometime within the next 24 hours during the week.) The patch will hopefully also be merged in Linus's tree for the next -rc kernel release. If you have any questions about this process, please let me know. >From b8c3b718087bf7c3c8e388eb1f72ac1108a4926e Mon Sep 17 00:00:00 2001 From: Mathias Nyman <mathias.nyman(a)linux.intel.com> Date: Tue, 18 Jun 2019 17:27:47 +0300 Subject: usb: xhci: Don't try to recover an endpoint if port is in error state. A USB3 device needs to be reset and re-enumarated if the port it connects to goes to a error state, with link state inactive. There is no use in trying to recover failed transactions by resetting endpoints at this stage. Tests show that in rare cases, after multiple endpoint resets of a roothub port the whole host controller might stop completely. Several retries to recover from transaction error can happen as it can take a long time before the hub thread discovers the USB3 port error and inactive link. We can't reliably detect the port error from slot or endpoint context due to a limitation in xhci, see xhci specs section 4.8.3: "There are several cases where the EP State field in the Output Endpoint Context may not reflect the current state of an endpoint" and "Software should maintain an accurate value for EP State, by tracking it with an internal variable that is driven by Events and Doorbell accesses" Same appears to be true for slot state. set a flag to the corresponding slot if a USB3 roothub port link goes inactive to prevent both queueing new URBs and resetting endpoints. Reported-by: Rapolu Chiranjeevi <chiranjeevi.rapolu(a)intel.com> Tested-by: Rapolu Chiranjeevi <chiranjeevi.rapolu(a)intel.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman(a)linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> --- drivers/usb/host/xhci-ring.c | 15 ++++++++++++++- drivers/usb/host/xhci.c | 5 +++++ drivers/usb/host/xhci.h | 9 +++++++++ 3 files changed, 28 insertions(+), 1 deletion(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index feffceb31e8a..121782e22c01 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -1612,8 +1612,13 @@ static void handle_port_status(struct xhci_hcd *xhci, usb_hcd_resume_root_hub(hcd); } - if (hcd->speed >= HCD_USB3 && (portsc & PORT_PLS_MASK) == XDEV_INACTIVE) + if (hcd->speed >= HCD_USB3 && + (portsc & PORT_PLS_MASK) == XDEV_INACTIVE) { + slot_id = xhci_find_slot_id_by_port(hcd, xhci, hcd_portnum + 1); + if (slot_id && xhci->devs[slot_id]) + xhci->devs[slot_id]->flags |= VDEV_PORT_ERROR; bus_state->port_remote_wakeup &= ~(1 << hcd_portnum); + } if ((portsc & PORT_PLC) && (portsc & PORT_PLS_MASK) == XDEV_RESUME) { xhci_dbg(xhci, "port resume event for port %d\n", port_id); @@ -1801,6 +1806,14 @@ static void xhci_cleanup_halted_endpoint(struct xhci_hcd *xhci, { struct xhci_virt_ep *ep = &xhci->devs[slot_id]->eps[ep_index]; struct xhci_command *command; + + /* + * Avoid resetting endpoint if link is inactive. Can cause host hang. + * Device will be reset soon to recover the link so don't do anything + */ + if (xhci->devs[slot_id]->flags & VDEV_PORT_ERROR) + return; + command = xhci_alloc_command(xhci, false, GFP_ATOMIC); if (!command) return; diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index 20db378a6012..78a2a937dd83 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -1466,6 +1466,10 @@ static int xhci_urb_enqueue(struct usb_hcd *hcd, struct urb *urb, gfp_t mem_flag xhci_dbg(xhci, "urb submitted during PCI suspend\n"); return -ESHUTDOWN; } + if (xhci->devs[slot_id]->flags & VDEV_PORT_ERROR) { + xhci_dbg(xhci, "Can't queue urb, port error, link inactive\n"); + return -ENODEV; + } if (usb_endpoint_xfer_isoc(&urb->ep->desc)) num_tds = urb->number_of_packets; @@ -3754,6 +3758,7 @@ static int xhci_discover_or_reset_device(struct usb_hcd *hcd, } /* If necessary, update the number of active TTs on this root port */ xhci_update_tt_active_eps(xhci, virt_dev, old_active_eps); + virt_dev->flags = 0; ret = 0; command_cleanup: diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h index 7f8b950d1a73..92e764c54154 100644 --- a/drivers/usb/host/xhci.h +++ b/drivers/usb/host/xhci.h @@ -1010,6 +1010,15 @@ struct xhci_virt_device { u8 real_port; struct xhci_interval_bw_table *bw_table; struct xhci_tt_bw_info *tt_info; + /* + * flags for state tracking based on events and issued commands. + * Software can not rely on states from output contexts because of + * latency between events and xHC updating output context values. + * See xhci 1.1 section 4.8.3 for more details + */ + unsigned long flags; +#define VDEV_PORT_ERROR BIT(0) /* Port error, link inactive */ + /* The current max exit latency for the enabled USB3 link states. */ u16 current_mel; /* Used for the debugfs interfaces. */ -- 2.22.0

6 years, 6 months

Jump to page:

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror June 2019