From: Dexuan Cui <decui(a)microsoft.com>
The pci-hyperv driver's channel callback hv_pci_onchannelcallback() is not
really a hot path, so we don't need to mark it as a perf_device, meaning
with this patch all HV_PCIE channels' target_cpu will be CPU0.
Signed-off-by: Dexuan Cui <decui(a)microsoft.com>
Cc: stable(a)vger.kernel.org
Cc: Stephen Hemminger <sthemmin(a)microsoft.com>
Cc: K. Y. Srinivasan <kys(a)microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys(a)microsoft.com>
---
drivers/hv/channel_mgmt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index c6d9d19bc04e..ecc2bd275a73 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -71,7 +71,7 @@ static const struct vmbus_device vmbus_devs[] = {
/* PCIE */
{ .dev_type = HV_PCIE,
HV_PCIE_GUID,
- .perf_device = true,
+ .perf_device = false,
},
/* Synthetic Frame Buffer */
--
2.15.1
From: Thomas Petazzoni <thomas.petazzoni(a)free-electrons.com>
In other to mimic other PCIe host controller drivers, introduce an
advk_pcie_valid_device() helper, used in the configuration read/write
functions.
This patch by itself is not a fix, but it is required for a follow-up
patch that is a fix, hence the Fixes tag and the Cc to stable.
Fixes: 8c39d710363c1 ("PCI: aardvark: Add Aardvark PCI host controller driver")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni(a)free-electrons.com>
---
Changes since v2:
- New patch
---
drivers/pci/host/pci-aardvark.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/host/pci-aardvark.c b/drivers/pci/host/pci-aardvark.c
index b04d37b3c5de..ccd0304a0c21 100644
--- a/drivers/pci/host/pci-aardvark.c
+++ b/drivers/pci/host/pci-aardvark.c
@@ -430,6 +430,15 @@ static int advk_pcie_wait_pio(struct advk_pcie *pcie)
return -ETIMEDOUT;
}
+static int advk_pcie_valid_device(struct advk_pcie *pcie, struct pci_bus *bus,
+ int devfn)
+{
+ if (PCI_SLOT(devfn) != 0)
+ return false;
+
+ return true;
+}
+
static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
int where, int size, u32 *val)
{
@@ -437,7 +446,7 @@ static int advk_pcie_rd_conf(struct pci_bus *bus, u32 devfn,
u32 reg;
int ret;
- if (PCI_SLOT(devfn) != 0) {
+ if (!advk_pcie_valid_device(pcie, bus, devfn)) {
*val = 0xffffffff;
return PCIBIOS_DEVICE_NOT_FOUND;
}
@@ -491,7 +500,7 @@ static int advk_pcie_wr_conf(struct pci_bus *bus, u32 devfn,
int offset;
int ret;
- if (PCI_SLOT(devfn) != 0)
+ if (!advk_pcie_valid_device(pcie, bus, devfn))
return PCIBIOS_DEVICE_NOT_FOUND;
if (where % size)
--
2.14.3
The kernel documentation states that the locking of the irq-chip
registers should be handled by the irq-chip driver. In the irq-gic,
the accesses to the irqchip are seemingly not protected and multiple
writes to SPIs from different irq descriptors do RMW requests without
taking the irq-chip lock. When multiple irqs call the request_irq at
the same time, there can be a simultaneous write at the gic
distributor, leading to a race. Acquire the gic_lock when the
irq_type is updated.
Signed-off-by: Aniruddha Banerjee <aniruddha.nitd(a)gmail.com>
---
drivers/irqchip/irq-gic-common.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-common.c b/drivers/irqchip/irq-gic-common.c
index 9ae71804b5dd..73dd39959e6e 100644
--- a/drivers/irqchip/irq-gic-common.c
+++ b/drivers/irqchip/irq-gic-common.c
@@ -21,6 +21,8 @@
#include "irq-gic-common.h"
+static DEFINE_RAW_SPINLOCK(irq_controller_lock);
+
static const struct gic_kvm_info *gic_kvm_info;
const struct gic_kvm_info *gic_get_kvm_info(void)
@@ -57,6 +59,7 @@ int gic_configure_irq(unsigned int irq, unsigned int type,
* Read current configuration register, and insert the config
* for "irq", depending on "type".
*/
+ raw_spin_lock(&irq_controller_lock);
val = oldval = readl_relaxed(base + GIC_DIST_CONFIG + confoff);
if (type & IRQ_TYPE_LEVEL_MASK)
val &= ~confmask;
@@ -64,8 +67,10 @@ int gic_configure_irq(unsigned int irq, unsigned int type,
val |= confmask;
/* If the current configuration is the same, then we are done */
- if (val == oldval)
+ if (val == oldval) {
+ raw_spin_unlock(&irq_controller_lock);
return 0;
+ }
/*
* Write back the new configuration, and possibly re-enable
@@ -83,6 +88,7 @@ int gic_configure_irq(unsigned int irq, unsigned int type,
pr_warn("GIC: PPI%d is secure or misconfigured\n",
irq - 16);
}
+ raw_spin_unlock(&irq_controller_lock);
if (sync_access)
sync_access();
--
2.16.2
Driver currently crashes due to NULL pointer deference
while updating PHY tune register if nvmem cell is NULL.
Since, fused value for Tune1/2 register is optional,
we'd rather bail out.
Fixes: ca04d9d3e1b1 ("phy: qcom-qusb2: New driver for QUSB2 PHY on Qcom chips")
Reviewed-by: Vivek Gautam <vivek.gautam(a)codeaurora.org>
Cc: stable <stable(a)vger.kernel.org> # 4.14+
Signed-off-by: Manu Gautam <mgautam(a)codeaurora.org>
---
drivers/phy/qualcomm/phy-qcom-qusb2.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/phy/qualcomm/phy-qcom-qusb2.c b/drivers/phy/qualcomm/phy-qcom-qusb2.c
index 94afeac..40fdef8 100644
--- a/drivers/phy/qualcomm/phy-qcom-qusb2.c
+++ b/drivers/phy/qualcomm/phy-qcom-qusb2.c
@@ -315,6 +315,10 @@ static void qusb2_phy_set_tune2_param(struct qusb2_phy *qphy)
const struct qusb2_phy_cfg *cfg = qphy->cfg;
u8 *val;
+ /* efuse register is optional */
+ if (!qphy->cell)
+ return;
+
/*
* Read efuse register having TUNE2/1 parameter's high nibble.
* If efuse register shows value as 0x0, or if we fail to find
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
From: AMAN DEEP <aman.deep(a)samsung.com>
There is a race condition between finish_unlinks->finish_urb() function
and usb_kill_urb() in ohci controller case. The finish_urb calls
spin_unlock(&ohci->lock) before usb_hcd_giveback_urb() function call,
then if during this time, usb_kill_urb is called for another endpoint,
then new ed will be added to ed_rm_list at beginning for unlink, and
ed_rm_list will point to newly added.
When finish_urb() is completed in finish_unlinks() and ed->td_list
becomes empty as in below code (in finish_unlinks() function):
if (list_empty(&ed->td_list)) {
*last = ed->ed_next;
ed->ed_next = NULL;
} else if (ohci->rh_state == OHCI_RH_RUNNING) {
*last = ed->ed_next;
ed->ed_next = NULL;
ed_schedule(ohci, ed);
}
The *last = ed->ed_next will make ed_rm_list to point to ed->ed_next
and previously added ed by usb_kill_urb will be left unreferenced by
ed_rm_list. This causes usb_kill_urb() hang forever waiting for
finish_unlink to remove added ed from ed_rm_list.
The main reason for hang in this race condtion is addition and removal
of ed from ed_rm_list in the beginning during usb_kill_urb and later
last* is modified in finish_unlinks().
As suggested by Alan Stern, the solution for proper handling of
ohci->ed_rm_list is to remove ed from the ed_rm_list before finishing
any URBs. Then at the end, we can add ed back to the list if necessary.
This properly handle the updated ohci->ed_rm_list in usb_kill_urb().
Fixes:977dcfdc6031("USB:OHCI:don't lose track of EDs when a controller dies")
Acked-by: Alan Stern <stern(a)rowland.harvard.edu>
CC: <stable(a)vger.kernel.org>
Signed-off-by: Aman Deep <aman.deep(a)samsung.com>
Signed-off-by: Jeffy Chen <jeffy.chen(a)rock-chips.com>
---
Changes in v6:
This is a resend of Aman Deep's v5 patch [0], which solved the hang we
hit [1]. (Thanks Aman :)
The v5 has some format issues, so i slightly adjust the commit message.
[0] https://www.spinics.net/lists/linux-usb/msg129010.html
[1] https://bugs.chromium.org/p/chromium/issues/detail?id=803749
drivers/usb/host/ohci-q.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/drivers/usb/host/ohci-q.c b/drivers/usb/host/ohci-q.c
index b2ec8c399363..4ccb85a67bb3 100644
--- a/drivers/usb/host/ohci-q.c
+++ b/drivers/usb/host/ohci-q.c
@@ -1019,6 +1019,8 @@ static void finish_unlinks(struct ohci_hcd *ohci)
* have modified this list. normally it's just prepending
* entries (which we'd ignore), but paranoia won't hurt.
*/
+ *last = ed->ed_next;
+ ed->ed_next = NULL;
modified = 0;
/* unlink urbs as requested, but rescan the list after
@@ -1077,21 +1079,22 @@ static void finish_unlinks(struct ohci_hcd *ohci)
goto rescan_this;
/*
- * If no TDs are queued, take ED off the ed_rm_list.
+ * If no TDs are queued, ED is now idle.
* Otherwise, if the HC is running, reschedule.
- * If not, leave it on the list for further dequeues.
+ * If the HC isn't running, add ED back to the
+ * start of the list for later processing.
*/
if (list_empty(&ed->td_list)) {
- *last = ed->ed_next;
- ed->ed_next = NULL;
ed->state = ED_IDLE;
list_del(&ed->in_use_list);
} else if (ohci->rh_state == OHCI_RH_RUNNING) {
- *last = ed->ed_next;
- ed->ed_next = NULL;
ed_schedule(ohci, ed);
} else {
- last = &ed->ed_next;
+ ed->ed_next = ohci->ed_rm_list;
+ ohci->ed_rm_list = ed;
+ /* Don't loop on the same ED */
+ if (last == &ohci->ed_rm_list)
+ last = &ed->ed_next;
}
if (modified)
--
2.11.0
This is a note to let you know that I've just added the patch titled
kbuild: disable clang's default use of -fmerge-all-constants
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kbuild-disable-clang-s-default-use-of-fmerge-all-constants.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 87e0d4f0f37fb0c8c4aeeac46fff5e957738df79 Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel(a)iogearbox.net>
Date: Wed, 21 Mar 2018 01:18:24 +0100
Subject: kbuild: disable clang's default use of -fmerge-all-constants
From: Daniel Borkmann <daniel(a)iogearbox.net>
commit 87e0d4f0f37fb0c8c4aeeac46fff5e957738df79 upstream.
Prasad reported that he has seen crashes in BPF subsystem with netd
on Android with arm64 in the form of (note, the taint is unrelated):
[ 4134.721483] Unable to handle kernel paging request at virtual address 800000001
[ 4134.820925] Mem abort info:
[ 4134.901283] Exception class = DABT (current EL), IL = 32 bits
[ 4135.016736] SET = 0, FnV = 0
[ 4135.119820] EA = 0, S1PTW = 0
[ 4135.201431] Data abort info:
[ 4135.301388] ISV = 0, ISS = 0x00000021
[ 4135.359599] CM = 0, WnR = 0
[ 4135.470873] user pgtable: 4k pages, 39-bit VAs, pgd = ffffffe39b946000
[ 4135.499757] [0000000800000001] *pgd=0000000000000000, *pud=0000000000000000
[ 4135.660725] Internal error: Oops: 96000021 [#1] PREEMPT SMP
[ 4135.674610] Modules linked in:
[ 4135.682883] CPU: 5 PID: 1260 Comm: netd Tainted: G S W 4.14.19+ #1
[ 4135.716188] task: ffffffe39f4aa380 task.stack: ffffff801d4e0000
[ 4135.731599] PC is at bpf_prog_add+0x20/0x68
[ 4135.741746] LR is at bpf_prog_inc+0x20/0x2c
[ 4135.751788] pc : [<ffffff94ab7ad584>] lr : [<ffffff94ab7ad638>] pstate: 60400145
[ 4135.769062] sp : ffffff801d4e3ce0
[...]
[ 4136.258315] Process netd (pid: 1260, stack limit = 0xffffff801d4e0000)
[ 4136.273746] Call trace:
[...]
[ 4136.442494] 3ca0: ffffff94ab7ad584 0000000060400145 ffffffe3a01bf8f8 0000000000000006
[ 4136.460936] 3cc0: 0000008000000000 ffffff94ab844204 ffffff801d4e3cf0 ffffff94ab7ad584
[ 4136.479241] [<ffffff94ab7ad584>] bpf_prog_add+0x20/0x68
[ 4136.491767] [<ffffff94ab7ad638>] bpf_prog_inc+0x20/0x2c
[ 4136.504536] [<ffffff94ab7b5d08>] bpf_obj_get_user+0x204/0x22c
[ 4136.518746] [<ffffff94ab7ade68>] SyS_bpf+0x5a8/0x1a88
Android's netd was basically pinning the uid cookie BPF map in BPF
fs (/sys/fs/bpf/traffic_cookie_uid_map) and later on retrieving it
again resulting in above panic. Issue is that the map was wrongly
identified as a prog! Above kernel was compiled with clang 4.0,
and it turns out that clang decided to merge the bpf_prog_iops and
bpf_map_iops into a single memory location, such that the two i_ops
could then not be distinguished anymore.
Reason for this miscompilation is that clang has the more aggressive
-fmerge-all-constants enabled by default. In fact, clang source code
has a comment about it in lib/AST/ExprConstant.cpp on why it is okay
to do so:
Pointers with different bases cannot represent the same object.
(Note that clang defaults to -fmerge-all-constants, which can
lead to inconsistent results for comparisons involving the address
of a constant; this generally doesn't matter in practice.)
The issue never appeared with gcc however, since gcc does not enable
-fmerge-all-constants by default and even *explicitly* states in
it's option description that using this flag results in non-conforming
behavior, quote from man gcc:
Languages like C or C++ require each variable, including multiple
instances of the same variable in recursive calls, to have distinct
locations, so using this option results in non-conforming behavior.
There are also various clang bug reports open on that matter [1],
where clang developers acknowledge the non-conforming behavior,
and refer to disabling it with -fno-merge-all-constants. But even
if this gets fixed in clang today, there are already users out there
that triggered this. Thus, fix this issue by explicitly adding
-fno-merge-all-constants to the kernel's Makefile to generically
disable this optimization, since potentially other places in the
kernel could subtly break as well.
Note, there is also a flag called -fmerge-constants (not supported
by clang), which is more conservative and only applies to strings
and it's enabled in gcc's -O/-O2/-O3/-Os optimization levels. In
gcc's code, the two flags -fmerge-{all-,}constants share the same
variable internally, so when disabling it via -fno-merge-all-constants,
then we really don't merge any const data (e.g. strings), and text
size increases with gcc (14,927,214 -> 14,942,646 for vmlinux.o).
$ gcc -fverbose-asm -O2 foo.c -S -o foo.S
-> foo.S lists -fmerge-constants under options enabled
$ gcc -fverbose-asm -O2 -fno-merge-all-constants foo.c -S -o foo.S
-> foo.S doesn't list -fmerge-constants under options enabled
$ gcc -fverbose-asm -O2 -fno-merge-all-constants -fmerge-constants foo.c -S -o foo.S
-> foo.S lists -fmerge-constants under options enabled
Thus, as a workaround we need to set both -fno-merge-all-constants
*and* -fmerge-constants in the Makefile in order for text size to
stay as is.
[1] https://bugs.llvm.org/show_bug.cgi?id=18538
Reported-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Chenbo Feng <fengc(a)google.com>
Cc: Richard Smith <richard-llvm(a)metafoo.co.uk>
Cc: Chandler Carruth <chandlerc(a)gmail.com>
Cc: linux-kernel(a)vger.kernel.org
Tested-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Makefile | 9 +++++++++
1 file changed, 9 insertions(+)
--- a/Makefile
+++ b/Makefile
@@ -790,6 +790,15 @@ KBUILD_CFLAGS += $(call cc-disable-warni
# disable invalid "can't wrap" optimizations for signed / pointers
KBUILD_CFLAGS += $(call cc-option,-fno-strict-overflow)
+# clang sets -fmerge-all-constants by default as optimization, but this
+# is non-conforming behavior for C and in fact breaks the kernel, so we
+# need to disable it here generally.
+KBUILD_CFLAGS += $(call cc-option,-fno-merge-all-constants)
+
+# for gcc -fno-merge-all-constants disables everything, but it is fine
+# to have actual conforming behavior enabled.
+KBUILD_CFLAGS += $(call cc-option,-fmerge-constants)
+
# Make sure -fstack-check isn't enabled (like gentoo apparently did)
KBUILD_CFLAGS += $(call cc-option,-fno-stack-check,)
Patches currently in stable-queue which might be from daniel(a)iogearbox.net are
queue-4.9/kbuild-disable-clang-s-default-use-of-fmerge-all-constants.patch
queue-4.9/bpf-skip-unnecessary-capability-check.patch
queue-4.9/bpf-x64-increase-number-of-passes.patch
This is a note to let you know that I've just added the patch titled
bpf, x64: increase number of passes
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
bpf-x64-increase-number-of-passes.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 6007b080d2e2adb7af22bf29165f0594ea12b34c Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel(a)iogearbox.net>
Date: Wed, 7 Mar 2018 22:10:01 +0100
Subject: bpf, x64: increase number of passes
From: Daniel Borkmann <daniel(a)iogearbox.net>
commit 6007b080d2e2adb7af22bf29165f0594ea12b34c upstream.
In Cilium some of the main programs we run today are hitting 9 passes
on x64's JIT compiler, and we've had cases already where we surpassed
the limit where the JIT then punts the program to the interpreter
instead, leading to insertion failures due to CONFIG_BPF_JIT_ALWAYS_ON
or insertion failures due to the prog array owner being JITed but the
program to insert not (both must have the same JITed/non-JITed property).
One concrete case the program image shrunk from 12,767 bytes down to
10,288 bytes where the image converged after 16 steps. I've measured
that this took 340us in the JIT until it converges on my i7-6600U. Thus,
increase the original limit we had from day one where the JIT covered
cBPF only back then before we run into the case (as similar with the
complexity limit) where we trip over this and hit program rejections.
Also add a cond_resched() into the compilation loop, the JIT process
runs without any locks and may sleep anyway.
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
Reviewed-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/net/bpf_jit_comp.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1135,7 +1135,7 @@ struct bpf_prog *bpf_int_jit_compile(str
* may converge on the last pass. In such case do one more
* pass to emit the final image
*/
- for (pass = 0; pass < 10 || image; pass++) {
+ for (pass = 0; pass < 20 || image; pass++) {
proglen = do_jit(prog, addrs, image, oldproglen, &ctx);
if (proglen <= 0) {
image = NULL;
@@ -1162,6 +1162,7 @@ struct bpf_prog *bpf_int_jit_compile(str
}
}
oldproglen = proglen;
+ cond_resched();
}
if (bpf_jit_enable > 1)
Patches currently in stable-queue which might be from daniel(a)iogearbox.net are
queue-4.9/kbuild-disable-clang-s-default-use-of-fmerge-all-constants.patch
queue-4.9/bpf-skip-unnecessary-capability-check.patch
queue-4.9/bpf-x64-increase-number-of-passes.patch
This is a note to let you know that I've just added the patch titled
bpf: skip unnecessary capability check
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
bpf-skip-unnecessary-capability-check.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 0fa4fe85f4724fff89b09741c437cbee9cf8b008 Mon Sep 17 00:00:00 2001
From: Chenbo Feng <fengc(a)google.com>
Date: Mon, 19 Mar 2018 17:57:27 -0700
Subject: bpf: skip unnecessary capability check
From: Chenbo Feng <fengc(a)google.com>
commit 0fa4fe85f4724fff89b09741c437cbee9cf8b008 upstream.
The current check statement in BPF syscall will do a capability check
for CAP_SYS_ADMIN before checking sysctl_unprivileged_bpf_disabled. This
code path will trigger unnecessary security hooks on capability checking
and cause false alarms on unprivileged process trying to get CAP_SYS_ADMIN
access. This can be resolved by simply switch the order of the statement
and CAP_SYS_ADMIN is not required anyway if unprivileged bpf syscall is
allowed.
Signed-off-by: Chenbo Feng <fengc(a)google.com>
Acked-by: Lorenzo Colitti <lorenzo(a)google.com>
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/bpf/syscall.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -801,7 +801,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf
union bpf_attr attr = {};
int err;
- if (!capable(CAP_SYS_ADMIN) && sysctl_unprivileged_bpf_disabled)
+ if (sysctl_unprivileged_bpf_disabled && !capable(CAP_SYS_ADMIN))
return -EPERM;
if (!access_ok(VERIFY_READ, uattr, 1))
Patches currently in stable-queue which might be from fengc(a)google.com are
queue-4.9/kbuild-disable-clang-s-default-use-of-fmerge-all-constants.patch
queue-4.9/bpf-skip-unnecessary-capability-check.patch
This is a note to let you know that I've just added the patch titled
kbuild: disable clang's default use of -fmerge-all-constants
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kbuild-disable-clang-s-default-use-of-fmerge-all-constants.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 87e0d4f0f37fb0c8c4aeeac46fff5e957738df79 Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel(a)iogearbox.net>
Date: Wed, 21 Mar 2018 01:18:24 +0100
Subject: kbuild: disable clang's default use of -fmerge-all-constants
From: Daniel Borkmann <daniel(a)iogearbox.net>
commit 87e0d4f0f37fb0c8c4aeeac46fff5e957738df79 upstream.
Prasad reported that he has seen crashes in BPF subsystem with netd
on Android with arm64 in the form of (note, the taint is unrelated):
[ 4134.721483] Unable to handle kernel paging request at virtual address 800000001
[ 4134.820925] Mem abort info:
[ 4134.901283] Exception class = DABT (current EL), IL = 32 bits
[ 4135.016736] SET = 0, FnV = 0
[ 4135.119820] EA = 0, S1PTW = 0
[ 4135.201431] Data abort info:
[ 4135.301388] ISV = 0, ISS = 0x00000021
[ 4135.359599] CM = 0, WnR = 0
[ 4135.470873] user pgtable: 4k pages, 39-bit VAs, pgd = ffffffe39b946000
[ 4135.499757] [0000000800000001] *pgd=0000000000000000, *pud=0000000000000000
[ 4135.660725] Internal error: Oops: 96000021 [#1] PREEMPT SMP
[ 4135.674610] Modules linked in:
[ 4135.682883] CPU: 5 PID: 1260 Comm: netd Tainted: G S W 4.14.19+ #1
[ 4135.716188] task: ffffffe39f4aa380 task.stack: ffffff801d4e0000
[ 4135.731599] PC is at bpf_prog_add+0x20/0x68
[ 4135.741746] LR is at bpf_prog_inc+0x20/0x2c
[ 4135.751788] pc : [<ffffff94ab7ad584>] lr : [<ffffff94ab7ad638>] pstate: 60400145
[ 4135.769062] sp : ffffff801d4e3ce0
[...]
[ 4136.258315] Process netd (pid: 1260, stack limit = 0xffffff801d4e0000)
[ 4136.273746] Call trace:
[...]
[ 4136.442494] 3ca0: ffffff94ab7ad584 0000000060400145 ffffffe3a01bf8f8 0000000000000006
[ 4136.460936] 3cc0: 0000008000000000 ffffff94ab844204 ffffff801d4e3cf0 ffffff94ab7ad584
[ 4136.479241] [<ffffff94ab7ad584>] bpf_prog_add+0x20/0x68
[ 4136.491767] [<ffffff94ab7ad638>] bpf_prog_inc+0x20/0x2c
[ 4136.504536] [<ffffff94ab7b5d08>] bpf_obj_get_user+0x204/0x22c
[ 4136.518746] [<ffffff94ab7ade68>] SyS_bpf+0x5a8/0x1a88
Android's netd was basically pinning the uid cookie BPF map in BPF
fs (/sys/fs/bpf/traffic_cookie_uid_map) and later on retrieving it
again resulting in above panic. Issue is that the map was wrongly
identified as a prog! Above kernel was compiled with clang 4.0,
and it turns out that clang decided to merge the bpf_prog_iops and
bpf_map_iops into a single memory location, such that the two i_ops
could then not be distinguished anymore.
Reason for this miscompilation is that clang has the more aggressive
-fmerge-all-constants enabled by default. In fact, clang source code
has a comment about it in lib/AST/ExprConstant.cpp on why it is okay
to do so:
Pointers with different bases cannot represent the same object.
(Note that clang defaults to -fmerge-all-constants, which can
lead to inconsistent results for comparisons involving the address
of a constant; this generally doesn't matter in practice.)
The issue never appeared with gcc however, since gcc does not enable
-fmerge-all-constants by default and even *explicitly* states in
it's option description that using this flag results in non-conforming
behavior, quote from man gcc:
Languages like C or C++ require each variable, including multiple
instances of the same variable in recursive calls, to have distinct
locations, so using this option results in non-conforming behavior.
There are also various clang bug reports open on that matter [1],
where clang developers acknowledge the non-conforming behavior,
and refer to disabling it with -fno-merge-all-constants. But even
if this gets fixed in clang today, there are already users out there
that triggered this. Thus, fix this issue by explicitly adding
-fno-merge-all-constants to the kernel's Makefile to generically
disable this optimization, since potentially other places in the
kernel could subtly break as well.
Note, there is also a flag called -fmerge-constants (not supported
by clang), which is more conservative and only applies to strings
and it's enabled in gcc's -O/-O2/-O3/-Os optimization levels. In
gcc's code, the two flags -fmerge-{all-,}constants share the same
variable internally, so when disabling it via -fno-merge-all-constants,
then we really don't merge any const data (e.g. strings), and text
size increases with gcc (14,927,214 -> 14,942,646 for vmlinux.o).
$ gcc -fverbose-asm -O2 foo.c -S -o foo.S
-> foo.S lists -fmerge-constants under options enabled
$ gcc -fverbose-asm -O2 -fno-merge-all-constants foo.c -S -o foo.S
-> foo.S doesn't list -fmerge-constants under options enabled
$ gcc -fverbose-asm -O2 -fno-merge-all-constants -fmerge-constants foo.c -S -o foo.S
-> foo.S lists -fmerge-constants under options enabled
Thus, as a workaround we need to set both -fno-merge-all-constants
*and* -fmerge-constants in the Makefile in order for text size to
stay as is.
[1] https://bugs.llvm.org/show_bug.cgi?id=18538
Reported-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Chenbo Feng <fengc(a)google.com>
Cc: Richard Smith <richard-llvm(a)metafoo.co.uk>
Cc: Chandler Carruth <chandlerc(a)gmail.com>
Cc: linux-kernel(a)vger.kernel.org
Tested-by: Prasad Sodagudi <psodagud(a)codeaurora.org>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
Makefile | 9 +++++++++
1 file changed, 9 insertions(+)
--- a/Makefile
+++ b/Makefile
@@ -784,6 +784,15 @@ KBUILD_CFLAGS += $(call cc-disable-warni
# disable invalid "can't wrap" optimizations for signed / pointers
KBUILD_CFLAGS += $(call cc-option,-fno-strict-overflow)
+# clang sets -fmerge-all-constants by default as optimization, but this
+# is non-conforming behavior for C and in fact breaks the kernel, so we
+# need to disable it here generally.
+KBUILD_CFLAGS += $(call cc-option,-fno-merge-all-constants)
+
+# for gcc -fno-merge-all-constants disables everything, but it is fine
+# to have actual conforming behavior enabled.
+KBUILD_CFLAGS += $(call cc-option,-fmerge-constants)
+
# Make sure -fstack-check isn't enabled (like gentoo apparently did)
KBUILD_CFLAGS += $(call cc-option,-fno-stack-check,)
Patches currently in stable-queue which might be from daniel(a)iogearbox.net are
queue-4.4/kbuild-disable-clang-s-default-use-of-fmerge-all-constants.patch
queue-4.4/bpf-skip-unnecessary-capability-check.patch
queue-4.4/bpf-x64-increase-number-of-passes.patch
This is a note to let you know that I've just added the patch titled
bpf, x64: increase number of passes
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
bpf-x64-increase-number-of-passes.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 6007b080d2e2adb7af22bf29165f0594ea12b34c Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel(a)iogearbox.net>
Date: Wed, 7 Mar 2018 22:10:01 +0100
Subject: bpf, x64: increase number of passes
From: Daniel Borkmann <daniel(a)iogearbox.net>
commit 6007b080d2e2adb7af22bf29165f0594ea12b34c upstream.
In Cilium some of the main programs we run today are hitting 9 passes
on x64's JIT compiler, and we've had cases already where we surpassed
the limit where the JIT then punts the program to the interpreter
instead, leading to insertion failures due to CONFIG_BPF_JIT_ALWAYS_ON
or insertion failures due to the prog array owner being JITed but the
program to insert not (both must have the same JITed/non-JITed property).
One concrete case the program image shrunk from 12,767 bytes down to
10,288 bytes where the image converged after 16 steps. I've measured
that this took 340us in the JIT until it converges on my i7-6600U. Thus,
increase the original limit we had from day one where the JIT covered
cBPF only back then before we run into the case (as similar with the
complexity limit) where we trip over this and hit program rejections.
Also add a cond_resched() into the compilation loop, the JIT process
runs without any locks and may sleep anyway.
Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net>
Acked-by: Alexei Starovoitov <ast(a)kernel.org>
Reviewed-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: Alexei Starovoitov <ast(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/net/bpf_jit_comp.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1077,7 +1077,7 @@ void bpf_int_jit_compile(struct bpf_prog
* may converge on the last pass. In such case do one more
* pass to emit the final image
*/
- for (pass = 0; pass < 10 || image; pass++) {
+ for (pass = 0; pass < 20 || image; pass++) {
proglen = do_jit(prog, addrs, image, oldproglen, &ctx);
if (proglen <= 0) {
image = NULL;
@@ -1100,6 +1100,7 @@ void bpf_int_jit_compile(struct bpf_prog
goto out;
}
oldproglen = proglen;
+ cond_resched();
}
if (bpf_jit_enable > 1)
Patches currently in stable-queue which might be from daniel(a)iogearbox.net are
queue-4.4/kbuild-disable-clang-s-default-use-of-fmerge-all-constants.patch
queue-4.4/bpf-skip-unnecessary-capability-check.patch
queue-4.4/bpf-x64-increase-number-of-passes.patch