Hi Greg,
This series fixes issues we've seen with softirq time accounting in 4.9:
- when ksoftirqd is running at 100% on a CPU, none of the values
reported by /proc/stat for that CPU will change, sometimes for
dozens of seconds,
- large deviations in the total number of ticks accumulated over a
fixed time for a CPU, probably because of the first issue hitting
for shorter periods.
We found out that something pretty similar had been reported 9 months
ago, see the reference link below. In that discussion, Rabin Vincent had
made a 4.9 specific patch which fixes our first issue, but we were still
seeing some deviation from the total number of ticks (up to 1.7% from
expected, where we had only 0.2% on older kernels), and you had also
asked for a direct backport from the mainline series, if possible.
As mentioned in that thread, a lot of changes (probably 50+) went into
4.11 to remove cputime, but we could get something working with only the
4 attached patches to fix these two issues. Three of these patches apply
without change, and the second one in the series ("sched/cputime:
Convert kcpustat to nsecs") needed a minor change as a cast had been
added in 527b0a76f41d ("sched/cpuacct: Avoid %lld seq_printf warning")
to fix a build warning on s390. I guess we could also include that patch
in this series, let me know if this is the preferred way to handle this.
We ran our tests on 3.18, 4.4 and 4.9 and confirmed that only 4.9 would
need this series, and that this series indeed restores the behavior we
were seeing on those older kernels.
Thanks!
Reference: http://lkml.kernel.org/r/%3C1513159876-5125-1-git-send-email-rabin.vincent@…
Frederic Weisbecker (4):
time: Introduce jiffies64_to_nsecs()
sched/cputime: Convert kcpustat to nsecs
sched/cputime: Increment kcpustat directly on irqtime account
sched/cputime: Fix ksoftirqd cputime accounting regression
arch/s390/appldata/appldata_os.c | 16 +++----
drivers/cpufreq/cpufreq.c | 6 +--
drivers/cpufreq/cpufreq_governor.c | 2 +-
drivers/cpufreq/cpufreq_stats.c | 1 -
drivers/macintosh/rack-meter.c | 2 +-
fs/proc/stat.c | 68 +++++++++++++--------------
fs/proc/uptime.c | 7 +--
include/linux/jiffies.h | 2 +
kernel/sched/cpuacct.c | 2 +-
kernel/sched/cputime.c | 75 +++++++++++++-----------------
kernel/sched/sched.h | 12 +++--
kernel/time/time.c | 10 ++++
kernel/time/timeconst.bc | 6 +++
13 files changed, 109 insertions(+), 100 deletions(-)
--
2.18.0
If the size of spi-nor flash is larger than 16MB, the read_opcode
is set to SPINOR_OP_READ_1_1_4_4B, and fsl_qspi_get_seqid() will
return -EINVAL when cmd is SPINOR_OP_READ_1_1_4_4B. This can
cause read operation fail.
Fixes: e46ecda764dc ("mtd: spi-nor: Add Freescale QuadSPI driver")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Liu Xiang <liu.xiang6(a)zte.com.cn>
---
Changes in v3:
move changelog position.
drivers/mtd/spi-nor/fsl-quadspi.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/mtd/spi-nor/fsl-quadspi.c b/drivers/mtd/spi-nor/fsl-quadspi.c
index 7d9620c..64304a3 100644
--- a/drivers/mtd/spi-nor/fsl-quadspi.c
+++ b/drivers/mtd/spi-nor/fsl-quadspi.c
@@ -478,6 +478,7 @@ static int fsl_qspi_get_seqid(struct fsl_qspi *q, u8 cmd)
{
switch (cmd) {
case SPINOR_OP_READ_1_1_4:
+ case SPINOR_OP_READ_1_1_4_4B:
return SEQID_READ;
case SPINOR_OP_WREN:
return SEQID_WREN;
--
1.9.1
Add more PCI IDs to the Intel GPU "spurious interrupt" quirk table,
which are known to break.
See commit f67fd55fa96f ("PCI: Add quirk for still enabled interrupts
on Intel Sandy Bridge GPUs"), and commit 7c82126a94e6 ("PCI: Add new
ID for Intel GPU "spurious interrupt" quirk") for some history.
Based on current findings, it is highly possible that all Intel
1st/2nd/3rd generation Core processors' IGD has such quirk.
Signed-off-by: Bin Meng <bmeng.cn(a)gmail.com>
Cc: <stable(a)vger.kernel.org> # v3.4+
---
drivers/pci/quirks.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 6bc27b7..c0673a7 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3190,7 +3190,11 @@ static void disable_igfx_irq(struct pci_dev *dev)
pci_iounmap(dev, regs);
}
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0042, disable_igfx_irq);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0046, disable_igfx_irq);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x004a, disable_igfx_irq);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0102, disable_igfx_irq);
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0106, disable_igfx_irq);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x010a, disable_igfx_irq);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x0152, disable_igfx_irq);
--
2.7.4
Hello Greg,
This is my 3rd attempt* at sending to you the FPU backported patches that
complete the removal of FPU lazy-mode code, documentation and comments.
Please apply the next patch before the 4.4.y 3-patch series.
- a1dbf52c1a38 ("KVM: x86: remove eager_fpu field of struct kvm_vcpu_arch")
[PATCH v3 4.4.y 1/3] x86/fpu: Remove use_eager_fpu()
[PATCH v3 4.4.y 2/3] x86/fpu: Remove struct fpu::counter
[PATCH v3 4.4.y 3/3] x86/fpu: Finish excising 'eagerfpu'
[Note] I already sent to you a patch backported for 4.9.y (please make sure you follow the annotations).
https://www.spinics.net/lists/stable/msg247014.html
Thanks,
Daniel
*Sorry if I mess up again. I am going to use the get_maintainer.pl from the upstream kernel
because last time I had some "returned e-mails".
Hello Greg,
Please consider this backport for 4.9.y. It completes the removal
of FPU lazy-mode code, documentation and comments.
[PATCH v3 4.9] x86/fpu: Remove use_eager_fpu()
After applying the patch, please do the following:
- cherry-pick: 3913cc350757 ("x86/fpu: Remove struct fpu::counter")
- revert: f09a7b0eead7 ("perf: sync up x86/.../cpufeatures.h")
- Because the modification in this patch actually belongs to e63650840e8b ("x86/fpu: Finish excising 'eagerfpu'")
- cherry-pick: e63650840e8b ("x86/fpu: Finish excising 'eagerfpu'")
Thanks,
Daniel
Hi Greg / stable maintainer(s),
IIUC, this commit is a good candidate for -stable. I just hit the bug on
4.14, because of new vendor firmware that noticed the mismatched length
(an internal "assert"), but it seems like this is equally problematic
from the kernel side for as long as this code existed [1]:
commit c8291988806407e02a01b4b15b4504eafbcc04e0
Author: Zhi Chen <zhichen(a)codeaurora.org>
Date: Mon Jun 18 17:00:39 2018 +0300
ath10k: fix scan crash due to incorrect length calculation
Thanks,
Brian
[1] I guess that would be:
Fixes: ca996ec56608 ("ath10k: implement wmi-tlv backend")
which was in v4.0.
Hi Greg,
This was not marked for stable but seems it should be in stable.
Please apply to your queue of 4.14-stable.
And at the same time, there was another later patch which fixed a bug
in this patch. Both are attached as a series to this mail.
--
Regards
Sudip
There are five patches to fix CVE-2018-5390 in latest mainline
branch, but only two patches exist in stable 4.4 and 3.18:
dc6ae4d tcp: detect malicious patterns in tcp_collapse_ofo_queue()
5fbec48 tcp: avoid collapses in tcp_prune_queue() if possible
I have tested with stable 4.4 kernel, and found the cpu usage was very high.
So I think only two patches can't fix the CVE-2018-5390.
test results:
with fix patch: 78.2% ksoftirqd
withoutfix patch: 90% ksoftirqd
Then I try to imitate 72cd43ba(tcp: free batches of packets in tcp_prune_ofo_queue())
to drop at least 12.5 % of sk_rcvbuf to avoid malicious attacks with simple queue
instead of RB tree. The result is not very well.
After analysing the codes of stable 4.4, and debuging the
system, shows that search of ofo_queue(tcp ofo using a simple queue) cost more cycles.
So I try to backport "tcp: use an RB tree for ooo receive queue" using RB tree
instead of simple queue, then backport Eric Dumazet 5 fixed patches in mainline,
good news is that ksoftirqd is turn to about 20%, which is the same with mainline now.
Stable 4.4 have already back port two patches,
f4a3313d(tcp: avoid collapses in tcp_prune_queue() if possible)
3d4bf93a(tcp: detect malicious patterns in tcp_collapse_ofo_queue())
If we want to change simple queue to RB tree to finally resolve, we should apply previous
patch 9f5afeae(tcp: use an RB tree for ooo receive queue.) firstly, but 9f5afeae have many
conflicts with 3d4bf93a and f4a3313d, which are part of patch series from Eric in
mainline to fix CVE-2018-5390, so I need revert part of patches in stable 4.4 firstly,
then apply 9f5afeae, and reapply five patches from Eric.
V1->V2:
1) Don't revert 3d4bf93a and f4a3313d firstly, all of 6 patches based on 4.4.155.
2) Add one bug fix patch for RB tree:76f0dcbb5ae1a7c3dbeec13dd98233b8e6b0b32a tcp: fix a stale ooo_last_skb
Eric Dumazet (5):
tcp: increment sk_drops for dropped rx packets
tcp: fix a stale ooo_last_skb after a replace
tcp: free batches of packets in tcp_prune_ofo_queue()
tcp: call tcp_drop() from tcp_data_queue_ofo()
tcp: add tcp_ooo_try_coalesce() helper
Yaogong Wang (1):
tcp: use an RB tree for ooo receive queue
include/linux/skbuff.h | 8 +
include/linux/tcp.h | 7 +-
include/net/sock.h | 7 +
include/net/tcp.h | 2 +-
net/core/skbuff.c | 19 +++
net/ipv4/tcp.c | 4 +-
net/ipv4/tcp_input.c | 417 +++++++++++++++++++++++++++++------------------
net/ipv4/tcp_ipv4.c | 3 +-
net/ipv4/tcp_minisocks.c | 1 -
net/ipv6/tcp_ipv6.c | 1 +
10 files changed, 297 insertions(+), 172 deletions(-)
--
1.8.3.1
Currently we return NOTIFY_DONE for any event which we don't think is
ours. However, many laptops will send more then just an ATIF event and
will also send an ACPI_VIDEO_NOTIFY_PROBE event as well. Since we don't
check for this, we return NOTIFY_DONE which causes a keypress for the
ACPI event to be propogated to userspace. This is the equivalent of
someone pressing the display key on a laptop every time there's a
hotplug event.
So, check for ACPI_VIDEO_NOTIFY_PROBE events and suppress keypresses
from them.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Cc: stable(a)vger.kernel.org
---
drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
index 353993218f21..f008804f0b97 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
@@ -358,7 +358,9 @@ static int amdgpu_atif_get_sbios_requests(struct amdgpu_atif *atif,
*
* Checks the acpi event and if it matches an atif event,
* handles it.
- * Returns NOTIFY code
+ *
+ * Returns:
+ * NOTIFY_BAD or NOTIFY_DONE, depending on the event.
*/
static int amdgpu_atif_handler(struct amdgpu_device *adev,
struct acpi_bus_event *event)
@@ -372,11 +374,16 @@ static int amdgpu_atif_handler(struct amdgpu_device *adev,
if (strcmp(event->device_class, ACPI_VIDEO_CLASS) != 0)
return NOTIFY_DONE;
+ /* Is this actually our event? */
if (!atif ||
!atif->notification_cfg.enabled ||
- event->type != atif->notification_cfg.command_code)
- /* Not our event */
- return NOTIFY_DONE;
+ event->type != atif->notification_cfg.command_code) {
+ /* These events will generate keypresses otherwise */
+ if (event->type == ACPI_VIDEO_NOTIFY_PROBE)
+ return NOTIFY_BAD;
+ else
+ return NOTIFY_DONE;
+ }
if (atif->functions.sbios_requests) {
struct atif_sbios_requests req;
@@ -385,7 +392,7 @@ static int amdgpu_atif_handler(struct amdgpu_device *adev,
count = amdgpu_atif_get_sbios_requests(atif, &req);
if (count <= 0)
- return NOTIFY_DONE;
+ return NOTIFY_BAD;
DRM_DEBUG_DRIVER("ATIF: %d pending SBIOS requests\n", count);
--
2.17.1