The patch below does not apply to the 5.7-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 783735f84fea6aad9b1e5931d6ea632796feaae3 Mon Sep 17 00:00:00 2001
From: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Date: Thu, 2 Apr 2020 12:45:34 +0300
Subject: [PATCH] thunderbolt: Fix path indices used in USB3 tunnel discovery
The USB3 discovery used wrong indices when tunnel is discovered. It
should use TB_USB3_PATH_DOWN for path that flows downstream and
TB_USB3_PATH_UP when it flows upstream. This should not affect the
functionality but better to fix it.
Fixes: e6f818585713 ("thunderbolt: Add support for USB 3.x tunnels")
Signed-off-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # v5.6+
diff --git a/drivers/thunderbolt/tunnel.c b/drivers/thunderbolt/tunnel.c
index dbe90bcf4ad4..c144ca9b032c 100644
--- a/drivers/thunderbolt/tunnel.c
+++ b/drivers/thunderbolt/tunnel.c
@@ -913,21 +913,21 @@ struct tb_tunnel *tb_tunnel_discover_usb3(struct tb *tb, struct tb_port *down)
* case.
*/
path = tb_path_discover(down, TB_USB3_HOPID, NULL, -1,
- &tunnel->dst_port, "USB3 Up");
+ &tunnel->dst_port, "USB3 Down");
if (!path) {
/* Just disable the downstream port */
tb_usb3_port_enable(down, false);
goto err_free;
}
- tunnel->paths[TB_USB3_PATH_UP] = path;
- tb_usb3_init_path(tunnel->paths[TB_USB3_PATH_UP]);
+ tunnel->paths[TB_USB3_PATH_DOWN] = path;
+ tb_usb3_init_path(tunnel->paths[TB_USB3_PATH_DOWN]);
path = tb_path_discover(tunnel->dst_port, -1, down, TB_USB3_HOPID, NULL,
- "USB3 Down");
+ "USB3 Up");
if (!path)
goto err_deactivate;
- tunnel->paths[TB_USB3_PATH_DOWN] = path;
- tb_usb3_init_path(tunnel->paths[TB_USB3_PATH_DOWN]);
+ tunnel->paths[TB_USB3_PATH_UP] = path;
+ tb_usb3_init_path(tunnel->paths[TB_USB3_PATH_UP]);
/* Validate that the tunnel is complete */
if (!tb_port_is_usb3_up(tunnel->dst_port)) {
The patch below does not apply to the 5.8-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 783735f84fea6aad9b1e5931d6ea632796feaae3 Mon Sep 17 00:00:00 2001
From: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Date: Thu, 2 Apr 2020 12:45:34 +0300
Subject: [PATCH] thunderbolt: Fix path indices used in USB3 tunnel discovery
The USB3 discovery used wrong indices when tunnel is discovered. It
should use TB_USB3_PATH_DOWN for path that flows downstream and
TB_USB3_PATH_UP when it flows upstream. This should not affect the
functionality but better to fix it.
Fixes: e6f818585713 ("thunderbolt: Add support for USB 3.x tunnels")
Signed-off-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # v5.6+
diff --git a/drivers/thunderbolt/tunnel.c b/drivers/thunderbolt/tunnel.c
index dbe90bcf4ad4..c144ca9b032c 100644
--- a/drivers/thunderbolt/tunnel.c
+++ b/drivers/thunderbolt/tunnel.c
@@ -913,21 +913,21 @@ struct tb_tunnel *tb_tunnel_discover_usb3(struct tb *tb, struct tb_port *down)
* case.
*/
path = tb_path_discover(down, TB_USB3_HOPID, NULL, -1,
- &tunnel->dst_port, "USB3 Up");
+ &tunnel->dst_port, "USB3 Down");
if (!path) {
/* Just disable the downstream port */
tb_usb3_port_enable(down, false);
goto err_free;
}
- tunnel->paths[TB_USB3_PATH_UP] = path;
- tb_usb3_init_path(tunnel->paths[TB_USB3_PATH_UP]);
+ tunnel->paths[TB_USB3_PATH_DOWN] = path;
+ tb_usb3_init_path(tunnel->paths[TB_USB3_PATH_DOWN]);
path = tb_path_discover(tunnel->dst_port, -1, down, TB_USB3_HOPID, NULL,
- "USB3 Down");
+ "USB3 Up");
if (!path)
goto err_deactivate;
- tunnel->paths[TB_USB3_PATH_DOWN] = path;
- tb_usb3_init_path(tunnel->paths[TB_USB3_PATH_DOWN]);
+ tunnel->paths[TB_USB3_PATH_UP] = path;
+ tb_usb3_init_path(tunnel->paths[TB_USB3_PATH_UP]);
/* Validate that the tunnel is complete */
if (!tb_port_is_usb3_up(tunnel->dst_port)) {
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c7614ff9b73a1e6fb2b1b51396da132ed22fecdb Mon Sep 17 00:00:00 2001
From: Brant Merryman <brant.merryman(a)silabs.com>
Date: Fri, 26 Jun 2020 04:24:20 +0000
Subject: [PATCH] USB: serial: cp210x: re-enable auto-RTS on open
CP210x hardware disables auto-RTS but leaves auto-CTS when in hardware
flow control mode and UART on cp210x hardware is disabled. When
re-opening the port, if auto-CTS is enabled on the cp210x, then auto-RTS
must be re-enabled in the driver.
Signed-off-by: Brant Merryman <brant.merryman(a)silabs.com>
Co-developed-by: Phu Luu <phu.luu(a)silabs.com>
Signed-off-by: Phu Luu <phu.luu(a)silabs.com>
Link: https://lore.kernel.org/r/ECCF8E73-91F3-4080-BE17-1714BC8818FB@silabs.com
[ johan: fix up tags and problem description ]
Fixes: 39a66b8d22a3 ("[PATCH] USB: CP2101 Add support for flow control")
Cc: stable <stable(a)vger.kernel.org> # 2.6.12
Signed-off-by: Johan Hovold <johan(a)kernel.org>
diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c
index bcceb4ad8be0..a90801ef0055 100644
--- a/drivers/usb/serial/cp210x.c
+++ b/drivers/usb/serial/cp210x.c
@@ -917,6 +917,7 @@ static void cp210x_get_termios_port(struct usb_serial_port *port,
u32 baud;
u16 bits;
u32 ctl_hs;
+ u32 flow_repl;
cp210x_read_u32_reg(port, CP210X_GET_BAUDRATE, &baud);
@@ -1017,6 +1018,22 @@ static void cp210x_get_termios_port(struct usb_serial_port *port,
ctl_hs = le32_to_cpu(flow_ctl.ulControlHandshake);
if (ctl_hs & CP210X_SERIAL_CTS_HANDSHAKE) {
dev_dbg(dev, "%s - flow control = CRTSCTS\n", __func__);
+ /*
+ * When the port is closed, the CP210x hardware disables
+ * auto-RTS and RTS is deasserted but it leaves auto-CTS when
+ * in hardware flow control mode. When re-opening the port, if
+ * auto-CTS is enabled on the cp210x, then auto-RTS must be
+ * re-enabled in the driver.
+ */
+ flow_repl = le32_to_cpu(flow_ctl.ulFlowReplace);
+ flow_repl &= ~CP210X_SERIAL_RTS_MASK;
+ flow_repl |= CP210X_SERIAL_RTS_SHIFT(CP210X_SERIAL_RTS_FLOW_CTL);
+ flow_ctl.ulFlowReplace = cpu_to_le32(flow_repl);
+ cp210x_write_reg_block(port,
+ CP210X_SET_FLOW,
+ &flow_ctl,
+ sizeof(flow_ctl));
+
cflag |= CRTSCTS;
} else {
dev_dbg(dev, "%s - flow control = NONE\n", __func__);
hi Greg,
Please apply upstream 8ab49526b53d to all stable kernels containing
07e1d88adaae, which should be v4.20 and higher stable kernels.
Thanks,
Ingo
----- Forwarded message from Eric Dumazet <edumazet(a)google.com> -----
Date: Sat, 15 Aug 2020 10:38:58 -0700
From: Eric Dumazet <edumazet(a)google.com>
To: Ingo Molnar <mingo(a)kernel.org>
Cc: linux-kernel <linux-kernel(a)vger.kernel.org>, Eric Dumazet <eric.dumazet(a)gmail.com>, Jann Horn <jannh(a)google.com>, syzbot <syzkaller(a)googlegroups.com>, Andy Lutomirski <luto(a)kernel.org>, "Chang S . Bae" <chang.seok.bae(a)intel.com>, Andy Lutomirski <luto(a)amacapital.net>,
Borislav Petkov <bp(a)alien8.de>, Brian Gerst <brgerst(a)gmail.com>, Dave Hansen <dave.hansen(a)linux.intel.com>, Denys Vlasenko <dvlasenk(a)redhat.com>, "H . Peter Anvin" <hpa(a)zytor.com>, Linus Torvalds <torvalds(a)linux-foundation.org>, Markus T Metzger
<markus.t.metzger(a)intel.com>, Peter Zijlstra <peterz(a)infradead.org>, Ravi Shankar <ravi.v.shankar(a)intel.com>, Rik van Riel <riel(a)surriel.com>, Thomas Gleixner <tglx(a)linutronix.de>
Subject: Re: [PATCH] x86/fsgsbase/64: Fix NULL deref in 86_fsgsbase_read_task
On Sat, Aug 15, 2020 at 4:48 AM Ingo Molnar <mingo(a)kernel.org> wrote:
>
>
> * Eric Dumazet <edumazet(a)google.com> wrote:
>
> > syzbot found its way in 86_fsgsbase_read_task() [1]
> >
> > Fix is to make sure ldt pointer is not NULL.
>
> Thanks for this fix. Linus has picked it up (inclusive the typos to
> the x86_fsgsbase_read_task() function name ;-), it's now upstream
> under:
>
> 8ab49526b53d: ("x86/fsgsbase/64: Fix NULL deref in 86_fsgsbase_read_task")
>
> By the fixes tag it looks like this should probably be backported all
> the way back to ~v4.20 or so?
This is absolutely right, sorry about the lack of a stable tag.
Most of my patches usually land into David Miller trees, where the
stable tag is not welcomed.
We use Fixes: tags to convey the exact information needed for stable backports.
Thanks.
----- End forwarded message -----
The original problem was from nvme-over-tcp code, who mistakenly uses
kernel_sendpage() to send pages allocated by __get_free_pages() without
__GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
tail pages, sending them by kernel_sendpage() may trigger a kernel panic
from a corrupted kernel heap, because these pages are incorrectly freed
in network stack as page_count 0 pages.
This patch introduces a helper sendpage_ok(), it returns true if the
checking page,
- is not slab page: PageSlab(page) is false.
- has page refcount: page_count(page) is not zero
All drivers who want to send page to remote end by kernel_sendpage()
may use this helper to check whether the page is OK. If the helper does
not return true, the driver should try other non sendpage method (e.g.
sock_no_sendpage()) to handle the page.
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni(a)wdc.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Hannes Reinecke <hare(a)suse.de>
Cc: Jan Kara <jack(a)suse.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy(a)solarflare.com>
Cc: Philipp Reisner <philipp.reisner(a)linbit.com>
Cc: Sagi Grimberg <sagi(a)grimberg.me>
Cc: Vlastimil Babka <vbabka(a)suse.com>
Cc: stable(a)vger.kernel.org
---
include/linux/net.h | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/include/linux/net.h b/include/linux/net.h
index d48ff1180879..97e8f1a8a427 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -286,6 +286,21 @@ do { \
#define net_get_random_once_wait(buf, nbytes) \
get_random_once_wait((buf), (nbytes))
+/*
+ * E.g. XFS meta- & log-data is in slab pages, or bcache meta
+ * data pages, or other high order pages allocated by
+ * __get_free_pages() without __GFP_COMP, which have a page_count
+ * of 0 and/or have PageSlab() set. We cannot use send_page for
+ * those, as that does get_page(); put_page(); and would cause
+ * either a VM_BUG directly, or __page_cache_release a page that
+ * would actually still be referenced by someone, leading to some
+ * obscure delayed Oops somewhere else.
+ */
+static inline bool sendpage_ok(struct page *page)
+{
+ return (!PageSlab(page) && page_count(page) >= 1);
+}
+
int kernel_sendmsg(struct socket *sock, struct msghdr *msg, struct kvec *vec,
size_t num, size_t len);
int kernel_sendmsg_locked(struct sock *sk, struct msghdr *msg,
--
2.26.2
From: Coly Li <colyli(a)suse.de>
The original problem was from nvme-over-tcp code, who mistakenly uses
kernel_sendpage() to send pages allocated by __get_free_pages() without
__GFP_COMP flag. Such pages don't have refcount (page_count is 0) on
tail pages, sending them by kernel_sendpage() may trigger a kernel panic
from a corrupted kernel heap, because these pages are incorrectly freed
in network stack as page_count 0 pages.
This patch introduces a helper sendpage_ok(), it returns true if the
checking page,
- is not slab page: PageSlab(page) is false.
- has page refcount: page_count(page) is not zero
All drivers who want to send page to remote end by kernel_sendpage()
may use this helper to check whether the page is OK. If the helper does
not return true, the driver should try other non sendpage method (e.g.
sock_no_sendpage()) to handle the page.
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni(a)wdc.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Hannes Reinecke <hare(a)suse.de>
Cc: Jan Kara <jack(a)suse.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy(a)solarflare.com>
Cc: Philipp Reisner <philipp.reisner(a)linbit.com>
Cc: Sagi Grimberg <sagi(a)grimberg.me>
Cc: Vlastimil Babka <vbabka(a)suse.com>
Cc: stable(a)vger.kernel.org
---
Changelog:
v4, change sendpage_ok() as an inline helper, and post it as
separate patch.
v3, introduce a more common sendpage_ok()
v2, fix typo in patch subject
v1, the initial version.
include/linux/net.h | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/include/linux/net.h b/include/linux/net.h
index d48ff1180879..97e8f1a8a427 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -286,6 +286,21 @@ do { \
#define net_get_random_once_wait(buf, nbytes) \
get_random_once_wait((buf), (nbytes))
+/*
+ * E.g. XFS meta- & log-data is in slab pages, or bcache meta
+ * data pages, or other high order pages allocated by
+ * __get_free_pages() without __GFP_COMP, which have a page_count
+ * of 0 and/or have PageSlab() set. We cannot use send_page for
+ * those, as that does get_page(); put_page(); and would cause
+ * either a VM_BUG directly, or __page_cache_release a page that
+ * would actually still be referenced by someone, leading to some
+ * obscure delayed Oops somewhere else.
+ */
+static inline bool sendpage_ok(struct page *page)
+{
+ return (!PageSlab(page) && page_count(page) >= 1);
+}
+
int kernel_sendmsg(struct socket *sock, struct msghdr *msg, struct kvec *vec,
size_t num, size_t len);
int kernel_sendmsg_locked(struct sock *sk, struct msghdr *msg,
--
2.26.2
On Tigerlake, we are seeing a repeat of commit d8f505311717 ("drm/i915/icl:
Forcibly evict stale csb entries") where, presumably, due to a missing
Global Observation Point synchronisation, the write pointer of the CSB
ringbuffer is updated _prior_ to the contents of the ringbuffer. That is
we see the GPU report more context-switch entries for us to parse, but
those entries have not been written, leading us to process stale events,
and eventually report a hung GPU.
However, this effect appears to be much more severe than we previously
saw on Icelake (though it might be best if we try the same approach
there as well and measure), and Bruce suggested the good idea of resetting
the CSB entry after use so that we can detect when it has been updated by
the GPU. By instrumenting how long that may be, we can set a reliable
upper bound for how long we should wait for:
513 late, avg of 61 retries (590 ns), max of 1061 retries (10099 ns)
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2045
References: d8f505311717 ("drm/i915/icl: Forcibly evict stale csb entries")
Suggested-by: Bruce Chang <yu.bruce.chang(a)intel.com>
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Bruce Chang <yu.bruce.chang(a)intel.com>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # v5.4
---
drivers/gpu/drm/i915/gt/intel_lrc.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index db982fc0f0bc..3b8161c6b601 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2498,9 +2498,22 @@ invalidate_csb_entries(const u64 *first, const u64 *last)
*/
static inline bool gen12_csb_parse(const u64 *csb)
{
- u64 entry = READ_ONCE(*csb);
- bool ctx_away_valid = GEN12_CSB_CTX_VALID(upper_32_bits(entry));
- bool new_queue =
+ bool ctx_away_valid;
+ bool new_queue;
+ u64 entry;
+
+ /* XXX HSD */
+ entry = READ_ONCE(*csb);
+ if (unlikely(entry == -1)) {
+ preempt_disable();
+ if (wait_for_atomic_us((entry = READ_ONCE(*csb)) != -1, 50))
+ GEM_WARN_ON("50us CSB timeout");
+ preempt_enable();
+ }
+ WRITE_ONCE(*(u64 *)csb, -1);
+
+ ctx_away_valid = GEN12_CSB_CTX_VALID(upper_32_bits(entry));
+ new_queue =
lower_32_bits(entry) & GEN12_CTX_STATUS_SWITCHED_TO_NEW_QUEUE;
/*
@@ -3995,6 +4008,8 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
WRITE_ONCE(*execlists->csb_write, reset_value);
wmb(); /* Make sure this is visible to HW (paranoia?) */
+ /* Check that the GPU does indeed update the CSB entries! */
+ memset(execlists->csb_status, -1, (reset_value + 1) * sizeof(u64));
invalidate_csb_entries(&execlists->csb_status[0],
&execlists->csb_status[reset_value]);
--
2.20.1
From: Baoquan He <bhe(a)redhat.com>
Subject: Revert "mm/vmstat.c: do not show lowmem reserve protection information of empty zone"
This reverts commit 26e7deadaae175.
Sonny reported that one of their tests started failing on the latest
kernel on their Chrome OS platform. The root cause is that the above
commit removed the protection line of empty zone, while the parser used in
the test relies on the protection line to mark the end of each zone.
Let's revert it to avoid breaking userspace testing or applications.
Link: http://lkml.kernel.org/r/20200811075412.12872-1-bhe@redhat.com
Fixes: 26e7deadaae175 ("mm/vmstat.c: do not show lowmem reserve protection information of empty zone)"
Signed-off-by: Baoquan He <bhe(a)redhat.com>
Reported-by: Sonny Rao <sonnyrao(a)chromium.org>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Acked-by: David Rientjes <rientjes(a)google.com>
Cc: <stable(a)vger.kernel.org> [5.8.x]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmstat.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--- a/mm/vmstat.c~revert-mm-vmstatc-do-not-show-lowmem-reserve-protection-information-of-empty-zone
+++ a/mm/vmstat.c
@@ -1642,12 +1642,6 @@ static void zoneinfo_show_print(struct s
zone->present_pages,
zone_managed_pages(zone));
- /* If unpopulated, no other information is useful */
- if (!populated_zone(zone)) {
- seq_putc(m, '\n');
- return;
- }
-
seq_printf(m,
"\n protection: (%ld",
zone->lowmem_reserve[0]);
@@ -1655,6 +1649,12 @@ static void zoneinfo_show_print(struct s
seq_printf(m, ", %ld", zone->lowmem_reserve[i]);
seq_putc(m, ')');
+ /* If unpopulated, no other information is useful */
+ if (!populated_zone(zone)) {
+ seq_putc(m, '\n');
+ return;
+ }
+
for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
seq_printf(m, "\n %-12s %lu", zone_stat_name(i),
zone_page_state(zone, i));
_
The patch titled
Subject: khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter()
has been added to the -mm tree. Its filename is
khugepaged-adjust-vm_bug_on_mm-in-__khugepaged_enter.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/khugepaged-adjust-vm_bug_on_mm-in-…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/khugepaged-adjust-vm_bug_on_mm-in-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter()
syzbot crashes on the VM_BUG_ON_MM(khugepaged_test_exit(mm), mm) in
__khugepaged_enter(): yes, when one thread is about to dump core, has set
core_state, and is waiting for others, another might do something calling
__khugepaged_enter(), which now crashes because I lumped the core_state
test (known as "mmget_still_valid") into khugepaged_test_exit(). I still
think it's best to lump them together, so just in this exceptional case,
check mm->mm_users directly instead of khugepaged_test_exit().
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008141503370.18085@eggly.anvils
Fixes: bbe98f9cadff ("khugepaged: khugepaged_test_exit() check mmget_still_valid()")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: <stable(a)vger.kernel.org> [4.8+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/khugepaged.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/khugepaged.c~khugepaged-adjust-vm_bug_on_mm-in-__khugepaged_enter
+++ a/mm/khugepaged.c
@@ -466,7 +466,7 @@ int __khugepaged_enter(struct mm_struct
return -ENOMEM;
/* __khugepaged_exit() must not run from under us */
- VM_BUG_ON_MM(khugepaged_test_exit(mm), mm);
+ VM_BUG_ON_MM(atomic_read(&mm->mm_users) == 0, mm);
if (unlikely(test_and_set_bit(MMF_VM_HUGEPAGE, &mm->flags))) {
free_mm_slot(mm_slot);
return 0;
_
Patches currently in -mm which might be from hughd(a)google.com are
dma-debug-fix-debug_dma_assert_idle-use-rcu_read_lock.patch
khugepaged-adjust-vm_bug_on_mm-in-__khugepaged_enter.patch
Hi,
Can you queue this up for 5.4 as well? Thanks!
-------- Forwarded Message --------
Subject: [PATCH] fs/io_uring.c: Fix uninitialized variable is referenced in io_submit_sqe
Date: Wed, 12 Aug 2020 23:56:44 -0700
From: Liu Yong <pkfxxxing(a)gmail.com>
To: Jens Axboe <axboe(a)kernel.dk>
CC: linux-block(a)vger.kernel.org, linux-kernel(a)vger.kernel.org, linux-fsdevel(a)vger.kernel.org
the commit <a4d61e66ee4a> ("<io_uring: prevent re-read of sqe->opcode>")
caused another vulnerability. After io_get_req(), the sqe_submit struct
in req is not initialized, but the following code defaults that
req->submit.opcode is available.
Signed-off-by: Liu Yong <pkfxxxing(a)gmail.com>
---
fs/io_uring.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/io_uring.c b/fs/io_uring.c
index be3d595a607f..c1aaee061dae 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2559,6 +2559,7 @@ static void io_submit_sqe(struct io_ring_ctx *ctx, struct sqe_submit *s,
goto err;
}
+ memcpy(&req->submit, s, sizeof(*s));
ret = io_req_set_file(ctx, s, state, req);
if (unlikely(ret)) {
err_req:
--
2.17.1
Add skcd->no_refcnt check which is missed when backporting
ad0f75e5f57c ("cgroup: fix cgroup_sk_alloc() for sk_clone_lock()").
This patch is needed in stable-4.9, stable-4.14 and stable-4.19.
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
kernel/cgroup/cgroup.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 6ae98c714edd..2a879d34bbe5 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -5957,6 +5957,8 @@ void cgroup_sk_clone(struct sock_cgroup_data *skcd)
{
/* Socket clone path */
if (skcd->val) {
+ if (skcd->no_refcnt)
+ return;
/*
* We might be cloning a socket which is left in an empty
* cgroup and the cgroup might have already been rmdir'd.
--
2.25.1
The v4.4 stable kernel lacks this bugfix:
commit 327868212381 ("make skb_copy_datagram_msg() et.al. preserve ->msg_iter on error").
As a result, the v4.4 kernel can deliver corrupt data to the application
when a corrupt UDP packet is closely followed by a valid UDP packet: the
same invocation of the recvmsg() syscall can deliver the corrupt packet's
UDP payload to the application with the UDP payload length and the
"from IP/Port" of the valid packet.
Details:
For a UDP packet longer than 76 bytes (see the v5.8-rc6 kernel's
include/linux/skbuff.h:3951), Linux delays the UDP checksum verification
until the application invokes the syscall recvmsg().
In the recvmsg() syscall handler, while Linux is copying the UDP payload
to the application's memory, it calculates the UDP checksum. If the
calculated checksum doesn't match the received checksum, Linux drops the
corrupt UDP packet, and then starts to process the next packet (if any),
and if the next packet is valid (i.e. the checksum is correct), Linux
will copy the valid UDP packet's payload to the application's receiver
buffer.
The bug is: before Linux starts to copy the valid UDP packet, the data
structure used to track how many more bytes should be copied to the
application memory is not reset to what it was when the application just
entered the kernel by the syscall! Consequently, only a small portion or
none of the valid packet's payload is copied to the application's
receive buffer, and later when the application exits from the kernel,
actually most of the application's receive buffer contains the payload
of the corrupt packet while recvmsg() returns the length of the UDP
payload of the valid packet.
For the mainline kernel, the bug was fixed in commit 327868212381,
but unluckily the bugfix is only backported to v4.9+. It turns out
backporting 327868212381 to v4.4 means that some supporting patches
must be backported first, so the overall changes seem too big, so the
alternative is performs the csum validation earlier and drops the
corrupt packets earlier.
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: Dexuan Cui <decui(a)microsoft.com>
---
net/ipv4/udp.c | 3 +--
net/ipv6/udp.c | 6 ++----
2 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index bb30699..49ab587 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1589,8 +1589,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
}
}
- if (rcu_access_pointer(sk->sk_filter) &&
- udp_lib_checksum_complete(skb))
+ if (udp_lib_checksum_complete(skb))
goto csum_error;
if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) {
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 73f1112..2d6703d 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -686,10 +686,8 @@ int udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
}
}
- if (rcu_access_pointer(sk->sk_filter)) {
- if (udp_lib_checksum_complete(skb))
- goto csum_error;
- }
+ if (udp_lib_checksum_complete(skb))
+ goto csum_error;
if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) {
UDP6_INC_STATS_BH(sock_net(sk),
--
1.8.3.1
From: Oleksij Rempel <o.rempel(a)pengutronix.de>
This patch adds check to ensure that the struct net_device::ml_priv is
allocated, as it is used later by the j1939 stack.
The allocation is done by all mainline CAN network drivers, but when using
bond or team devices this is not the case.
Bail out if no ml_priv is allocated.
Reported-by: syzbot+f03d384f3455d28833eb(a)syzkaller.appspotmail.com
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Cc: linux-stable <stable(a)vger.kernel.org> # >= v5.4
Signed-off-by: Oleksij Rempel <o.rempel(a)pengutronix.de>
Link: https://lore.kernel.org/r/20200807105200.26441-4-o.rempel@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
net/can/j1939/socket.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c
index ad973370de12..b93876c57fc4 100644
--- a/net/can/j1939/socket.c
+++ b/net/can/j1939/socket.c
@@ -467,6 +467,14 @@ static int j1939_sk_bind(struct socket *sock, struct sockaddr *uaddr, int len)
goto out_release_sock;
}
+ if (!ndev->ml_priv) {
+ netdev_warn_once(ndev,
+ "No CAN mid layer private allocated, please fix your driver and use alloc_candev()!\n");
+ dev_put(ndev);
+ ret = -ENODEV;
+ goto out_release_sock;
+ }
+
priv = j1939_netdev_start(ndev);
dev_put(ndev);
if (IS_ERR(priv)) {
--
2.28.0
From: Oleksij Rempel <o.rempel(a)pengutronix.de>
The current stack implementation do not support ECTS requests of not
aligned TP sized blocks.
If ECTS will request a block with size and offset spanning two TP
blocks, this will cause memcpy() to read beyond the queued skb (which
does only contain one TP sized block).
Sometimes KASAN will detect this read if the memory region beyond the
skb was previously allocated and freed. In other situations it will stay
undetected. The ETP transfer in any case will be corrupted.
This patch adds a sanity check to avoid this kind of read and abort the
session with error J1939_XTP_ABORT_ECTS_TOO_BIG.
Reported-by: syzbot+5322482fe520b02aea30(a)syzkaller.appspotmail.com
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Cc: linux-stable <stable(a)vger.kernel.org> # >= v5.4
Signed-off-by: Oleksij Rempel <o.rempel(a)pengutronix.de>
Link: https://lore.kernel.org/r/20200807105200.26441-3-o.rempel@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
---
net/can/j1939/transport.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
index b135c5e2a86e..30957c9a8eb7 100644
--- a/net/can/j1939/transport.c
+++ b/net/can/j1939/transport.c
@@ -787,6 +787,18 @@ static int j1939_session_tx_dat(struct j1939_session *session)
if (len > 7)
len = 7;
+ if (offset + len > se_skb->len) {
+ netdev_err_once(priv->ndev,
+ "%s: 0x%p: requested data outside of queued buffer: offset %i, len %i, pkt.tx: %i\n",
+ __func__, session, skcb->offset, se_skb->len , session->pkt.tx);
+ return -EOVERFLOW;
+ }
+
+ if (!len) {
+ ret = -ENOBUFS;
+ break;
+ }
+
memcpy(&dat[1], &tpdat[offset], len);
ret = j1939_tp_tx_dat(session, dat, len + 1);
if (ret < 0) {
@@ -1120,6 +1132,9 @@ static enum hrtimer_restart j1939_tp_txtimer(struct hrtimer *hrtimer)
* cleanup including propagation of the error to user space.
*/
break;
+ case -EOVERFLOW:
+ j1939_session_cancel(session, J1939_XTP_ABORT_ECTS_TOO_BIG);
+ break;
case 0:
session->tx_retry = 0;
break;
--
2.28.0
This patch is used to fix ext4 direct I/O read error when
the read size is not aligned with block size.
Then, I will use a test to explain the error.
(1) Make a file that is not aligned with block size:
$dd if=/dev/zero of=./test.jar bs=1000 count=3
(2) I wrote a source file named "direct_io_read_file.c" as following:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/file.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <string.h>
#define BUF_SIZE 1024
int main()
{
int fd;
int ret;
unsigned char *buf;
ret = posix_memalign((void **)&buf, 512, BUF_SIZE);
if (ret) {
perror("posix_memalign failed");
exit(1);
}
fd = open("./test.jar", O_RDONLY | O_DIRECT, 0755);
if (fd < 0){
perror("open ./test.jar failed");
exit(1);
}
do {
ret = read(fd, buf, BUF_SIZE);
printf("ret=%d\n",ret);
if (ret < 0) {
perror("write test.jar failed");
}
} while (ret > 0);
free(buf);
close(fd);
}
(3) Compile the source file:
$gcc direct_io_read_file.c -D_GNU_SOURCE
(4) Run the test program:
$./a.out
The result is as following:
ret=1024
ret=1024
ret=952
ret=-1
write test.jar failed: Invalid argument.
I have tested this program on XFS filesystem, XFS does not have
this problem, because XFS use iomap_dio_rw() to do direct I/O
read. And the comparing between read offset and file size is done
in iomap_dio_rw(), the code is as following:
if (pos < size) {
retval = filemap_write_and_wait_range(mapping, pos,
pos + iov_length(iov, nr_segs) - 1);
if (!retval) {
retval = mapping->a_ops->direct_IO(READ, iocb,
iov, pos, nr_segs);
}
...
}
...only when "pos < size", direct I/O can be done, or 0 will be return.
I have tested the fix patch on Ext4, it is up to the mustard of
EINVAL in man2(read) as following:
#include <unistd.h>
ssize_t read(int fd, void *buf, size_t count);
EINVAL
fd is attached to an object which is unsuitable for reading;
or the file was opened with the O_DIRECT flag, and either the
address specified in buf, the value specified in count, or the
current file offset is not suitably aligned.
So I think this patch can be applied to fix ext4 direct I/O error.
However Ext4 introduces direct I/O read using iomap infrastructure
on kernel 5.5, the patch is commit <b1b4705d54ab>
("ext4: introduce direct I/O read using iomap infrastructure"),
then Ext4 will be the same as XFS, they all use iomap_dio_rw() to do direct
I/O read. So this problem does not exist on kernel 5.5 for Ext4.
>From above description, we can see this problem exists on all the kernel
versions between kernel 3.14 and kernel 5.4. It will cause the Applications
to fail to read. For example, when the search service downloads a new full
index file, the search engine is loading the previous index file and is
processing the search request, it can not use buffer io that may squeeze
the previous index file in use from pagecache, so the serch service must
use direct I/O read.
Please apply this patch on these kernel versions, or please use the method
on kernel 5.5 to fix this problem.
Fixes: 9fe55eea7e4b ("Fix race when checking i_size on direct i/o read")
Reviewed-by: Jan Kara <jack(a)suse.cz>
Co-developed-by: Wang Long <wanglong19(a)meituan.com>
Signed-off-by: Wang Long <wanglong19(a)meituan.com>
Signed-off-by: Jiang Ying <jiangying8582(a)126.com>
---
fs/ext4/inode.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 516faa2..a66b0ac 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3821,6 +3821,11 @@ static ssize_t ext4_direct_IO_read(struct kiocb *iocb, struct iov_iter *iter)
struct inode *inode = mapping->host;
size_t count = iov_iter_count(iter);
ssize_t ret;
+ loff_t offset = iocb->ki_pos;
+ loff_t size = i_size_read(inode);
+
+ if (offset >= size)
+ return 0;
/*
* Shared inode_lock is enough for us - it protects against concurrent
--
1.8.3.1
The patch titled
Subject: bootconfig: fix off-by-one in xbc_node_compose_key_after()
has been removed from the -mm tree. Its filename was
bootconfig-fix-off-by-one-in-xbc_node_compose_key_after.patch
This patch was dropped because it was withdrawn
------------------------------------------------------
From: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
Subject: bootconfig: fix off-by-one in xbc_node_compose_key_after()
While reviewing some patches for bootconfig, I noticed the following
code in xbc_node_compose_key_after():
ret = snprintf(buf, size, "%s%s", xbc_node_get_data(node),
depth ? "." : "");
if (ret < 0)
return ret;
if (ret > size) {
size = 0;
} else {
size -= ret;
buf += ret;
}
But snprintf() returns the number of bytes that would be written, not
the number of bytes that are written (ignoring the nul terminator).
This means that if the number of non null bytes written were to equal
size, then the nul byte, which snprintf() always adds, will overwrite
that last byte.
ret = snprintf(buf, 5, "hello");
printf("buf = '%s'
", buf);
printf("ret = %d
", ret);
produces:
buf = 'hell'
ret = 5
The string was truncated without ret being greater than 5.
Test (ret >= size) for overwrite.
Link: http://lkml.kernel.org/r/20200813183050.029a6003@oasis.local.home
Fixes: 76db5a27a827c ("bootconfig: Add Extra Boot Config support")
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/bootconfig.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/lib/bootconfig.c~bootconfig-fix-off-by-one-in-xbc_node_compose_key_after
+++ a/lib/bootconfig.c
@@ -248,7 +248,7 @@ int __init xbc_node_compose_key_after(st
depth ? "." : "");
if (ret < 0)
return ret;
- if (ret > size) {
+ if (ret >= size) {
size = 0;
} else {
size -= ret;
_
Patches currently in -mm which might be from rostedt(a)goodmis.org are
The patch titled
Subject: mm, page_alloc: fix core hung in free_pcppages_bulk()
has been added to the -mm tree. Its filename is
mm-page_alloc-fix-core-hung-in-free_pcppages_bulk.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-fix-core-hung-in-fre…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-fix-core-hung-in-fre…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Charan Teja Reddy <charante(a)codeaurora.org>
Subject: mm, page_alloc: fix core hung in free_pcppages_bulk()
The following race is observed with the repeated online, offline and a
delay between two successive online of memory blocks of movable zone.
P1 P2
Online the first memory block in
the movable zone. The pcp struct
values are initialized to default
values,i.e., pcp->high = 0 &
pcp->batch = 1.
Allocate the pages from the
movable zone.
Try to Online the second memory
block in the movable zone thus it
entered the online_pages() but yet
to call zone_pcp_update().
This process is entered into
the exit path thus it tries
to release the order-0 pages
to pcp lists through
free_unref_page_commit().
As pcp->high = 0, pcp->count = 1
proceed to call the function
free_pcppages_bulk().
Update the pcp values thus the
new pcp values are like, say,
pcp->high = 378, pcp->batch = 63.
Read the pcp's batch value using
READ_ONCE() and pass the same to
free_pcppages_bulk(), pcp values
passed here are, batch = 63,
count = 1.
Since num of pages in the pcp
lists are less than ->batch,
then it will stuck in
while(list_empty(list)) loop
with interrupts disabled thus
a core hung.
Avoid this by ensuring free_pcppages_bulk() is called with proper count of
pcp list pages.
The mentioned race is some what easily reproducible without [1] because
pcp's are not updated for the first memory block online and thus there is
a enough race window for P2 between alloc+free and pcp struct values
update through onlining of second memory block.
With [1], the race still exists but it is very narrow as we update the pcp
struct values for the first memory block online itself.
This is not limited to the movable zone, it could also happen in cases
with the normal zone (e.g., hotplug to a node that only has DMA memory, or
no other memory yet).
[1]: https://patchwork.kernel.org/patch/11696389/
Link: http://lkml.kernel.org/r/1597150703-19003-1-git-send-email-charante@codeaur…
Fixes: 5f8dcc21211a ("page-allocator: split per-cpu list into one-list-per-migrate-type")
Signed-off-by: Charan Teja Reddy <charante(a)codeaurora.org>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: David Rientjes <rientjes(a)google.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Vinayak Menon <vinmenon(a)codeaurora.org>
Cc: <stable(a)vger.kernel.org> [2.6+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 5 +++++
1 file changed, 5 insertions(+)
--- a/mm/page_alloc.c~mm-page_alloc-fix-core-hung-in-free_pcppages_bulk
+++ a/mm/page_alloc.c
@@ -1301,6 +1301,11 @@ static void free_pcppages_bulk(struct zo
struct page *page, *tmp;
LIST_HEAD(head);
+ /*
+ * Ensure proper count is passed which otherwise would stuck in the
+ * below while (list_empty(list)) loop.
+ */
+ count = min(pcp->count, count);
while (count) {
struct list_head *list;
_
Patches currently in -mm which might be from charante(a)codeaurora.org are
mm-page_alloc-fix-core-hung-in-free_pcppages_bulk.patch
The patch titled
Subject: bootconfig: fix off-by-one in xbc_node_compose_key_after()
has been added to the -mm tree. Its filename is
bootconfig-fix-off-by-one-in-xbc_node_compose_key_after.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/bootconfig-fix-off-by-one-in-xbc_n…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/bootconfig-fix-off-by-one-in-xbc_n…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
Subject: bootconfig: fix off-by-one in xbc_node_compose_key_after()
While reviewing some patches for bootconfig, I noticed the following
code in xbc_node_compose_key_after():
ret = snprintf(buf, size, "%s%s", xbc_node_get_data(node),
depth ? "." : "");
if (ret < 0)
return ret;
if (ret > size) {
size = 0;
} else {
size -= ret;
buf += ret;
}
But snprintf() returns the number of bytes that would be written, not
the number of bytes that are written (ignoring the nul terminator).
This means that if the number of non null bytes written were to equal
size, then the nul byte, which snprintf() always adds, will overwrite
that last byte.
ret = snprintf(buf, 5, "hello");
printf("buf = '%s'
", buf);
printf("ret = %d
", ret);
produces:
buf = 'hell'
ret = 5
The string was truncated without ret being greater than 5.
Test (ret >= size) for overwrite.
Link: http://lkml.kernel.org/r/20200813183050.029a6003@oasis.local.home
Fixes: 76db5a27a827c ("bootconfig: Add Extra Boot Config support")
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
Cc: Masami Hiramatsu <mhiramat(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/bootconfig.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/lib/bootconfig.c~bootconfig-fix-off-by-one-in-xbc_node_compose_key_after
+++ a/lib/bootconfig.c
@@ -248,7 +248,7 @@ int __init xbc_node_compose_key_after(st
depth ? "." : "");
if (ret < 0)
return ret;
- if (ret > size) {
+ if (ret >= size) {
size = 0;
} else {
size -= ret;
_
Patches currently in -mm which might be from rostedt(a)goodmis.org are
bootconfig-fix-off-by-one-in-xbc_node_compose_key_after.patch
From: Paul Hsieh <paul.hsieh(a)amd.com>
[Why]
Place the cursor in the center of screen between two pipes then
adjusting the viewport but cursour doesn't update cause DFPstate hang.
[How]
If viewport changed, update cursor as well.
Cc: stable(a)vger.kernel.org
Signed-off-by: Paul Hsieh <paul.hsieh(a)amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr(a)amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira(a)amd.com>
---
drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
index 66180b4332f1..c8cfd3ba1c15 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -1457,8 +1457,8 @@ static void dcn20_update_dchubp_dpp(
/* Any updates are handled in dc interface, just need to apply existing for plane enable */
if ((pipe_ctx->update_flags.bits.enable || pipe_ctx->update_flags.bits.opp_changed ||
- pipe_ctx->update_flags.bits.scaler || pipe_ctx->update_flags.bits.viewport)
- && pipe_ctx->stream->cursor_attributes.address.quad_part != 0) {
+ pipe_ctx->update_flags.bits.scaler || viewport_changed == true) &&
+ pipe_ctx->stream->cursor_attributes.address.quad_part != 0) {
dc->hwss.set_cursor_position(pipe_ctx);
dc->hwss.set_cursor_attribute(pipe_ctx);
--
2.28.0
Add skcd->no_refcnt check which is missed when backporting
ad0f75e5f57c ("cgroup: fix cgroup_sk_alloc() for sk_clone_lock()").
This patch is needed in stable-4.9, stable-4.14 and stable-4.19.
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
kernel/cgroup/cgroup.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index a2fed8fbd2bd..ada060e628ce 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -5827,6 +5827,8 @@ void cgroup_sk_clone(struct sock_cgroup_data *skcd)
{
/* Socket clone path */
if (skcd->val) {
+ if (skcd->no_refcnt)
+ return;
/*
* We might be cloning a socket which is left in an empty
* cgroup and the cgroup might have already been rmdir'd.
--
2.25.1
Add skcd->no_refcnt check which is missed when backporting
ad0f75e5f57c ("cgroup: fix cgroup_sk_alloc() for sk_clone_lock()").
This patch is needed in stable-4.9, stable-4.14 and stable-4.19.
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
kernel/cgroup.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index f047c73189f3..684d02f343b4 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -6355,6 +6355,8 @@ void cgroup_sk_clone(struct sock_cgroup_data *skcd)
{
/* Socket clone path */
if (skcd->val) {
+ if (skcd->no_refcnt)
+ return;
/*
* We might be cloning a socket which is left in an empty
* cgroup and the cgroup might have already been rmdir'd.
--
2.25.1
Add skcd->no_refcnt check which is missed when backporting
ad0f75e5f57c ("cgroup: fix cgroup_sk_alloc() for sk_clone_lock()").
This patch is needed in stable-4.9, stable-4.14 and stable-4.19.
Signed-off-by: Yang Yingliang <yangyingliang(a)huawei.com>
---
kernel/cgroup.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index f047c73189f3..684d02f343b4 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -6355,6 +6355,8 @@ void cgroup_sk_clone(struct sock_cgroup_data *skcd)
{
/* Socket clone path */
if (skcd->val) {
+ if (skcd->no_refcnt)
+ return;
/*
* We might be cloning a socket which is left in an empty
* cgroup and the cgroup might have already been rmdir'd.
--
2.25.1
The following commit has been merged into the irq/urgent branch of tip:
Commit-ID: f107cee94ba4d2c7357fde59a1d84346c73d4958
Gitweb: https://git.kernel.org/tip/f107cee94ba4d2c7357fde59a1d84346c73d4958
Author: Guenter Roeck <linux(a)roeck-us.net>
AuthorDate: Tue, 11 Aug 2020 11:00:12 -07:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Thu, 13 Aug 2020 09:35:59 +02:00
genirq: Unlock irq descriptor after errors
In irq_set_irqchip_state(), the irq descriptor is not unlocked after an
error is encountered. While that should never happen in practice, a buggy
driver may trigger it. This would result in a lockup, so fix it.
Fixes: 1d0326f352bb ("genirq: Check irq_data_get_irq_chip() return value before use")
Signed-off-by: Guenter Roeck <linux(a)roeck-us.net>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20200811180012.80269-1-linux@roeck-us.net
---
kernel/irq/manage.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index d55ba62..52ac539 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -2731,8 +2731,10 @@ int irq_set_irqchip_state(unsigned int irq, enum irqchip_irq_state which,
do {
chip = irq_data_get_irq_chip(data);
- if (WARN_ON_ONCE(!chip))
- return -ENODEV;
+ if (WARN_ON_ONCE(!chip)) {
+ err = -ENODEV;
+ goto out_unlock;
+ }
if (chip->irq_set_irqchip_state)
break;
#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY
@@ -2745,6 +2747,7 @@ int irq_set_irqchip_state(unsigned int irq, enum irqchip_irq_state which,
if (data)
err = chip->irq_set_irqchip_state(data, which, val);
+out_unlock:
irq_put_desc_busunlock(desc, flags);
return err;
}
binfmt_flat loader uses the gap between text and data to store data
segment pointers for the libraries. Even in the absence of shared
libraries it stores at least one pointer to the executable's own data
segment. Text and data can go back to back in the flat binary image and
without offsetting data segment last few instructions in the text
segment may get corrupted by the data segment pointer.
Fix it by reverting commit a2357223c50a ("binfmt_flat: don't offset the
data start").
Cc: stable(a)vger.kernel.org
Fixes: a2357223c50a ("binfmt_flat: don't offset the data start")
Signed-off-by: Max Filippov <jcmvbkbc(a)gmail.com>
---
fs/binfmt_flat.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)
diff --git a/fs/binfmt_flat.c b/fs/binfmt_flat.c
index f2f9086ebe98..b9c658e0548e 100644
--- a/fs/binfmt_flat.c
+++ b/fs/binfmt_flat.c
@@ -576,7 +576,7 @@ static int load_flat_file(struct linux_binprm *bprm,
goto err;
}
- len = data_len + extra;
+ len = data_len + extra + MAX_SHARED_LIBS * sizeof(unsigned long);
len = PAGE_ALIGN(len);
realdatastart = vm_mmap(NULL, 0, len,
PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE, 0);
@@ -590,7 +590,9 @@ static int load_flat_file(struct linux_binprm *bprm,
vm_munmap(textpos, text_len);
goto err;
}
- datapos = ALIGN(realdatastart, FLAT_DATA_ALIGN);
+ datapos = ALIGN(realdatastart +
+ MAX_SHARED_LIBS * sizeof(unsigned long),
+ FLAT_DATA_ALIGN);
pr_debug("Allocated data+bss+stack (%u bytes): %lx\n",
data_len + bss_len + stack_len, datapos);
@@ -620,7 +622,7 @@ static int load_flat_file(struct linux_binprm *bprm,
memp_size = len;
} else {
- len = text_len + data_len + extra;
+ len = text_len + data_len + extra + MAX_SHARED_LIBS * sizeof(u32);
len = PAGE_ALIGN(len);
textpos = vm_mmap(NULL, 0, len,
PROT_READ | PROT_EXEC | PROT_WRITE, MAP_PRIVATE, 0);
@@ -635,7 +637,9 @@ static int load_flat_file(struct linux_binprm *bprm,
}
realdatastart = textpos + ntohl(hdr->data_start);
- datapos = ALIGN(realdatastart, FLAT_DATA_ALIGN);
+ datapos = ALIGN(realdatastart +
+ MAX_SHARED_LIBS * sizeof(u32),
+ FLAT_DATA_ALIGN);
reloc = (__be32 __user *)
(datapos + (ntohl(hdr->reloc_start) - text_len));
@@ -652,9 +656,8 @@ static int load_flat_file(struct linux_binprm *bprm,
(text_len + full_data
- sizeof(struct flat_hdr)),
0);
- if (datapos != realdatastart)
- memmove((void *)datapos, (void *)realdatastart,
- full_data);
+ memmove((void *) datapos, (void *) realdatastart,
+ full_data);
#else
/*
* This is used on MMU systems mainly for testing.
@@ -710,7 +713,8 @@ static int load_flat_file(struct linux_binprm *bprm,
if (IS_ERR_VALUE(result)) {
ret = result;
pr_err("Unable to read code+data+bss, errno %d\n", ret);
- vm_munmap(textpos, text_len + data_len + extra);
+ vm_munmap(textpos, text_len + data_len + extra +
+ MAX_SHARED_LIBS * sizeof(u32));
goto err;
}
}
--
2.20.1
The patch titled
Subject: fs/minix: reject too-large maximum file size
has been removed from the -mm tree. Its filename was
fs-minix-reject-too-large-maximum-file-size.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Eric Biggers <ebiggers(a)google.com>
Subject: fs/minix: reject too-large maximum file size
If the minix filesystem tries to map a very large logical block number to
its on-disk location, block_to_path() can return offsets that are too
large, causing out-of-bounds memory accesses when accessing indirect index
blocks. This should be prevented by the check against the maximum file
size, but this doesn't work because the maximum file size is read directly
from the on-disk superblock and isn't validated itself.
Fix this by validating the maximum file size at mount time.
Link: http://lkml.kernel.org/r/20200628060846.682158-4-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+c7d9ec7a1a7272dd71b3(a)syzkaller.appspotmail.com
Reported-by: syzbot+3b7b03a0c28948054fb5(a)syzkaller.appspotmail.com
Reported-by: syzbot+6e056ee473568865f3e6(a)syzkaller.appspotmail.com
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: Qiujun Huang <anenbupt(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/minix/inode.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
--- a/fs/minix/inode.c~fs-minix-reject-too-large-maximum-file-size
+++ a/fs/minix/inode.c
@@ -150,6 +150,23 @@ static int minix_remount (struct super_b
return 0;
}
+static bool minix_check_superblock(struct minix_sb_info *sbi)
+{
+ if (sbi->s_imap_blocks == 0 || sbi->s_zmap_blocks == 0)
+ return false;
+
+ /*
+ * s_max_size must not exceed the block mapping limitation. This check
+ * is only needed for V1 filesystems, since V2/V3 support an extra level
+ * of indirect blocks which places the limit well above U32_MAX.
+ */
+ if (sbi->s_version == MINIX_V1 &&
+ sbi->s_max_size > (7 + 512 + 512*512) * BLOCK_SIZE)
+ return false;
+
+ return true;
+}
+
static int minix_fill_super(struct super_block *s, void *data, int silent)
{
struct buffer_head *bh;
@@ -228,11 +245,12 @@ static int minix_fill_super(struct super
} else
goto out_no_fs;
+ if (!minix_check_superblock(sbi))
+ goto out_illegal_sb;
+
/*
* Allocate the buffer map to keep the superblock small.
*/
- if (sbi->s_imap_blocks == 0 || sbi->s_zmap_blocks == 0)
- goto out_illegal_sb;
i = (sbi->s_imap_blocks + sbi->s_zmap_blocks) * sizeof(bh);
map = kzalloc(i, GFP_KERNEL);
if (!map)
_
Patches currently in -mm which might be from ebiggers(a)google.com are
The patch titled
Subject: fs/minix: don't allow getting deleted inodes
has been removed from the -mm tree. Its filename was
fs-minix-dont-allow-getting-deleted-inodes.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Eric Biggers <ebiggers(a)google.com>
Subject: fs/minix: don't allow getting deleted inodes
If an inode has no links, we need to mark it bad rather than allowing it
to be accessed. This avoids WARNINGs in inc_nlink() and drop_nlink() when
doing directory operations on a fuzzed filesystem.
Link: http://lkml.kernel.org/r/20200628060846.682158-3-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+a9ac3de1b5de5fb10efc(a)syzkaller.appspotmail.com
Reported-by: syzbot+df958cf5688a96ad3287(a)syzkaller.appspotmail.com
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: Qiujun Huang <anenbupt(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/minix/inode.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
--- a/fs/minix/inode.c~fs-minix-dont-allow-getting-deleted-inodes
+++ a/fs/minix/inode.c
@@ -468,6 +468,13 @@ static struct inode *V1_minix_iget(struc
iget_failed(inode);
return ERR_PTR(-EIO);
}
+ if (raw_inode->i_nlinks == 0) {
+ printk("MINIX-fs: deleted inode referenced: %lu\n",
+ inode->i_ino);
+ brelse(bh);
+ iget_failed(inode);
+ return ERR_PTR(-ESTALE);
+ }
inode->i_mode = raw_inode->i_mode;
i_uid_write(inode, raw_inode->i_uid);
i_gid_write(inode, raw_inode->i_gid);
@@ -501,6 +508,13 @@ static struct inode *V2_minix_iget(struc
iget_failed(inode);
return ERR_PTR(-EIO);
}
+ if (raw_inode->i_nlinks == 0) {
+ printk("MINIX-fs: deleted inode referenced: %lu\n",
+ inode->i_ino);
+ brelse(bh);
+ iget_failed(inode);
+ return ERR_PTR(-ESTALE);
+ }
inode->i_mode = raw_inode->i_mode;
i_uid_write(inode, raw_inode->i_uid);
i_gid_write(inode, raw_inode->i_gid);
_
Patches currently in -mm which might be from ebiggers(a)google.com are
The patch titled
Subject: fs/minix: check return value of sb_getblk()
has been removed from the -mm tree. Its filename was
fs-minix-check-return-value-of-sb_getblk.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Eric Biggers <ebiggers(a)google.com>
Subject: fs/minix: check return value of sb_getblk()
Patch series "fs/minix: fix syzbot bugs and set s_maxbytes".
This series fixes all syzbot bugs in the minix filesystem:
KASAN: null-ptr-deref Write in get_block
KASAN: use-after-free Write in get_block
KASAN: use-after-free Read in get_block
WARNING in inc_nlink
KMSAN: uninit-value in get_block
WARNING in drop_nlink
It also fixes the minix filesystem to set s_maxbytes correctly, so that
userspace sees the correct behavior when exceeding the max file size.
This patch (of 6):
sb_getblk() can fail, so check its return value.
This fixes a NULL pointer dereference.
Originally from Qiujun Huang.
Link: http://lkml.kernel.org/r/20200628060846.682158-1-ebiggers@kernel.org
Link: http://lkml.kernel.org/r/20200628060846.682158-2-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Reported-by: syzbot+4a88b2b9dc280f47baf4(a)syzkaller.appspotmail.com
Cc: Qiujun Huang <anenbupt(a)gmail.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/minix/itree_common.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
--- a/fs/minix/itree_common.c~fs-minix-check-return-value-of-sb_getblk
+++ a/fs/minix/itree_common.c
@@ -75,6 +75,7 @@ static int alloc_branch(struct inode *in
int n = 0;
int i;
int parent = minix_new_block(inode);
+ int err = -ENOSPC;
branch[0].key = cpu_to_block(parent);
if (parent) for (n = 1; n < num; n++) {
@@ -85,6 +86,11 @@ static int alloc_branch(struct inode *in
break;
branch[n].key = cpu_to_block(nr);
bh = sb_getblk(inode->i_sb, parent);
+ if (!bh) {
+ minix_free_block(inode, nr);
+ err = -ENOMEM;
+ break;
+ }
lock_buffer(bh);
memset(bh->b_data, 0, bh->b_size);
branch[n].bh = bh;
@@ -103,7 +109,7 @@ static int alloc_branch(struct inode *in
bforget(branch[i].bh);
for (i = 0; i < n; i++)
minix_free_block(inode, block_to_cpu(branch[i].key));
- return -ENOSPC;
+ return err;
}
static inline int splice_branch(struct inode *inode,
_
Patches currently in -mm which might be from ebiggers(a)google.com are
The patch titled
Subject: cma: don't quit at first error when activating reserved areas
has been removed from the -mm tree. Its filename was
cma-dont-quit-at-first-error-when-activating-reserved-areas.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: cma: don't quit at first error when activating reserved areas
The routine cma_init_reserved_areas is designed to activate all
reserved cma areas. It quits when it first encounters an error.
This can leave some areas in a state where they are reserved but
not activated. There is no feedback to code which performed the
reservation. Attempting to allocate memory from areas in such a
state will result in a BUG.
Modify cma_init_reserved_areas to always attempt to activate all
areas. The called routine, cma_activate_area is responsible for
leaving the area in a valid state. No one is making active use
of returned error codes, so change the routine to void.
How to reproduce: This example uses kernelcore, hugetlb and cma
as an easy way to reproduce. However, this is a more general cma
issue.
Two node x86 VM 16GB total, 8GB per node
Kernel command line parameters, kernelcore=4G hugetlb_cma=8G
Related boot time messages,
hugetlb_cma: reserve 8192 MiB, up to 4096 MiB per node
cma: Reserved 4096 MiB at 0x0000000100000000
hugetlb_cma: reserved 4096 MiB on node 0
cma: Reserved 4096 MiB at 0x0000000300000000
hugetlb_cma: reserved 4096 MiB on node 1
cma: CMA area hugetlb could not be activated
# echo 8 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
...
Call Trace:
bitmap_find_next_zero_area_off+0x51/0x90
cma_alloc+0x1a5/0x310
alloc_fresh_huge_page+0x78/0x1a0
alloc_pool_huge_page+0x6f/0xf0
set_max_huge_pages+0x10c/0x250
nr_hugepages_store_common+0x92/0x120
? __kmalloc+0x171/0x270
kernfs_fop_write+0xc1/0x1a0
vfs_write+0xc7/0x1f0
ksys_write+0x5f/0xe0
do_syscall_64+0x4d/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Link: http://lkml.kernel.org/r/20200730163123.6451-1-mike.kravetz@oracle.com
Fixes: c64be2bb1c6e ("drivers: add Contiguous Memory Allocator")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reviewed-by: Roman Gushchin <guro(a)fb.com>
Acked-by: Barry Song <song.bao.hua(a)hisilicon.com>
Cc: Marek Szyprowski <m.szyprowski(a)samsung.com>
Cc: Michal Nazarewicz <mina86(a)mina86.com>
Cc: Kyungmin Park <kyungmin.park(a)samsung.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/cma.c | 23 +++++++++--------------
1 file changed, 9 insertions(+), 14 deletions(-)
--- a/mm/cma.c~cma-dont-quit-at-first-error-when-activating-reserved-areas
+++ a/mm/cma.c
@@ -93,17 +93,15 @@ static void cma_clear_bitmap(struct cma
mutex_unlock(&cma->lock);
}
-static int __init cma_activate_area(struct cma *cma)
+static void __init cma_activate_area(struct cma *cma)
{
unsigned long base_pfn = cma->base_pfn, pfn = base_pfn;
unsigned i = cma->count >> pageblock_order;
struct zone *zone;
cma->bitmap = bitmap_zalloc(cma_bitmap_maxno(cma), GFP_KERNEL);
- if (!cma->bitmap) {
- cma->count = 0;
- return -ENOMEM;
- }
+ if (!cma->bitmap)
+ goto out_error;
WARN_ON_ONCE(!pfn_valid(pfn));
zone = page_zone(pfn_to_page(pfn));
@@ -133,25 +131,22 @@ static int __init cma_activate_area(stru
spin_lock_init(&cma->mem_head_lock);
#endif
- return 0;
+ return;
not_in_zone:
- pr_err("CMA area %s could not be activated\n", cma->name);
bitmap_free(cma->bitmap);
+out_error:
cma->count = 0;
- return -EINVAL;
+ pr_err("CMA area %s could not be activated\n", cma->name);
+ return;
}
static int __init cma_init_reserved_areas(void)
{
int i;
- for (i = 0; i < cma_area_count; i++) {
- int ret = cma_activate_area(&cma_areas[i]);
-
- if (ret)
- return ret;
- }
+ for (i = 0; i < cma_area_count; i++)
+ cma_activate_area(&cma_areas[i]);
return 0;
}
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
The patch titled
Subject: hugetlbfs: remove call to huge_pte_alloc without i_mmap_rwsem
has been removed from the -mm tree. Its filename was
hugetlbfs-remove-call-to-huge_pte_alloc-without-i_mmap_rwsem.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlbfs: remove call to huge_pte_alloc without i_mmap_rwsem
Commit c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing
synchronization") requires callers of huge_pte_alloc to hold i_mmap_rwsem
in at least read mode. This is because the explicit locking in
huge_pmd_share (called by huge_pte_alloc) was removed. When restructuring
the code, the call to huge_pte_alloc in the else block at the beginning of
hugetlb_fault was missed.
Unfortunately, that else clause is exercised when there is no page table
entry. This will likely lead to a call to huge_pmd_share. If
huge_pmd_share thinks pmd sharing is possible, it will traverse the
mapping tree (i_mmap) without holding i_mmap_rwsem. If someone else is
modifying the tree, bad things such as addressing exceptions or worse
could happen.
Simply remove the else clause. It should have been removed previously.
The code following the else will call huge_pte_alloc with the appropriate
locking.
To prevent this type of issue in the future, add routines to assert that
i_mmap_rwsem is held, and call these routines in huge pmd sharing
routines.
Link: http://lkml.kernel.org/r/e670f327-5cf9-1959-96e4-6dc7cc30d3d5@oracle.com
Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Suggested-by: Matthew Wilcox <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: "Kirill A.Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: Prakash Sangappa <prakash.sangappa(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/fs.h | 10 ++++++++++
include/linux/hugetlb.h | 8 +++++---
mm/hugetlb.c | 15 +++++++--------
mm/rmap.c | 2 +-
4 files changed, 23 insertions(+), 12 deletions(-)
--- a/include/linux/fs.h~hugetlbfs-remove-call-to-huge_pte_alloc-without-i_mmap_rwsem
+++ a/include/linux/fs.h
@@ -518,6 +518,16 @@ static inline void i_mmap_unlock_read(st
up_read(&mapping->i_mmap_rwsem);
}
+static inline void i_mmap_assert_locked(struct address_space *mapping)
+{
+ lockdep_assert_held(&mapping->i_mmap_rwsem);
+}
+
+static inline void i_mmap_assert_write_locked(struct address_space *mapping)
+{
+ lockdep_assert_held_write(&mapping->i_mmap_rwsem);
+}
+
/*
* Might pages of this file be mapped into userspace?
*/
--- a/include/linux/hugetlb.h~hugetlbfs-remove-call-to-huge_pte_alloc-without-i_mmap_rwsem
+++ a/include/linux/hugetlb.h
@@ -164,7 +164,8 @@ pte_t *huge_pte_alloc(struct mm_struct *
unsigned long addr, unsigned long sz);
pte_t *huge_pte_offset(struct mm_struct *mm,
unsigned long addr, unsigned long sz);
-int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep);
+int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma,
+ unsigned long *addr, pte_t *ptep);
void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
unsigned long *start, unsigned long *end);
struct page *follow_huge_addr(struct mm_struct *mm, unsigned long address,
@@ -203,8 +204,9 @@ static inline struct address_space *huge
return NULL;
}
-static inline int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr,
- pte_t *ptep)
+static inline int huge_pmd_unshare(struct mm_struct *mm,
+ struct vm_area_struct *vma,
+ unsigned long *addr, pte_t *ptep)
{
return 0;
}
--- a/mm/hugetlb.c~hugetlbfs-remove-call-to-huge_pte_alloc-without-i_mmap_rwsem
+++ a/mm/hugetlb.c
@@ -3967,7 +3967,7 @@ void __unmap_hugepage_range(struct mmu_g
continue;
ptl = huge_pte_lock(h, mm, ptep);
- if (huge_pmd_unshare(mm, &address, ptep)) {
+ if (huge_pmd_unshare(mm, vma, &address, ptep)) {
spin_unlock(ptl);
/*
* We just unmapped a page of PMDs by clearing a PUD.
@@ -4554,10 +4554,6 @@ vm_fault_t hugetlb_fault(struct mm_struc
} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
return VM_FAULT_HWPOISON_LARGE |
VM_FAULT_SET_HINDEX(hstate_index(h));
- } else {
- ptep = huge_pte_alloc(mm, haddr, huge_page_size(h));
- if (!ptep)
- return VM_FAULT_OOM;
}
/*
@@ -5034,7 +5030,7 @@ unsigned long hugetlb_change_protection(
if (!ptep)
continue;
ptl = huge_pte_lock(h, mm, ptep);
- if (huge_pmd_unshare(mm, &address, ptep)) {
+ if (huge_pmd_unshare(mm, vma, &address, ptep)) {
pages++;
spin_unlock(ptl);
shared_pmd = true;
@@ -5415,12 +5411,14 @@ out:
* returns: 1 successfully unmapped a shared pte page
* 0 the underlying pte page is not shared, or it is the last user
*/
-int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep)
+int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma,
+ unsigned long *addr, pte_t *ptep)
{
pgd_t *pgd = pgd_offset(mm, *addr);
p4d_t *p4d = p4d_offset(pgd, *addr);
pud_t *pud = pud_offset(p4d, *addr);
+ i_mmap_assert_write_locked(vma->vm_file->f_mapping);
BUG_ON(page_count(virt_to_page(ptep)) == 0);
if (page_count(virt_to_page(ptep)) == 1)
return 0;
@@ -5438,7 +5436,8 @@ pte_t *huge_pmd_share(struct mm_struct *
return NULL;
}
-int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep)
+int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma,
+ unsigned long *addr, pte_t *ptep)
{
return 0;
}
--- a/mm/rmap.c~hugetlbfs-remove-call-to-huge_pte_alloc-without-i_mmap_rwsem
+++ a/mm/rmap.c
@@ -1469,7 +1469,7 @@ static bool try_to_unmap_one(struct page
* do this outside rmap routines.
*/
VM_BUG_ON(!(flags & TTU_RMAP_LOCKED));
- if (huge_pmd_unshare(mm, &address, pvmw.pte)) {
+ if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) {
/*
* huge_pmd_unshare unmapped an entire PMD
* page. There is no way of knowing exactly
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
commit 7ef282e05132d56b6f6b71e3873f317664bea78b upstream
If a process has the trace_pipe open on a trace_array, the current tracer
for that trace array should not be changed. This was original enforced by a
global lock, but when instances were introduced, it was moved to the
current_trace. But this structure is shared by all instances, and a
trace_pipe is for a single instance. There's no reason that a process that
has trace_pipe open on one instance should prevent another instance from
changing its current tracer. Move the reference counter to the trace_array
instead.
This is marked as "Fixes" but is more of a clean up than a true fix.
Backport if you want, but its not critical.
Fixes: cf6ab6d9143b1 ("tracing: Add ref count to tracer for when they are being read by pipe")
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
This addresses an issue we've seen with users trying to change
current_tracer when they happen to have rasdaemon installed.
rasdaemon uses the trace_pipe interface at runtime, which therefore
blocks changing the current tracer. But of course, unless
you know about rasdaemon internals, it isn't exactly an obvious
failure mode.
kernel/trace/trace.c | 12 ++++++------
kernel/trace/trace.h | 2 +-
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index bb62269724d5..6fc6da55b94e 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -5887,7 +5887,7 @@ int tracing_set_tracer(struct trace_array *tr, const char *buf)
}
/* If trace pipe files are being read, we can't change the tracer */
- if (tr->current_trace->ref) {
+ if (tr->trace_ref) {
ret = -EBUSY;
goto out;
}
@@ -6103,7 +6103,7 @@ static int tracing_open_pipe(struct inode *inode, struct file *filp)
nonseekable_open(inode, filp);
- tr->current_trace->ref++;
+ tr->trace_ref++;
out:
mutex_unlock(&trace_types_lock);
return ret;
@@ -6122,7 +6122,7 @@ static int tracing_release_pipe(struct inode *inode, struct file *file)
mutex_lock(&trace_types_lock);
- tr->current_trace->ref--;
+ tr->trace_ref--;
if (iter->trace->pipe_close)
iter->trace->pipe_close(iter);
@@ -7424,7 +7424,7 @@ static int tracing_buffers_open(struct inode *inode, struct file *filp)
filp->private_data = info;
- tr->current_trace->ref++;
+ tr->trace_ref++;
mutex_unlock(&trace_types_lock);
@@ -7525,7 +7525,7 @@ static int tracing_buffers_release(struct inode *inode, struct file *file)
mutex_lock(&trace_types_lock);
- iter->tr->current_trace->ref--;
+ iter->tr->trace_ref--;
__trace_array_put(iter->tr);
@@ -8733,7 +8733,7 @@ static int __remove_instance(struct trace_array *tr)
int i;
/* Reference counter for a newly created trace array = 1. */
- if (tr->ref > 1 || (tr->current_trace && tr->current_trace->ref))
+ if (tr->ref > 1 || (tr->current_trace && tr->trace_ref))
return -EBUSY;
list_del(&tr->list);
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 13db4000af3f..f21607f87189 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -356,6 +356,7 @@ struct trace_array {
struct trace_event_file *trace_marker_file;
cpumask_var_t tracing_cpumask; /* only trace on set CPUs */
int ref;
+ int trace_ref;
#ifdef CONFIG_FUNCTION_TRACER
struct ftrace_ops *ops;
struct trace_pid_list __rcu *function_pids;
@@ -547,7 +548,6 @@ struct tracer {
struct tracer *next;
struct tracer_flags *flags;
int enabled;
- int ref;
bool print_max;
bool allow_instances;
#ifdef CONFIG_TRACER_MAX_TRACE
--
2.28.0
Before commit 9495b7e92f716ab2 ("driver core: platform: Initialize
dma_parms for platform devices"), the R-Car SATA device didn't have DMA
parameters. Hence the DMA boundary mask supplied by its driver was
silently ignored, as __scsi_init_queue() doesn't check the return value
of dma_set_seg_boundary(), and the default value of 0xffffffff was used.
Now the device has gained DMA parameters, the driver-supplied value is
used, and the following warning is printed on Salvator-XS:
DMA-API: sata_rcar ee300000.sata: mapping sg segment across boundary [start=0x00000000ffffe000] [end=0x00000000ffffefff] [boundary=0x000000001ffffffe]
WARNING: CPU: 5 PID: 38 at kernel/dma/debug.c:1233 debug_dma_map_sg+0x298/0x300
(the range of start/end values depend on whether IOMMU support is
enabled or not)
The issue here is that SATA_RCAR_DMA_BOUNDARY doesn't have bit 0 set, so
any typical end value, which is odd, will trigger the check.
Fix this by increasing the DMA boundary value by 1.
Fixes: 8bfbeed58665dbbf ("sata_rcar: correct 'sata_rcar_sht'")
Fixes: 9495b7e92f716ab2 ("driver core: platform: Initialize dma_parms for platform devices")
Signed-off-by: Geert Uytterhoeven <geert+renesas(a)glider.be>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov(a)cogentembedded.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj(a)bp.renesas.com>
Cc: stable <stable(a)vger.kernel.org>
---
v2:
- Add Reviewed-by, Tested-by, Cc.
This is a fix for a regression in v5.7-rc5 that fell through the cracks.
https://lore.kernel.org/linux-ide/20200513110426.22472-1-geert+renesas@glid…
As by default the DMA debug code prints the first error only, this issue
may be hidden on plain v5.7-rc5, where the FCP driver triggers a similar
warning. Merging commit dd844fb8e50b12e6 ("media: platform: fcp: Set
appropriate DMA parameters", in v5.8-rc1) from the media tree fixes the
FCP issue, and exposes the SATA issue.
I added the second fixes tag because that commit is already being
backported to stable kernels, and this patch thus needs backporting,
too.
---
drivers/ata/sata_rcar.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/ata/sata_rcar.c b/drivers/ata/sata_rcar.c
index 141ac600b64c87ef..44b0ed8f6bb8a120 100644
--- a/drivers/ata/sata_rcar.c
+++ b/drivers/ata/sata_rcar.c
@@ -120,7 +120,7 @@
/* Descriptor table word 0 bit (when DTA32M = 1) */
#define SATA_RCAR_DTEND BIT(0)
-#define SATA_RCAR_DMA_BOUNDARY 0x1FFFFFFEUL
+#define SATA_RCAR_DMA_BOUNDARY 0x1FFFFFFFUL
/* Gen2 Physical Layer Control Registers */
#define RCAR_GEN2_PHY_CTL1_REG 0x1704
--
2.17.1
Hi,
On Mon, Mar 9, 2020 at 2:11 PM Stephen Boyd <sboyd(a)kernel.org> wrote:
>
> Quoting Mike Tipton (2020-02-14 18:12:32)
> > The current implementation always uses rpmh_write_async, which doesn't
> > wait for completion. That's fine for disable requests since there's no
> > immediate need for the clocks and they can be disabled in the
> > background. However, for enable requests we need to ensure the clocks
> > are actually enabled before returning to the client. Otherwise, clients
> > can end up accessing their HW before the necessary clocks are enabled,
> > which can lead to bus errors.
> >
> > Use the synchronous version of this API (rpmh_write) for enable requests
> > in the active set to ensure completion.
> >
> > Completion isn't required for sleep/wake sets, since they don't take
> > effect until after we enter sleep. All rpmh requests are automatically
> > flushed prior to entering sleep.
> >
> > Fixes: 9c7e47025a6b ("clk: qcom: clk-rpmh: Add QCOM RPMh clock driver")
> > Signed-off-by: Mike Tipton <mdtipton(a)codeaurora.org>
> > ---
>
> Applied to clk-next but I squashed in some changes to make it easier for
> me to read.
This landed upstream as commit dad4e7fda4bd ("clk: qcom: clk-rpmh:
Wait for completion when enabling clocks") but seemed to have missed
stable. Can stable pick it up? It has a Fixes tag so presumably it
should be easy to track down where it needs to go.
Thanks!
-Doug
Is it possible 5bedd3afee8eb01ccd256f0cd2cc0fa6f841417a to be
backported to 5.4.xx to fix c64141c68f725068440fbc13eb63dbb283e99168
as it was backported to 5.7 as
02963f5752032ab987fae7b450d5e1e357e7425b to fix
e2cb0c5635ecf7d8f2bde9971edbe00a0b8b8536
> Gesendet: Mittwoch, 12. August 2020 um 11:37 Uhr
> Von: "Wenbin Mei" <wenbin.mei(a)mediatek.com>
> Betreff: [PATCH 3/3] mmc: mediatek: add optional module reset property
> This patch adds a optional reset management for msdc.
> Sometimes the bootloader does not bring msdc register
> to default state, so need reset the msdc controller.
>
> Signed-off-by: Wenbin Mei <wenbin.mei(a)mediatek.com>
Thanks for posting the fix to Mainline
same as 3/3, dts-patch is also needed for fixing eMMC-Issue on R64
Fixes: 966580ad236e ("mmc: mediatek: add support for MT7622 SoC")
Tested-By: Frank Wunderlich <frank-w(a)public-files.de>
and it needs to be fixed at least for 5.4+, so adding stable-CC
Cc: stable(a)vger.kernel.org
> Gesendet: Mittwoch, 12. August 2020 um 11:37 Uhr
> Von: "Wenbin Mei" <wenbin.mei(a)mediatek.com>
> Betreff: [PATCH 3/3] mmc: mediatek: add optional module reset property
> This patch adds a optional reset management for msdc.
> Sometimes the bootloader does not bring msdc register
> to default state, so need reset the msdc controller.
>
> Signed-off-by: Wenbin Mei <wenbin.mei(a)mediatek.com>
Thanks for posting the fix to Mainline
imho this should contain a fixes-Tag as it fixes eMMC-Access on mt7622/Bpi-R64
before we got these Errors on mounting eMMC ion R64:
[ 48.664925] blk_update_request: I/O error, dev mmcblk0, sector 204800 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[ 48.676019] Buffer I/O error on dev mmcblk0p1, logical block 0, lost sync page write
Fixes: 966580ad236e ("mmc: mediatek: add support for MT7622 SoC")
Tested-By: Frank Wunderlich <frank-w(a)public-files.de>
and it needs to be fixed at least for 5.4+, so adding stable-CC
Cc: stable(a)vger.kernel.org
The following commit has been merged into the irq/urgent branch of tip:
Commit-ID: e27b1636e9337d1a1d174b191e53d0f86421a822
Gitweb: https://git.kernel.org/tip/e27b1636e9337d1a1d174b191e53d0f86421a822
Author: Guenter Roeck <linux(a)roeck-us.net>
AuthorDate: Tue, 11 Aug 2020 11:00:01 -07:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Wed, 12 Aug 2020 11:04:05 +02:00
genirq/PM: Always unlock IRQ descriptor in rearm_wake_irq()
rearm_wake_irq() does not unlock the irq descriptor if the interrupt
is not suspended or if wakeup is not enabled on it.
Restucture the exit conditions so the unlock is always ensured.
Fixes: 3a79bc63d9075 ("PCI: irq: Introduce rearm_wake_irq()")
Signed-off-by: Guenter Roeck <linux(a)roeck-us.net>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20200811180001.80203-1-linux@roeck-us.net
---
kernel/irq/pm.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/kernel/irq/pm.c b/kernel/irq/pm.c
index 8f557fa..c6c7e18 100644
--- a/kernel/irq/pm.c
+++ b/kernel/irq/pm.c
@@ -185,14 +185,18 @@ void rearm_wake_irq(unsigned int irq)
unsigned long flags;
struct irq_desc *desc = irq_get_desc_buslock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL);
- if (!desc || !(desc->istate & IRQS_SUSPENDED) ||
- !irqd_is_wakeup_set(&desc->irq_data))
+ if (!desc)
return;
+ if (!(desc->istate & IRQS_SUSPENDED) ||
+ !irqd_is_wakeup_set(&desc->irq_data))
+ goto unlock;
+
desc->istate &= ~IRQS_SUSPENDED;
irqd_set(&desc->irq_data, IRQD_WAKEUP_ARMED);
__enable_irq(desc);
+unlock:
irq_put_desc_busunlock(desc, flags);
}
Commit 88b7381a939d ("USB: Select better matching USB drivers when
available") introduced the use of a "match" function to select a
non-generic/better driver for a particular USB device. This
unfortunately breaks the operation of usbip in general, as reported in
the kernel bugzilla with bug 208267 (linked below).
Upon inspecting the aforementioned commit, one can observe that the
original code in the usb_device_match function used to return 1
unconditionally, but the aforementioned commit makes the usb_device_match
function use identifier tables and "match" virtual functions, if either of
them are available.
Hence, this commit implements a match function for usbip that
unconditionally returns true to ensure that usbip is functional again.
This change has been verified to restore usbip functionality, with a
v5.7.y kernel on an up-to-date version of Qubes OS 4.0, which uses
usbip to redirect USB devices between VMs.
Thanks to Jonathan Dieter for the effort in bisecting this issue down
to the aforementioned commit.
Fixes: 88b7381a939d ("USB: Select better matching USB drivers when available")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=208267
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1856443
Link: https://github.com/QubesOS/qubes-issues/issues/5905
Signed-off-by: M. Vefa Bicakci <m.v.b(a)runbox.com>
Cc: <stable(a)vger.kernel.org> # 5.7
Cc: Valentina Manea <valentina.manea.m(a)gmail.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Bastien Nocera <hadess(a)hadess.net>
Cc: Alan Stern <stern(a)rowland.harvard.edu>
---
drivers/usb/usbip/stub_dev.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/usb/usbip/stub_dev.c b/drivers/usb/usbip/stub_dev.c
index 2305d425e6c9..9d7d642022d1 100644
--- a/drivers/usb/usbip/stub_dev.c
+++ b/drivers/usb/usbip/stub_dev.c
@@ -461,6 +461,11 @@ static void stub_disconnect(struct usb_device *udev)
return;
}
+static bool usbip_match(struct usb_device *udev)
+{
+ return true;
+}
+
#ifdef CONFIG_PM
/* These functions need usb_port_suspend and usb_port_resume,
@@ -486,6 +491,7 @@ struct usb_device_driver stub_driver = {
.name = "usbip-host",
.probe = stub_probe,
.disconnect = stub_disconnect,
+ .match = usbip_match,
#ifdef CONFIG_PM
.suspend = stub_suspend,
.resume = stub_resume,
--
2.26.2
Don't recheck it since xattr_permission() already
checks CAP_SYS_ADMIN capability.
Just follow 5d3ce4f70172 ("f2fs: avoid duplicated permission check for "trusted." xattrs")
Reported-by: Hongyu Jin <hongyu.jin(a)unisoc.com>
[ Gao Xiang: since it could cause some complex Android overlay
permission issue as well on android-5.4+, so it'd be better to
backport to 5.4+ rather than pure cleanup on mainline. ]
Cc: <stable(a)vger.kernel.org> # 5.4+
Signed-off-by: Gao Xiang <hsiangkao(a)redhat.com>
---
related commit:
https://android-review.googlesource.com/c/kernel/common/+/1121623/6/fs/xatt…
fs/erofs/xattr.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/fs/erofs/xattr.c b/fs/erofs/xattr.c
index 87e437e7b34f..f86e3247febc 100644
--- a/fs/erofs/xattr.c
+++ b/fs/erofs/xattr.c
@@ -473,8 +473,6 @@ static int erofs_xattr_generic_get(const struct xattr_handler *handler,
return -EOPNOTSUPP;
break;
case EROFS_XATTR_INDEX_TRUSTED:
- if (!capable(CAP_SYS_ADMIN))
- return -EPERM;
break;
case EROFS_XATTR_INDEX_SECURITY:
break;
--
2.18.1
From: Eric Biggers <ebiggers(a)google.com>
Subject: fs/minix: reject too-large maximum file size
If the minix filesystem tries to map a very large logical block number to
its on-disk location, block_to_path() can return offsets that are too
large, causing out-of-bounds memory accesses when accessing indirect index
blocks. This should be prevented by the check against the maximum file
size, but this doesn't work because the maximum file size is read directly
from the on-disk superblock and isn't validated itself.
Fix this by validating the maximum file size at mount time.
Link: http://lkml.kernel.org/r/20200628060846.682158-4-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+c7d9ec7a1a7272dd71b3(a)syzkaller.appspotmail.com
Reported-by: syzbot+3b7b03a0c28948054fb5(a)syzkaller.appspotmail.com
Reported-by: syzbot+6e056ee473568865f3e6(a)syzkaller.appspotmail.com
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: Qiujun Huang <anenbupt(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/minix/inode.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
--- a/fs/minix/inode.c~fs-minix-reject-too-large-maximum-file-size
+++ a/fs/minix/inode.c
@@ -150,6 +150,23 @@ static int minix_remount (struct super_b
return 0;
}
+static bool minix_check_superblock(struct minix_sb_info *sbi)
+{
+ if (sbi->s_imap_blocks == 0 || sbi->s_zmap_blocks == 0)
+ return false;
+
+ /*
+ * s_max_size must not exceed the block mapping limitation. This check
+ * is only needed for V1 filesystems, since V2/V3 support an extra level
+ * of indirect blocks which places the limit well above U32_MAX.
+ */
+ if (sbi->s_version == MINIX_V1 &&
+ sbi->s_max_size > (7 + 512 + 512*512) * BLOCK_SIZE)
+ return false;
+
+ return true;
+}
+
static int minix_fill_super(struct super_block *s, void *data, int silent)
{
struct buffer_head *bh;
@@ -228,11 +245,12 @@ static int minix_fill_super(struct super
} else
goto out_no_fs;
+ if (!minix_check_superblock(sbi))
+ goto out_illegal_sb;
+
/*
* Allocate the buffer map to keep the superblock small.
*/
- if (sbi->s_imap_blocks == 0 || sbi->s_zmap_blocks == 0)
- goto out_illegal_sb;
i = (sbi->s_imap_blocks + sbi->s_zmap_blocks) * sizeof(bh);
map = kzalloc(i, GFP_KERNEL);
if (!map)
_
From: Eric Biggers <ebiggers(a)google.com>
Subject: fs/minix: check return value of sb_getblk()
Patch series "fs/minix: fix syzbot bugs and set s_maxbytes".
This series fixes all syzbot bugs in the minix filesystem:
KASAN: null-ptr-deref Write in get_block
KASAN: use-after-free Write in get_block
KASAN: use-after-free Read in get_block
WARNING in inc_nlink
KMSAN: uninit-value in get_block
WARNING in drop_nlink
It also fixes the minix filesystem to set s_maxbytes correctly, so that
userspace sees the correct behavior when exceeding the max file size.
This patch (of 6):
sb_getblk() can fail, so check its return value.
This fixes a NULL pointer dereference.
Originally from Qiujun Huang.
Link: http://lkml.kernel.org/r/20200628060846.682158-1-ebiggers@kernel.org
Link: http://lkml.kernel.org/r/20200628060846.682158-2-ebiggers@kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Reported-by: syzbot+4a88b2b9dc280f47baf4(a)syzkaller.appspotmail.com
Cc: Qiujun Huang <anenbupt(a)gmail.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/minix/itree_common.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
--- a/fs/minix/itree_common.c~fs-minix-check-return-value-of-sb_getblk
+++ a/fs/minix/itree_common.c
@@ -75,6 +75,7 @@ static int alloc_branch(struct inode *in
int n = 0;
int i;
int parent = minix_new_block(inode);
+ int err = -ENOSPC;
branch[0].key = cpu_to_block(parent);
if (parent) for (n = 1; n < num; n++) {
@@ -85,6 +86,11 @@ static int alloc_branch(struct inode *in
break;
branch[n].key = cpu_to_block(nr);
bh = sb_getblk(inode->i_sb, parent);
+ if (!bh) {
+ minix_free_block(inode, nr);
+ err = -ENOMEM;
+ break;
+ }
lock_buffer(bh);
memset(bh->b_data, 0, bh->b_size);
branch[n].bh = bh;
@@ -103,7 +109,7 @@ static int alloc_branch(struct inode *in
bforget(branch[i].bh);
for (i = 0; i < n; i++)
minix_free_block(inode, block_to_cpu(branch[i].key));
- return -ENOSPC;
+ return err;
}
static inline int splice_branch(struct inode *inode,
_
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: cma: don't quit at first error when activating reserved areas
The routine cma_init_reserved_areas is designed to activate all
reserved cma areas. It quits when it first encounters an error.
This can leave some areas in a state where they are reserved but
not activated. There is no feedback to code which performed the
reservation. Attempting to allocate memory from areas in such a
state will result in a BUG.
Modify cma_init_reserved_areas to always attempt to activate all
areas. The called routine, cma_activate_area is responsible for
leaving the area in a valid state. No one is making active use
of returned error codes, so change the routine to void.
How to reproduce: This example uses kernelcore, hugetlb and cma
as an easy way to reproduce. However, this is a more general cma
issue.
Two node x86 VM 16GB total, 8GB per node
Kernel command line parameters, kernelcore=4G hugetlb_cma=8G
Related boot time messages,
hugetlb_cma: reserve 8192 MiB, up to 4096 MiB per node
cma: Reserved 4096 MiB at 0x0000000100000000
hugetlb_cma: reserved 4096 MiB on node 0
cma: Reserved 4096 MiB at 0x0000000300000000
hugetlb_cma: reserved 4096 MiB on node 1
cma: CMA area hugetlb could not be activated
# echo 8 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
...
Call Trace:
bitmap_find_next_zero_area_off+0x51/0x90
cma_alloc+0x1a5/0x310
alloc_fresh_huge_page+0x78/0x1a0
alloc_pool_huge_page+0x6f/0xf0
set_max_huge_pages+0x10c/0x250
nr_hugepages_store_common+0x92/0x120
? __kmalloc+0x171/0x270
kernfs_fop_write+0xc1/0x1a0
vfs_write+0xc7/0x1f0
ksys_write+0x5f/0xe0
do_syscall_64+0x4d/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Link: http://lkml.kernel.org/r/20200730163123.6451-1-mike.kravetz@oracle.com
Fixes: c64be2bb1c6e ("drivers: add Contiguous Memory Allocator")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reviewed-by: Roman Gushchin <guro(a)fb.com>
Acked-by: Barry Song <song.bao.hua(a)hisilicon.com>
Cc: Marek Szyprowski <m.szyprowski(a)samsung.com>
Cc: Michal Nazarewicz <mina86(a)mina86.com>
Cc: Kyungmin Park <kyungmin.park(a)samsung.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/cma.c | 23 +++++++++--------------
1 file changed, 9 insertions(+), 14 deletions(-)
--- a/mm/cma.c~cma-dont-quit-at-first-error-when-activating-reserved-areas
+++ a/mm/cma.c
@@ -93,17 +93,15 @@ static void cma_clear_bitmap(struct cma
mutex_unlock(&cma->lock);
}
-static int __init cma_activate_area(struct cma *cma)
+static void __init cma_activate_area(struct cma *cma)
{
unsigned long base_pfn = cma->base_pfn, pfn = base_pfn;
unsigned i = cma->count >> pageblock_order;
struct zone *zone;
cma->bitmap = bitmap_zalloc(cma_bitmap_maxno(cma), GFP_KERNEL);
- if (!cma->bitmap) {
- cma->count = 0;
- return -ENOMEM;
- }
+ if (!cma->bitmap)
+ goto out_error;
WARN_ON_ONCE(!pfn_valid(pfn));
zone = page_zone(pfn_to_page(pfn));
@@ -133,25 +131,22 @@ static int __init cma_activate_area(stru
spin_lock_init(&cma->mem_head_lock);
#endif
- return 0;
+ return;
not_in_zone:
- pr_err("CMA area %s could not be activated\n", cma->name);
bitmap_free(cma->bitmap);
+out_error:
cma->count = 0;
- return -EINVAL;
+ pr_err("CMA area %s could not be activated\n", cma->name);
+ return;
}
static int __init cma_init_reserved_areas(void)
{
int i;
- for (i = 0; i < cma_area_count; i++) {
- int ret = cma_activate_area(&cma_areas[i]);
-
- if (ret)
- return ret;
- }
+ for (i = 0; i < cma_area_count; i++)
+ cma_activate_area(&cma_areas[i]);
return 0;
}
_
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlbfs: remove call to huge_pte_alloc without i_mmap_rwsem
Commit c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing
synchronization") requires callers of huge_pte_alloc to hold i_mmap_rwsem
in at least read mode. This is because the explicit locking in
huge_pmd_share (called by huge_pte_alloc) was removed. When restructuring
the code, the call to huge_pte_alloc in the else block at the beginning of
hugetlb_fault was missed.
Unfortunately, that else clause is exercised when there is no page table
entry. This will likely lead to a call to huge_pmd_share. If
huge_pmd_share thinks pmd sharing is possible, it will traverse the
mapping tree (i_mmap) without holding i_mmap_rwsem. If someone else is
modifying the tree, bad things such as addressing exceptions or worse
could happen.
Simply remove the else clause. It should have been removed previously.
The code following the else will call huge_pte_alloc with the appropriate
locking.
To prevent this type of issue in the future, add routines to assert that
i_mmap_rwsem is held, and call these routines in huge pmd sharing
routines.
Link: http://lkml.kernel.org/r/e670f327-5cf9-1959-96e4-6dc7cc30d3d5@oracle.com
Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Suggested-by: Matthew Wilcox <willy(a)infradead.org>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar(a)linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: "Kirill A.Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: Prakash Sangappa <prakash.sangappa(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/fs.h | 10 ++++++++++
include/linux/hugetlb.h | 8 +++++---
mm/hugetlb.c | 15 +++++++--------
mm/rmap.c | 2 +-
4 files changed, 23 insertions(+), 12 deletions(-)
--- a/include/linux/fs.h~hugetlbfs-remove-call-to-huge_pte_alloc-without-i_mmap_rwsem
+++ a/include/linux/fs.h
@@ -518,6 +518,16 @@ static inline void i_mmap_unlock_read(st
up_read(&mapping->i_mmap_rwsem);
}
+static inline void i_mmap_assert_locked(struct address_space *mapping)
+{
+ lockdep_assert_held(&mapping->i_mmap_rwsem);
+}
+
+static inline void i_mmap_assert_write_locked(struct address_space *mapping)
+{
+ lockdep_assert_held_write(&mapping->i_mmap_rwsem);
+}
+
/*
* Might pages of this file be mapped into userspace?
*/
--- a/include/linux/hugetlb.h~hugetlbfs-remove-call-to-huge_pte_alloc-without-i_mmap_rwsem
+++ a/include/linux/hugetlb.h
@@ -164,7 +164,8 @@ pte_t *huge_pte_alloc(struct mm_struct *
unsigned long addr, unsigned long sz);
pte_t *huge_pte_offset(struct mm_struct *mm,
unsigned long addr, unsigned long sz);
-int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep);
+int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma,
+ unsigned long *addr, pte_t *ptep);
void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
unsigned long *start, unsigned long *end);
struct page *follow_huge_addr(struct mm_struct *mm, unsigned long address,
@@ -203,8 +204,9 @@ static inline struct address_space *huge
return NULL;
}
-static inline int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr,
- pte_t *ptep)
+static inline int huge_pmd_unshare(struct mm_struct *mm,
+ struct vm_area_struct *vma,
+ unsigned long *addr, pte_t *ptep)
{
return 0;
}
--- a/mm/hugetlb.c~hugetlbfs-remove-call-to-huge_pte_alloc-without-i_mmap_rwsem
+++ a/mm/hugetlb.c
@@ -3967,7 +3967,7 @@ void __unmap_hugepage_range(struct mmu_g
continue;
ptl = huge_pte_lock(h, mm, ptep);
- if (huge_pmd_unshare(mm, &address, ptep)) {
+ if (huge_pmd_unshare(mm, vma, &address, ptep)) {
spin_unlock(ptl);
/*
* We just unmapped a page of PMDs by clearing a PUD.
@@ -4554,10 +4554,6 @@ vm_fault_t hugetlb_fault(struct mm_struc
} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
return VM_FAULT_HWPOISON_LARGE |
VM_FAULT_SET_HINDEX(hstate_index(h));
- } else {
- ptep = huge_pte_alloc(mm, haddr, huge_page_size(h));
- if (!ptep)
- return VM_FAULT_OOM;
}
/*
@@ -5034,7 +5030,7 @@ unsigned long hugetlb_change_protection(
if (!ptep)
continue;
ptl = huge_pte_lock(h, mm, ptep);
- if (huge_pmd_unshare(mm, &address, ptep)) {
+ if (huge_pmd_unshare(mm, vma, &address, ptep)) {
pages++;
spin_unlock(ptl);
shared_pmd = true;
@@ -5415,12 +5411,14 @@ out:
* returns: 1 successfully unmapped a shared pte page
* 0 the underlying pte page is not shared, or it is the last user
*/
-int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep)
+int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma,
+ unsigned long *addr, pte_t *ptep)
{
pgd_t *pgd = pgd_offset(mm, *addr);
p4d_t *p4d = p4d_offset(pgd, *addr);
pud_t *pud = pud_offset(p4d, *addr);
+ i_mmap_assert_write_locked(vma->vm_file->f_mapping);
BUG_ON(page_count(virt_to_page(ptep)) == 0);
if (page_count(virt_to_page(ptep)) == 1)
return 0;
@@ -5438,7 +5436,8 @@ pte_t *huge_pmd_share(struct mm_struct *
return NULL;
}
-int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep)
+int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma,
+ unsigned long *addr, pte_t *ptep)
{
return 0;
}
--- a/mm/rmap.c~hugetlbfs-remove-call-to-huge_pte_alloc-without-i_mmap_rwsem
+++ a/mm/rmap.c
@@ -1469,7 +1469,7 @@ static bool try_to_unmap_one(struct page
* do this outside rmap routines.
*/
VM_BUG_ON(!(flags & TTU_RMAP_LOCKED));
- if (huge_pmd_unshare(mm, &address, pvmw.pte)) {
+ if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) {
/*
* huge_pmd_unshare unmapped an entire PMD
* page. There is no way of knowing exactly
_
OK, some patches in the series add buggy code which is then fixed by
follow-up patches, but none of the bugs fixed are severe regressions on
common configs (e.g. compiler warnings, lockdep/rt errors, or bugs in
new drivers). So I thought it's more important to preserve the credit
for the fixes.
I had to pull 5 patches from git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux mlx5-next
to get the mlx5 things to work, this seems to be how mellanox guys are
always managing things, and they told me they are ok with it.
The following changes since commit bcf876870b95592b52519ed4aafcf9d95999bc9c:
Linux 5.8 (2020-08-02 14:21:45 -0700)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus
for you to fetch changes up to 8a7c3213db068135e816a6a517157de6443290d6:
vdpa/mlx5: fix up endian-ness for mtu (2020-08-10 10:38:55 -0400)
----------------------------------------------------------------
virtio: fixes, features
IRQ bypass support for vdpa and IFC
MLX5 vdpa driver
Endian-ness fixes for virtio drivers
Misc other fixes
Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>
----------------------------------------------------------------
Alex Dewar (1):
vdpa/mlx5: Fix uninitialised variable in core/mr.c
Colin Ian King (1):
vdpa/mlx5: fix memory allocation failure checks
Dan Carpenter (2):
vdpa/mlx5: Fix pointer math in mlx5_vdpa_get_config()
vdpa: Fix pointer math bug in vdpasim_get_config()
Eli Cohen (9):
net/mlx5: Support setting access rights of dma addresses
net/mlx5: Add VDPA interface type to supported enumerations
net/mlx5: Add interface changes required for VDPA
net/vdpa: Use struct for set/get vq state
vdpa: Modify get_vq_state() to return error code
vdpa/mlx5: Add hardware descriptive header file
vdpa/mlx5: Add support library for mlx5 VDPA implementation
vdpa/mlx5: Add shared memory registration code
vdpa/mlx5: Add VDPA driver for supported mlx5 devices
Gustavo A. R. Silva (1):
vhost: Use flex_array_size() helper in copy_from_user()
Jason Wang (6):
vhost: vdpa: remove per device feature whitelist
vhost-vdpa: refine ioctl pre-processing
vhost: generialize backend features setting/getting
vhost-vdpa: support get/set backend features
vhost-vdpa: support IOTLB batching hints
vdpasim: support batch updating
Liao Pingfang (1):
virtio_pci_modern: Fix the comment of virtio_pci_find_capability()
Mao Wenan (1):
virtio_ring: Avoid loop when vq is broken in virtqueue_poll
Maor Gottlieb (2):
net/mlx5: Export resource dump interface
net/mlx5: Add support in query QP, CQ and MKEY segments
Max Gurtovoy (2):
vdpasim: protect concurrent access to iommu iotlb
vdpa: remove hard coded virtq num
Meir Lichtinger (1):
RDMA/mlx5: ConnectX-7 new capabilities to set relaxed ordering by UMR
Michael Guralnik (2):
net/mlx5: Enable QP number request when creating IPoIB underlay QP
net/mlx5: Enable count action for rules with allow action
Michael S. Tsirkin (44):
virtio: VIRTIO_F_IOMMU_PLATFORM -> VIRTIO_F_ACCESS_PLATFORM
virtio: virtio_has_iommu_quirk -> virtio_has_dma_quirk
virtio_balloon: fix sparse warning
virtio_ring: sparse warning fixup
virtio: allow __virtioXX, __leXX in config space
virtio_9p: correct tags for config space fields
virtio_balloon: correct tags for config space fields
virtio_blk: correct tags for config space fields
virtio_console: correct tags for config space fields
virtio_crypto: correct tags for config space fields
virtio_fs: correct tags for config space fields
virtio_gpu: correct tags for config space fields
virtio_input: correct tags for config space fields
virtio_iommu: correct tags for config space fields
virtio_mem: correct tags for config space fields
virtio_net: correct tags for config space fields
virtio_pmem: correct tags for config space fields
virtio_scsi: correct tags for config space fields
virtio_config: disallow native type fields
mlxbf-tmfifo: sparse tags for config access
vdpa: make sure set_features is invoked for legacy
vhost/vdpa: switch to new helpers
virtio_vdpa: legacy features handling
vdpa_sim: fix endian-ness of config space
virtio_config: cread/write cleanup
virtio_config: rewrite using _Generic
virtio_config: disallow native type fields (again)
virtio_config: LE config space accessors
virtio_caif: correct tags for config space fields
virtio_config: add virtio_cread_le_feature
virtio_balloon: use LE config space accesses
virtio_input: convert to LE accessors
virtio_fs: convert to LE accessors
virtio_crypto: convert to LE accessors
virtio_pmem: convert to LE accessors
drm/virtio: convert to LE accessors
virtio_mem: convert to LE accessors
virtio-iommu: convert to LE accessors
virtio_config: drop LE option from config space
virtio_net: use LE accessors for speed/duplex
Merge branch 'mlx5-next' of git://git.kernel.org/.../mellanox/linux into HEAD
virtio_config: fix up warnings on parisc
vdpa_sim: init iommu lock
vdpa/mlx5: fix up endian-ness for mtu
Parav Pandit (2):
net/mlx5: Avoid RDMA file inclusion in core driver
net/mlx5: Avoid eswitch header inclusion in fs core layer
Tariq Toukan (1):
net/mlx5: kTLS, Improve TLS params layout structures
Zhu Lingshan (7):
vhost: introduce vhost_vring_call
kvm: detect assigned device via irqbypass manager
vDPA: add get_vq_irq() in vdpa_config_ops
vhost_vdpa: implement IRQ offloading in vhost_vdpa
ifcvf: implement vdpa_config_ops.get_vq_irq()
irqbypass: do not start cons/prod when failed connect
vDPA: dont change vq irq after DRIVER_OK
arch/um/drivers/virtio_uml.c | 2 +-
arch/x86/kvm/x86.c | 12 +-
drivers/crypto/virtio/virtio_crypto_core.c | 46 +-
drivers/gpu/drm/virtio/virtgpu_kms.c | 16 +-
drivers/gpu/drm/virtio/virtgpu_object.c | 2 +-
drivers/gpu/drm/virtio/virtgpu_vq.c | 4 +-
drivers/iommu/virtio-iommu.c | 34 +-
drivers/net/ethernet/mellanox/mlx5/core/alloc.c | 11 +-
.../ethernet/mellanox/mlx5/core/diag/rsc_dump.c | 6 +
.../ethernet/mellanox/mlx5/core/diag/rsc_dump.h | 33 +-
drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h | 2 +-
.../ethernet/mellanox/mlx5/core/en_accel/ktls.h | 2 +-
.../ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c | 14 +-
.../mellanox/mlx5/core/en_accel/tls_rxtx.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 10 -
drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/fs_core.h | 10 +
.../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c | 7 +
drivers/net/ethernet/mellanox/mlx5/core/main.c | 3 +
drivers/net/virtio_net.c | 9 +-
drivers/nvdimm/virtio_pmem.c | 4 +-
drivers/platform/mellanox/mlxbf-tmfifo.c | 13 +-
drivers/scsi/virtio_scsi.c | 4 +-
drivers/vdpa/Kconfig | 19 +
drivers/vdpa/Makefile | 1 +
drivers/vdpa/ifcvf/ifcvf_base.c | 4 +-
drivers/vdpa/ifcvf/ifcvf_base.h | 6 +-
drivers/vdpa/ifcvf/ifcvf_main.c | 31 +-
drivers/vdpa/mlx5/Makefile | 4 +
drivers/vdpa/mlx5/core/mlx5_vdpa.h | 91 +
drivers/vdpa/mlx5/core/mlx5_vdpa_ifc.h | 168 ++
drivers/vdpa/mlx5/core/mr.c | 486 +++++
drivers/vdpa/mlx5/core/resources.c | 284 +++
drivers/vdpa/mlx5/net/main.c | 76 +
drivers/vdpa/mlx5/net/mlx5_vnet.c | 1974 ++++++++++++++++++++
drivers/vdpa/mlx5/net/mlx5_vnet.h | 24 +
drivers/vdpa/vdpa.c | 4 +
drivers/vdpa/vdpa_sim/vdpa_sim.c | 124 +-
drivers/vhost/Kconfig | 1 +
drivers/vhost/net.c | 22 +-
drivers/vhost/vdpa.c | 183 +-
drivers/vhost/vhost.c | 39 +-
drivers/vhost/vhost.h | 11 +-
drivers/virtio/virtio_balloon.c | 30 +-
drivers/virtio/virtio_input.c | 32 +-
drivers/virtio/virtio_mem.c | 30 +-
drivers/virtio/virtio_pci_modern.c | 1 +
drivers/virtio/virtio_ring.c | 7 +-
drivers/virtio/virtio_vdpa.c | 9 +-
fs/fuse/virtio_fs.c | 4 +-
include/linux/mlx5/cq.h | 1 -
include/linux/mlx5/device.h | 13 +-
include/linux/mlx5/driver.h | 2 +
include/linux/mlx5/mlx5_ifc.h | 134 +-
include/linux/mlx5/qp.h | 2 +-
include/linux/mlx5/rsc_dump.h | 51 +
include/linux/vdpa.h | 66 +-
include/linux/virtio_caif.h | 6 +-
include/linux/virtio_config.h | 191 +-
include/linux/virtio_ring.h | 19 +-
include/uapi/linux/vhost.h | 2 +
include/uapi/linux/vhost_types.h | 11 +
include/uapi/linux/virtio_9p.h | 4 +-
include/uapi/linux/virtio_balloon.h | 10 +-
include/uapi/linux/virtio_blk.h | 26 +-
include/uapi/linux/virtio_config.h | 10 +-
include/uapi/linux/virtio_console.h | 8 +-
include/uapi/linux/virtio_crypto.h | 26 +-
include/uapi/linux/virtio_fs.h | 2 +-
include/uapi/linux/virtio_gpu.h | 8 +-
include/uapi/linux/virtio_input.h | 18 +-
include/uapi/linux/virtio_iommu.h | 12 +-
include/uapi/linux/virtio_mem.h | 14 +-
include/uapi/linux/virtio_net.h | 8 +-
include/uapi/linux/virtio_pmem.h | 4 +-
include/uapi/linux/virtio_scsi.h | 20 +-
tools/virtio/linux/virtio_config.h | 6 +-
virt/lib/irqbypass.c | 16 +-
78 files changed, 4116 insertions(+), 487 deletions(-)
create mode 100644 drivers/vdpa/mlx5/Makefile
create mode 100644 drivers/vdpa/mlx5/core/mlx5_vdpa.h
create mode 100644 drivers/vdpa/mlx5/core/mlx5_vdpa_ifc.h
create mode 100644 drivers/vdpa/mlx5/core/mr.c
create mode 100644 drivers/vdpa/mlx5/core/resources.c
create mode 100644 drivers/vdpa/mlx5/net/main.c
create mode 100644 drivers/vdpa/mlx5/net/mlx5_vnet.c
create mode 100644 drivers/vdpa/mlx5/net/mlx5_vnet.h
create mode 100644 include/linux/mlx5/rsc_dump.h
The patch titled
Subject: Revert "mm/vmstat.c: do not show lowmem reserve protection information of empty zone"
has been added to the -mm tree. Its filename is
revert-mm-vmstatc-do-not-show-lowmem-reserve-protection-information-of-empty-zone.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/revert-mm-vmstatc-do-not-show-lowm…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/revert-mm-vmstatc-do-not-show-lowm…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Baoquan He <bhe(a)redhat.com>
Subject: Revert "mm/vmstat.c: do not show lowmem reserve protection information of empty zone"
This reverts commit 26e7deadaae175.
Sonny reported that one of their tests started failing on the latest
kernel on their Chrome OS platform. The root cause is that the above
commit removed the protection line of empty zone, while the parser used in
the test relies on the protection line to mark the end of each zone.
Let's revert it to avoid breaking userspace testing or applications.
Link: http://lkml.kernel.org/r/20200811075412.12872-1-bhe@redhat.com
Fixes: 26e7deadaae175 ("mm/vmstat.c: do not show lowmem reserve protection information of empty zone)"
Signed-off-by: Baoquan He <bhe(a)redhat.com>
Reported-by: Sonny Rao <sonnyrao(a)chromium.org>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Cc: <stable(a)vger.kernel.org> [5.8.x]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmstat.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--- a/mm/vmstat.c~revert-mm-vmstatc-do-not-show-lowmem-reserve-protection-information-of-empty-zone
+++ a/mm/vmstat.c
@@ -1618,12 +1618,6 @@ static void zoneinfo_show_print(struct s
zone->present_pages,
zone_managed_pages(zone));
- /* If unpopulated, no other information is useful */
- if (!populated_zone(zone)) {
- seq_putc(m, '\n');
- return;
- }
-
seq_printf(m,
"\n protection: (%ld",
zone->lowmem_reserve[0]);
@@ -1631,6 +1625,12 @@ static void zoneinfo_show_print(struct s
seq_printf(m, ", %ld", zone->lowmem_reserve[i]);
seq_putc(m, ')');
+ /* If unpopulated, no other information is useful */
+ if (!populated_zone(zone)) {
+ seq_putc(m, '\n');
+ return;
+ }
+
for (i = 0; i < NR_VM_ZONE_STAT_ITEMS; i++)
seq_printf(m, "\n %-12s %lu", zone_stat_name(i),
zone_page_state(zone, i));
_
Patches currently in -mm which might be from bhe(a)redhat.com are
revert-mm-vmstatc-do-not-show-lowmem-reserve-protection-information-of-empty-zone.patch
The patch titled
Subject: mm: slub: fix conversion of freelist_corrupted()
has been added to the -mm tree. Its filename is
mm-slub-fix-conversion-of-freelist_corrupted.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-slub-fix-conversion-of-freelist…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-slub-fix-conversion-of-freelist…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Eugeniu Rosca <erosca(a)de.adit-jv.com>
Subject: mm: slub: fix conversion of freelist_corrupted()
Commit 52f23478081ae0 ("mm/slub.c: fix corrupted freechain in
deactivate_slab()") suffered an update when picked up from LKML [1].
Specifically, relocating 'freelist = NULL' into 'freelist_corrupted()'
created a no-op statement. Fix it by sticking to the behavior intended in
the original patch [1]. Prefer the lowest-line-count solution.
[1] https://lore.kernel.org/linux-mm/20200331031450.12182-1-dongli.zhang@oracle…
Link: http://lkml.kernel.org/r/20200811124656.10308-1-erosca@de.adit-jv.com
Fixes: 52f23478081ae0 ("mm/slub.c: fix corrupted freechain in deactivate_slab()")
Signed-off-by: Eugeniu Rosca <erosca(a)de.adit-jv.com>
Cc: Dongli Zhang <dongli.zhang(a)oracle.com>
Cc: Joe Jin <joe.jin(a)oracle.com>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/slub.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/mm/slub.c~mm-slub-fix-conversion-of-freelist_corrupted
+++ a/mm/slub.c
@@ -677,7 +677,6 @@ static bool freelist_corrupted(struct km
if ((s->flags & SLAB_CONSISTENCY_CHECKS) &&
!check_valid_pointer(s, page, nextfree)) {
object_err(s, page, freelist, "Freechain corrupt");
- freelist = NULL;
slab_fix(s, "Isolate corrupted freechain");
return true;
}
@@ -2184,8 +2183,10 @@ static void deactivate_slab(struct kmem_
* 'freelist' is already corrupted. So isolate all objects
* starting at 'freelist'.
*/
- if (freelist_corrupted(s, page, freelist, nextfree))
+ if (freelist_corrupted(s, page, freelist, nextfree)) {
+ freelist = NULL;
break;
+ }
do {
prior = page->freelist;
_
Patches currently in -mm which might be from erosca(a)de.adit-jv.com are
mm-slub-fix-conversion-of-freelist_corrupted.patch
Hi,
Can we queue up a backport of:
commit 4c6e277c4cc4a6b3b2b9c66a7b014787ae757cc1
Author: Jens Axboe <axboe(a)kernel.dk>
Date: Wed Jul 1 11:29:10 2020 -0600
io_uring: abstract out task work running
for 5.7 and 5.8 stable? It fixes a reported issue from Dave Chinner,
since the abstraction also ensures that we always set the current
task state appropriately before running task work.
I've attached both a 5.8 and 5.7 port of the patch.
Thanks!
--
Jens Axboe
This is the start of the stable review cycle for the 4.19.139 release.
There are 48 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 12 Aug 2020 15:17:47 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.139-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.139-rc1
Eric Biggers <ebiggers(a)google.com>
Smack: fix use-after-free in smk_write_relabel_self()
Martyna Szapar <martyna.szapar(a)intel.com>
i40e: Memory leak in i40e_config_iwarp_qvlist
Martyna Szapar <martyna.szapar(a)intel.com>
i40e: Fix of memory leak and integer truncation in i40e_virtchnl.c
Grzegorz Siwik <grzegorz.siwik(a)intel.com>
i40e: Wrong truncation from u16 to u8
Sergey Nemov <sergey.nemov(a)intel.com>
i40e: add num_vectors checker in iwarp handler
David Howells <dhowells(a)redhat.com>
rxrpc: Fix race between recvmsg and sendmsg on immediate call failure
Willem de Bruijn <willemb(a)google.com>
selftests/net: relax cpu affinity requirement in msg_zerocopy test
Hangbin Liu <liuhangbin(a)gmail.com>
Revert "vxlan: fix tos value before xmit"
Peilin Ye <yepeilin.cs(a)gmail.com>
openvswitch: Prevent kernel-infoleak in ovs_ct_put_key()
Xin Long <lucien.xin(a)gmail.com>
net: thunderx: use spin_lock_bh in nicvf_set_rx_mode_task()
Lorenzo Bianconi <lorenzo(a)kernel.org>
net: gre: recompute gre csum for sctp over gre tunnels
Stephen Hemminger <stephen(a)networkplumber.org>
hv_netvsc: do not use VF device if link is down
Johan Hovold <johan(a)kernel.org>
net: lan78xx: replace bogus endpoint lookup
Ido Schimmel <idosch(a)mellanox.com>
vxlan: Ensure FDB dump is performed under RCU
Landen Chao <landen.chao(a)mediatek.com>
net: ethernet: mtk_eth_soc: fix MTU warnings
Cong Wang <xiyou.wangcong(a)gmail.com>
ipv6: fix memory leaks on IPV6_ADDRFORM path
Ido Schimmel <idosch(a)mellanox.com>
ipv4: Silence suspicious RCU usage warning
Frank van der Linden <fllinden(a)amazon.com>
xattr: break delegations in {set,remove}xattr
Dexuan Cui <decui(a)microsoft.com>
Drivers: hv: vmbus: Ignore CHANNELMSG_TL_CONNECT_RESULT(23)
Philippe Duplessis-Guindon <pduplessis(a)efficios.com>
tools lib traceevent: Fix memory leak in process_dynamic_array_len
Xin Xiong <xiongx18(a)fudan.edu.cn>
atm: fix atm_dev refcnt leaks in atmtcp_remove_persistent
Francesco Ruggeri <fruggeri(a)arista.com>
igb: reinit_locked() should be called with rtnl_lock
Julian Squires <julian(a)cipht.net>
cfg80211: check vendor command doit pointer before use
Qiushi Wu <wu000273(a)umn.edu>
firmware: Fix a reference count leak.
Rustam Kovhaev <rkovhaev(a)gmail.com>
usb: hso: check for return value in hso_serial_common_create()
Wolfram Sang <wsa+renesas(a)sang-engineering.com>
i2c: slave: add sanity check when unregistering
Wolfram Sang <wsa+renesas(a)sang-engineering.com>
i2c: slave: improve sanity check when registering
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/fbcon: zero-initialise the mode_cmd2 structure
Ben Skeggs <bskeggs(a)redhat.com>
drm/nouveau/fbcon: fix module unload when fbcon init has failed for some reason
Christoph Hellwig <hch(a)lst.de>
net/9p: validate fds in p9_fd_open
Johan Hovold <johan(a)kernel.org>
leds: 88pm860x: fix use-after-free on unbind
Johan Hovold <johan(a)kernel.org>
leds: lm3533: fix use-after-free on unbind
Johan Hovold <johan(a)kernel.org>
leds: da903x: fix use-after-free on unbind
Johan Hovold <johan(a)kernel.org>
leds: wm831x-status: fix use-after-free on unbind
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
mtd: properly check all write ioctls for permissions
Yunhai Zhang <zhangyunhai(a)nsfocus.com>
vgacon: Fix for missing check in scrollback handling
Jann Horn <jannh(a)google.com>
binder: Prevent context manager from incrementing ref 0
Adam Ford <aford173(a)gmail.com>
omapfb: dss: Fix max fclk divider for omap36xx
Peilin Ye <yepeilin.cs(a)gmail.com>
Bluetooth: Prevent out-of-bounds read in hci_inquiry_result_with_rssi_evt()
Peilin Ye <yepeilin.cs(a)gmail.com>
Bluetooth: Prevent out-of-bounds read in hci_inquiry_result_evt()
Peilin Ye <yepeilin.cs(a)gmail.com>
Bluetooth: Fix slab-out-of-bounds read in hci_extended_inquiry_result_evt()
Suren Baghdasaryan <surenb(a)google.com>
staging: android: ashmem: Fix lockdep warning for write operation
Takashi Iwai <tiwai(a)suse.de>
ALSA: seq: oss: Serialize ioctls
Hui Wang <hui.wang(a)canonical.com>
Revert "ALSA: hda: call runtime_allow() for all hda controllers"
Forest Crossman <cyrozap(a)gmail.com>
usb: xhci: Fix ASMedia ASM1142 DMA addressing
Forest Crossman <cyrozap(a)gmail.com>
usb: xhci: define IDs for various ASMedia host controllers
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
USB: iowarrior: fix up report size handling for some devices
Erik Ekman <erik(a)kryo.se>
USB: serial: qcserial: add EM7305 QDL product ID
-------------
Diffstat:
Makefile | 4 +-
drivers/android/binder.c | 15 ++-
drivers/atm/atmtcp.c | 10 +-
drivers/firmware/qemu_fw_cfg.c | 7 +-
drivers/gpu/drm/nouveau/nouveau_fbcon.c | 3 +-
drivers/hv/channel_mgmt.c | 21 ++--
drivers/hv/vmbus_drv.c | 4 +
drivers/i2c/i2c-core-slave.c | 7 +-
drivers/leds/leds-88pm860x.c | 14 ++-
drivers/leds/leds-da903x.c | 14 ++-
drivers/leds/leds-lm3533.c | 12 ++-
drivers/leds/leds-wm831x-status.c | 14 ++-
drivers/mtd/mtdchar.c | 56 ++++++++--
drivers/net/ethernet/cavium/thunder/nicvf_main.c | 4 +-
drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 51 ++++++---
drivers/net/ethernet/intel/igb/igb_main.c | 9 ++
drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 +
drivers/net/hyperv/netvsc_drv.c | 7 +-
drivers/net/usb/hso.c | 5 +-
drivers/net/usb/lan78xx.c | 117 ++++++---------------
drivers/net/vxlan.c | 10 +-
drivers/staging/android/ashmem.c | 12 +++
drivers/usb/host/xhci-pci.c | 10 +-
drivers/usb/misc/iowarrior.c | 35 ++++--
drivers/usb/serial/qcserial.c | 1 +
drivers/video/console/vgacon.c | 4 +
drivers/video/fbdev/omap2/omapfb/dss/dss.c | 2 +-
fs/xattr.c | 84 +++++++++++++--
include/linux/hyperv.h | 2 +
include/linux/xattr.h | 2 +
include/net/addrconf.h | 1 +
net/9p/trans_fd.c | 24 +++--
net/bluetooth/hci_event.c | 11 +-
net/ipv4/fib_trie.c | 2 +-
net/ipv4/gre_offload.c | 13 ++-
net/ipv6/anycast.c | 17 ++-
net/ipv6/ipv6_sockglue.c | 1 +
net/openvswitch/conntrack.c | 38 +++----
net/rxrpc/call_object.c | 27 +++--
net/rxrpc/conn_object.c | 8 +-
net/rxrpc/recvmsg.c | 2 +-
net/rxrpc/sendmsg.c | 3 +
net/wireless/nl80211.c | 6 +-
security/smack/smackfs.c | 13 ++-
sound/core/seq/oss/seq_oss.c | 8 +-
sound/pci/hda/hda_intel.c | 1 -
tools/lib/traceevent/event-parse.c | 1 +
tools/testing/selftests/net/msg_zerocopy.c | 5 +-
48 files changed, 488 insertions(+), 231 deletions(-)