Due to using gcc defines for configuration, some labels might be unused in
certain configurations. While adding a __maybe_unused to the label is
fine in general, the line has to be terminated with ';'. This is also
reflected in the gcc documentation, but gcc parsed the previous variant
without an error message.
This has been spotted while compiling with goto-cc, the compiler for the
CPROVER tool suite.
Signed-off-by: Norbert Manthey <nmanthey(a)amazon.de>
Signed-off-by: Michael Tautschnig <tautschn(a)amazon.co.uk>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: linux-kernel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
---
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 26a71eb..3af8fb6 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6716,7 +6716,7 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf
p = task_of(se);
-done: __maybe_unused
+done: __maybe_unused;
#ifdef CONFIG_SMP
/*
* Move the next running task to the front of
--
2.7.4
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
The patch
ASoC: topology: Fix logical inversion in set_link_hw_format()
has been applied to the asoc tree at
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git
All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying
to this mail.
Thanks,
Mark
>From 7f34c3f38e9666216fda155b6ebededd1073f5aa Mon Sep 17 00:00:00 2001
From: Pan Xiuli <xiuli.pan(a)linux.intel.com>
Date: Fri, 23 Feb 2018 06:01:28 +0800
Subject: [PATCH] ASoC: topology: Fix logical inversion in set_link_hw_format()
The master/slave conventions are wrt. the codec. The topology code
contains a logical inversion, fix to follow ASoC usage.
Signed-off-by: Pan Xiuli <xiuli.pan(a)linux.intel.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
sound/soc/soc-topology.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/sound/soc/soc-topology.c b/sound/soc/soc-topology.c
index 01a50413c66f..5f507993c43e 100644
--- a/sound/soc/soc-topology.c
+++ b/sound/soc/soc-topology.c
@@ -1996,11 +1996,11 @@ static void set_link_hw_format(struct snd_soc_dai_link *link,
/* clock masters */
bclk_master = hw_config->bclk_master;
fsync_master = hw_config->fsync_master;
- if (!bclk_master && !fsync_master)
+ if (bclk_master && fsync_master)
link->dai_fmt |= SND_SOC_DAIFMT_CBM_CFM;
- else if (bclk_master && !fsync_master)
- link->dai_fmt |= SND_SOC_DAIFMT_CBS_CFM;
else if (!bclk_master && fsync_master)
+ link->dai_fmt |= SND_SOC_DAIFMT_CBS_CFM;
+ else if (bclk_master && !fsync_master)
link->dai_fmt |= SND_SOC_DAIFMT_CBM_CFS;
else
link->dai_fmt |= SND_SOC_DAIFMT_CBS_CFS;
--
2.16.1
On 02/26/18 20:47, Mark Brown wrote:
> On Mon, Feb 26, 2018 at 07:34:07PM +0100, Kirill Marinushkin wrote:
>
>> 2. The suggested commit breaks the existing ASoC.
>> The existing functionality already works with several existing ASoC by
>> Intel. The suggested commit will break it for the following reason:
>> The ALSA topology mechanism loads the binary topology file into ASoC. The
>> suggested commit modifies the parser, but the binaries are already created for
>> the existing functionality. As a result, all existing binaries will be parsed
>> incorrectly.
> Are there actual binaries using the existing code or is that just
> theoretical? My impression was that this was fixing things for existing
> binaries which were shipped for non-mainline kernels, is that not the
> case?
There is an actual conf, which can be converted into an actual binary:
`alsa-lib/src/conf/topology/broadwell/broadwell.conf`
Theoretically, there can be an unknown number of the binaries, not available publicly.
Broadwell exists for a long time, and there is no way to be sure that all the theoretically
existing binaries are modified together with this kernel patch.
On Mon, 2018-02-12 at 08:24 +0100, Paolo Valente wrote:
>
> > Il giorno 10 feb 2018, alle ore 09:29, Oleksandr Natalenko <oleksandr(a)natalenko.name> ha scritto:
> >
> > Hi.
> >
> > On pátek 9. února 2018 18:29:39 CET Mike Galbraith wrote:
> >> On Fri, 2018-02-09 at 14:21 +0100, Oleksandr Natalenko wrote:
> >>> In addition to this I think it should be worth considering CC'ing Greg
> >>> to pull this fix into 4.15 stable tree.
> >>
> >> This isn't one he can cherry-pick, some munging required, in which case
> >> he usually wants a properly tested backport.
> >>
> >> -Mike
> >
> > Maybe, this could be a good opportunity to push all the pending BFQ patches
> > into the stable 4.15 branch? Because IIUC currently BFQ in 4.15 is just
> > unusable.
> >
> > Paolo?
> >
>
> Of course ok for me, and thanks Oleksandr for proposing this. These
> commits should apply cleanly on 4.15, and FWIW have been tested, by me
> and BFQ users, on 4.15 too in these months.
How does that work without someone actually submitting patches? CC
stable and pass along a conveniently sorted cherry-pick list?
9b25bd0368d5 block, bfq: remove batches of confusing ifdefs
a34b024448eb block, bfq: consider also past I/O in soft real-time detection
4403e4e467c3 block, bfq: remove superfluous check in queue-merging setup
7b8fa3b900a0 block, bfq: let a queue be merged only shortly after starting I/O
1be6e8a964ee block, bfq: check low_latency flag in bfq_bfqq_save_state()
05e902835616 block, bfq: add missing rq_pos_tree update on rq removal
f0ba5ea2fe45 block, bfq: increase threshold to deem I/O as random
0d52af590552 block, bfq: release oom-queue ref to root group on exit
52257ffbfcaf block, bfq: put async queues for root bfq groups too
8abef10b3de1 bfq-iosched: don't call bfqg_and_blkg_put for !CONFIG_BFQ_GROUP_IOSCHED
8993d445df38 block, bfq: fix occurrences of request finish method's old name
8a8747dc01ce block, bfq: limit sectors served with interactive weight raising
a52a69ea89dc block, bfq: limit tags for writes and async I/O
a78773906147 block, bfq: add requeue-request hook
> ---
> >
> > block, bfq: add requeue-request hook
> > bfq-iosched: don't call bfqg_and_blkg_put for !CONFIG_BFQ_GROUP_IOSCHED
> > block, bfq: release oom-queue ref to root group on exit
> > block, bfq: put async queues for root bfq groups too
> > block, bfq: limit sectors served with interactive weight raising
> > block, bfq: limit tags for writes and async I/O
> > block, bfq: increase threshold to deem I/O as random
> > block, bfq: remove superfluous check in queue-merging setup
> > block, bfq: let a queue be merged only shortly after starting I/O
> > block, bfq: check low_latency flag in bfq_bfqq_save_state()
> > block, bfq: add missing rq_pos_tree update on rq removal
> > block, bfq: fix occurrences of request finish method's old name
> > block, bfq: consider also past I/O in soft real-time detection
> > block, bfq: remove batches of confusing ifdefs
> >
> >
>
When an MSI descriptor was not available, the error path would try to
unbind an irq that was never acquired - potentially unbinding an
unrelated irq.
Fixes: 4892c9b4ada9f9 ("xen: add support for MSI message groups")
Reported-by: Hooman Mirhadi <mirhadih(a)amazon.com>
CC: <stable(a)vger.kernel.org>
CC: Roger Pau Monné <roger.pau(a)citrix.com>
CC: David Vrabel <david.vrabel(a)citrix.com>
CC: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
CC: Eduardo Valentin <eduval(a)amazon.com>
CC: Juergen Gross <jgross(a)suse.com>
CC: Thomas Gleixner <tglx(a)linutronix.de>
CC: "K. Y. Srinivasan" <kys(a)microsoft.com>
CC: Liu Shuo <shuo.a.liu(a)intel.com>
CC: Anoob Soman <anoob.soman(a)citrix.com>
Signed-off-by: Amit Shah <aams(a)amazon.com>
---
drivers/xen/events/events_base.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 1ab4bd1..b6b8b29 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -749,6 +749,7 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct msi_desc *msidesc,
}
ret = irq_set_msi_desc(irq, msidesc);
+ i--;
if (ret < 0)
goto error_irq;
out:
--
2.7.3.AMZN
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
commit 8e1eb3fa009aa7c0b944b3c8b26b07de0efb3200 upstream.
At entry userspace may have (maliciously) populated the extra registers
outside the syscall calling convention with arbitrary values that could
be useful in a speculative execution (Spectre style) attack.
Clear these registers to minimize the kernel's attack surface.
Note, this only clears the extra registers and not the unused
registers for syscalls less than 6 arguments, since those registers are
likely to be clobbered well before their values could be put to use
under speculation.
Note, Linus found that the XOR instructions can be executed with
minimized cost if interleaved with the PUSH instructions, and Ingo's
analysis found that R10 and R11 should be included in the register
clearing beyond the typical 'extra' syscall calling convention
registers.
Suggested-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Reported-by: Andi Kleen <ak(a)linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: http://lkml.kernel.org/r/151787988577.7847.16733592218894189003.stgit@dwill…
[ Made small improvements to the changelog and the code comments. ]
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
---
arch/x86/entry/entry_64.S | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index db5009ce065a..8d7e4d48db0d 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -176,13 +176,26 @@ GLOBAL(entry_SYSCALL_64_after_swapgs)
pushq %r8 /* pt_regs->r8 */
pushq %r9 /* pt_regs->r9 */
pushq %r10 /* pt_regs->r10 */
+ /*
+ * Clear extra registers that a speculation attack might
+ * otherwise want to exploit. Interleave XOR with PUSH
+ * for better uop scheduling:
+ */
+ xorq %r10, %r10 /* nospec r10 */
pushq %r11 /* pt_regs->r11 */
+ xorq %r11, %r11 /* nospec r11 */
pushq %rbx /* pt_regs->rbx */
+ xorl %ebx, %ebx /* nospec rbx */
pushq %rbp /* pt_regs->rbp */
+ xorl %ebp, %ebp /* nospec rbp */
pushq %r12 /* pt_regs->r12 */
+ xorq %r12, %r12 /* nospec r12 */
pushq %r13 /* pt_regs->r13 */
+ xorq %r13, %r13 /* nospec r13 */
pushq %r14 /* pt_regs->r14 */
+ xorq %r14, %r14 /* nospec r14 */
pushq %r15 /* pt_regs->r15 */
+ xorq %r15, %r15 /* nospec r15 */
/* IRQs are off. */
movq %rsp, %rdi
commit b70131de648c2b997d22f4653934438013f407a1 upstream.
V4L2 memory registrations are incompatible with filesystem-dax that
needs the ability to revoke dma access to a mapping at will, or
otherwise allow the kernel to wait for completion of DMA. The
filesystem-dax implementation breaks the traditional solution of
truncate of active file backed mappings since there is no page-cache
page we can orphan to sustain ongoing DMA.
If v4l2 wants to support long lived DMA mappings it needs to arrange to
hold a file lease or use some other mechanism so that the kernel can
coordinate revoking DMA access when the filesystem needs to truncate
mappings.
Link: http://lkml.kernel.org/r/151068940499.7446.12846708245365671207.stgit@dwill…
Fixes: 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings")
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Reported-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Cc: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Doug Ledford <dledford(a)redhat.com>
Cc: Hal Rosenstock <hal.rosenstock(a)gmail.com>
Cc: Inki Dae <inki.dae(a)samsung.com>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: Jeff Moyer <jmoyer(a)redhat.com>
Cc: Joonyoung Shim <jy0922.shim(a)samsung.com>
Cc: Kyungmin Park <kyungmin.park(a)samsung.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Cc: Sean Hefty <sean.hefty(a)intel.com>
Cc: Seung-Woo Kim <sw0312.kim(a)samsung.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
---
drivers/media/v4l2-core/videobuf-dma-sg.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/media/v4l2-core/videobuf-dma-sg.c b/drivers/media/v4l2-core/videobuf-dma-sg.c
index 1db0af6c7f94..b6189a4958c5 100644
--- a/drivers/media/v4l2-core/videobuf-dma-sg.c
+++ b/drivers/media/v4l2-core/videobuf-dma-sg.c
@@ -185,12 +185,13 @@ static int videobuf_dma_init_user_locked(struct videobuf_dmabuf *dma,
dprintk(1, "init user [0x%lx+0x%lx => %d pages]\n",
data, size, dma->nr_pages);
- err = get_user_pages(data & PAGE_MASK, dma->nr_pages,
+ err = get_user_pages_longterm(data & PAGE_MASK, dma->nr_pages,
flags, dma->pages, NULL);
if (err != dma->nr_pages) {
dma->nr_pages = (err >= 0) ? err : 0;
- dprintk(1, "get_user_pages: err=%d [%d]\n", err, dma->nr_pages);
+ dprintk(1, "get_user_pages_longterm: err=%d [%d]\n", err,
+ dma->nr_pages);
return err < 0 ? err : -EINVAL;
}
return 0;
commit 2bb6d2837083de722bfdc369cb0d76ce188dd9b4 upstream.
Patch series "introduce get_user_pages_longterm()", v2.
Here is a new get_user_pages api for cases where a driver intends to
keep an elevated page count indefinitely. This is distinct from usages
like iov_iter_get_pages where the elevated page counts are transient.
The iov_iter_get_pages cases immediately turn around and submit the
pages to a device driver which will put_page when the i/o operation
completes (under kernel control).
In the longterm case userspace is responsible for dropping the page
reference at some undefined point in the future. This is untenable for
filesystem-dax case where the filesystem is in control of the lifetime
of the block / page and needs reasonable limits on how long it can wait
for pages in a mapping to become idle.
Fixing filesystems to actually wait for dax pages to be idle before
blocks from a truncate/hole-punch operation are repurposed is saved for
a later patch series.
Also, allowing longterm registration of dax mappings is a future patch
series that introduces a "map with lease" semantic where the kernel can
revoke a lease and force userspace to drop its page references.
I have also tagged these for -stable to purposely break cases that might
assume that longterm memory registrations for filesystem-dax mappings
were supported by the kernel. The behavior regression this policy
change implies is one of the reasons we maintain the "dax enabled.
Warning: EXPERIMENTAL, use at your own risk" notification when mounting
a filesystem in dax mode.
It is worth noting the device-dax interface does not suffer the same
constraints since it does not support file space management operations
like hole-punch.
This patch (of 4):
Until there is a solution to the dma-to-dax vs truncate problem it is
not safe to allow long standing memory registrations against
filesytem-dax vmas. Device-dax vmas do not have this problem and are
explicitly allowed.
This is temporary until a "memory registration with layout-lease"
mechanism can be implemented for the affected sub-systems (RDMA and
V4L2).
[akpm(a)linux-foundation.org: use kcalloc()]
Link: http://lkml.kernel.org/r/151068939435.7446.13560129395419350737.stgit@dwill…
Fixes: 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings")
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Suggested-by: Christoph Hellwig <hch(a)lst.de>
Cc: Doug Ledford <dledford(a)redhat.com>
Cc: Hal Rosenstock <hal.rosenstock(a)gmail.com>
Cc: Inki Dae <inki.dae(a)samsung.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: Jeff Moyer <jmoyer(a)redhat.com>
Cc: Joonyoung Shim <jy0922.shim(a)samsung.com>
Cc: Kyungmin Park <kyungmin.park(a)samsung.com>
Cc: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Cc: Sean Hefty <sean.hefty(a)intel.com>
Cc: Seung-Woo Kim <sw0312.kim(a)samsung.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
---
include/linux/dax.h | 5 ----
include/linux/fs.h | 20 ++++++++++++++++
include/linux/mm.h | 13 ++++++++++
mm/gup.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 97 insertions(+), 5 deletions(-)
diff --git a/include/linux/dax.h b/include/linux/dax.h
index add6c4bc568f..ed9cf2f5cd06 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -61,11 +61,6 @@ static inline int dax_pmd_fault(struct vm_area_struct *vma, unsigned long addr,
int dax_pfn_mkwrite(struct vm_area_struct *, struct vm_fault *);
#define dax_mkwrite(vma, vmf, gb) dax_fault(vma, vmf, gb)
-static inline bool vma_is_dax(struct vm_area_struct *vma)
-{
- return vma->vm_file && IS_DAX(vma->vm_file->f_mapping->host);
-}
-
static inline bool dax_mapping(struct address_space *mapping)
{
return mapping->host && IS_DAX(mapping->host);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index d705ae084edd..745ea1b2e02c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -18,6 +18,7 @@
#include <linux/bug.h>
#include <linux/mutex.h>
#include <linux/rwsem.h>
+#include <linux/mm_types.h>
#include <linux/capability.h>
#include <linux/semaphore.h>
#include <linux/fiemap.h>
@@ -3033,6 +3034,25 @@ static inline bool io_is_direct(struct file *filp)
return (filp->f_flags & O_DIRECT) || IS_DAX(filp->f_mapping->host);
}
+static inline bool vma_is_dax(struct vm_area_struct *vma)
+{
+ return vma->vm_file && IS_DAX(vma->vm_file->f_mapping->host);
+}
+
+static inline bool vma_is_fsdax(struct vm_area_struct *vma)
+{
+ struct inode *inode;
+
+ if (!vma->vm_file)
+ return false;
+ if (!vma_is_dax(vma))
+ return false;
+ inode = file_inode(vma->vm_file);
+ if (inode->i_mode == S_IFCHR)
+ return false; /* device-dax */
+ return true;
+}
+
static inline int iocb_flags(struct file *file)
{
int res = 0;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2217e2f18247..8e506783631b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1288,6 +1288,19 @@ long __get_user_pages_unlocked(struct task_struct *tsk, struct mm_struct *mm,
struct page **pages, unsigned int gup_flags);
long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
struct page **pages, unsigned int gup_flags);
+#ifdef CONFIG_FS_DAX
+long get_user_pages_longterm(unsigned long start, unsigned long nr_pages,
+ unsigned int gup_flags, struct page **pages,
+ struct vm_area_struct **vmas);
+#else
+static inline long get_user_pages_longterm(unsigned long start,
+ unsigned long nr_pages, unsigned int gup_flags,
+ struct page **pages, struct vm_area_struct **vmas)
+{
+ return get_user_pages(start, nr_pages, gup_flags, pages, vmas);
+}
+#endif /* CONFIG_FS_DAX */
+
int get_user_pages_fast(unsigned long start, int nr_pages, int write,
struct page **pages);
diff --git a/mm/gup.c b/mm/gup.c
index c63a0341ae38..6c3b4e822946 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -982,6 +982,70 @@ long get_user_pages(unsigned long start, unsigned long nr_pages,
}
EXPORT_SYMBOL(get_user_pages);
+#ifdef CONFIG_FS_DAX
+/*
+ * This is the same as get_user_pages() in that it assumes we are
+ * operating on the current task's mm, but it goes further to validate
+ * that the vmas associated with the address range are suitable for
+ * longterm elevated page reference counts. For example, filesystem-dax
+ * mappings are subject to the lifetime enforced by the filesystem and
+ * we need guarantees that longterm users like RDMA and V4L2 only
+ * establish mappings that have a kernel enforced revocation mechanism.
+ *
+ * "longterm" == userspace controlled elevated page count lifetime.
+ * Contrast this to iov_iter_get_pages() usages which are transient.
+ */
+long get_user_pages_longterm(unsigned long start, unsigned long nr_pages,
+ unsigned int gup_flags, struct page **pages,
+ struct vm_area_struct **vmas_arg)
+{
+ struct vm_area_struct **vmas = vmas_arg;
+ struct vm_area_struct *vma_prev = NULL;
+ long rc, i;
+
+ if (!pages)
+ return -EINVAL;
+
+ if (!vmas) {
+ vmas = kcalloc(nr_pages, sizeof(struct vm_area_struct *),
+ GFP_KERNEL);
+ if (!vmas)
+ return -ENOMEM;
+ }
+
+ rc = get_user_pages(start, nr_pages, gup_flags, pages, vmas);
+
+ for (i = 0; i < rc; i++) {
+ struct vm_area_struct *vma = vmas[i];
+
+ if (vma == vma_prev)
+ continue;
+
+ vma_prev = vma;
+
+ if (vma_is_fsdax(vma))
+ break;
+ }
+
+ /*
+ * Either get_user_pages() failed, or the vma validation
+ * succeeded, in either case we don't need to put_page() before
+ * returning.
+ */
+ if (i >= rc)
+ goto out;
+
+ for (i = 0; i < rc; i++)
+ put_page(pages[i]);
+ rc = -EOPNOTSUPP;
+out:
+ if (vmas != vmas_arg)
+ kfree(vmas);
+ return rc;
+}
+EXPORT_SYMBOL(get_user_pages_longterm);
+#endif /* CONFIG_FS_DAX */
+
/**
* populate_vma_page_range() - populate a range of pages in the vma.
* @vma: target vma
This is a note to let you know that I've just added the patch titled
mm: Fix devm_memremap_pages() collision handling
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-fix-devm_memremap_pages-collision-handling.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Feb 26 20:55:53 CET 2018
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Fri, 23 Feb 2018 14:06:10 -0800
Subject: mm: Fix devm_memremap_pages() collision handling
To: gregkh(a)linuxfoundation.org
Cc:
Message-ID: <151942357089.21775.3486425046348885247.stgit(a)dwillia2-desk3.amr.corp.intel.com>
From: Jan H. Schönherr <jschoenh(a)amazon.de>
commit 77dd66a3c67c93ab401ccc15efff25578be281fd upstream.
If devm_memremap_pages() detects a collision while adding entries
to the radix-tree, we call pgmap_radix_release(). Unfortunately,
the function removes *all* entries for the range -- including the
entries that caused the collision in the first place.
Modify pgmap_radix_release() to take an additional argument to
indicate where to stop, so that only newly added entries are removed
from the tree.
Cc: <stable(a)vger.kernel.org>
Fixes: 9476df7d80df ("mm: introduce find_dev_pagemap()")
Signed-off-by: Jan H. Schönherr <jschoenh(a)amazon.de>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/memremap.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -194,7 +194,7 @@ void put_zone_device_page(struct page *p
}
EXPORT_SYMBOL(put_zone_device_page);
-static void pgmap_radix_release(struct resource *res)
+static void pgmap_radix_release(struct resource *res, resource_size_t end_key)
{
resource_size_t key, align_start, align_size, align_end;
@@ -203,8 +203,11 @@ static void pgmap_radix_release(struct r
align_end = align_start + align_size - 1;
mutex_lock(&pgmap_lock);
- for (key = res->start; key <= res->end; key += SECTION_SIZE)
+ for (key = res->start; key <= res->end; key += SECTION_SIZE) {
+ if (key >= end_key)
+ break;
radix_tree_delete(&pgmap_radix, key >> PA_SECTION_SHIFT);
+ }
mutex_unlock(&pgmap_lock);
}
@@ -255,7 +258,7 @@ static void devm_memremap_pages_release(
unlock_device_hotplug();
untrack_pfn(NULL, PHYS_PFN(align_start), align_size);
- pgmap_radix_release(res);
+ pgmap_radix_release(res, -1);
dev_WARN_ONCE(dev, pgmap->altmap && pgmap->altmap->alloc,
"%s: failed to free all reserved pages\n", __func__);
}
@@ -289,7 +292,7 @@ struct dev_pagemap *find_dev_pagemap(res
void *devm_memremap_pages(struct device *dev, struct resource *res,
struct percpu_ref *ref, struct vmem_altmap *altmap)
{
- resource_size_t key, align_start, align_size, align_end;
+ resource_size_t key = 0, align_start, align_size, align_end;
pgprot_t pgprot = PAGE_KERNEL;
struct dev_pagemap *pgmap;
struct page_map *page_map;
@@ -392,7 +395,7 @@ void *devm_memremap_pages(struct device
untrack_pfn(NULL, PHYS_PFN(align_start), align_size);
err_pfn_remap:
err_radix:
- pgmap_radix_release(res);
+ pgmap_radix_release(res, key);
devres_free(page_map);
return ERR_PTR(error);
}
Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are
queue-4.9/mm-fix-devm_memremap_pages-collision-handling.patch
queue-4.9/ib-core-disable-memory-registration-of-filesystem-dax-vmas.patch
queue-4.9/mm-avoid-spurious-bad-pmd-warning-messages.patch
queue-4.9/mm-introduce-get_user_pages_longterm.patch
queue-4.9/mm-fail-get_vaddr_frames-for-filesystem-dax-mappings.patch
queue-4.9/fs-dax.c-fix-inefficiency-in-dax_writeback_mapping_range.patch
queue-4.9/device-dax-implement-split-to-catch-invalid-munmap-attempts.patch
queue-4.9/v4l2-disable-filesystem-dax-mapping-support.patch
queue-4.9/libnvdimm-dax-fix-1gb-aligned-namespaces-vs-physical-misalignment.patch
queue-4.9/x86-entry-64-clear-extra-registers-beyond-syscall-arguments-to-reduce-speculation-attack-surface.patch
queue-4.9/libnvdimm-fix-integer-overflow-static-analysis-warning.patch
commit b7f0554a56f21fb3e636a627450a9add030889be upstream.
Until there is a solution to the dma-to-dax vs truncate problem it is
not safe to allow V4L2, Exynos, and other frame vector users to create
long standing / irrevocable memory registrations against filesytem-dax
vmas.
[dan.j.williams(a)intel.com: add comment for vma_is_fsdax() check in get_vaddr_frames(), per Jan]
Link: http://lkml.kernel.org/r/151197874035.26211.4061781453123083667.stgit@dwill…
Link: http://lkml.kernel.org/r/151068939985.7446.15684639617389154187.stgit@dwill…
Fixes: 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings")
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Cc: Inki Dae <inki.dae(a)samsung.com>
Cc: Seung-Woo Kim <sw0312.kim(a)samsung.com>
Cc: Joonyoung Shim <jy0922.shim(a)samsung.com>
Cc: Kyungmin Park <kyungmin.park(a)samsung.com>
Cc: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Doug Ledford <dledford(a)redhat.com>
Cc: Hal Rosenstock <hal.rosenstock(a)gmail.com>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: Jeff Moyer <jmoyer(a)redhat.com>
Cc: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Cc: Sean Hefty <sean.hefty(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
---
mm/frame_vector.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/mm/frame_vector.c b/mm/frame_vector.c
index db77dcb38afd..375a103d7a56 100644
--- a/mm/frame_vector.c
+++ b/mm/frame_vector.c
@@ -52,6 +52,18 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
ret = -EFAULT;
goto out;
}
+
+ /*
+ * While get_vaddr_frames() could be used for transient (kernel
+ * controlled lifetime) pinning of memory pages all current
+ * users establish long term (userspace controlled lifetime)
+ * page pinning. Treat get_vaddr_frames() like
+ * get_user_pages_longterm() and disallow it for filesystem-dax
+ * mappings.
+ */
+ if (vma_is_fsdax(vma))
+ return -EOPNOTSUPP;
+
if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) {
vec->got_ref = true;
vec->is_pfns = false;