On Fri, Mar 02, 2018 at 02:58:54PM +0100, Wolfram Sang wrote:
>
> > It needs platform maintainers to be motivated to fix it, and one way to
> > provide that motivation is for subsystem maintainers to say no to patches
> > like this. If patches like this get accepted, then the "problem" gets
> > solved, and there is very little motivation to fix the platform itself.
>
> Yes, I can see this. I will drop / revert the patch.
>
TBH, I can't find the threads from November so I feel a bit lost and
there is no documentation for platform_get_irq().
regards,
dan carpenter
With the alc289, the Pin 0x1b is Headphone-Mic, so we should assign
ALC269_FIXUP_DELL4_MIC_NO_PRESENCE rather than
ALC225_FIXUP_DELL1_MIC_NO_PRESENCE to it. And this change is suggested
by Kailang of Realtek and is verified on the machine.
(This fixes the commit 3f2f7c55)
Cc: Kailang Yang <kailang(a)realtek.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang(a)canonical.com>
---
sound/pci/hda/patch_realtek.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index b9c93fa..7a9a867 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -6872,7 +6872,7 @@ static const struct snd_hda_pin_quirk alc269_pin_fixup_tbl[] = {
{0x12, 0x90a60120},
{0x14, 0x90170110},
{0x21, 0x0321101f}),
- SND_HDA_PIN_QUIRK(0x10ec0289, 0x1028, "Dell", ALC225_FIXUP_DELL1_MIC_NO_PRESENCE,
+ SND_HDA_PIN_QUIRK(0x10ec0289, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE,
{0x12, 0xb7a60130},
{0x14, 0x90170110},
{0x21, 0x04211020}),
--
2.7.4
This is the start of the stable review cycle for the 4.4.115 release.
There are 67 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun Feb 4 14:07:31 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.115-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.4.115-rc1
Stefan Agner <stefan(a)agner.ch>
spi: imx: do not access registers while clocks disabled
Fabio Estevam <fabio.estevam(a)nxp.com>
serial: imx: Only wakeup via RTSDEN bit if the system has RTS/CTS
Mark Salyzyn <salyzyn(a)android.com>
selinux: general protection fault in sock_has_perm
Oliver Neukum <oneukum(a)suse.com>
usb: uas: unconditionally bring back host after reset
Hemant Kumar <hemantk(a)codeaurora.org>
usb: f_fs: Prevent gadget unbind if it is already unbound
Johan Hovold <johan(a)kernel.org>
USB: serial: simple: add Motorola Tetra driver
Shuah Khan <shuahkh(a)osg.samsung.com>
usbip: list: don't list devices attached to vhci_hcd
Shuah Khan <shuahkh(a)osg.samsung.com>
usbip: prevent bind loops on devices attached to vhci_hcd
Jia-Ju Bai <baijiaju1990(a)gmail.com>
USB: serial: io_edgeport: fix possible sleep-in-atomic
Oliver Neukum <oneukum(a)suse.com>
CDC-ACM: apply quirk for card reader
Hans de Goede <hdegoede(a)redhat.com>
USB: cdc-acm: Do not log urb submission errors on disconnect
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
USB: serial: pl2303: new device id for Chilitag
OKAMOTO Yoshiaki <yokamoto(a)allied-telesis.co.jp>
usb: option: Add support for FS040U modem
Larry Finger <Larry.Finger(a)lwfinger.net>
staging: rtl8188eu: Fix incorrect response to SIOCGIWESSID
Colin Ian King <colin.king(a)canonical.com>
usb: gadget: don't dereference g until after it has been null checked
Icenowy Zheng <icenowy(a)aosc.io>
media: usbtv: add a new usbid
Gustavo A. R. Silva <garsilva(a)embeddedor.com>
scsi: ufs: ufshcd: fix potential NULL pointer dereference in ufshcd_config_vreg
Guilherme G. Piccoli <gpiccoli(a)linux.vnet.ibm.com>
scsi: aacraid: Prevent crash in case of free interrupt during scsi EH path
Darrick J. Wong <darrick.wong(a)oracle.com>
xfs: ubsan fixes
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
drm/omap: Fix error handling path in 'omap_dmm_probe()'
Yisheng Xie <xieyisheng1(a)huawei.com>
kmemleak: add scheduling point to kmemleak_scan()
Trond Myklebust <trond.myklebust(a)primarydata.com>
SUNRPC: Allow connect to return EHOSTUNREACH
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
quota: Check for register_shrinker() failure.
Geert Uytterhoeven <geert+renesas(a)glider.be>
net: ethernet: xilinx: Mark XILINX_LL_TEMAC broken on 64-bit
Robert Lippert <roblip(a)gmail.com>
hwmon: (pmbus) Use 64bit math for DIRECT format values
Vasily Averin <vvs(a)virtuozzo.com>
lockd: fix "list_add double add" caused by legacy signal interface
Andrew Elble <aweits(a)rit.edu>
nfsd: check for use of the closed special stateid
Vasily Averin <vvs(a)virtuozzo.com>
grace: replace BUG_ON by WARN_ONCE in exit_net hook
Trond Myklebust <trond.myklebust(a)primarydata.com>
nfsd: Ensure we check stateid validity in the seqid operation checks
Trond Myklebust <trond.myklebust(a)primarydata.com>
nfsd: CLOSE SHOULD return the invalid special stateid for NFSv4.x (x>0)
Eduardo Otubo <otubo(a)redhat.com>
xen-netfront: remove warning when unloading module
Wanpeng Li <wanpeng.li(a)hotmail.com>
KVM: VMX: Fix rflags cache during vCPU reset
Josef Bacik <jbacik(a)fb.com>
btrfs: fix deadlock when writing out space cache
Chun-Yeow Yeoh <yeohchunyeow(a)gmail.com>
mac80211: fix the update of path metric for RANN frame
zhangliping <zhangliping02(a)baidu.com>
openvswitch: fix the incorrect flow action alloc size
Felix Kuehling <Felix.Kuehling(a)amd.com>
drm/amdkfd: Fix SDMA oversubsription handling
shaoyunl <Shaoyun.Liu(a)amd.com>
drm/amdkfd: Fix SDMA ring buffer size calculation
Felix Kuehling <Felix.Kuehling(a)amd.com>
drm/amdgpu: Fix SDMA load/unload sequence on HWS disabled mode
Michael Lyle <mlyle(a)lyle.org>
bcache: check return value of register_shrinker
James Hogan <jhogan(a)kernel.org>
cpufreq: Add Loongson machine dependencies
Hans de Goede <hdegoede(a)redhat.com>
ACPI / bus: Leave modalias empty for devices which are not present
Nikita Leshenko <nikita.leshchenko(a)oracle.com>
KVM: x86: ioapic: Preserve read-only values in the redirection table
Nikita Leshenko <nikita.leshchenko(a)oracle.com>
KVM: x86: ioapic: Clear Remote IRR when entry is switched to edge-triggered
Nikita Leshenko <nikita.leshchenko(a)oracle.com>
KVM: x86: ioapic: Fix level-triggered EOI and IOAPIC reconfigure race
Wanpeng Li <wanpeng.li(a)hotmail.com>
KVM: X86: Fix operand/address-size during instruction decoding
Liran Alon <liran.alon(a)oracle.com>
KVM: x86: Don't re-execute instruction when not passing CR2 value
Liran Alon <liran.alon(a)oracle.com>
KVM: x86: emulator: Return to user-mode on L1 CPL=0 emulation failure
Lyude Paul <lyude(a)redhat.com>
igb: Free IRQs when device is hotplugged
Jesse Chan <jc(a)linux.com>
mtd: nand: denali_pci: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
Jesse Chan <jc(a)linux.com>
gpio: ath79: add missing MODULE_DESCRIPTION/LICENSE
Jesse Chan <jc(a)linux.com>
gpio: iop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
Jesse Chan <jc(a)linux.com>
power: reset: zx-reboot: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE
Stephan Mueller <smueller(a)chronox.de>
crypto: af_alg - whitelist mask and type
Stephan Mueller <smueller(a)chronox.de>
crypto: aesni - handle zero length dst buffer
Takashi Iwai <tiwai(a)suse.de>
ALSA: seq: Make ioctls race-free
Hugh Dickins <hughd(a)google.com>
kaiser: fix intel_bts perf crashes
Dave Hansen <dave.hansen(a)linux.intel.com>
x86/pti: Make unpoison of pgd for trusted boot work for real
Daniel Borkmann <daniel(a)iogearbox.net>
bpf: reject stores into ctx via st and xadd
Alexei Starovoitov <ast(a)kernel.org>
bpf: fix 32-bit divide by zero
Eric Dumazet <edumazet(a)google.com>
bpf: fix divides by zero
Daniel Borkmann <daniel(a)iogearbox.net>
bpf: avoid false sharing of map refcount with max_entries
Daniel Borkmann <daniel(a)iogearbox.net>
bpf: arsh is not supported in 32 bit alu thus reject it
Alexei Starovoitov <ast(a)kernel.org>
bpf: introduce BPF_JIT_ALWAYS_ON config
Alexei Starovoitov <ast(a)fb.com>
bpf: fix bpf_tail_call() x64 JIT
Eric Dumazet <edumazet(a)google.com>
x86: bpf_jit: small optimization in emit_bpf_tail_call()
Alexei Starovoitov <ast(a)fb.com>
bpf: fix branch pruning logic
Linus Torvalds <torvalds(a)linux-foundation.org>
loop: fix concurrent lo_open/lo_release
-------------
Diffstat:
Makefile | 4 +-
arch/arm64/Kconfig | 1 +
arch/s390/Kconfig | 1 +
arch/x86/Kconfig | 1 +
arch/x86/crypto/aesni-intel_glue.c | 2 +-
arch/x86/include/asm/kvm_host.h | 3 +-
arch/x86/kernel/cpu/perf_event_intel_bts.c | 44 ++++++++++----
arch/x86/kernel/tboot.c | 10 ++++
arch/x86/kvm/emulate.c | 7 +++
arch/x86/kvm/ioapic.c | 20 ++++++-
arch/x86/kvm/vmx.c | 4 +-
arch/x86/kvm/x86.c | 2 +-
arch/x86/net/bpf_jit_comp.c | 13 ++--
crypto/af_alg.c | 10 ++--
drivers/acpi/device_sysfs.c | 4 ++
drivers/block/loop.c | 10 +++-
drivers/cpufreq/Kconfig | 2 +
drivers/gpio/gpio-ath79.c | 3 +
drivers/gpio/gpio-iop.c | 4 ++
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 47 +++++++++++----
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 4 +-
.../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 18 ++++++
drivers/gpu/drm/omapdrm/omap_dmm_tiler.c | 3 +-
drivers/hwmon/pmbus/pmbus_core.c | 21 ++++---
drivers/md/bcache/btree.c | 5 +-
drivers/media/usb/usbtv/usbtv-core.c | 1 +
drivers/mtd/nand/denali_pci.c | 4 ++
drivers/net/ethernet/intel/igb/igb_main.c | 2 +-
drivers/net/ethernet/xilinx/Kconfig | 1 +
drivers/net/xen-netfront.c | 18 ++++++
drivers/power/reset/zx-reboot.c | 4 ++
drivers/scsi/aacraid/commsup.c | 2 +-
drivers/scsi/ufs/ufshcd.c | 7 ++-
drivers/spi/spi-imx.c | 15 ++++-
drivers/staging/rtl8188eu/os_dep/ioctl_linux.c | 14 ++---
drivers/tty/serial/imx.c | 14 +++--
drivers/usb/class/cdc-acm.c | 5 +-
drivers/usb/gadget/composite.c | 7 ++-
drivers/usb/gadget/function/f_fs.c | 3 +-
drivers/usb/serial/Kconfig | 1 +
drivers/usb/serial/io_edgeport.c | 1 -
drivers/usb/serial/option.c | 5 ++
drivers/usb/serial/pl2303.c | 1 +
drivers/usb/serial/pl2303.h | 1 +
drivers/usb/serial/usb-serial-simple.c | 7 +++
drivers/usb/storage/uas.c | 7 +--
fs/btrfs/free-space-cache.c | 3 +-
fs/nfs_common/grace.c | 10 +++-
fs/nfsd/nfs4state.c | 34 ++++++-----
fs/quota/dquot.c | 3 +-
fs/xfs/xfs_aops.c | 6 +-
include/linux/bpf.h | 16 +++--
init/Kconfig | 7 +++
kernel/bpf/core.c | 30 ++++++++--
kernel/bpf/verifier.c | 70 ++++++++++++++++++++++
lib/test_bpf.c | 13 ++--
mm/kmemleak.c | 2 +
net/Kconfig | 3 +
net/core/filter.c | 8 ++-
net/core/sysctl_net_core.c | 6 ++
net/mac80211/mesh_hwmp.c | 15 +++--
net/openvswitch/flow_netlink.c | 16 ++---
net/socket.c | 9 +++
net/sunrpc/xprtsock.c | 1 +
security/selinux/hooks.c | 2 +
sound/core/seq/seq_clientmgr.c | 10 +++-
sound/core/seq/seq_clientmgr.h | 1 +
tools/usb/usbip/src/usbip_bind.c | 9 +++
tools/usb/usbip/src/usbip_list.c | 9 +++
69 files changed, 504 insertions(+), 142 deletions(-)
Rediffed for RHEL7.4.z.
Conflicts:
The return value of md_make_request is different from RHEL to upstream.
And RHEL7 MD code does not use bio->bi_opf, but bio->bi_rw.
-Nigel Croxon
commit 393debc23c7820211d1c8253dd6a8408a7628fe7
Author: Shaohua Li <shli(a)fb.com>
Date: Thu Sep 21 10:23:35 2017 -0700
md: separate request handling
With commit cc27b0c78c79, pers->make_request could bail out without handling
the bio. If that happens, we should retry. The commit fixes md_make_request
but not other call sites. Separate the request handling part, so other call
sites can use it.
Reported-by: Nate Dailey <nate.dailey(a)stratus.com>
Fix: cc27b0c78c79(md: fix deadlock between mddev_suspend() and md_write_start())
Cc: stable(a)vger.kernel.org
Reviewed-by: NeilBrown <neilb(a)suse.com>
Signed-off-by: Shaohua Li <shli(a)fb.com>
Signed-off-by: Denys Vlasenko <dvlasenk(a)redhat.com>
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 85fe7a99290..407e15f4bfe 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -248,21 +248,8 @@ static DEFINE_SPINLOCK(all_mddevs_lock);
* call has finished, the bio has been linked into some internal structure
* and so is visible to ->quiesce(), so we don't need the refcount any more.
*/
-static void md_make_request(struct request_queue *q, struct bio *bio)
+void md_handle_request(struct mddev *mddev, struct bio *bio)
{
- const int rw = bio_data_dir(bio);
- struct mddev *mddev = q->queuedata;
- int cpu;
- unsigned int sectors;
-
- if (mddev == NULL || mddev->pers == NULL) {
- bio_io_error(bio);
- return;
- }
- if (mddev->ro == 1 && unlikely(rw == WRITE)) {
- bio_endio(bio, bio_sectors(bio) == 0 ? 0 : -EROFS);
- return;
- }
check_suspended:
smp_rmb(); /* Ensure implications of 'active' are visible */
rcu_read_lock();
@@ -282,24 +269,45 @@ check_suspended:
atomic_inc(&mddev->active_io);
rcu_read_unlock();
- /*
- * save the sectors now since our bio can
- * go away inside make_request
- */
- sectors = bio_sectors(bio);
if (!mddev->pers->make_request(mddev, bio)) {
atomic_dec(&mddev->active_io);
wake_up(&mddev->sb_wait);
goto check_suspended;
}
+ if (atomic_dec_and_test(&mddev->active_io) && mddev->suspended)
+ wake_up(&mddev->sb_wait);
+}
+EXPORT_SYMBOL(md_handle_request);
+
+static void md_make_request(struct request_queue *q, struct bio *bio)
+{
+ const int rw = bio_data_dir(bio);
+ struct mddev *mddev = q->queuedata;
+ int cpu;
+ unsigned int sectors;
+
+ if (mddev == NULL || mddev->pers == NULL) {
+ bio_io_error(bio);
+ return;
+ }
+ if (mddev->ro == 1 && unlikely(rw == WRITE)) {
+ bio_endio(bio, bio_sectors(bio) == 0 ? 0 : -EROFS);
+ return;
+ }
+
+ /*
+ * save the sectors now since our bio can
+ * go away inside make_request
+ */
+ sectors = bio_sectors(bio);
+
+ md_handle_request(mddev, bio);
+
cpu = part_stat_lock();
part_stat_inc(cpu, &mddev->gendisk->part0, ios[rw]);
part_stat_add(cpu, &mddev->gendisk->part0, sectors[rw], sectors);
part_stat_unlock();
-
- if (atomic_dec_and_test(&mddev->active_io) && mddev->suspended)
- wake_up(&mddev->sb_wait);
}
/* mddev_suspend makes sure no new requests are submitted
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 0d13bf88f41..12e19d6d373 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -679,6 +679,7 @@ extern void md_stop_writes(struct mddev *mddev);
extern int md_rdev_init(struct md_rdev *rdev);
extern void md_rdev_clear(struct md_rdev *rdev);
+extern void md_handle_request(struct mddev *mddev, struct bio *bio);
extern void mddev_suspend(struct mddev *mddev);
extern void mddev_resume(struct mddev *mddev);
extern struct bio *bio_clone_mddev(struct bio *bio, gfp_t gfp_mask,
--
2.16.2
Changes since v4 [1]:
* Fix the changelog of "dax: introduce IS_DEVDAX() and IS_FSDAX()" to
better clarify the need for new helpers (Jan)
* Replace dax_sem_is_locked() with dax_sem_assert_held() (Jan)
* Use file_inode() in vma_is_dax() (Jan)
* Resend the full series to linux-xfs@ (Dave)
* Collect Jan's Reviewed-by
[1]: https://lists.01.org/pipermail/linux-nvdimm/2018-February/014271.html
---
The vfio interface, like RDMA, wants to setup long term (indefinite)
pins of the pages backing an address range so that a guest or userspace
driver can perform DMA to the with physical address. Given that this
pinning may lead to filesystem operations deadlocking in the
filesystem-dax case, the pinning request needs to be rejected.
The longer term fix for vfio, RDMA, and any other long term pin user, is
to provide a 'pin with lease' mechanism. Similar to the leases that are
hold for pNFS RDMA layouts, this userspace lease gives the kernel a way
to notify userspace that the block layout of the file is changing and
the kernel is revoking access to pinned pages.
Related to this change is the discovery that vma_is_fsdax() was causing
device-dax inode detection to fail. That lead to series of fixes and
cleanups to make sure that S_DAX is defined correctly in the
CONFIG_FS_DAX=n + CONFIG_DEV_DAX=y case.
---
Dan Williams (12):
dax: fix vma_is_fsdax() helper
dax: introduce IS_DEVDAX() and IS_FSDAX()
ext2, dax: finish implementing dax_sem helpers
ext2, dax: define ext2_dax_*() infrastructure in all cases
ext4, dax: define ext4_dax_*() infrastructure in all cases
ext2, dax: replace IS_DAX() with IS_FSDAX()
ext4, dax: replace IS_DAX() with IS_FSDAX()
xfs, dax: replace IS_DAX() with IS_FSDAX()
mm, dax: replace IS_DAX() with IS_DEVDAX() or IS_FSDAX()
fs, dax: kill IS_DAX()
dax: fix S_DAX definition
vfio: disable filesystem-dax page pinning
drivers/vfio/vfio_iommu_type1.c | 18 ++++++++++++++--
fs/ext2/ext2.h | 6 +++++
fs/ext2/file.c | 19 +++++------------
fs/ext2/inode.c | 10 ++++-----
fs/ext4/file.c | 18 +++++-----------
fs/ext4/inode.c | 4 ++--
fs/ext4/ioctl.c | 2 +-
fs/ext4/super.c | 2 +-
fs/iomap.c | 2 +-
fs/xfs/xfs_file.c | 14 ++++++-------
fs/xfs/xfs_ioctl.c | 4 ++--
fs/xfs/xfs_iomap.c | 6 +++--
fs/xfs/xfs_reflink.c | 2 +-
include/linux/dax.h | 12 ++++++++---
include/linux/fs.h | 43 ++++++++++++++++++++++++++++-----------
mm/fadvise.c | 3 ++-
mm/filemap.c | 4 ++--
mm/huge_memory.c | 4 +++-
mm/madvise.c | 3 ++-
19 files changed, 102 insertions(+), 74 deletions(-)
The patch titled
Subject: mm/page_alloc: fix memmap_init_zone pageblock alignment
has been added to the -mm tree. Its filename is
mm-page_alloc-fix-memmap_init_zone-pageblock-alignment.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-fix-memmap_init_zone…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-fix-memmap_init_zone…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Daniel Vacek <neelx(a)redhat.com>
Subject: mm/page_alloc: fix memmap_init_zone pageblock alignment
b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where
possible") introduced a bug where move_freepages() triggers a VM_BUG_ON()
on uninitialized page structure due to pageblock alignment. To fix this,
simply align the skipped pfns in memmap_init_zone() the same way as in
move_freepages_block().
Seen in one of the RHEL reports:
crash> log | grep -e BUG -e RIP -e Call.Trace -e move_freepages_block -e rmqueue -e freelist -A1
kernel BUG at mm/page_alloc.c:1389!
invalid opcode: 0000 [#1] SMP
--
RIP: 0010:[<ffffffff8118833e>] [<ffffffff8118833e>] move_freepages+0x15e/0x160
RSP: 0018:ffff88054d727688 EFLAGS: 00010087
--
Call Trace:
[<ffffffff811883b3>] move_freepages_block+0x73/0x80
[<ffffffff81189e63>] __rmqueue+0x263/0x460
[<ffffffff8118c781>] get_page_from_freelist+0x7e1/0x9e0
[<ffffffff8118caf6>] __alloc_pages_nodemask+0x176/0x420
--
RIP [<ffffffff8118833e>] move_freepages+0x15e/0x160
RSP <ffff88054d727688>
crash> page_init_bug -v | grep RAM
<struct resource 0xffff88067fffd2f8> 1000 - 9bfff System RAM (620.00 KiB)
<struct resource 0xffff88067fffd3a0> 100000 - 430bffff System RAM ( 1.05 GiB = 1071.75 MiB = 1097472.00 KiB)
<struct resource 0xffff88067fffd410> 4b0c8000 - 4bf9cfff System RAM ( 14.83 MiB = 15188.00 KiB)
<struct resource 0xffff88067fffd480> 4bfac000 - 646b1fff System RAM (391.02 MiB = 400408.00 KiB)
<struct resource 0xffff88067fffd560> 7b788000 - 7b7fffff System RAM (480.00 KiB)
<struct resource 0xffff88067fffd640> 100000000 - 67fffffff System RAM ( 22.00 GiB)
crash> page_init_bug | head -6
<struct resource 0xffff88067fffd560> 7b788000 - 7b7fffff System RAM (480.00 KiB)
<struct page 0xffffea0001ede200> 1fffff00000000 0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32 4096 1048575
<struct page 0xffffea0001ede200> 505736 505344 <struct page 0xffffea0001ed8000> 505855 <struct page 0xffffea0001edffc0>
<struct page 0xffffea0001ed8000> 0 0 <struct pglist_data 0xffff88047ffd9000> 0 <struct zone 0xffff88047ffd9000> DMA 1 4095
<struct page 0xffffea0001edffc0> 1fffff00000400 0 <struct pglist_data 0xffff88047ffd9000> 1 <struct zone 0xffff88047ffd9800> DMA32 4096 1048575
BUG, zones differ!
Note that this range follows two not populated sections 68000000-77ffffff
in this zone. 7b788000-7b7fffff is the first one after a gap. This makes
memmap_init_zone() skip all the pfns up to the beginning of this range.
But this range is not pageblock (2M) aligned. In fact no range has to be.
crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b787000 7b788000
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffea0001e00000 78000000 0 0 0 0
ffffea0001ed7fc0 7b5ff000 0 0 0 0
ffffea0001ed8000 7b600000 0 0 0 0 <<<<
ffffea0001ede1c0 7b787000 0 0 0 0
ffffea0001ede200 7b788000 0 0 1 1fffff00000000
Top part of page flags should contain nodeid and zonenr, which is not
the case for page ffffea0001ed8000 here (<<<<).
crash> log | grep -o fffea0001ed[^\ ]* | sort -u
fffea0001ed8000
fffea0001eded20
fffea0001edffc0
crash> bt -r | grep -o fffea0001ed[^\ ]* | sort -u
fffea0001ed8000
fffea0001eded00
fffea0001eded20
fffea0001edffc0
Initialization of the whole beginning of the section is skipped up to the
start of the range due to the commit b92df1de5d28. Now any code calling
move_freepages_block() (like reusing the page from a freelist as in this
example) with a page from the beginning of the range will get the page
rounded down to start_page ffffea0001ed8000 and passed to move_freepages()
which crashes on assertion getting wrong zonenr.
> VM_BUG_ON(page_zone(start_page) != page_zone(end_page));
Note, page_zone() derives the zone from page flags here.
>From similar machine before commit b92df1de5d28:
crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
fffff73941e00000 78000000 0 0 1 1fffff00000000
fffff73941ed7fc0 7b5ff000 0 0 1 1fffff00000000
fffff73941ed8000 7b600000 0 0 1 1fffff00000000
fffff73941edff80 7b7fe000 0 0 1 1fffff00000000
fffff73941edffc0 7b7ff000 ffff8e67e04d3ae0 ad84 1 1fffff00020068 uptodate,lru,active,mappedtodisk
All the pages since the beginning of the section are initialized.
move_freepages()' not gonna blow up.
The same machine with this fix applied:
crash> kmem -p 77fff000 78000000 7b5ff000 7b600000 7b7fe000 7b7ff000
PAGE PHYSICAL MAPPING INDEX CNT FLAGS
ffffea0001e00000 78000000 0 0 0 0
ffffea0001e00000 7b5ff000 0 0 0 0
ffffea0001ed8000 7b600000 0 0 1 1fffff00000000
ffffea0001edff80 7b7fe000 0 0 1 1fffff00000000
ffffea0001edffc0 7b7ff000 ffff88017fb13720 8 2 1fffff00020068 uptodate,lru,active,mappedtodisk
At least the bare minimum of pages is initialized preventing the crash as well.
Link: http://lkml.kernel.org/r/0485727b2e82da7efbce5f6ba42524b429d0391a.152001194…
Fixes: b92df1de5d28 ("mm: page_alloc: skip over regions of invalid pfns where possible")
Signed-off-by: Daniel Vacek <neelx(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Paul Burton <paul.burton(a)imgtec.com>
Cc: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff -puN mm/page_alloc.c~mm-page_alloc-fix-memmap_init_zone-pageblock-alignment mm/page_alloc.c
--- a/mm/page_alloc.c~mm-page_alloc-fix-memmap_init_zone-pageblock-alignment
+++ a/mm/page_alloc.c
@@ -5359,9 +5359,14 @@ void __meminit memmap_init_zone(unsigned
/*
* Skip to the pfn preceding the next valid one (or
* end_pfn), such that we hit a valid pfn (or end_pfn)
- * on our next iteration of the loop.
+ * on our next iteration of the loop. Note that it needs
+ * to be pageblock aligned even when the region itself
+ * is not. move_freepages_block() can shift ahead of
+ * the valid region but still depends on correct page
+ * metadata.
*/
- pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
+ pfn = (memblock_next_valid_pfn(pfn, end_pfn) &
+ ~(pageblock_nr_pages-1)) - 1;
#endif
continue;
}
_
Patches currently in -mm which might be from neelx(a)redhat.com are
mm-memblock-hardcode-the-end_pfn-being-1.patch
mm-page_alloc-fix-memmap_init_zone-pageblock-alignment.patch
Hi All,
Resent without non-upstream patches.
This backport patchset fixed the spectre issue, it's original branch:
https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=kpti
A few dependency or fixingpatches are also picked up, if they are necessary
and no functional changes.
No bug found from kernelci.org and lkft testing. It also could be gotten from:
git://git.linaro.org/kernel/linux-linaro-stable.git v4.9-spectre-upstream-only
Comments are appreciated!
Regards
Alex
[PATCH 01/45] mm: Introduce lm_alias
[PATCH 02/45] arm64: alternatives: apply boot time fixups via the
[PATCH 03/45] arm64: barrier: Add CSDB macros to control data-value
[PATCH 04/45] arm64: Implement array_index_mask_nospec()
[PATCH 05/45] arm64: move TASK_* definitions to <asm/processor.h>
[PATCH 06/45] arm64: Factor out PAN enabling/disabling into separate
[PATCH 07/45] arm64: Factor out TTBR0_EL1 post-update workaround into
[PATCH 08/45] arm64: uaccess: consistently check object sizes
[PATCH 09/45] arm64: Make USER_DS an inclusive limit
[PATCH 10/45] arm64: Use pointer masking to limit uaccess speculation
[PATCH 11/45] arm64: syscallno is secretly an int, make it official
[PATCH 12/45] arm64: entry: Ensure branch through syscall table is
[PATCH 13/45] arm64: uaccess: Prevent speculative use of the current
[PATCH 14/45] arm64: uaccess: Don't bother eliding access_ok checks
[PATCH 15/45] arm64: uaccess: Mask __user pointers for __arch_{clear,
[PATCH 16/45] arm64: futex: Mask __user pointers prior to dereference
[PATCH 17/45] drivers/firmware: Expose psci_get_version through
[PATCH 18/45] arm64: cpufeature: __this_cpu_has_cap() shouldn't stop
[PATCH 19/45] arm64: cpu_errata: Allow an erratum to be match for all
[PATCH 20/45] arm64: Run enable method for errata work arounds on
[PATCH 21/45] arm64: cpufeature: Pass capability structure to
[PATCH 22/45] arm64: Move post_ttbr_update_workaround to C code
[PATCH 23/45] arm64: Add skeleton to harden the branch predictor
[PATCH 24/45] arm64: Move BP hardening to check_and_switch_context
[PATCH 25/45] arm64: KVM: Use per-CPU vector when BP hardening is
[PATCH 26/45] arm64: entry: Apply BP hardening for high-priority
[PATCH 27/45] arm64: entry: Apply BP hardening for suspicious
[PATCH 28/45] arm64: cputype: Add missing MIDR values for Cortex-A72
[PATCH 29/45] arm64: Implement branch predictor hardening for
[PATCH 30/45] arm64: KVM: Increment PC after handling an SMC trap
[PATCH 31/45] arm/arm64: KVM: Consolidate the PSCI include files
[PATCH 32/45] arm/arm64: KVM: Add PSCI_VERSION helper
[PATCH 33/45] arm/arm64: KVM: Add smccc accessors to PSCI code
[PATCH 34/45] arm/arm64: KVM: Implement PSCI 1.0 support
[PATCH 35/45] arm/arm64: KVM: Advertise SMCCC v1.1
[PATCH 36/45] arm64: KVM: Make PSCI_VERSION a fast path
[PATCH 37/45] arm/arm64: KVM: Turn kvm_psci_version into a static
[PATCH 38/45] arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening
[PATCH 39/45] arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling
[PATCH 40/45] firmware/psci: Expose PSCI conduit
[PATCH 41/45] firmware/psci: Expose SMCCC version through psci_ops
[PATCH 42/45] arm/arm64: smccc: Make function identifiers an unsigned
[PATCH 43/45] arm/arm64: smccc: Implement SMCCC v1.1 inline primitive
[PATCH 44/45] arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening
[PATCH 45/45] arm64: Kill PSCI_GET_VERSION as a variant-2 workaround
The patch titled
Subject: mm-memblock-hardcode-the-end_pfn-being-1-fix
has been added to the -mm tree. Its filename is
mm-memblock-hardcode-the-end_pfn-being-1-fix.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-memblock-hardcode-the-end_pfn-b…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-memblock-hardcode-the-end_pfn-b…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Andrew Morton <akpm(a)linux-foundation.org>
Subject: mm-memblock-hardcode-the-end_pfn-being-1-fix
make it work against current -linus, not against -mm
Cc: Daniel Vacek <neelx(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Paul Burton <paul.burton(a)imgtec.com>
Cc: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -puN mm/page_alloc.c~mm-memblock-hardcode-the-end_pfn-being-1-fix mm/page_alloc.c
--- a/mm/page_alloc.c~mm-memblock-hardcode-the-end_pfn-being-1-fix
+++ a/mm/page_alloc.c
@@ -5361,7 +5361,7 @@ void __meminit memmap_init_zone(unsigned
* end_pfn), such that we hit a valid pfn (or end_pfn)
* on our next iteration of the loop.
*/
- pfn = memblock_next_valid_pfn(pfn) - 1;
+ pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
#endif
continue;
}
_
Patches currently in -mm which might be from akpm(a)linux-foundation.org are
i-need-old-gcc.patch
mm-memblock-hardcode-the-end_pfn-being-1-fix.patch
arm-arch-arm-include-asm-pageh-needs-personalityh.patch
mm.patch
mm-page_alloc-skip-over-regions-of-invalid-pfns-on-uma-fix.patch
direct-io-minor-cleanups-in-do_blockdev_direct_io-fix.patch
list_lru-prefetch-neighboring-list-entries-before-acquiring-lock-fix.patch
mm-oom-cgroup-aware-oom-killer-fix.patch
mm-oom-docs-describe-the-cgroup-aware-oom-killer-fix-2-fix.patch
proc-add-seq_put_decimal_ull_width-to-speed-up-proc-pid-smaps-fix.patch
sysctl-add-flags-to-support-min-max-range-clamping-fix.patch
linux-next-git-rejects.patch
kernel-forkc-export-kernel_thread-to-modules.patch
slab-leaks3-default-y.patch
The patch titled
Subject: mm/memblock.c: hardcode the end_pfn being -1
has been added to the -mm tree. Its filename is
mm-memblock-hardcode-the-end_pfn-being-1.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-memblock-hardcode-the-end_pfn-b…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-memblock-hardcode-the-end_pfn-b…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Daniel Vacek <neelx(a)redhat.com>
Subject: mm/memblock.c: hardcode the end_pfn being -1
This is just a cleanup. It aids handling the special end case in the next
commit.
Link: http://lkml.kernel.org/r/1ca478d4269125a99bcfb1ca04d7b88ac1aee924.152001194…
Signed-off-by: Daniel Vacek <neelx(a)redhat.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Pavel Tatashin <pasha.tatashin(a)oracle.com>
Cc: Paul Burton <paul.burton(a)imgtec.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memblock.c | 13 ++++++-------
mm/page_alloc.c | 2 +-
2 files changed, 7 insertions(+), 8 deletions(-)
diff -puN mm/memblock.c~mm-memblock-hardcode-the-end_pfn-being-1 mm/memblock.c
--- a/mm/memblock.c~mm-memblock-hardcode-the-end_pfn-being-1
+++ a/mm/memblock.c
@@ -1101,13 +1101,12 @@ void __init_memblock __next_mem_pfn_rang
*out_nid = r->nid;
}
-unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn,
- unsigned long max_pfn)
+unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn)
{
struct memblock_type *type = &memblock.memory;
unsigned int right = type->cnt;
unsigned int mid, left = 0;
- phys_addr_t addr = PFN_PHYS(pfn + 1);
+ phys_addr_t addr = PFN_PHYS(++pfn);
do {
mid = (right + left) / 2;
@@ -1118,15 +1117,15 @@ unsigned long __init_memblock memblock_n
type->regions[mid].size))
left = mid + 1;
else {
- /* addr is within the region, so pfn + 1 is valid */
- return min(pfn + 1, max_pfn);
+ /* addr is within the region, so pfn is valid */
+ return pfn;
}
} while (left < right);
if (right == type->cnt)
- return max_pfn;
+ return -1UL;
else
- return min(PHYS_PFN(type->regions[right].base), max_pfn);
+ return PHYS_PFN(type->regions[right].base);
}
/**
diff -puN mm/page_alloc.c~mm-memblock-hardcode-the-end_pfn-being-1 mm/page_alloc.c
--- a/mm/page_alloc.c~mm-memblock-hardcode-the-end_pfn-being-1
+++ a/mm/page_alloc.c
@@ -5361,7 +5361,7 @@ void __meminit memmap_init_zone(unsigned
* end_pfn), such that we hit a valid pfn (or end_pfn)
* on our next iteration of the loop.
*/
- pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1;
+ pfn = memblock_next_valid_pfn(pfn) - 1;
#endif
continue;
}
_
Patches currently in -mm which might be from neelx(a)redhat.com are
mm-memblock-hardcode-the-end_pfn-being-1.patch
mm-page_alloc-fix-memmap_init_zone-pageblock-alignment.patch