- Linux-stable-mirror - lists.linaro.org

by Sasha Levin

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 Hi Greg, Pleae pull commits for Linux 4.4 . I've sent a review request for all commits over a week ago and all comments were addressed. Thanks, Sasha ===== The following changes since commit 9c6cd3f3a4b8194e82fa927bc00028c7a505e3b3: Linux 4.4.159 (2018-09-29 03:08:55 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git tags/for-greg-4.4-08102018 for you to fetch changes up to 78c9795a56513a101b1620f0e34d89c114a7a59a: xen: fix GCC warning and remove duplicate EVTCHN_ROW/EVTCHN_COL usage (2018-09-29 13:03:41 -0400) - ---------------------------------------------------------------- for-greg-4.4-08102018 - ---------------------------------------------------------------- Anton Vasilyev (1): usb: gadget: fotg210-udc: Fix memory leak of fotg210->ep[i] Ben Hutchings (1): USB: yurex: Check for truncation in yurex_read() Ben Skeggs (1): drm/nouveau/TBDdevinit: don't fail when PMU/PRE_OS is missing from VBIOS Dan Carpenter (1): cifs: read overflow in is_valid_oplock_break() Jann Horn (1): RDMA/ucma: check fd type in ucma_migrate_id() Joe Thornber (1): dm thin metadata: try to avoid ever aborting transactions Josh Abraham (1): xen: fix GCC warning and remove duplicate EVTCHN_ROW/EVTCHN_COL usage Julian Wiedmann (1): s390/qeth: don't dump past end of unknown HW header Kai-Heng Feng (1): r8169: Clear RTL_FLAG_TASK_*_PENDING when clearing RTL_FLAG_TASK_ENABLED Miguel Ojeda (1): arm64: jump_label.h: use asm_volatile_goto macro instead of "asm goto" Olaf Hering (1): xen: avoid crash in disable_hotplug_cpu Randy Dunlap (2): arch/hexagon: fix kernel/dma.c build warning hexagon: modify ffs() and fls() to return int Sandipan Das (1): perf probe powerpc: Ignore SyS symbols irrespective of endianness Stephen Boyd (1): pinctrl: msm: Really mask level interrupts to prevent latching Stephen Rothwell (1): fs/cifs: suppress a string overflow warning Vitaly Kuznetsov (1): xen/manage: don't complain about an empty value in control/sysrq node arch/arm64/include/asm/jump_label.h | 4 +- arch/hexagon/include/asm/bitops.h | 4 +- arch/hexagon/kernel/dma.c | 2 +- .../gpu/drm/nouveau/nvkm/subdev/devinit/gm204.c | 3 +- drivers/infiniband/core/ucma.c | 6 ++ drivers/md/dm-thin-metadata.c | 36 ++++++++++- drivers/md/dm-thin.c | 73 +++++++++++++++++++--- drivers/net/ethernet/realtek/r8169.c | 9 ++- drivers/pinctrl/qcom/pinctrl-msm.c | 24 +++++++ drivers/s390/net/qeth_l2_main.c | 2 +- drivers/s390/net/qeth_l3_main.c | 2 +- drivers/usb/gadget/udc/fotg210-udc.c | 15 +++-- drivers/usb/misc/yurex.c | 3 + drivers/xen/cpu_hotplug.c | 15 ++--- drivers/xen/events/events_base.c | 2 +- drivers/xen/manage.c | 6 +- fs/cifs/cifssmb.c | 11 +++- fs/cifs/misc.c | 8 +++ tools/perf/arch/powerpc/util/sym-handling.c | 4 +- 19 files changed, 190 insertions(+), 39 deletions(-) -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEE4n5dijQDou9mhzu83qZv95d3LNwFAlu7cZoACgkQ3qZv95d3 LNwdwQ//dsOKScH3qjTVEsX+fIwRTTnw9qTnQeCobQm2C2ibPnoVKST+oqaXoffQ 7QwdtrIXxNc/i4Ga4JuqV4AZUSuyrgsppueoHwoF+8v5Pa5hiwexlbSBJ0CmtTg9 mma8c6QGjyObaxtytWohX0AZ/1awLrmh2RYYngGTeMyTyfylRDwZoj9DA1lzT1Vv M9Qm+aDV7IApjiQW0Jb7UHZVGV8pyps0NRm0MUiwzNeoTTTOBW+s3uY1yu7pytpt +Gw1s39pmaAwmPfrxsjyzZMbFVogK9KFa/0nSVN56oo+qamq3IzvFYFIj78GJv1P hPhCH1T1OVprZPt3FAdYPnpuWvb/4maogGyrEu1LME09erC9NpLs9ovTiORTmc5h ucUJmgtHF7EXb53R//m8jLcWKWnlIRPezKxIGv/vUx5N86LOWixX/yFtZU3QanMj yKbQBz/eBYhe0Mj8G+roI3kIN0naplH5FuluissaMWRzdBXTDUMe+YKmEEYWVIHs V2KX2aiYbaAojv4/JLaKnPV0HMIG6+DUJq/h8yFDxMn8kUeiKyW7JVpNWrNgz38p oIYK5qf/JO9RqKbQK1qLOi8HkFNTq9VOFoEBz9SnEHwiJq39wLs3W3wZIAGcLnhd sfy1yJM5ihbGX51oSAbc4O/5bxzmnPzycq8AA84JnYfN65zYBjw= =+003 -----END PGP SIGNATURE-----

6 years, 11 months

1
0
0 0

[GIT PULL] commits for Linux 4.9

by Sasha Levin

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 Hi Greg, Pleae pull commits for Linux 4.9 . I've sent a review request for all commits over a week ago and all comments were addressed. Thanks, Sasha ===== The following changes since commit cdd48f386d7e6671e7cc21e517ae258b298ec877: Linux 4.9.131 (2018-10-03 17:01:55 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git tags/for-greg-4.9-08102018 for you to fetch changes up to 3c29b011970e1edb3d203e2e2517daad893c8ed4: xen: fix GCC warning and remove duplicate EVTCHN_ROW/EVTCHN_COL usage (2018-10-03 22:23:44 -0400) - ---------------------------------------------------------------- for-greg-4.9-08102018 - ---------------------------------------------------------------- Anton Vasilyev (1): usb: gadget: fotg210-udc: Fix memory leak of fotg210->ep[i] Ben Hutchings (1): USB: yurex: Check for truncation in yurex_read() Ben Skeggs (1): drm/nouveau/TBDdevinit: don't fail when PMU/PRE_OS is missing from VBIOS Dan Carpenter (1): cifs: read overflow in is_valid_oplock_break() Daniel Jurgens (1): net/mlx5: Consider PCI domain in search for next dev Harry Mallon (1): HID: hid-saitek: Add device ID for RAT 7 Contagion Heinz Mauelshagen (1): dm raid: fix rebuild of specific devices by updating superblock Hisao Tanabe (1): perf evsel: Fix potential null pointer dereference in perf_evsel__new_idx() Jacek Tomaka (1): perf/x86/intel: Add support/quirk for the MISPREDICT bit on Knights Landing CPUs Jann Horn (1): RDMA/ucma: check fd type in ucma_migrate_id() Joe Thornber (1): dm thin metadata: try to avoid ever aborting transactions Josh Abraham (1): xen: fix GCC warning and remove duplicate EVTCHN_ROW/EVTCHN_COL usage Julian Wiedmann (1): s390/qeth: don't dump past end of unknown HW header Kai-Heng Feng (1): r8169: Clear RTL_FLAG_TASK_*_PENDING when clearing RTL_FLAG_TASK_ENABLED Miguel Ojeda (1): arm64: jump_label.h: use asm_volatile_goto macro instead of "asm goto" Netanel Belgazal (1): net: ena: fix driver when PAGE_SIZE == 64kB Olaf Hering (1): xen: avoid crash in disable_hotplug_cpu Randy Dunlap (2): arch/hexagon: fix kernel/dma.c build warning hexagon: modify ffs() and fls() to return int Sagi Grimberg (1): nvmet-rdma: fix possible bogus dereference under heavy load Sandipan Das (1): perf probe powerpc: Ignore SyS symbols irrespective of endianness Sean O'Brien (1): HID: add support for Apple Magic Keyboards Stephen Boyd (1): pinctrl: msm: Really mask level interrupts to prevent latching Stephen Rothwell (1): fs/cifs: suppress a string overflow warning Vitaly Kuznetsov (1): xen/manage: don't complain about an empty value in control/sysrq node Wenjia Zhang (1): s390/qeth: use vzalloc for QUERY OAT buffer arch/arm64/include/asm/jump_label.h | 4 +- arch/hexagon/include/asm/bitops.h | 4 +- arch/hexagon/kernel/dma.c | 2 +- arch/x86/events/intel/lbr.c | 4 ++ .../gpu/drm/nouveau/nvkm/subdev/devinit/gm200.c | 3 +- drivers/hid/hid-apple.c | 9 ++- drivers/hid/hid-ids.h | 3 + drivers/hid/hid-saitek.c | 2 + drivers/infiniband/core/ucma.c | 6 ++ drivers/md/dm-raid.c | 5 ++ drivers/md/dm-thin-metadata.c | 36 ++++++++++- drivers/md/dm-thin.c | 73 +++++++++++++++++++--- drivers/net/ethernet/amazon/ena/ena_netdev.c | 10 +-- drivers/net/ethernet/amazon/ena/ena_netdev.h | 11 ++++ drivers/net/ethernet/mellanox/mlx5/core/dev.c | 7 ++- drivers/net/ethernet/realtek/r8169.c | 9 ++- drivers/nvme/target/rdma.c | 27 +++++++- drivers/pinctrl/qcom/pinctrl-msm.c | 24 +++++++ drivers/s390/net/qeth_core_main.c | 5 +- drivers/s390/net/qeth_l2_main.c | 2 +- drivers/s390/net/qeth_l3_main.c | 2 +- drivers/usb/gadget/udc/fotg210-udc.c | 15 +++-- drivers/usb/misc/yurex.c | 3 + drivers/xen/cpu_hotplug.c | 15 ++--- drivers/xen/events/events_base.c | 2 +- drivers/xen/manage.c | 6 +- fs/cifs/cifssmb.c | 11 +++- fs/cifs/misc.c | 8 +++ tools/perf/arch/powerpc/util/sym-handling.c | 4 +- tools/perf/util/evsel.c | 5 +- 30 files changed, 263 insertions(+), 54 deletions(-) -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEE4n5dijQDou9mhzu83qZv95d3LNwFAlu7cZMACgkQ3qZv95d3 LNyMbQ//fC/jcjVOdeQmwRDBUo13zt25Eb/MeXdLdAS2NleK44wY+/e804oq3CBV qACG1CIWtP0tV0V295NrblM6xWxroPnGU8X3GdXgfT97MIByGkGEv3kU+4Nsc4c6 Gv/QlBRHcZM5QFLAk6J6nOwzEk5BKNh/XyJuaniz9DNnQGTfDMLFpN4xpFJHIZlU 5NFlcTeycPnU8ljPFxfupGQ78XVQp1QCJKT3Zrc9iPnMUDlxMdW3ERHIi3uYkyva RXrYIZJ8KYSq2RqLT8nMMiNybKy3F+20/UdfTIl7uk71AkLUB75JjWFMu9MkOWCm k0KRJ89whIN9pH5imEt7pcqFKwkyjyVT0Zk+SJQJ4Xi6+Yc0Q8p2uBteopcerwJ9 tShMyUcOSgmEctg4ujchfUalOXB2CWewV+CKyblhiiURqSTBjPZFxKtwwQF8SF/r 28MlCx8gHL9zjNpGL6z0TEp9CSPlrCjggdKca07aYZj/XJz9rV0zXh7DHNa1gUTt s7mYPZDK9RryNHVaJMss/W5eIfdy2JE1vpsQlyFgVxWxkFKmWT4iZgZwO+NqTtjd 1x1nVtw1KRnGQP748d3iBtUApsH9jS1PDer9ZJXyjVEy2jfiK6exHRcC7HGkCtz4 UTLoqw65N+bTUfGvAnlOJ8EqD9hjqLFZyTKNQ1uiK19/UDzhdp0= =nX+I -----END PGP SIGNATURE-----

6 years, 11 months

1
0
0 0

[GIT PULL] commits for Linux 4.14

by Sasha Levin

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 Hi Greg, Pleae pull commits for Linux 4.14 . I've sent a review request for all commits over a week ago and all comments were addressed. Thanks, Sasha ===== The following changes since commit e6abbe80c8838e9c0bdb51835e6218008fa49386: Linux 4.14.74 (2018-10-03 17:01:00 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git tags/for-greg-4.14-08102018 for you to fetch changes up to 3ab4f07c10e5c704c2c46d772debf0b2fb7c3467: xen: fix GCC warning and remove duplicate EVTCHN_ROW/EVTCHN_COL usage (2018-10-03 22:23:37 -0400) - ---------------------------------------------------------------- for-greg-4.14-08102018 - ---------------------------------------------------------------- Anton Vasilyev (1): usb: gadget: fotg210-udc: Fix memory leak of fotg210->ep[i] Ben Hutchings (1): USB: yurex: Check for truncation in yurex_read() Ben Skeggs (2): drm/nouveau/TBDdevinit: don't fail when PMU/PRE_OS is missing from VBIOS drm/nouveau/disp: fix DP disable race Chris Phlipot (1): perf util: Fix bad memory access in trace info. Christian König (1): drm/amdgpu: fix error handling in amdgpu_cs_user_fence_chunk Dan Carpenter (1): cifs: read overflow in is_valid_oplock_break() Daniel Jurgens (1): net/mlx5: Consider PCI domain in search for next dev Hans de Goede (1): HID: sensor-hub: Restore fixup for Lenovo ThinkPad Helix 2 sensor hub report Harry Mallon (1): HID: hid-saitek: Add device ID for RAT 7 Contagion Heinz Mauelshagen (1): dm raid: fix rebuild of specific devices by updating superblock Hisao Tanabe (1): perf evsel: Fix potential null pointer dereference in perf_evsel__new_idx() Jacek Tomaka (1): perf/x86/intel: Add support/quirk for the MISPREDICT bit on Knights Landing CPUs Jann Horn (1): RDMA/ucma: check fd type in ucma_migrate_id() Joe Thornber (1): dm thin metadata: try to avoid ever aborting transactions Josh Abraham (1): xen: fix GCC warning and remove duplicate EVTCHN_ROW/EVTCHN_COL usage Julian Wiedmann (1): s390/qeth: don't dump past end of unknown HW header Kai-Heng Feng (1): r8169: Clear RTL_FLAG_TASK_*_PENDING when clearing RTL_FLAG_TASK_ENABLED Martin Willi (1): netfilter: xt_cluster: add dependency on conntrack module Matt Ranostay (1): Revert "iio: temperature: maxim_thermocouple: add MAX31856 part" Miguel Ojeda (1): arm64: jump_label.h: use asm_volatile_goto macro instead of "asm goto" Netanel Belgazal (2): net: ena: fix driver when PAGE_SIZE == 64kB net: ena: fix missing calls to READ_ONCE Nilesh Javali (1): scsi: qedi: Add the CRC size within iSCSI NVM image Olaf Hering (1): xen: avoid crash in disable_hotplug_cpu Pablo Neira Ayuso (1): netfilter: conntrack: timeout interface depend on CONFIG_NF_CONNTRACK_TIMEOUT Randy Dunlap (2): arch/hexagon: fix kernel/dma.c build warning hexagon: modify ffs() and fls() to return int Sagi Grimberg (1): nvmet-rdma: fix possible bogus dereference under heavy load Sandipan Das (1): perf probe powerpc: Ignore SyS symbols irrespective of endianness Sean O'Brien (1): HID: add support for Apple Magic Keyboards Stephen Boyd (1): pinctrl: msm: Really mask level interrupts to prevent latching Stephen Rothwell (1): fs/cifs: suppress a string overflow warning Taehee Yoo (1): netfilter: nf_tables: release chain in flushing set Vincent Pelletier (1): scsi: iscsi: target: Set conn->sess to NULL when iscsi_login_set_conn_values fails Vitaly Kuznetsov (1): xen/manage: don't complain about an empty value in control/sysrq node Wenjia Zhang (1): s390/qeth: use vzalloc for QUERY OAT buffer arch/arm64/include/asm/jump_label.h | 4 +- arch/hexagon/include/asm/bitops.h | 4 +- arch/hexagon/kernel/dma.c | 2 +- arch/x86/events/intel/lbr.c | 4 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 23 ++++--- drivers/gpu/drm/nouveau/nvkm/engine/disp/dp.c | 17 +++-- drivers/gpu/drm/nouveau/nvkm/engine/disp/nv50.c | 6 +- drivers/gpu/drm/nouveau/nvkm/engine/disp/outp.c | 2 + drivers/gpu/drm/nouveau/nvkm/engine/disp/outp.h | 3 +- .../gpu/drm/nouveau/nvkm/subdev/devinit/gm200.c | 3 +- drivers/hid/hid-apple.c | 9 ++- drivers/hid/hid-ids.h | 3 + drivers/hid/hid-saitek.c | 2 + drivers/hid/hid-sensor-hub.c | 23 +++++++ drivers/iio/temperature/maxim_thermocouple.c | 1 - drivers/infiniband/core/ucma.c | 6 ++ drivers/md/dm-raid.c | 5 ++ drivers/md/dm-thin-metadata.c | 36 ++++++++++- drivers/md/dm-thin.c | 73 +++++++++++++++++++--- drivers/net/ethernet/amazon/ena/ena_com.c | 8 +-- drivers/net/ethernet/amazon/ena/ena_netdev.c | 10 +-- drivers/net/ethernet/amazon/ena/ena_netdev.h | 11 ++++ drivers/net/ethernet/mellanox/mlx5/core/dev.c | 7 ++- drivers/net/ethernet/realtek/r8169.c | 9 ++- drivers/nvme/target/rdma.c | 27 +++++++- drivers/pinctrl/qcom/pinctrl-msm.c | 24 +++++++ drivers/s390/net/qeth_core_main.c | 5 +- drivers/s390/net/qeth_l2_main.c | 2 +- drivers/s390/net/qeth_l3_main.c | 2 +- drivers/scsi/qedi/qedi.h | 7 ++- drivers/scsi/qedi/qedi_main.c | 28 +++++---- drivers/target/iscsi/iscsi_target_login.c | 8 +-- drivers/usb/gadget/udc/fotg210-udc.c | 15 +++-- drivers/usb/misc/yurex.c | 3 + drivers/xen/cpu_hotplug.c | 15 ++--- drivers/xen/events/events_base.c | 2 +- drivers/xen/manage.c | 6 +- fs/cifs/cifssmb.c | 11 +++- fs/cifs/misc.c | 8 +++ net/ipv4/netfilter/nf_conntrack_proto_icmp.c | 8 +-- net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c | 8 +-- net/netfilter/nf_conntrack_proto_dccp.c | 12 ++-- net/netfilter/nf_conntrack_proto_generic.c | 8 +-- net/netfilter/nf_conntrack_proto_gre.c | 8 +-- net/netfilter/nf_conntrack_proto_sctp.c | 14 ++--- net/netfilter/nf_conntrack_proto_tcp.c | 12 ++-- net/netfilter/nf_conntrack_proto_udp.c | 20 +++--- net/netfilter/nf_tables_api.c | 1 + net/netfilter/xt_cluster.c | 14 ++++- tools/perf/arch/powerpc/util/sym-handling.c | 4 +- tools/perf/util/evsel.c | 5 +- tools/perf/util/trace-event-info.c | 2 +- 52 files changed, 408 insertions(+), 142 deletions(-) -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEE4n5dijQDou9mhzu83qZv95d3LNwFAlu7cYwACgkQ3qZv95d3 LNwxIg//XuOMBAUlGyM9sgqUgbHSoTCeq+NssR49ijpiK/Qa068KIBCcBFUHd9SX fBdxDbF80d56CVC34EbdltN7OjafPlVPbOnA2MAUel64pvrNNkBEEecXKXk7rOrq xnSX4wbQX/vJcdbPf1RgE4RQuSCLhRVpbCSL74W+MP9FZv9gHlF9Cfle6tVnWR6j nSVYw8Uyh4Hh3vAncf8tdBwrm2+eg7QEAalssD2mGiNdVnxWcuw30y8eUDyPI8p9 iBYhmLMrXKdrcupBHRMYv7bA/BA3fW9fBb2iOQYiuWCRe82dyFrcp6Zqu0MlzGQn TzgxnROGvujDMkRyMTcwPEISrirDWnMwUyyk4cCKl+K/sxsT+PTW/yMkLgluldLv KHmyeYNGyTXxOozTUSUmF3VcXanxM++xUGVsEP/zGn5UjO919iVjF71mqgZPUXpC bUfdcI0bdmpDnbayVQDoztidg+Ru9aRKA1DmMdBX2B/MSE6/xAZ8my3FPUJ+/Nkq YH8SypNLG4v45KGwLaVO4mPZqhMW6bFc8l/edrtNoaPuDtSwd/2YREvIY0W039Ux 6PkWPwrIa+L4eW8MjpCCmRstgma3CrMFfOwHNmvtvsvCgtxMMnSvrdqqm3PtE3Pv FylEUBHdkuU0XUJAV2kd5XwiczsokrN7G1WENuvylWYwJ+39io0= =jkO/ -----END PGP SIGNATURE-----

6 years, 11 months

1
0
0 0

[PATCH] kernel/bounds: Provide prototype for foo

by Kieran Bingham

kernel/bounds.c is recompiled on every build, and shows the following warning when compiling with W=1: CC kernel/bounds.s linux/kernel/bounds.c:16:6: warning: no previous prototype for ‘foo’ [-Wmissing-prototypes] void foo(void) ^~~ Provide a prototype to satisfy the compiler. Signed-off-by: Kieran Bingham <kieran.bingham+renesas(a)ideasonboard.com> Cc: stable(a)vger.kernel.org --- I compile all of my incremental builds with W=1, which allows me to know instantly if I add a new compiler warning in code I generate. This warning always comes up and seems trivial to clean up. --- kernel/bounds.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/bounds.c b/kernel/bounds.c index c373e887c066..60136d937800 100644 --- a/kernel/bounds.c +++ b/kernel/bounds.c @@ -13,6 +13,8 @@ #include <linux/log2.h> #include <linux/spinlock_types.h> +void foo(void); + void foo(void) { /* The enum constants to put into include/generated/bounds.h */ -- 2.17.1

6 years, 11 months

7
16
0 0

[PATCH v2] filesystem-dax: Fix dax_layout_busy_page() livelock

by Dan Williams

In the presence of multi-order entries the typical pagevec_lookup_entries() pattern may loop forever: while (index < end && pagevec_lookup_entries(&pvec, mapping, index, min(end - index, (pgoff_t)PAGEVEC_SIZE), indices)) { ... for (i = 0; i < pagevec_count(&pvec); i++) { index = indices[i]; ... } index++; /* BUG */ } The loop updates 'index' for each index found and then increments to the next possible page to continue the lookup. However, if the last entry in the pagevec is multi-order then the next possible page index is more than 1 page away. Fix this locally for the filesystem-dax case by checking for dax-multi-order entries. Going forward new users of multi-order entries need to be similarly careful, or we need a generic way to report the page increment in the radix iterator. Fixes: 5fac7408d828 ("mm, fs, dax: handle layout changes to pinned dax...") Cc: <stable(a)vger.kernel.org> Cc: Jan Kara <jack(a)suse.cz> Cc: Ross Zwisler <zwisler(a)kernel.org> Cc: Matthew Wilcox <willy(a)infradead.org> Signed-off-by: Dan Williams <dan.j.williams(a)intel.com> --- Changes in v2: * Only update nr_pages if the last entry in the pagevec is multi-order. fs/dax.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 4becbf168b7f..0fb270f0a0ef 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -666,6 +666,8 @@ struct page *dax_layout_busy_page(struct address_space *mapping) while (index < end && pagevec_lookup_entries(&pvec, mapping, index, min(end - index, (pgoff_t)PAGEVEC_SIZE), indices)) { + pgoff_t nr_pages = 1; + for (i = 0; i < pagevec_count(&pvec); i++) { struct page *pvec_ent = pvec.pages[i]; void *entry; @@ -680,8 +682,15 @@ struct page *dax_layout_busy_page(struct address_space *mapping) xa_lock_irq(&mapping->i_pages); entry = get_unlocked_mapping_entry(mapping, index, NULL); - if (entry) + if (entry) { page = dax_busy_page(entry); + /* + * Account for multi-order entries at + * the end of the pagevec. + */ + if (i + 1 >= pagevec_count(&pvec)) + nr_pages = 1UL << dax_radix_order(entry); + } put_unlocked_mapping_entry(mapping, index, entry); xa_unlock_irq(&mapping->i_pages); if (page) @@ -696,7 +705,7 @@ struct page *dax_layout_busy_page(struct address_space *mapping) */ pagevec_remove_exceptionals(&pvec); pagevec_release(&pvec); - index++; + index += nr_pages; if (page) break;

6 years, 11 months

2
1
0 0

FAILED: patch "[PATCH] powerpc/lib: fix book3s/32 boot failure due to code patching" failed to apply to 4.14-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From b45ba4a51cde29b2939365ef0c07ad34c8321789 Mon Sep 17 00:00:00 2001 From: Christophe Leroy <christophe.leroy(a)c-s.fr> Date: Mon, 1 Oct 2018 12:21:10 +0000 Subject: [PATCH] powerpc/lib: fix book3s/32 boot failure due to code patching Commit 51c3c62b58b3 ("powerpc: Avoid code patching freed init sections") accesses 'init_mem_is_free' flag too early, before the kernel is relocated. This provokes early boot failure (before the console is active). As it is not necessary to do this verification that early, this patch moves the test into patch_instruction() instead of __patch_instruction(). This modification also has the advantage of avoiding unnecessary remappings. Fixes: 51c3c62b58b3 ("powerpc: Avoid code patching freed init sections") Cc: stable(a)vger.kernel.org # 4.13+ Signed-off-by: Christophe Leroy <christophe.leroy(a)c-s.fr> Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au> diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c index 6ae2777c220d..5ffee298745f 100644 --- a/arch/powerpc/lib/code-patching.c +++ b/arch/powerpc/lib/code-patching.c @@ -28,12 +28,6 @@ static int __patch_instruction(unsigned int *exec_addr, unsigned int instr, { int err; - /* Make sure we aren't patching a freed init section */ - if (init_mem_is_free && init_section_contains(exec_addr, 4)) { - pr_debug("Skipping init section patching addr: 0x%px\n", exec_addr); - return 0; - } - __put_user_size(instr, patch_addr, 4, err); if (err) return err; @@ -148,7 +142,7 @@ static inline int unmap_patch_area(unsigned long addr) return 0; } -int patch_instruction(unsigned int *addr, unsigned int instr) +static int do_patch_instruction(unsigned int *addr, unsigned int instr) { int err; unsigned int *patch_addr = NULL; @@ -188,12 +182,22 @@ out: } #else /* !CONFIG_STRICT_KERNEL_RWX */ -int patch_instruction(unsigned int *addr, unsigned int instr) +static int do_patch_instruction(unsigned int *addr, unsigned int instr) { return raw_patch_instruction(addr, instr); } #endif /* CONFIG_STRICT_KERNEL_RWX */ + +int patch_instruction(unsigned int *addr, unsigned int instr) +{ + /* Make sure we aren't patching a freed init section */ + if (init_mem_is_free && init_section_contains(addr, 4)) { + pr_debug("Skipping init section patching addr: 0x%px\n", addr); + return 0; + } + return do_patch_instruction(addr, instr); +} NOKPROBE_SYMBOL(patch_instruction); int patch_branch(unsigned int *addr, unsigned long target, int flags)

6 years, 11 months

1
0
0 0

[PATCH stable 4.14, 4.18] bpf: 32-bit RSH verification must truncate input before the ALU op

by Daniel Borkmann

From: Jann Horn <jannh(a)google.com> [ upstream commit b799207e1e1816b09e7a5920fbb2d5fcf6edd681 ] When I wrote commit 468f6eafa6c4 ("bpf: fix 32-bit ALU op verification"), I assumed that, in order to emulate 64-bit arithmetic with 32-bit logic, it is sufficient to just truncate the output to 32 bits; and so I just moved the register size coercion that used to be at the start of the function to the end of the function. That assumption is true for almost every op, but not for 32-bit right shifts, because those can propagate information towards the least significant bit. Fix it by always truncating inputs for 32-bit ops to 32 bits. Also get rid of the coerce_reg_to_size() after the ALU op, since that has no effect. Fixes: 468f6eafa6c4 ("bpf: fix 32-bit ALU op verification") Acked-by: Daniel Borkmann <daniel(a)iogearbox.net> Signed-off-by: Jann Horn <jannh(a)google.com> Signed-off-by: Daniel Borkmann <daniel(a)iogearbox.net> --- kernel/bpf/verifier.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index adbe21c..82e8ede 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2865,6 +2865,15 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env, u64 umin_val, umax_val; u64 insn_bitness = (BPF_CLASS(insn->code) == BPF_ALU64) ? 64 : 32; + if (insn_bitness == 32) { + /* Relevant for 32-bit RSH: Information can propagate towards + * LSB, so it isn't sufficient to only truncate the output to + * 32 bits. + */ + coerce_reg_to_size(dst_reg, 4); + coerce_reg_to_size(&src_reg, 4); + } + smin_val = src_reg.smin_value; smax_val = src_reg.smax_value; umin_val = src_reg.umin_value; @@ -3100,7 +3109,6 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env, if (BPF_CLASS(insn->code) != BPF_ALU64) { /* 32-bit ALU ops are (32,32)->32 */ coerce_reg_to_size(dst_reg, 4); - coerce_reg_to_size(&src_reg, 4); } __reg_deduce_bounds(dst_reg); -- 2.9.5

6 years, 11 months

2
1
0 0

Re: [PATCH 03/16] vfs: don't evict uninitialized inode

by Amir Goldstein

> Miklos, > > Seeing that it wasn't fixed in 4.18.. > > > I've nothing against applying "new primitive: discard_new_inode() now > > + this patch, but if it is deemed too risky at this point, we could > > just revert the buggy commit 80ea09a002bf ("vfs: factor out > > inode_insert5()") and its dependencies. > > > > Should we propose for stable the upstream commits: > e950564b97fd vfs: don't evict uninitialized inode > c2b6d621c4ff new primitive: discard_new_inode() > > Or should we go with the independent v1 patch: > https://patchwork.kernel.org/patch/10511969/ > Greg, To fix a 4.18 overlayfs regression please apply the following 3 upstream commits (in apply order): c2b6d621c4ff new primitive: discard_new_inode() e950564b97fd vfs: don't evict uninitialized inode 6faf05c2b2b4 ovl: set I_CREATING on inode being created Thanks, Amir.

6 years, 11 months

3
2
0 0

[PATCH 08/15] bcache: fix miss key refill->end in writeback

by Coly Li

From: Tang Junhui <tang.junhui.linux(a)gmail.com> refill->end record the last key of writeback, for example, at the first time, keys (1,128K) to (1,1024K) are flush to the backend device, but the end key (1,1024K) is not included, since the bellow code: if (bkey_cmp(k, refill->end) >= 0) { ret = MAP_DONE; goto out; } And in the next time when we refill writeback keybuf again, we searched key start from (1,1024K), and got a key bigger than it, so the key (1,1024K) missed. This patch modify the above code, and let the end key to be included to the writeback key buffer. Signed-off-by: Tang Junhui <tang.junhui.linux(a)gmail.com> Cc: stable(a)vger.kernel.org Signed-off-by: Coly Li <colyli(a)suse.de> --- drivers/md/bcache/btree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c index e7d4817681f2..3f4211b5cd33 100644 --- a/drivers/md/bcache/btree.c +++ b/drivers/md/bcache/btree.c @@ -2434,7 +2434,7 @@ static int refill_keybuf_fn(struct btree_op *op, struct btree *b, struct keybuf *buf = refill->buf; int ret = MAP_CONTINUE; - if (bkey_cmp(k, refill->end) >= 0) { + if (bkey_cmp(k, refill->end) > 0) { ret = MAP_DONE; goto out; } -- 2.19.0

6 years, 11 months

1
0
0 0

[PATCH 06/15] bcache: correct dirty data statistics

by Coly Li

From: Tang Junhui <tang.junhui.linux(a)gmail.com> When bcache device is clean, dirty keys may still exist after journal replay, so we need to count these dirty keys even device in clean status, otherwise after writeback, the amount of dirty data would be incorrect. Signed-off-by: Tang Junhui <tang.junhui.linux(a)gmail.com> Cc: stable(a)vger.kernel.org Signed-off-by: Coly Li <colyli(a)suse.de> --- drivers/md/bcache/super.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index a99af19d2f91..4989c7d4d4d0 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1152,11 +1152,12 @@ int bch_cached_dev_attach(struct cached_dev *dc, struct cache_set *c, } if (BDEV_STATE(&dc->sb) == BDEV_STATE_DIRTY) { - bch_sectors_dirty_init(&dc->disk); atomic_set(&dc->has_dirty, 1); bch_writeback_queue(dc); } + bch_sectors_dirty_init(&dc->disk); + bch_cached_dev_run(dc); bcache_device_link(&dc->disk, c, "bdev"); atomic_inc(&c->attached_dev_nr); -- 2.19.0

6 years, 11 months

1
0
0 0

[PATCH 04/15] bcache: fix ioctl in flash device

by Coly Li

From: Tang Junhui <tang.junhui.linux(a)gmail.com> When doing ioctl in flash device, it will call ioctl_dev() in super.c, then we should not to get cached device since flash only device has no backend device. This patch just move the jugement dc->io_disable to cached_dev_ioctl() to make ioctl in flash device correctly. Fixes: 0f0709e6bfc3c ("bcache: stop bcache device when backing device is offline") Signed-off-by: Tang Junhui <tang.junhui.linux(a)gmail.com> Cc: stable(a)vger.kernel.org Signed-off-by: Coly Li <colyli(a)suse.de> --- drivers/md/bcache/request.c | 3 +++ drivers/md/bcache/super.c | 4 ---- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c index ee15fb039fd0..3bf35914bb57 100644 --- a/drivers/md/bcache/request.c +++ b/drivers/md/bcache/request.c @@ -1218,6 +1218,9 @@ static int cached_dev_ioctl(struct bcache_device *d, fmode_t mode, { struct cached_dev *dc = container_of(d, struct cached_dev, disk); + if (dc->io_disable) + return -EIO; + return __blkdev_driver_ioctl(dc->bdev, mode, cmd, arg); } diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 448e531e8c2d..a99af19d2f91 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -647,10 +647,6 @@ static int ioctl_dev(struct block_device *b, fmode_t mode, unsigned int cmd, unsigned long arg) { struct bcache_device *d = b->bd_disk->private_data; - struct cached_dev *dc = container_of(d, struct cached_dev, disk); - - if (dc->io_disable) - return -EIO; return d->ioctl(d, mode, cmd, arg); } -- 2.19.0

6 years, 11 months

1
0
0 0

[PATCH 02/15] bcache: trace missed reading by cache_missed

by Coly Li

From: Tang Junhui <tang.junhui.linux(a)gmail.com> Missed reading IOs are identified by s->cache_missed, not the s->cache_miss, so in trace_bcache_read() using trace_bcache_read to identify whether the IO is missed or not. Signed-off-by: Tang Junhui <tang.junhui.linux(a)gmail.com> Cc: stable(a)vger.kernel.org Signed-off-by: Coly Li <colyli(a)suse.de> --- drivers/md/bcache/request.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c index 51be355a3309..4946d486f734 100644 --- a/drivers/md/bcache/request.c +++ b/drivers/md/bcache/request.c @@ -850,7 +850,7 @@ static void cached_dev_read_done_bh(struct closure *cl) bch_mark_cache_accounting(s->iop.c, s->d, !s->cache_missed, s->iop.bypass); - trace_bcache_read(s->orig_bio, !s->cache_miss, s->iop.bypass); + trace_bcache_read(s->orig_bio, !s->cache_missed, s->iop.bypass); if (s->iop.status) continue_at_nobarrier(cl, cached_dev_read_error, bcache_wq); -- 2.19.0

6 years, 11 months

1
0
0 0

[PATCH AUTOSEL 4.18 01/48] ASoC: dapm: Fix NULL pointer deference on CODEC to CODEC DAIs

by Sasha Levin

From: Charles Keepax <ckeepax(a)opensource.cirrus.com> [ Upstream commit 249dc49576fc953a7378b916c6a6d47ea81e4da2 ] Commit a655de808cbde ("ASoC: core: Allow topology to override machine driver FE DAI link config.") caused soc_dai_hw_params to be come dependent on the substream private_data being set with a pointer to the snd_soc_pcm_runtime. Currently, CODEC to CODEC links don't set this, which causes a NULL pointer dereference: [<4069de54>] (soc_dai_hw_params) from [<40694b68>] (snd_soc_dai_link_event+0x1a0/0x380) Since the ASoC core in general assumes that the substream private_data will be set to a pointer to the snd_soc_pcm_runtime, update the CODEC to CODEC links to respect this. Signed-off-by: Charles Keepax <ckeepax(a)opensource.cirrus.com> Signed-off-by: Mark Brown <broonie(a)kernel.org> Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com> --- include/sound/soc-dapm.h | 1 + sound/soc/soc-core.c | 4 ++-- sound/soc/soc-dapm.c | 4 ++++ 3 files changed, 7 insertions(+), 2 deletions(-) diff --git a/include/sound/soc-dapm.h b/include/sound/soc-dapm.h index a6ce2de4e20a..be3bee1cf91f 100644 --- a/include/sound/soc-dapm.h +++ b/include/sound/soc-dapm.h @@ -410,6 +410,7 @@ int snd_soc_dapm_new_dai_widgets(struct snd_soc_dapm_context *dapm, int snd_soc_dapm_link_dai_widgets(struct snd_soc_card *card); void snd_soc_dapm_connect_dai_link_widgets(struct snd_soc_card *card); int snd_soc_dapm_new_pcm(struct snd_soc_card *card, + struct snd_soc_pcm_runtime *rtd, const struct snd_soc_pcm_stream *params, unsigned int num_params, struct snd_soc_dapm_widget *source, diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c index 4663de3cf495..0b4896d411f9 100644 --- a/sound/soc/soc-core.c +++ b/sound/soc/soc-core.c @@ -1430,7 +1430,7 @@ static int soc_link_dai_widgets(struct snd_soc_card *card, sink = codec_dai->playback_widget; source = cpu_dai->capture_widget; if (sink && source) { - ret = snd_soc_dapm_new_pcm(card, dai_link->params, + ret = snd_soc_dapm_new_pcm(card, rtd, dai_link->params, dai_link->num_params, source, sink); if (ret != 0) { @@ -1443,7 +1443,7 @@ static int soc_link_dai_widgets(struct snd_soc_card *card, sink = cpu_dai->playback_widget; source = codec_dai->capture_widget; if (sink && source) { - ret = snd_soc_dapm_new_pcm(card, dai_link->params, + ret = snd_soc_dapm_new_pcm(card, rtd, dai_link->params, dai_link->num_params, source, sink); if (ret != 0) { diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c index a099c3e45504..577f6178af57 100644 --- a/sound/soc/soc-dapm.c +++ b/sound/soc/soc-dapm.c @@ -3658,6 +3658,7 @@ static int snd_soc_dai_link_event(struct snd_soc_dapm_widget *w, { struct snd_soc_dapm_path *source_p, *sink_p; struct snd_soc_dai *source, *sink; + struct snd_soc_pcm_runtime *rtd = w->priv; const struct snd_soc_pcm_stream *config = w->params + w->params_select; struct snd_pcm_substream substream; struct snd_pcm_hw_params *params = NULL; @@ -3717,6 +3718,7 @@ static int snd_soc_dai_link_event(struct snd_soc_dapm_widget *w, goto out; } substream.runtime = runtime; + substream.private_data = rtd; switch (event) { case SND_SOC_DAPM_PRE_PMU: @@ -3901,6 +3903,7 @@ snd_soc_dapm_alloc_kcontrol(struct snd_soc_card *card, } int snd_soc_dapm_new_pcm(struct snd_soc_card *card, + struct snd_soc_pcm_runtime *rtd, const struct snd_soc_pcm_stream *params, unsigned int num_params, struct snd_soc_dapm_widget *source, @@ -3969,6 +3972,7 @@ int snd_soc_dapm_new_pcm(struct snd_soc_card *card, w->params = params; w->num_params = num_params; + w->priv = rtd; ret = snd_soc_dapm_add_path(&card->dapm, source, w, NULL, NULL); if (ret) -- 2.17.1

6 years, 11 months

3
51
0 0

[PATCH] percpu: stop leaking bitmap metadata blocks

by Mike Rapoport

The commit ca460b3c9627 ("percpu: introduce bitmap metadata blocks") introduced bitmap metadata blocks. These metadata blocks are allocated whenever a new chunk is created, but they are never freed. Fix it. Fixes: ca460b3c9627 ("percpu: introduce bitmap metadata blocks") Signed-off-by: Mike Rapoport <rppt(a)linux.vnet.ibm.com> Cc: stable(a)vger.kernel.org --- mm/percpu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/percpu.c b/mm/percpu.c index d21cb13..25104cd 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -1212,6 +1212,7 @@ static void pcpu_free_chunk(struct pcpu_chunk *chunk) { if (!chunk) return; + pcpu_mem_free(chunk->md_blocks); pcpu_mem_free(chunk->bound_map); pcpu_mem_free(chunk->alloc_map); pcpu_mem_free(chunk); -- 2.7.4

6 years, 11 months

2
1
0 0

[PATCH v2 03/29] clk: sunxi-ng: Use u64 for calculation of NM rate

by Jernej Skrabec

Allwinner H6 SoC has multiplier N range between 1 and 254. Since parent rate is 24MHz, intermediate result when calculating final rate easily overflows 32 bit variable. Because of that, introduce function for calculating clock rate which uses 64 bit variable for intermediate result. Fixes: 6174a1e24b0d ("clk: sunxi-ng: Add N-M-factor clock support") Fixes: ee28648cb2b4 ("clk: sunxi-ng: Remove the use of rational computations") CC: <stable(a)vger.kernel.org> Signed-off-by: Jernej Skrabec <jernej.skrabec(a)siol.net> --- drivers/clk/sunxi-ng/ccu_nm.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/clk/sunxi-ng/ccu_nm.c b/drivers/clk/sunxi-ng/ccu_nm.c index 6fe3c14f7b2d..424d8635b053 100644 --- a/drivers/clk/sunxi-ng/ccu_nm.c +++ b/drivers/clk/sunxi-ng/ccu_nm.c @@ -19,6 +19,17 @@ struct _ccu_nm { unsigned long m, min_m, max_m; }; +static unsigned long ccu_nm_calc_rate(unsigned long parent, + unsigned long n, unsigned long m) +{ + u64 rate = parent; + + rate *= n; + do_div(rate, m); + + return rate; +} + static void ccu_nm_find_best(unsigned long parent, unsigned long rate, struct _ccu_nm *nm) { @@ -28,7 +39,8 @@ static void ccu_nm_find_best(unsigned long parent, unsigned long rate, for (_n = nm->min_n; _n <= nm->max_n; _n++) { for (_m = nm->min_m; _m <= nm->max_m; _m++) { - unsigned long tmp_rate = parent * _n / _m; + unsigned long tmp_rate = ccu_nm_calc_rate(parent, + _n, _m); if (tmp_rate > rate) continue; @@ -100,7 +112,7 @@ static unsigned long ccu_nm_recalc_rate(struct clk_hw *hw, if (ccu_sdm_helper_is_enabled(&nm->common, &nm->sdm)) rate = ccu_sdm_helper_read_rate(&nm->common, &nm->sdm, m, n); else - rate = parent_rate * n / m; + rate = ccu_nm_calc_rate(parent_rate, n, m); if (nm->common.features & CCU_FEATURE_FIXED_POSTDIV) rate /= nm->fixed_post_div; @@ -149,7 +161,7 @@ static long ccu_nm_round_rate(struct clk_hw *hw, unsigned long rate, _nm.max_m = nm->m.max ?: 1 << nm->m.width; ccu_nm_find_best(*parent_rate, rate, &_nm); - rate = *parent_rate * _nm.n / _nm.m; + rate = ccu_nm_calc_rate(*parent_rate, _nm.n, _nm.m); if (nm->common.features & CCU_FEATURE_FIXED_POSTDIV) rate /= nm->fixed_post_div; -- 2.19.0

6 years, 11 months

1
0
0 0

[PATCH] filesystem-dax: Fix dax_layout_busy_page() livelock

by Dan Williams

In the presence of multi-order entries the typical pagevec_lookup_entries() pattern may loop forever: while (index < end && pagevec_lookup_entries(&pvec, mapping, index, min(end - index, (pgoff_t)PAGEVEC_SIZE), indices)) { ... for (i = 0; i < pagevec_count(&pvec); i++) { index = indices[i]; ... } index++; /* BUG */ } The loop updates 'index' for each index found and then increments to the next possible page to continue the lookup. However, if the last entry in the pagevec is multi-order then the next possible page index is more than 1 page away. Fix this locally for the filesystem-dax case by checking for dax-multi-order entries. Going forward new users of multi-order entries need to be similarly careful, or we need a generic way to report the page increment in the radix iterator. Fixes: 5fac7408d828 ("mm, fs, dax: handle layout changes to pinned dax...") Cc: <stable(a)vger.kernel.org> Cc: Jan Kara <jack(a)suse.cz> Cc: Ross Zwisler <zwisler(a)kernel.org> Cc: Matthew Wilcox <willy(a)infradead.org> Signed-off-by: Dan Williams <dan.j.williams(a)intel.com> --- fs/dax.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 4becbf168b7f..c1472eede1f7 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -666,6 +666,8 @@ struct page *dax_layout_busy_page(struct address_space *mapping) while (index < end && pagevec_lookup_entries(&pvec, mapping, index, min(end - index, (pgoff_t)PAGEVEC_SIZE), indices)) { + pgoff_t nr_pages = 1; + for (i = 0; i < pagevec_count(&pvec); i++) { struct page *pvec_ent = pvec.pages[i]; void *entry; @@ -680,8 +682,11 @@ struct page *dax_layout_busy_page(struct address_space *mapping) xa_lock_irq(&mapping->i_pages); entry = get_unlocked_mapping_entry(mapping, index, NULL); - if (entry) + if (entry) { page = dax_busy_page(entry); + /* account for multi-order entries */ + nr_pages = 1UL << dax_radix_order(entry); + } put_unlocked_mapping_entry(mapping, index, entry); xa_unlock_irq(&mapping->i_pages); if (page) @@ -696,7 +701,7 @@ struct page *dax_layout_busy_page(struct address_space *mapping) */ pagevec_remove_exceptionals(&pvec); pagevec_release(&pvec); - index++; + index += nr_pages; if (page) break;

6 years, 11 months

1
1
0 0

please apply d41aa5252394 ("mm: madvise(MADV_DODUMP): allow hugetlbfs pages")

by Daniel Black

This is a bugfix on 0103bd16fb90 which exists in 3.16+ stable trees Mike Kravetz confirmed this as a regression: https://marc.info/?l=linux-mm&m=153843190414206&w=2 Thanks -- Daniel Black

6 years, 11 months

2
1
0 0

[PATCH stable 0/2] termios: Alpha BOTHER/IBSHIFT, tty_baudrate fix

by H. Peter Anvin

From: "H. Peter Anvin (Intel)" <hpa(a)zytor.com> It turns out that Alpha is the only architecture that never implemented BOTHER and IBSHIFT, which is otherwise ages old. This is one thing that has held up glibc support for this feature (all other architectures have supported these for about a decade, at least before the current 3.2 glibc cutoff.) Furthermore, in the process of dealing with this, I discovered that the current code in tty_baudrate.c can read past the end of the baud_table[] on Alpha and PowerPC. The second patch in this series fixes that, but it also cleans up the code substantially by auto-generating the table and, since all architectures now have them, removing all conditionals for BOTHER and IBSHIFT existing. Tagging for stable because these are concrete problems. I have a much bigger update in process, nearly done, which will clean up a lot of duplicated code and make the uapi headers usable for libc, but that is not critical on the same level. Tested on x86, compile-tested on Alpha. Signed-off-by: H. Peter Anvin (Intel) <hpa(a)zytor.com> Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Cc: Jiri Slaby <jslaby(a)suse.com> Cc: Al Viro <viro(a)zeniv.linux.org.uk> Cc: Richard Henderson <rth(a)twiddle.net> Cc: Ivan Kokshaysky <ink(a)jurassic.park.msu.ru> Cc: Matt Turner <mattst88(a)gmail.com> Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: Kate Stewart <kstewart(a)linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne(a)nexb.com> Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Cc: Eugene Syromiatnikov <esyr(a)redhat.com> Cc: <linux-alpha(a)vger.kernel.org> Cc: <linux-serial(a)vger.kernel.org> Cc: Johan Hovold <johan(a)kernel.org> Cc: Alan Cox <alan(a)lxorguk.ukuu.org.uk> Cc: Benjamin Herrenschmidt <benh(a)kernel.crashing.org> Cc: Paul Mackerras <paulus(a)samba.org> Cc: Michael Ellerman <mpe(a)ellerman.id.au> Cc: <stable(a)vger.kernel.org> --- arch/alpha/include/asm/termios.h | 8 +- arch/alpha/include/uapi/asm/ioctls.h | 5 + arch/alpha/include/uapi/asm/termbits.h | 17 +++ drivers/tty/.gitignore | 1 + drivers/tty/Makefile | 16 +++ drivers/tty/bmacros.c | 2 + drivers/tty/tty_baudrate.c | 190 +++++++++++++-------------------- 7 files changed, 122 insertions(+), 117 deletions(-)

6 years, 11 months

1
2
0 0

[PATCH] remoteproc: qcom: q6v5: Propagate EPROBE_DEFER

by Bjorn Andersson

In the case that the interrupts fail to result because of the interrupt-controller not yet being registered the platform_get_irq_byname() call will fail with -EPROBE_DEFER, but passing this into devm_request_threaded_irq() will result in -EINVAL being returned, the driver is therefor not reprobed later. Fixes: 3b415c8fb263 ("remoteproc: q6v5: Extract common resource handling") Cc: stable(a)vger.kernel.org Signed-off-by: Bjorn Andersson <bjorn.andersson(a)linaro.org> --- drivers/remoteproc/qcom_q6v5.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/remoteproc/qcom_q6v5.c b/drivers/remoteproc/qcom_q6v5.c index 61a760ee4aac..e9ab90c19304 100644 --- a/drivers/remoteproc/qcom_q6v5.c +++ b/drivers/remoteproc/qcom_q6v5.c @@ -198,6 +198,9 @@ int qcom_q6v5_init(struct qcom_q6v5 *q6v5, struct platform_device *pdev, } q6v5->fatal_irq = platform_get_irq_byname(pdev, "fatal"); + if (q6v5->fatal_irq == -EPROBE_DEFER) + return -EPROBE_DEFER; + ret = devm_request_threaded_irq(&pdev->dev, q6v5->fatal_irq, NULL, q6v5_fatal_interrupt, IRQF_TRIGGER_RISING | IRQF_ONESHOT, @@ -208,6 +211,9 @@ int qcom_q6v5_init(struct qcom_q6v5 *q6v5, struct platform_device *pdev, } q6v5->ready_irq = platform_get_irq_byname(pdev, "ready"); + if (q6v5->ready_irq == -EPROBE_DEFER) + return -EPROBE_DEFER; + ret = devm_request_threaded_irq(&pdev->dev, q6v5->ready_irq, NULL, q6v5_ready_interrupt, IRQF_TRIGGER_RISING | IRQF_ONESHOT, @@ -218,6 +224,9 @@ int qcom_q6v5_init(struct qcom_q6v5 *q6v5, struct platform_device *pdev, } q6v5->handover_irq = platform_get_irq_byname(pdev, "handover"); + if (q6v5->handover_irq == -EPROBE_DEFER) + return -EPROBE_DEFER; + ret = devm_request_threaded_irq(&pdev->dev, q6v5->handover_irq, NULL, q6v5_handover_interrupt, IRQF_TRIGGER_RISING | IRQF_ONESHOT, @@ -229,6 +238,9 @@ int qcom_q6v5_init(struct qcom_q6v5 *q6v5, struct platform_device *pdev, disable_irq(q6v5->handover_irq); q6v5->stop_irq = platform_get_irq_byname(pdev, "stop-ack"); + if (q6v5->stop_irq == -EPROBE_DEFER) + return -EPROBE_DEFER; + ret = devm_request_threaded_irq(&pdev->dev, q6v5->stop_irq, NULL, q6v5_stop_interrupt, IRQF_TRIGGER_RISING | IRQF_ONESHOT, -- 2.18.0

6 years, 11 months

2
2
0 0

[PATCH] jbd2: Fix use after free in jbd2_log_do_checkpoint()

by Jan Kara

The code cleaning transaction's lists of checkpoint buffers has a bug where it increases bh refcount only after releasing journal->j_list_lock. Thus the following race is possible: CPU0 CPU1 jbd2_log_do_checkpoint() jbd2_journal_try_to_free_buffers() __journal_try_to_free_buffer(bh) ... while (transaction->t_checkpoint_io_list) ... if (buffer_locked(bh)) { <-- IO completes now, buffer gets unlocked --> spin_unlock(&journal->j_list_lock); spin_lock(&journal->j_list_lock); __jbd2_journal_remove_checkpoint(jh); spin_unlock(&journal->j_list_lock); try_to_free_buffers(page); get_bh(bh) <-- accesses freed bh Fix the problem by grabbing bh reference before unlocking journal->j_list_lock. Fixes: dc6e8d669cf5cb3ff84707c372c0a2a8a5e80845 Fixes: be1158cc615fd723552f0d9912087423c7cadda5 Reported-by: syzbot+7f4a27091759e2fe7453(a)syzkaller.appspotmail.com CC: stable(a)vger.kernel.org Signed-off-by: Jan Kara <jack(a)suse.cz> --- fs/jbd2/checkpoint.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index c125d662777c..26f8d7e46462 100644 --- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -251,8 +251,8 @@ int jbd2_log_do_checkpoint(journal_t *journal) bh = jh2bh(jh); if (buffer_locked(bh)) { - spin_unlock(&journal->j_list_lock); get_bh(bh); + spin_unlock(&journal->j_list_lock); wait_on_buffer(bh); /* the journal_head may have gone by now */ BUFFER_TRACE(bh, "brelse"); @@ -333,8 +333,8 @@ int jbd2_log_do_checkpoint(journal_t *journal) jh = transaction->t_checkpoint_io_list; bh = jh2bh(jh); if (buffer_locked(bh)) { - spin_unlock(&journal->j_list_lock); get_bh(bh); + spin_unlock(&journal->j_list_lock); wait_on_buffer(bh); /* the journal_head may have gone by now */ BUFFER_TRACE(bh, "brelse"); -- 2.16.4

6 years, 11 months

4
8
0 0

[patch 13/14] ocfs2: fix locking for res->tracking and dlm->tracking_list

by akpm＠linux-foundation.org

From: Ashish Samant <ashish.samant(a)oracle.com> Subject: ocfs2: fix locking for res->tracking and dlm->tracking_list In dlm_init_lockres() we access and modify res->tracking and dlm->tracking_list without holding dlm->track_lock. This can cause list corruptions and can end up in kernel panic. Fix this by locking res->tracking and dlm->tracking_list with dlm->track_lock instead of dlm->spinlock. Link: http://lkml.kernel.org/r/1529951192-4686-1-git-send-email-ashish.samant@ora… Signed-off-by: Ashish Samant <ashish.samant(a)oracle.com> Reviewed-by: Changwei Ge <ge.changwei(a)h3c.com> Acked-by: Joseph Qi <jiangqi903(a)gmail.com> Acked-by: Jun Piao <piaojun(a)huawei.com> Cc: Mark Fasheh <mark(a)fasheh.com> Cc: Joel Becker <jlbec(a)evilplan.org> Cc: Junxiao Bi <junxiao.bi(a)oracle.com> Cc: Changwei Ge <ge.changwei(a)h3c.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/ocfs2/dlm/dlmmaster.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/fs/ocfs2/dlm/dlmmaster.c~ocfs2-fix-locking-for-res-tracking-and-dlm-tracking_list +++ a/fs/ocfs2/dlm/dlmmaster.c @@ -584,9 +584,9 @@ static void dlm_init_lockres(struct dlm_ res->last_used = 0; - spin_lock(&dlm->spinlock); + spin_lock(&dlm->track_lock); list_add_tail(&res->tracking, &dlm->tracking_list); - spin_unlock(&dlm->spinlock); + spin_unlock(&dlm->track_lock); memset(res->lvb, 0, DLM_LVB_LEN); memset(res->refmap, 0, sizeof(res->refmap)); _

6 years, 11 months

1
0
0 0

[patch 11/14] mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly

by akpm＠linux-foundation.org

From: Jann Horn <jannh(a)google.com> Subject: mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly 5dd0b16cdaff ("mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even on UP") made the availability of the NR_TLB_REMOTE_FLUSH* counters inside the kernel unconditional to reduce #ifdef soup, but (either to avoid showing dummy zero counters to userspace, or because that code was missed) didn't update the vmstat_array, meaning that all following counters would be shown with incorrect values. This only affects kernel builds with CONFIG_VM_EVENT_COUNTERS=y && CONFIG_DEBUG_TLBFLUSH=y && CONFIG_SMP=n. Link: http://lkml.kernel.org/r/20181001143138.95119-2-jannh@google.com Fixes: 5dd0b16cdaff ("mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even on UP") Signed-off-by: Jann Horn <jannh(a)google.com> Reviewed-by: Kees Cook <keescook(a)chromium.org> Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org> Acked-by: Michal Hocko <mhocko(a)suse.com> Acked-by: Roman Gushchin <guro(a)fb.com> Cc: Davidlohr Bueso <dave(a)stgolabs.net> Cc: Oleg Nesterov <oleg(a)redhat.com> Cc: Christoph Lameter <clameter(a)sgi.com> Cc: Kemi Wang <kemi.wang(a)intel.com> Cc: Andy Lutomirski <luto(a)kernel.org> Cc: Ingo Molnar <mingo(a)kernel.org> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/vmstat.c | 3 +++ 1 file changed, 3 insertions(+) --- a/mm/vmstat.c~mm-vmstat-skip-nr_tlb_remote_flush-properly +++ a/mm/vmstat.c @@ -1275,6 +1275,9 @@ const char * const vmstat_text[] = { #ifdef CONFIG_SMP "nr_tlb_remote_flush", "nr_tlb_remote_flush_received", +#else + "", /* nr_tlb_remote_flush */ + "", /* nr_tlb_remote_flush_received */ #endif /* CONFIG_SMP */ "nr_tlb_local_flush_all", "nr_tlb_local_flush_one", _

6 years, 11 months

1
0
0 0

[patch 09/14] proc: restrict kernel stack dumps to root

by akpm＠linux-foundation.org

From: Jann Horn <jannh(a)google.com> Subject: proc: restrict kernel stack dumps to root Currently, you can use /proc/self/task/*/stack to cause a stack walk on a task you control while it is running on another CPU. That means that the stack can change under the stack walker. The stack walker does have guards against going completely off the rails and into random kernel memory, but it can interpret random data from your kernel stack as instruction pointers and stack pointers. This can cause exposure of kernel stack contents to userspace. Restrict the ability to inspect kernel stacks of arbitrary tasks to root in order to prevent a local attacker from exploiting racy stack unwinding to leak kernel task stack contents. See the added comment for a longer rationale. There don't seem to be any users of this userspace API that can't gracefully bail out if reading from the file fails. Therefore, I believe that this change is unlikely to break things. In the case that this patch does end up needing a revert, the next-best solution might be to fake a single-entry stack based on wchan. Link: http://lkml.kernel.org/r/20180927153316.200286-1-jannh@google.com Fixes: 2ec220e27f50 ("proc: add /proc/*/stack") Signed-off-by: Jann Horn <jannh(a)google.com> Acked-by: Kees Cook <keescook(a)chromium.org> Cc: Alexey Dobriyan <adobriyan(a)gmail.com> Cc: Ken Chen <kenchen(a)google.com> Cc: Will Deacon <will.deacon(a)arm.com> Cc: Laura Abbott <labbott(a)redhat.com> Cc: Andy Lutomirski <luto(a)amacapital.net> Cc: Catalin Marinas <catalin.marinas(a)arm.com> Cc: Josh Poimboeuf <jpoimboe(a)redhat.com> Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: Ingo Molnar <mingo(a)redhat.com> Cc: "H . Peter Anvin" <hpa(a)zytor.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/proc/base.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) --- a/fs/proc/base.c~proc-restrict-kernel-stack-dumps-to-root +++ a/fs/proc/base.c @@ -407,6 +407,20 @@ static int proc_pid_stack(struct seq_fil unsigned long *entries; int err; + /* + * The ability to racily run the kernel stack unwinder on a running task + * and then observe the unwinder output is scary; while it is useful for + * debugging kernel issues, it can also allow an attacker to leak kernel + * stack contents. + * Doing this in a manner that is at least safe from races would require + * some work to ensure that the remote task can not be scheduled; and + * even then, this would still expose the unwinder as local attack + * surface. + * Therefore, this interface is restricted to root. + */ + if (!file_ns_capable(m->file, &init_user_ns, CAP_SYS_ADMIN)) + return -EACCES; + entries = kmalloc_array(MAX_STACK_TRACE_DEPTH, sizeof(*entries), GFP_KERNEL); if (!entries) _

6 years, 11 months

1
0
0 0

[patch 04/14] mm, thp: fix mlocking THP page with migration enabled

by akpm＠linux-foundation.org

From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com> Subject: mm, thp: fix mlocking THP page with migration enabled A transparent huge page is represented by a single entry on an LRU list. Therefore, we can only make unevictable an entire compound page, not individual subpages. If a user tries to mlock() part of a huge page, we want the rest of the page to be reclaimable. We handle this by keeping PTE-mapped huge pages on normal LRU lists: the PMD on border of VM_LOCKED VMA will be split into PTE table. Introduction of THP migration breaks[1] the rules around mlocking THP pages. If we had a single PMD mapping of the page in mlocked VMA, the page will get mlocked, regardless of PTE mappings of the page. For tmpfs/shmem it's easy to fix by checking PageDoubleMap() in remove_migration_pmd(). Anon THP pages can only be shared between processes via fork(). Mlocked page can only be shared if parent mlocked it before forking, otherwise CoW will be triggered on mlock(). For Anon-THP, we can fix the issue by munlocking the page on removing PTE migration entry for the page. PTEs for the page will always come after mlocked PMD: rmap walks VMAs from oldest to newest. Test-case: #include <unistd.h> #include <sys/mman.h> #include <sys/wait.h> #include <linux/mempolicy.h> #include <numaif.h> int main(void) { unsigned long nodemask = 4; void *addr; addr = mmap((void *)0x20000000UL, 2UL << 20, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_LOCKED, -1, 0); if (fork()) { wait(NULL); return 0; } mlock(addr, 4UL << 10); mbind(addr, 2UL << 20, MPOL_PREFERRED | MPOL_F_RELATIVE_NODES, &nodemask, 4, MPOL_MF_MOVE); return 0; } [1] https://lkml.kernel.org/r/CAOMGZ=G52R-30rZvhGxEbkTw7rLLwBGadVYeo--iizcD3upL… Link: http://lkml.kernel.org/r/20180917133816.43995-1-kirill.shutemov@linux.intel… Fixes: 616b8371539a ("mm: thp: enable thp migration in generic path") Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com> Reported-by: Vegard Nossum <vegard.nossum(a)oracle.com> Reviewed-by: Zi Yan <zi.yan(a)cs.rutgers.edu> Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com> Cc: Vlastimil Babka <vbabka(a)suse.cz> Cc: Andrea Arcangeli <aarcange(a)redhat.com> Cc: <stable(a)vger.kernel.org> [4.14+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/huge_memory.c | 2 +- mm/migrate.c | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) --- a/mm/huge_memory.c~mm-thp-fix-mlocking-thp-page-with-migration-enabled +++ a/mm/huge_memory.c @@ -2931,7 +2931,7 @@ void remove_migration_pmd(struct page_vm else page_add_file_rmap(new, true); set_pmd_at(mm, mmun_start, pvmw->pmd, pmde); - if (vma->vm_flags & VM_LOCKED) + if ((vma->vm_flags & VM_LOCKED) && !PageDoubleMap(new)) mlock_vma_page(new); update_mmu_cache_pmd(vma, address, pvmw->pmd); } --- a/mm/migrate.c~mm-thp-fix-mlocking-thp-page-with-migration-enabled +++ a/mm/migrate.c @@ -275,6 +275,9 @@ static bool remove_migration_pte(struct if (vma->vm_flags & VM_LOCKED && !PageTransCompound(new)) mlock_vma_page(new); + if (PageTransHuge(page) && PageMlocked(page)) + clear_page_mlock(page); + /* No need to invalidate - it was non-present before */ update_mmu_cache(vma, pvmw.address, pvmw.pte); } _

6 years, 11 months

1
0
0 0

[patch 01/14] mm: migration: fix migration of huge PMD shared pages

by akpm＠linux-foundation.org

From: Mike Kravetz <mike.kravetz(a)oracle.com> Subject: mm: migration: fix migration of huge PMD shared pages The page migration code employs try_to_unmap() to try and unmap the source page. This is accomplished by using rmap_walk to find all vmas where the page is mapped. This search stops when page mapcount is zero. For shared PMD huge pages, the page map count is always 1 no matter the number of mappings. Shared mappings are tracked via the reference count of the PMD page. Therefore, try_to_unmap stops prematurely and does not completely unmap all mappings of the source page. This problem can result is data corruption as writes to the original source page can happen after contents of the page are copied to the target page. Hence, data is lost. This problem was originally seen as DB corruption of shared global areas after a huge page was soft offlined due to ECC memory errors. DB developers noticed they could reproduce the issue by (hotplug) offlining memory used to back huge pages. A simple testcase can reproduce the problem by creating a shared PMD mapping (note that this must be at least PUD_SIZE in size and PUD_SIZE aligned (1GB on x86)), and using migrate_pages() to migrate process pages between nodes while continually writing to the huge pages being migrated. To fix, have the try_to_unmap_one routine check for huge PMD sharing by calling huge_pmd_unshare for hugetlbfs huge pages. If it is a shared mapping it will be 'unshared' which removes the page table entry and drops the reference on the PMD page. After this, flush caches and TLB. mmu notifiers are called before locking page tables, but we can not be sure of PMD sharing until page tables are locked. Therefore, check for the possibility of PMD sharing before locking so that notifiers can prepare for the worst possible case. Link: http://lkml.kernel.org/r/20180823205917.16297-2-mike.kravetz@oracle.com [mike.kravetz(a)oracle.com: make _range_in_vma() a static inline] Link: http://lkml.kernel.org/r/6063f215-a5c8-2f0c-465a-2c515ddc952d@oracle.com Fixes: 39dde65c9940 ("shared page table for hugetlb page") Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com> Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com> Reviewed-by: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com> Acked-by: Michal Hocko <mhocko(a)suse.com> Cc: Vlastimil Babka <vbabka(a)suse.cz> Cc: Davidlohr Bueso <dave(a)stgolabs.net> Cc: Jerome Glisse <jglisse(a)redhat.com> Cc: Mike Kravetz <mike.kravetz(a)oracle.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- include/linux/hugetlb.h | 14 ++++++++++++ include/linux/mm.h | 6 +++++ mm/hugetlb.c | 37 +++++++++++++++++++++++++++++++-- mm/rmap.c | 42 +++++++++++++++++++++++++++++++++++--- 4 files changed, 94 insertions(+), 5 deletions(-) --- a/include/linux/hugetlb.h~mm-migration-fix-migration-of-huge-pmd-shared-pages +++ a/include/linux/hugetlb.h @@ -140,6 +140,8 @@ pte_t *huge_pte_alloc(struct mm_struct * pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr, unsigned long sz); int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep); +void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, + unsigned long *start, unsigned long *end); struct page *follow_huge_addr(struct mm_struct *mm, unsigned long address, int write); struct page *follow_huge_pd(struct vm_area_struct *vma, @@ -170,6 +172,18 @@ static inline unsigned long hugetlb_tota return 0; } +static inline int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, + pte_t *ptep) +{ + return 0; +} + +static inline void adjust_range_if_pmd_sharing_possible( + struct vm_area_struct *vma, + unsigned long *start, unsigned long *end) +{ +} + #define follow_hugetlb_page(m,v,p,vs,a,b,i,w,n) ({ BUG(); 0; }) #define follow_huge_addr(mm, addr, write) ERR_PTR(-EINVAL) #define copy_hugetlb_page_range(src, dst, vma) ({ BUG(); 0; }) --- a/include/linux/mm.h~mm-migration-fix-migration-of-huge-pmd-shared-pages +++ a/include/linux/mm.h @@ -2455,6 +2455,12 @@ static inline struct vm_area_struct *fin return vma; } +static inline bool range_in_vma(struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + return (vma && vma->vm_start <= start && end <= vma->vm_end); +} + #ifdef CONFIG_MMU pgprot_t vm_get_page_prot(unsigned long vm_flags); void vma_set_page_prot(struct vm_area_struct *vma); --- a/mm/hugetlb.c~mm-migration-fix-migration-of-huge-pmd-shared-pages +++ a/mm/hugetlb.c @@ -4545,13 +4545,41 @@ static bool vma_shareable(struct vm_area /* * check on proper vm_flags and page table alignment */ - if (vma->vm_flags & VM_MAYSHARE && - vma->vm_start <= base && end <= vma->vm_end) + if (vma->vm_flags & VM_MAYSHARE && range_in_vma(vma, base, end)) return true; return false; } /* + * Determine if start,end range within vma could be mapped by shared pmd. + * If yes, adjust start and end to cover range associated with possible + * shared pmd mappings. + */ +void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, + unsigned long *start, unsigned long *end) +{ + unsigned long check_addr = *start; + + if (!(vma->vm_flags & VM_MAYSHARE)) + return; + + for (check_addr = *start; check_addr < *end; check_addr += PUD_SIZE) { + unsigned long a_start = check_addr & PUD_MASK; + unsigned long a_end = a_start + PUD_SIZE; + + /* + * If sharing is possible, adjust start/end if necessary. + */ + if (range_in_vma(vma, a_start, a_end)) { + if (a_start < *start) + *start = a_start; + if (a_end > *end) + *end = a_end; + } + } +} + +/* * Search for a shareable pmd page for hugetlb. In any case calls pmd_alloc() * and returns the corresponding pte. While this is not necessary for the * !shared pmd case because we can allocate the pmd later as well, it makes the @@ -4648,6 +4676,11 @@ int huge_pmd_unshare(struct mm_struct *m { return 0; } + +void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, + unsigned long *start, unsigned long *end) +{ +} #define want_pmd_share() (0) #endif /* CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ --- a/mm/rmap.c~mm-migration-fix-migration-of-huge-pmd-shared-pages +++ a/mm/rmap.c @@ -1362,11 +1362,21 @@ static bool try_to_unmap_one(struct page } /* - * We have to assume the worse case ie pmd for invalidation. Note that - * the page can not be free in this function as call of try_to_unmap() - * must hold a reference on the page. + * For THP, we have to assume the worse case ie pmd for invalidation. + * For hugetlb, it could be much worse if we need to do pud + * invalidation in the case of pmd sharing. + * + * Note that the page can not be free in this function as call of + * try_to_unmap() must hold a reference on the page. */ end = min(vma->vm_end, start + (PAGE_SIZE << compound_order(page))); + if (PageHuge(page)) { + /* + * If sharing is possible, start and end will be adjusted + * accordingly. + */ + adjust_range_if_pmd_sharing_possible(vma, &start, &end); + } mmu_notifier_invalidate_range_start(vma->vm_mm, start, end); while (page_vma_mapped_walk(&pvmw)) { @@ -1409,6 +1419,32 @@ static bool try_to_unmap_one(struct page subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte); address = pvmw.address; + if (PageHuge(page)) { + if (huge_pmd_unshare(mm, &address, pvmw.pte)) { + /* + * huge_pmd_unshare unmapped an entire PMD + * page. There is no way of knowing exactly + * which PMDs may be cached for this mm, so + * we must flush them all. start/end were + * already adjusted above to cover this range. + */ + flush_cache_range(vma, start, end); + flush_tlb_range(vma, start, end); + mmu_notifier_invalidate_range(mm, start, end); + + /* + * The ref count of the PMD page was dropped + * which is part of the way map counting + * is done for shared PMDs. Return 'true' + * here. When there is no other sharing, + * huge_pmd_unshare returns false and we will + * unmap the actual page and drop map count + * to zero. + */ + page_vma_mapped_walk_done(&pvmw); + break; + } + } if (IS_ENABLED(CONFIG_MIGRATION) && (flags & TTU_MIGRATION) && _

6 years, 11 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror