This is the start of the stable review cycle for the 4.14.215 release.
There are 57 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 13 Jan 2021 13:00:19 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.215-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.14.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.14.215-rc1
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: x86: fix shift out of bounds reported by UBSAN
Ying-Tsun Huang <ying-tsun.huang(a)amd.com>
x86/mtrr: Correct the range check before performing MTRR type lookups
Florian Westphal <fw(a)strlen.de>
netfilter: xt_RATEEST: reject non-null terminated string from userspace
Vasily Averin <vvs(a)virtuozzo.com>
netfilter: ipset: fix shift-out-of-bounds in htable_bits()
Bard Liao <yung-chuan.liao(a)linux.intel.com>
Revert "device property: Keep secondary firmware node secondary by type"
Kailang Yang <kailang(a)realtek.com>
ALSA: hda/realtek - Fix speaker volume control on Lenovo C940
bo liu <bo.liu(a)senarytech.com>
ALSA: hda/conexant: add a new hda codec CX11970
Dan Williams <dan.j.williams(a)intel.com>
x86/mm: Fix leak of pmd ptlock
Johan Hovold <johan(a)kernel.org>
USB: serial: keyspan_pda: remove unused variable
Eddie Hung <eddie.hung(a)mediatek.com>
usb: gadget: configfs: Fix use-after-free issue with udc_name
Chandana Kishori Chiluveru <cchiluve(a)codeaurora.org>
usb: gadget: configfs: Preserve function ordering after bind failure
Sriharsha Allenki <sallenki(a)codeaurora.org>
usb: gadget: Fix spinlock lockup on usb_function_deactivate
Yang Yingliang <yangyingliang(a)huawei.com>
USB: gadget: legacy: fix return error code in acm_ms_bind()
Zqiang <qiang.zhang(a)windriver.com>
usb: gadget: function: printer: Fix a memory leak for interface descriptor
Jerome Brunet <jbrunet(a)baylibre.com>
usb: gadget: f_uac2: reset wMaxPacketSize
Arnd Bergmann <arnd(a)arndb.de>
usb: gadget: select CONFIG_CRC32
Takashi Iwai <tiwai(a)suse.de>
ALSA: usb-audio: Fix UBSAN warnings for MIDI jacks
Johan Hovold <johan(a)kernel.org>
USB: usblp: fix DMA to stack
Johan Hovold <johan(a)kernel.org>
USB: yurex: fix control-URB timeout handling
Bjørn Mork <bjorn(a)mork.no>
USB: serial: option: add Quectel EM160R-GL
Daniel Palmer <daniel(a)0x0f.com>
USB: serial: option: add LongSung M5710 module support
Johan Hovold <johan(a)kernel.org>
USB: serial: iuu_phoenix: fix DMA from stack
Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
usb: uas: Add PNY USB Portable SSD to unusual_uas
Randy Dunlap <rdunlap(a)infradead.org>
usb: usbip: vhci_hcd: protect shift size
Michael Grzeschik <m.grzeschik(a)pengutronix.de>
USB: xhci: fix U1/U2 handling for hardware with XHCI_INTEL_HOST quirk set
Yu Kuai <yukuai3(a)huawei.com>
usb: chipidea: ci_hdrc_imx: add missing put_device() call in usbmisc_get_init_data()
Serge Semin <Sergey.Semin(a)baikalelectronics.ru>
usb: dwc3: ulpi: Use VStsDone to detect PHY regs access completion
Sean Young <sean(a)mess.org>
USB: cdc-acm: blacklist another IR Droid device
taehyun.cho <taehyun.cho(a)samsung.com>
usb: gadget: enable super speed plus
Ard Biesheuvel <ardb(a)kernel.org>
crypto: ecdh - avoid buffer overflow in ecdh_set_secret()
Dexuan Cui <decui(a)microsoft.com>
video: hyperv_fb: Fix the mmap() regression for v5.4.y and older
Florian Fainelli <f.fainelli(a)gmail.com>
net: systemport: set dev->max_mtu to UMAC_MAX_MTU_SIZE
Stefan Chulski <stefanc(a)marvell.com>
net: mvpp2: Fix GoP port 3 Networking Complex Control configurations
Antoine Tenart <atenart(a)kernel.org>
net-sysfs: take the rtnl lock when accessing xps_cpus_map and num_tc
Randy Dunlap <rdunlap(a)infradead.org>
net: sched: prevent invalid Scell_log shift count
Yunjian Wang <wangyunjian(a)huawei.com>
vhost_net: fix ubuf refcount incorrectly when sendmsg fails
Bjørn Mork <bjorn(a)mork.no>
net: usb: qmi_wwan: add Quectel EM160R-GL
Roland Dreier <roland(a)kernel.org>
CDC-NCM: remove "connected" log message
Xie He <xie.he.0141(a)gmail.com>
net: hdlc_ppp: Fix issues when mod_timer is called while timer is running
Yunjian Wang <wangyunjian(a)huawei.com>
net: hns: fix return value check in __lb_other_process()
Guillaume Nault <gnault(a)redhat.com>
ipv4: Ignore ECN bits for fib lookups in fib_compute_spec_dst()
Grygorii Strashko <grygorii.strashko(a)ti.com>
net: ethernet: ti: cpts: fix ethtool output when no ptp_clock registered
Antoine Tenart <atenart(a)kernel.org>
net-sysfs: take the rtnl lock when storing xps_cpus
Dinghao Liu <dinghao.liu(a)zju.edu.cn>
net: ethernet: Fix memleak in ethoc_probe
John Wang <wangzhiqiang.bj(a)bytedance.com>
net/ncsi: Use real net-device for response handler
Petr Machata <me(a)pmachata.org>
net: dcb: Validate netlink message in DCB handler
Jeff Dike <jdike(a)akamai.com>
virtio_net: Fix recursive call to cpus_read_lock()
Manish Chopra <manishc(a)marvell.com>
qede: fix offload for IPIP tunnel packets
Dan Carpenter <dan.carpenter(a)oracle.com>
atm: idt77252: call pci_disable_device() on error path
Rasmus Villemoes <rasmus.villemoes(a)prevas.dk>
ethernet: ucc_geth: set dev->max_mtu to 1518
Rasmus Villemoes <rasmus.villemoes(a)prevas.dk>
ethernet: ucc_geth: fix use-after-free in ucc_geth_remove()
Linus Torvalds <torvalds(a)linux-foundation.org>
depmod: handle the case of /sbin/depmod without /sbin in PATH
Huang Shijie <sjhuang(a)iluvatar.ai>
lib/genalloc: fix the overflow when size is too big
Bart Van Assche <bvanassche(a)acm.org>
scsi: ide: Do not set the RQF_PREEMPT flag for sense requests
Adrian Hunter <adrian.hunter(a)intel.com>
scsi: ufs-pci: Ensure UFS device is in PowerDown mode for suspend-to-disk ->poweroff()
Yunfeng Ye <yeyunfeng(a)huawei.com>
workqueue: Kick a worker based on the actual activation of delayed works
Dominique Martinet <asmadeus(a)codewreck.org>
kbuild: don't hardcode depmod path
-------------
Diffstat:
Makefile | 6 +--
arch/x86/kernel/cpu/mtrr/generic.c | 6 +--
arch/x86/kvm/mmu.h | 2 +-
arch/x86/mm/pgtable.c | 2 +
crypto/ecdh.c | 3 +-
drivers/atm/idt77252.c | 2 +-
drivers/base/core.c | 2 +-
drivers/ide/ide-atapi.c | 1 -
drivers/ide/ide-io.c | 5 --
drivers/net/ethernet/broadcom/bcmsysport.c | 1 +
drivers/net/ethernet/ethoc.c | 3 +-
drivers/net/ethernet/freescale/ucc_geth.c | 3 +-
drivers/net/ethernet/hisilicon/hns/hns_ethtool.c | 4 ++
drivers/net/ethernet/marvell/mvpp2.c | 2 +-
drivers/net/ethernet/qlogic/qede/qede_fp.c | 5 ++
drivers/net/ethernet/ti/cpts.c | 2 +
drivers/net/usb/cdc_ncm.c | 3 --
drivers/net/usb/qmi_wwan.c | 1 +
drivers/net/virtio_net.c | 12 +++--
drivers/net/wan/hdlc_ppp.c | 7 +++
drivers/scsi/ufs/ufshcd-pci.c | 34 +++++++++++-
drivers/usb/chipidea/ci_hdrc_imx.c | 6 ++-
drivers/usb/class/cdc-acm.c | 4 ++
drivers/usb/class/usblp.c | 21 +++++++-
drivers/usb/dwc3/core.h | 1 +
drivers/usb/dwc3/ulpi.c | 2 +-
drivers/usb/gadget/Kconfig | 2 +
drivers/usb/gadget/composite.c | 10 +++-
drivers/usb/gadget/configfs.c | 19 ++++---
drivers/usb/gadget/function/f_printer.c | 1 +
drivers/usb/gadget/function/f_uac2.c | 69 +++++++++++++++++++-----
drivers/usb/gadget/legacy/acm_ms.c | 4 +-
drivers/usb/host/xhci.c | 24 ++++-----
drivers/usb/misc/yurex.c | 3 ++
drivers/usb/serial/iuu_phoenix.c | 20 +++++--
drivers/usb/serial/keyspan_pda.c | 2 -
drivers/usb/serial/option.c | 3 ++
drivers/usb/storage/unusual_uas.h | 7 +++
drivers/usb/usbip/vhci_hcd.c | 2 +
drivers/vhost/net.c | 6 +--
drivers/video/fbdev/hyperv_fb.c | 6 +--
include/net/red.h | 4 +-
kernel/workqueue.c | 13 +++--
lib/genalloc.c | 25 ++++-----
net/core/net-sysfs.c | 29 ++++++++--
net/dcb/dcbnl.c | 2 +
net/ipv4/fib_frontend.c | 2 +-
net/ncsi/ncsi-rsp.c | 2 +-
net/netfilter/ipset/ip_set_hash_gen.h | 20 ++-----
net/netfilter/xt_RATEEST.c | 3 ++
net/sched/sch_choke.c | 2 +-
net/sched/sch_gred.c | 2 +-
net/sched/sch_red.c | 2 +-
net/sched/sch_sfq.c | 2 +-
scripts/depmod.sh | 2 +
sound/pci/hda/patch_conexant.c | 1 +
sound/pci/hda/patch_realtek.c | 6 +++
sound/usb/midi.c | 4 ++
58 files changed, 315 insertions(+), 124 deletions(-)
Upon a communication error, the interrupt handler can call
tegra_i2c_disable_packet_mode. This causes a sleeping poll to happen
unless the current transaction was marked atomic. Fix this by
making the poll happen atomically if we are in an IRQ.
This matches the behavior prior to the patch mentioned
in the Fixes tag.
Fixes: ede2299f7101 ("i2c: tegra: Support atomic transfers")
Cc: stable(a)vger.kernel.org
Signed-off-by: Mikko Perttunen <mperttunen(a)nvidia.com>
---
v2:
* Use in_irq() instead of passing a flag from the ISR.
Thanks to Dmitry for the suggestion.
* Update commit message.
---
drivers/i2c/busses/i2c-tegra.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c
index 6f08c0c3238d..0727383f4940 100644
--- a/drivers/i2c/busses/i2c-tegra.c
+++ b/drivers/i2c/busses/i2c-tegra.c
@@ -533,7 +533,7 @@ static int tegra_i2c_poll_register(struct tegra_i2c_dev *i2c_dev,
void __iomem *addr = i2c_dev->base + tegra_i2c_reg_addr(i2c_dev, reg);
u32 val;
- if (!i2c_dev->atomic_mode)
+ if (!i2c_dev->atomic_mode && !in_irq())
return readl_relaxed_poll_timeout(addr, val, !(val & mask),
delay_us, timeout_us);
--
2.30.0
The patch titled
Subject: mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
has been added to the -mm tree. Its filename is
mm-hugetlb-remove-vm_bug_on_page-from-page_huge_active.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-remove-vm_bug_on_page-…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-remove-vm_bug_on_page-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
The page_huge_active() can be called from scan_movable_pages() which
do not hold a reference count to the HugeTLB page. So when we call
page_huge_active() from scan_movable_pages(), the HugeTLB page can
be freed parallel. Then we will trigger a BUG_ON which is in the
page_huge_active() when CONFIG_DEBUG_VM is enabled. Just remove the
VM_BUG_ON_PAGE.
Link: https://lkml.kernel.org/r/20210110124017.86750-7-songmuchun@bytedance.com
Fixes: 7e1f049efb86 ("mm: hugetlb: cleanup using paeg_huge_active()")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-remove-vm_bug_on_page-from-page_huge_active
+++ a/mm/hugetlb.c
@@ -1361,8 +1361,7 @@ struct hstate *size_to_hstate(unsigned l
*/
bool page_huge_active(struct page *page)
{
- VM_BUG_ON_PAGE(!PageHuge(page), page);
- return PageHead(page) && PagePrivate(&page[1]);
+ return PageHeadHuge(page) && PagePrivate(&page[1]);
}
/* never called for tail page */
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-hugetlbfs-fix-cannot-migrate-the-fallocated-hugetlb-page.patch
mm-hugetlb-fix-a-race-between-freeing-and-dissolving-the-page.patch
mm-hugetlb-fix-a-race-between-isolating-and-freeing-page.patch
mm-hugetlb-remove-vm_bug_on_page-from-page_huge_active.patch
mm-memcontrol-optimize-per-lruvec-stats-counter-memory-usage.patch
mm-memcontrol-fix-nr_anon_thps-accounting-in-charge-moving.patch
mm-memcontrol-convert-nr_anon_thps-account-to-pages.patch
mm-memcontrol-convert-nr_file_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_pmdmapped-account-to-pages.patch
mm-memcontrol-convert-nr_file_pmdmapped-account-to-pages.patch
mm-memcontrol-make-the-slab-calculation-consistent.patch
mm-migrate-do-not-migrate-hugetlb-page-whose-refcount-is-one.patch
mm-hugetlb-add-return-eagain-for-dissolve_free_huge_page.patch
The patch titled
Subject: mm: hugetlb: fix a race between isolating and freeing page
has been added to the -mm tree. Its filename is
mm-hugetlb-fix-a-race-between-isolating-and-freeing-page.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-fix-a-race-between-iso…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-fix-a-race-between-iso…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: hugetlb: fix a race between isolating and freeing page
There is a race between isolate_huge_page() and __free_huge_page().
CPU0: CPU1:
if (PageHuge(page))
put_page(page)
__free_huge_page(page)
spin_lock(&hugetlb_lock)
update_and_free_page(page)
set_compound_page_dtor(page,
NULL_COMPOUND_DTOR)
spin_unlock(&hugetlb_lock)
isolate_huge_page(page)
// trigger BUG_ON
VM_BUG_ON_PAGE(!PageHead(page), page)
spin_lock(&hugetlb_lock)
page_huge_active(page)
// trigger BUG_ON
VM_BUG_ON_PAGE(!PageHuge(page), page)
spin_unlock(&hugetlb_lock)
When we isolate a HugeTLB page on CPU0. Meanwhile, we free it to the
buddy allocator on CPU1. Then, we can trigger a BUG_ON on CPU0. Because
it is already freed to the buddy allocator.
Link: https://lkml.kernel.org/r/20210110124017.86750-6-songmuchun@bytedance.com
Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-fix-a-race-between-isolating-and-freeing-page
+++ a/mm/hugetlb.c
@@ -5581,9 +5581,9 @@ bool isolate_huge_page(struct page *page
{
bool ret = true;
- VM_BUG_ON_PAGE(!PageHead(page), page);
spin_lock(&hugetlb_lock);
- if (!page_huge_active(page) || !get_page_unless_zero(page)) {
+ if (!PageHeadHuge(page) || !page_huge_active(page) ||
+ !get_page_unless_zero(page)) {
ret = false;
goto unlock;
}
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-hugetlbfs-fix-cannot-migrate-the-fallocated-hugetlb-page.patch
mm-hugetlb-fix-a-race-between-freeing-and-dissolving-the-page.patch
mm-hugetlb-fix-a-race-between-isolating-and-freeing-page.patch
mm-hugetlb-remove-vm_bug_on_page-from-page_huge_active.patch
mm-memcontrol-optimize-per-lruvec-stats-counter-memory-usage.patch
mm-memcontrol-fix-nr_anon_thps-accounting-in-charge-moving.patch
mm-memcontrol-convert-nr_anon_thps-account-to-pages.patch
mm-memcontrol-convert-nr_file_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_pmdmapped-account-to-pages.patch
mm-memcontrol-convert-nr_file_pmdmapped-account-to-pages.patch
mm-memcontrol-make-the-slab-calculation-consistent.patch
mm-migrate-do-not-migrate-hugetlb-page-whose-refcount-is-one.patch
mm-hugetlb-add-return-eagain-for-dissolve_free_huge_page.patch
The patch titled
Subject: mm: hugetlb: fix a race between freeing and dissolving the page
has been added to the -mm tree. Its filename is
mm-hugetlb-fix-a-race-between-freeing-and-dissolving-the-page.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-fix-a-race-between-fre…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-fix-a-race-between-fre…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: hugetlb: fix a race between freeing and dissolving the page
There is a race condition between __free_huge_page()
and dissolve_free_huge_page().
CPU0: CPU1:
// page_count(page) == 1
put_page(page)
__free_huge_page(page)
dissolve_free_huge_page(page)
spin_lock(&hugetlb_lock)
// PageHuge(page) && !page_count(page)
update_and_free_page(page)
// page is freed to the buddy
spin_unlock(&hugetlb_lock)
spin_lock(&hugetlb_lock)
clear_page_huge_active(page)
enqueue_huge_page(page)
// It is wrong, the page is already freed
spin_unlock(&hugetlb_lock)
The race windows is between put_page() and dissolve_free_huge_page().
We should make sure that the page is already on the free list
when it is dissolved.
Link: https://lkml.kernel.org/r/20210110124017.86750-4-songmuchun@bytedance.com
Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
--- a/mm/hugetlb.c~mm-hugetlb-fix-a-race-between-freeing-and-dissolving-the-page
+++ a/mm/hugetlb.c
@@ -79,6 +79,21 @@ DEFINE_SPINLOCK(hugetlb_lock);
static int num_fault_mutexes;
struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp;
+static inline bool PageHugeFreed(struct page *head)
+{
+ return page_private(head + 4) == -1UL;
+}
+
+static inline void SetPageHugeFreed(struct page *head)
+{
+ set_page_private(head + 4, -1UL);
+}
+
+static inline void ClearPageHugeFreed(struct page *head)
+{
+ set_page_private(head + 4, 0);
+}
+
/* Forward declaration */
static int hugetlb_acct_memory(struct hstate *h, long delta);
@@ -1028,6 +1043,7 @@ static void enqueue_huge_page(struct hst
list_move(&page->lru, &h->hugepage_freelists[nid]);
h->free_huge_pages++;
h->free_huge_pages_node[nid]++;
+ SetPageHugeFreed(page);
}
static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
@@ -1044,6 +1060,7 @@ static struct page *dequeue_huge_page_no
list_move(&page->lru, &h->hugepage_activelist);
set_page_refcounted(page);
+ ClearPageHugeFreed(page);
h->free_huge_pages--;
h->free_huge_pages_node[nid]--;
return page;
@@ -1505,6 +1522,7 @@ static void prep_new_huge_page(struct hs
spin_lock(&hugetlb_lock);
h->nr_huge_pages++;
h->nr_huge_pages_node[nid]++;
+ ClearPageHugeFreed(page);
spin_unlock(&hugetlb_lock);
}
@@ -1771,6 +1789,14 @@ int dissolve_free_huge_page(struct page
int nid = page_to_nid(head);
if (h->free_huge_pages - h->resv_huge_pages == 0)
goto out;
+
+ /*
+ * We should make sure that the page is already on the free list
+ * when it is dissolved.
+ */
+ if (unlikely(!PageHugeFreed(head)))
+ goto out;
+
/*
* Move PageHWPoison flag from head page to the raw error page,
* which makes any subpages rather than the error page reusable.
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-hugetlbfs-fix-cannot-migrate-the-fallocated-hugetlb-page.patch
mm-hugetlb-fix-a-race-between-freeing-and-dissolving-the-page.patch
mm-hugetlb-fix-a-race-between-isolating-and-freeing-page.patch
mm-hugetlb-remove-vm_bug_on_page-from-page_huge_active.patch
mm-memcontrol-optimize-per-lruvec-stats-counter-memory-usage.patch
mm-memcontrol-fix-nr_anon_thps-accounting-in-charge-moving.patch
mm-memcontrol-convert-nr_anon_thps-account-to-pages.patch
mm-memcontrol-convert-nr_file_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_pmdmapped-account-to-pages.patch
mm-memcontrol-convert-nr_file_pmdmapped-account-to-pages.patch
mm-memcontrol-make-the-slab-calculation-consistent.patch
mm-migrate-do-not-migrate-hugetlb-page-whose-refcount-is-one.patch
mm-hugetlb-add-return-eagain-for-dissolve_free_huge_page.patch
The patch titled
Subject: mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
has been added to the -mm tree. Its filename is
mm-hugetlbfs-fix-cannot-migrate-the-fallocated-hugetlb-page.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlbfs-fix-cannot-migrate-t…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlbfs-fix-cannot-migrate-t…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
If a new hugetlb page is allocated during fallocate it will not be marked
as active (set_page_huge_active) which will result in a later
isolate_huge_page failure when the page migration code would like to move
that page. Such a failure would be unexpected and wrong.
Only export set_page_huge_active, just leave clear_page_huge_active
as static. Because there are no external users.
Link: https://lkml.kernel.org/r/20210110124017.86750-3-songmuchun@bytedance.com
Fixes: 70c3547e36f5 (hugetlbfs: add hugetlbfs_fallocate())
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/hugetlbfs/inode.c | 3 ++-
include/linux/hugetlb.h | 2 ++
mm/hugetlb.c | 2 +-
3 files changed, 5 insertions(+), 2 deletions(-)
--- a/fs/hugetlbfs/inode.c~mm-hugetlbfs-fix-cannot-migrate-the-fallocated-hugetlb-page
+++ a/fs/hugetlbfs/inode.c
@@ -735,9 +735,10 @@ static long hugetlbfs_fallocate(struct f
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
+ set_page_huge_active(page);
/*
* unlock_page because locked by add_to_page_cache()
- * page_put due to reference from alloc_huge_page()
+ * put_page() due to reference from alloc_huge_page()
*/
unlock_page(page);
put_page(page);
--- a/include/linux/hugetlb.h~mm-hugetlbfs-fix-cannot-migrate-the-fallocated-hugetlb-page
+++ a/include/linux/hugetlb.h
@@ -770,6 +770,8 @@ static inline void huge_ptep_modify_prot
}
#endif
+void set_page_huge_active(struct page *page);
+
#else /* CONFIG_HUGETLB_PAGE */
struct hstate {};
--- a/mm/hugetlb.c~mm-hugetlbfs-fix-cannot-migrate-the-fallocated-hugetlb-page
+++ a/mm/hugetlb.c
@@ -1349,7 +1349,7 @@ bool page_huge_active(struct page *page)
}
/* never called for tail page */
-static void set_page_huge_active(struct page *page)
+void set_page_huge_active(struct page *page)
{
VM_BUG_ON_PAGE(!PageHeadHuge(page), page);
SetPagePrivate(&page[1]);
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-hugetlbfs-fix-cannot-migrate-the-fallocated-hugetlb-page.patch
mm-hugetlb-fix-a-race-between-freeing-and-dissolving-the-page.patch
mm-hugetlb-fix-a-race-between-isolating-and-freeing-page.patch
mm-hugetlb-remove-vm_bug_on_page-from-page_huge_active.patch
mm-memcontrol-optimize-per-lruvec-stats-counter-memory-usage.patch
mm-memcontrol-fix-nr_anon_thps-accounting-in-charge-moving.patch
mm-memcontrol-convert-nr_anon_thps-account-to-pages.patch
mm-memcontrol-convert-nr_file_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_pmdmapped-account-to-pages.patch
mm-memcontrol-convert-nr_file_pmdmapped-account-to-pages.patch
mm-memcontrol-make-the-slab-calculation-consistent.patch
mm-migrate-do-not-migrate-hugetlb-page-whose-refcount-is-one.patch
mm-hugetlb-add-return-eagain-for-dissolve_free_huge_page.patch
The patch titled
Subject: mm: fix initialization of struct page for holes in memory layout
has been removed from the -mm tree. Its filename was
mm-fix-initialization-of-struct-page-for-holes-in-memory-layout.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Mike Rapoport <rppt(a)linux.ibm.com>
Subject: mm: fix initialization of struct page for holes in memory layout
There could be struct pages that are not backed by actual physical memory.
This can happen when the actual memory bank is not a multiple of
SECTION_SIZE or when an architecture does not register memory holes
reserved by the firmware as memblock.memory.
Such pages are currently initialized using init_unavailable_mem() function
that iterated through PFNs in holes in memblock.memory and if there is a
struct page corresponding to a PFN, the fields if this page are set to
default values and it is marked as Reserved.
init_unavailable_mem() does not take into account zone and node the page
belongs to and sets both zone and node links in struct page to zero.
On a system that has firmware reserved holes in a zone above ZONE_DMA, for
instance in a configuration below:
# grep -A1 E820 /proc/iomem
7a17b000-7a216fff : Unknown E820 type
7a217000-7bffffff : System RAM
unset zone link in struct page will trigger
VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page);
because there are pages in both ZONE_DMA32 and ZONE_DMA (unset zone link in
struct page) in the same pageblock.
Interleave initialization of pages that correspond to holes with the
initialization of memory map, so that zone and node information will be
properly set on such pages.
[akpm(a)linux-foundation.org: coding style fixes]
Link: https://lkml.kernel.org/r/20201209214304.6812-3-rppt@kernel.org
Fixes: 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions rather
that check each PFN")
Signed-off-by: Mike Rapoport <rppt(a)linux.ibm.com>
Reported-by: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Qian Cai <cai(a)lca.pw>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 152 +++++++++++++++++++---------------------------
1 file changed, 65 insertions(+), 87 deletions(-)
--- a/mm/page_alloc.c~mm-fix-initialization-of-struct-page-for-holes-in-memory-layout
+++ a/mm/page_alloc.c
@@ -6254,24 +6254,85 @@ static void __meminit zone_init_free_lis
}
}
-void __meminit __weak memmap_init(unsigned long size, int nid,
- unsigned long zone,
- unsigned long range_start_pfn)
+#if !defined(CONFIG_FLAT_NODE_MEM_MAP)
+/*
+ * Only struct pages that are backed by physical memory available to the
+ * kernel are zeroed and initialized by memmap_init_zone().
+ * But, there are some struct pages that are either reserved by firmware or
+ * do not correspond to physical page frames because the actual memory bank
+ * is not a multiple of SECTION_SIZE.
+ * Fields of those struct pages may be accessed (for example page_to_pfn()
+ * on some configuration accesses page flags) so we must explicitly
+ * initialize those struct pages.
+ */
+static u64 __init init_unavailable_range(unsigned long spfn, unsigned long epfn,
+ int zone, int node)
{
- unsigned long start_pfn, end_pfn;
+ unsigned long pfn;
+ u64 pgcnt = 0;
+
+ for (pfn = spfn; pfn < epfn; pfn++) {
+ if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages))) {
+ pfn = ALIGN_DOWN(pfn, pageblock_nr_pages)
+ + pageblock_nr_pages - 1;
+ continue;
+ }
+ __init_single_page(pfn_to_page(pfn), pfn, zone, node);
+ __SetPageReserved(pfn_to_page(pfn));
+ pgcnt++;
+ }
+
+ return pgcnt;
+}
+#else
+static inline u64 init_unavailable_range(unsigned long spfn, unsigned long epfn,
+ int zone, int node)
+{
+ return 0;
+}
+#endif
+
+void __init __weak memmap_init(unsigned long size, int nid,
+ unsigned long zone,
+ unsigned long range_start_pfn)
+{
+ unsigned long start_pfn, end_pfn, hole_start_pfn = 0;
unsigned long range_end_pfn = range_start_pfn + size;
+ u64 pgcnt = 0;
int i;
for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) {
start_pfn = clamp(start_pfn, range_start_pfn, range_end_pfn);
end_pfn = clamp(end_pfn, range_start_pfn, range_end_pfn);
+ hole_start_pfn = clamp(hole_start_pfn, range_start_pfn,
+ range_end_pfn);
if (end_pfn > start_pfn) {
size = end_pfn - start_pfn;
memmap_init_zone(size, nid, zone, start_pfn, range_end_pfn,
MEMINIT_EARLY, NULL, MIGRATE_MOVABLE);
}
+
+ if (hole_start_pfn < start_pfn)
+ pgcnt += init_unavailable_range(hole_start_pfn,
+ start_pfn, zone, nid);
+ hole_start_pfn = end_pfn;
}
+
+ /*
+ * Early sections always have a fully populated memmap for the whole
+ * section - see pfn_valid(). If the last section has holes at the
+ * end and that section is marked "online", the memmap will be
+ * considered initialized. Make sure that memmap has a well defined
+ * state.
+ */
+ if (hole_start_pfn < range_end_pfn)
+ pgcnt += init_unavailable_range(hole_start_pfn, range_end_pfn,
+ zone, nid);
+
+ if (pgcnt)
+ pr_info("%s: Zeroed struct page in unavailable ranges: %lld\n",
+ zone_names[zone], pgcnt);
}
static int zone_batchsize(struct zone *zone)
@@ -7072,88 +7133,6 @@ void __init free_area_init_memoryless_no
free_area_init_node(nid);
}
-#if !defined(CONFIG_FLAT_NODE_MEM_MAP)
-/*
- * Initialize all valid struct pages in the range [spfn, epfn) and mark them
- * PageReserved(). Return the number of struct pages that were initialized.
- */
-static u64 __init init_unavailable_range(unsigned long spfn, unsigned long epfn)
-{
- unsigned long pfn;
- u64 pgcnt = 0;
-
- for (pfn = spfn; pfn < epfn; pfn++) {
- if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages))) {
- pfn = ALIGN_DOWN(pfn, pageblock_nr_pages)
- + pageblock_nr_pages - 1;
- continue;
- }
- /*
- * Use a fake node/zone (0) for now. Some of these pages
- * (in memblock.reserved but not in memblock.memory) will
- * get re-initialized via reserve_bootmem_region() later.
- */
- __init_single_page(pfn_to_page(pfn), pfn, 0, 0);
- __SetPageReserved(pfn_to_page(pfn));
- pgcnt++;
- }
-
- return pgcnt;
-}
-
-/*
- * Only struct pages that are backed by physical memory are zeroed and
- * initialized by going through __init_single_page(). But, there are some
- * struct pages which are reserved in memblock allocator and their fields
- * may be accessed (for example page_to_pfn() on some configuration accesses
- * flags). We must explicitly initialize those struct pages.
- *
- * This function also addresses a similar issue where struct pages are left
- * uninitialized because the physical address range is not covered by
- * memblock.memory or memblock.reserved. That could happen when memblock
- * layout is manually configured via memmap=, or when the highest physical
- * address (max_pfn) does not end on a section boundary.
- */
-static void __init init_unavailable_mem(void)
-{
- phys_addr_t start, end;
- u64 i, pgcnt;
- phys_addr_t next = 0;
-
- /*
- * Loop through unavailable ranges not covered by memblock.memory.
- */
- pgcnt = 0;
- for_each_mem_range(i, &start, &end) {
- if (next < start)
- pgcnt += init_unavailable_range(PFN_DOWN(next),
- PFN_UP(start));
- next = end;
- }
-
- /*
- * Early sections always have a fully populated memmap for the whole
- * section - see pfn_valid(). If the last section has holes at the
- * end and that section is marked "online", the memmap will be
- * considered initialized. Make sure that memmap has a well defined
- * state.
- */
- pgcnt += init_unavailable_range(PFN_DOWN(next),
- round_up(max_pfn, PAGES_PER_SECTION));
-
- /*
- * Struct pages that do not have backing memory. This could be because
- * firmware is using some of this memory, or for some other reasons.
- */
- if (pgcnt)
- pr_info("Zeroed struct page in unavailable ranges: %lld pages", pgcnt);
-}
-#else
-static inline void __init init_unavailable_mem(void)
-{
-}
-#endif /* !CONFIG_FLAT_NODE_MEM_MAP */
-
#if MAX_NUMNODES > 1
/*
* Figure out the number of possible node ids.
@@ -7584,7 +7563,6 @@ void __init free_area_init(unsigned long
/* Initialise every node */
mminit_verify_pageflags_layout();
setup_nr_node_ids();
- init_unavailable_mem();
for_each_online_node(nid) {
pg_data_t *pgdat = NODE_DATA(nid);
free_area_init_node(nid);
_
Patches currently in -mm which might be from rppt(a)linux.ibm.com are
mm-add-definition-of-pmd_page_order.patch
mmap-make-mlock_future_check-global.patch
set_memory-allow-set_direct_map__noflush-for-multiple-pages.patch
set_memory-allow-querying-whether-set_direct_map_-is-actually-enabled.patch
mm-introduce-memfd_secret-system-call-to-create-secret-memory-areas.patch
mm-introduce-memfd_secret-system-call-to-create-secret-memory-areas-fix.patch
secretmem-use-pmd-size-pages-to-amortize-direct-map-fragmentation.patch
secretmem-add-memcg-accounting.patch
pm-hibernate-disable-when-there-are-active-secretmem-users.patch
arch-mm-wire-up-memfd_secret-system-call-were-relevant.patch
secretmem-test-add-basic-selftest-for-memfd_secret2.patch
The patch titled
Subject: mm: memblock: enforce overlap of memory.memblock and memory.reserved
has been removed from the -mm tree. Its filename was
mm-memblock-enforce-overlap-of-memorymemblock-and-memoryreserved.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Mike Rapoport <rppt(a)linux.ibm.com>
Subject: mm: memblock: enforce overlap of memory.memblock and memory.reserved
Patch series "mm: fix initialization of struct page for holes in memory layout", v2.
Commit 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions
rather that check each PFN") exposed several issues with the memory map
initialization and these patches fix those issues.
Initially there were crashes during compaction that Qian Cai reported back
in April [1]. It seemed back then that the probelm was fixed, but a few
weeks ago Andrea Arcangeli hit the same bug [2] and after a long
discussion between us [3] I think these patches are the proper fix.
[1] https://lore.kernel.org/lkml/8C537EB7-85EE-4DCF-943E-3CC0ED0DF56D@lca.pw
[2] https://lore.kernel.org/lkml/20201121194506.13464-1-aarcange@redhat.com
[3] https://lore.kernel.org/mm-commits/20201206005401.qKuAVgOXr%akpm@linux-foun…
This patch (of 2):
memblock does not require that the reserved memory ranges will be a subset
of memblock.memory.
As a result there may be reserved pages that are not in the range of any
zone or node because zone and node boundaries are detected based on
memblock.memory and pages that only present in memblock.reserved are not
taken into account during zone/node size detection.
Make sure that all ranges in memblock.reserved are added to
memblock.memory before calculating node and zone boundaries.
Link: https://lkml.kernel.org/r/20201209214304.6812-1-rppt@kernel.org
Link: https://lkml.kernel.org/r/20201209214304.6812-2-rppt@kernel.org
Fixes: 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions rather that check each PFN")
Signed-off-by: Mike Rapoport <rppt(a)linux.ibm.com>
Reported-by: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Qian Cai <cai(a)lca.pw>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/memblock.h | 1 +
mm/memblock.c | 24 ++++++++++++++++++++++++
mm/page_alloc.c | 7 +++++++
3 files changed, 32 insertions(+)
--- a/include/linux/memblock.h~mm-memblock-enforce-overlap-of-memorymemblock-and-memoryreserved
+++ a/include/linux/memblock.h
@@ -120,6 +120,7 @@ int memblock_clear_nomap(phys_addr_t bas
unsigned long memblock_free_all(void);
void reset_node_managed_pages(pg_data_t *pgdat);
void reset_all_zones_managed_pages(void);
+void memblock_enforce_memory_reserved_overlap(void);
/* Low level functions */
void __next_mem_range(u64 *idx, int nid, enum memblock_flags flags,
--- a/mm/memblock.c~mm-memblock-enforce-overlap-of-memorymemblock-and-memoryreserved
+++ a/mm/memblock.c
@@ -1860,6 +1860,30 @@ void __init_memblock memblock_trim_memor
}
}
+/**
+ * memblock_enforce_memory_reserved_overlap - make sure every range in
+ * @memblock.reserved is covered by @memblock.memory
+ *
+ * The data in @memblock.memory is used to detect zone and node boundaries
+ * during initialization of the memory map and the page allocator. Make
+ * sure that every memory range present in @memblock.reserved is also added
+ * to @memblock.memory even if the architecture specific memory
+ * initialization failed to do so
+ */
+void __init memblock_enforce_memory_reserved_overlap(void)
+{
+ phys_addr_t start, end;
+ int nid;
+ u64 i;
+
+ __for_each_mem_range(i, &memblock.reserved, &memblock.memory,
+ NUMA_NO_NODE, MEMBLOCK_NONE, &start, &end, &nid) {
+ pr_warn("memblock: reserved range [%pa-%pa] is not in memory\n",
+ &start, &end);
+ memblock_add_node(start, (end - start), nid);
+ }
+}
+
void __init_memblock memblock_set_current_limit(phys_addr_t limit)
{
memblock.current_limit = limit;
--- a/mm/page_alloc.c~mm-memblock-enforce-overlap-of-memorymemblock-and-memoryreserved
+++ a/mm/page_alloc.c
@@ -7513,6 +7513,13 @@ void __init free_area_init(unsigned long
memset(arch_zone_highest_possible_pfn, 0,
sizeof(arch_zone_highest_possible_pfn));
+ /*
+ * Some architectures (e.g. x86) have reserved pages outside of
+ * memblock.memory. Make sure these pages are taken into account
+ * when detecting zone and node boundaries
+ */
+ memblock_enforce_memory_reserved_overlap();
+
start_pfn = find_min_pfn_with_active_regions();
descending = arch_has_descending_max_zone_pfns();
_
Patches currently in -mm which might be from rppt(a)linux.ibm.com are
mm-fix-initialization-of-struct-page-for-holes-in-memory-layout.patch
mm-add-definition-of-pmd_page_order.patch
mmap-make-mlock_future_check-global.patch
set_memory-allow-set_direct_map__noflush-for-multiple-pages.patch
set_memory-allow-querying-whether-set_direct_map_-is-actually-enabled.patch
mm-introduce-memfd_secret-system-call-to-create-secret-memory-areas.patch
mm-introduce-memfd_secret-system-call-to-create-secret-memory-areas-fix.patch
secretmem-use-pmd-size-pages-to-amortize-direct-map-fragmentation.patch
secretmem-add-memcg-accounting.patch
pm-hibernate-disable-when-there-are-active-secretmem-users.patch
arch-mm-wire-up-memfd_secret-system-call-were-relevant.patch
secretmem-test-add-basic-selftest-for-memfd_secret2.patch