The patch titled
Subject: fat: fix uninit-memory access for partial initialized inode
has been added to the -mm tree. Its filename is
fat-fix-uninit-memory-access-for-partial-initialized-inode.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/fat-fix-uninit-memory-access-for-p…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/fat-fix-uninit-memory-access-for-p…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: OGAWA Hirofumi <hirofumi(a)mail.parknet.co.jp>
Subject: fat: fix uninit-memory access for partial initialized inode
When get an error in the middle of reading an inode, some fields in the
inode might be still not initialized. And then the evict_inode path may
access those fields via iput().
To fix, this makes sure that inode fields are initialized.
Link: http://lkml.kernel.org/r/871rqnreqx.fsf@mail.parknet.co.jp
Reported-by: syzbot+9d82b8de2992579da5d0(a)syzkaller.appspotmail.com
Signed-off-by: OGAWA Hirofumi <hirofumi(a)mail.parknet.co.jp>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/fat/inode.c | 19 +++++++------------
1 file changed, 7 insertions(+), 12 deletions(-)
--- a/fs/fat/inode.c~fat-fix-uninit-memory-access-for-partial-initialized-inode
+++ a/fs/fat/inode.c
@@ -750,6 +750,13 @@ static struct inode *fat_alloc_inode(str
return NULL;
init_rwsem(&ei->truncate_lock);
+ /* Zeroing to allow iput() even if partial initialized inode. */
+ ei->mmu_private = 0;
+ ei->i_start = 0;
+ ei->i_logstart = 0;
+ ei->i_attrs = 0;
+ ei->i_pos = 0;
+
return &ei->vfs_inode;
}
@@ -1374,16 +1381,6 @@ out:
return 0;
}
-static void fat_dummy_inode_init(struct inode *inode)
-{
- /* Initialize this dummy inode to work as no-op. */
- MSDOS_I(inode)->mmu_private = 0;
- MSDOS_I(inode)->i_start = 0;
- MSDOS_I(inode)->i_logstart = 0;
- MSDOS_I(inode)->i_attrs = 0;
- MSDOS_I(inode)->i_pos = 0;
-}
-
static int fat_read_root(struct inode *inode)
{
struct msdos_sb_info *sbi = MSDOS_SB(inode->i_sb);
@@ -1844,13 +1841,11 @@ int fat_fill_super(struct super_block *s
fat_inode = new_inode(sb);
if (!fat_inode)
goto out_fail;
- fat_dummy_inode_init(fat_inode);
sbi->fat_inode = fat_inode;
fsinfo_inode = new_inode(sb);
if (!fsinfo_inode)
goto out_fail;
- fat_dummy_inode_init(fsinfo_inode);
fsinfo_inode->i_ino = MSDOS_FSINFO_INO;
sbi->fsinfo_inode = fsinfo_inode;
insert_inode_hash(fsinfo_inode);
_
Patches currently in -mm which might be from hirofumi(a)mail.parknet.co.jp are
fat-fix-uninit-memory-access-for-partial-initialized-inode.patch
The patch titled
Subject: mm/hugetlb.c: fix a addressing exception caused by huge_pte_offset()
has been added to the -mm tree. Its filename is
mm-hugetlb-fix-a-addressing-exception-caused-by-huge_pte_offset.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-fix-a-addressing-except…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-fix-a-addressing-except…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Longpeng <longpeng2(a)huawei.com>
Subject: mm/hugetlb.c: fix a addressing exception caused by huge_pte_offset()
Our machine encountered a panic(addressing exception) after running for a
long time. The calltrace is:
RIP: 0010:[<ffffffff9dff0587>] [<ffffffff9dff0587>] hugetlb_fault+0x307/0xbe0
RSP: 0018:ffff9567fc27f808 EFLAGS: 00010286
RAX: e800c03ff1258d48 RBX: ffffd3bb003b69c0 RCX: e800c03ff1258d48
RDX: 17ff3fc00eda72b7 RSI: 00003ffffffff000 RDI: e800c03ff1258d48
RBP: ffff9567fc27f8c8 R08: e800c03ff1258d48 R09: 0000000000000080
R10: ffffaba0704c22a8 R11: 0000000000000001 R12: ffff95c87b4b60d8
R13: 00005fff00000000 R14: 0000000000000000 R15: ffff9567face8074
FS: 00007fe2d9ffb700(0000) GS:ffff956900e40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffd3bb003b69c0 CR3: 000000be67374000 CR4: 00000000003627e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
[<ffffffff9df9b71b>] ? unlock_page+0x2b/0x30
[<ffffffff9dff04a2>] ? hugetlb_fault+0x222/0xbe0
[<ffffffff9dff1405>] follow_hugetlb_page+0x175/0x540
[<ffffffff9e15b825>] ? cpumask_next_and+0x35/0x50
[<ffffffff9dfc7230>] __get_user_pages+0x2a0/0x7e0
[<ffffffff9dfc648d>] __get_user_pages_unlocked+0x15d/0x210
[<ffffffffc068cfc5>] __gfn_to_pfn_memslot+0x3c5/0x460 [kvm]
[<ffffffffc06b28be>] try_async_pf+0x6e/0x2a0 [kvm]
[<ffffffffc06b4b41>] tdp_page_fault+0x151/0x2d0 [kvm]
[<ffffffffc075731c>] ? vmx_vcpu_run+0x2ec/0xc80 [kvm_intel]
[<ffffffffc0757328>] ? vmx_vcpu_run+0x2f8/0xc80 [kvm_intel]
[<ffffffffc06abc11>] kvm_mmu_page_fault+0x31/0x140 [kvm]
[<ffffffffc074d1ae>] handle_ept_violation+0x9e/0x170 [kvm_intel]
[<ffffffffc075579c>] vmx_handle_exit+0x2bc/0xc70 [kvm_intel]
[<ffffffffc074f1a0>] ? __vmx_complete_interrupts.part.73+0x80/0xd0 [kvm_intel]
[<ffffffffc07574c0>] ? vmx_vcpu_run+0x490/0xc80 [kvm_intel]
[<ffffffffc069f3be>] vcpu_enter_guest+0x7be/0x13a0 [kvm]
[<ffffffffc06cf53e>] ? kvm_check_async_pf_completion+0x8e/0xb0 [kvm]
[<ffffffffc06a6f90>] kvm_arch_vcpu_ioctl_run+0x330/0x490 [kvm]
[<ffffffffc068d919>] kvm_vcpu_ioctl+0x309/0x6d0 [kvm]
[<ffffffff9deaa8c2>] ? dequeue_signal+0x32/0x180
[<ffffffff9deae34d>] ? do_sigtimedwait+0xcd/0x230
[<ffffffff9e03aed0>] do_vfs_ioctl+0x3f0/0x540
[<ffffffff9e03b0c1>] SyS_ioctl+0xa1/0xc0
[<ffffffff9e53879b>] system_call_fastpath+0x22/0x27
The kernel we used is older, but we think the latest kernel also has this
bug after digging into this problem.
For 1G hugepages, huge_pte_offset() wants to return NULL or pudp, but it
may return a wrong 'pmdp' if there is a race. Please look at the
following code snippet:
...
pud = pud_offset(p4d, addr);
if (sz != PUD_SIZE && pud_none(*pud))
return NULL;
/* hugepage or swap? */
if (pud_huge(*pud) || !pud_present(*pud))
return (pte_t *)pud;
pmd = pmd_offset(pud, addr);
if (sz != PMD_SIZE && pmd_none(*pmd))
return NULL;
/* hugepage or swap? */
if (pmd_huge(*pmd) || !pmd_present(*pmd))
return (pte_t *)pmd;
...
The following sequence would trigger this bug:
1. CPU0: sz = PUD_SIZE and *pud = 0 , continue
1. CPU0: "pud_huge(*pud)" is false
2. CPU1: calling hugetlb_no_page and set *pud to xxxx8e7(PRESENT)
3. CPU0: "!pud_present(*pud)" is false, continue
4. CPU0: pmd = pmd_offset(pud, addr) and maybe return a wrong pmdp
However, we want CPU0 to return NULL or pudp.
We can avoid this race by reading the pud only once. What's more, we also
use READ_ONCE to access the entries for safety (i.e. avoid the compilier
mischief)
Link: http://lkml.kernel.org/r/1582342427-230392-1-git-send-email-longpeng2@huawe…
Signed-off-by: Longpeng <longpeng2(a)huawei.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Sean Christopherson <sean.j.christopherson(a)intel.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-fix-a-addressing-exception-caused-by-huge_pte_offset
+++ a/mm/hugetlb.c
@@ -4910,28 +4910,30 @@ pte_t *huge_pte_offset(struct mm_struct
{
pgd_t *pgd;
p4d_t *p4d;
- pud_t *pud;
- pmd_t *pmd;
+ pud_t *pud, pud_entry;
+ pmd_t *pmd, pmd_entry;
pgd = pgd_offset(mm, addr);
- if (!pgd_present(*pgd))
+ if (!pgd_present(READ_ONCE(*pgd)))
return NULL;
p4d = p4d_offset(pgd, addr);
- if (!p4d_present(*p4d))
+ if (!p4d_present(READ_ONCE(*p4d)))
return NULL;
pud = pud_offset(p4d, addr);
- if (sz != PUD_SIZE && pud_none(*pud))
+ pud_entry = READ_ONCE(*pud);
+ if (sz != PUD_SIZE && pud_none(pud_entry))
return NULL;
/* hugepage or swap? */
- if (pud_huge(*pud) || !pud_present(*pud))
+ if (pud_huge(pud_entry) || !pud_present(pud_entry))
return (pte_t *)pud;
pmd = pmd_offset(pud, addr);
- if (sz != PMD_SIZE && pmd_none(*pmd))
+ pmd_entry = READ_ONCE(*pmd);
+ if (sz != PMD_SIZE && pmd_none(pmd_entry))
return NULL;
/* hugepage or swap? */
- if (pmd_huge(*pmd) || !pmd_present(*pmd))
+ if (pmd_huge(pmd_entry) || !pmd_present(pmd_entry))
return (pte_t *)pmd;
return NULL;
_
Patches currently in -mm which might be from longpeng2(a)huawei.com are
mm-hugetlb-fix-a-addressing-exception-caused-by-huge_pte_offset.patch
This reverts commit 4585fbcb5331fc910b7e553ad3efd0dd7b320d14.
The name changing as devfreq(X) breaks some user space applications,
such as Android HAL from Unisoc and Hikey [1].
The device name will be changed unexpectly after every boot depending
on module init sequence. It will make trouble to setup some system
configuration like selinux for Android.
So we'd like to revert it back to old naming rule before any better
way being found.
[1] https://lkml.org/lkml/2018/5/8/1042
Cc: John Stultz <john.stultz(a)linaro.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: stable(a)vger.kernel.org
Signed-off-by: Orson Zhai <orson.unisoc(a)gmail.com>
---
drivers/devfreq/devfreq.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index cceee8b..7dcf209 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -738,7 +738,6 @@ struct devfreq *devfreq_add_device(struct device *dev,
{
struct devfreq *devfreq;
struct devfreq_governor *governor;
- static atomic_t devfreq_no = ATOMIC_INIT(-1);
int err = 0;
if (!dev || !profile || !governor_name) {
@@ -800,8 +799,7 @@ struct devfreq *devfreq_add_device(struct device *dev,
devfreq->suspend_freq = dev_pm_opp_get_suspend_opp_freq(dev);
atomic_set(&devfreq->suspend_count, 0);
- dev_set_name(&devfreq->dev, "devfreq%d",
- atomic_inc_return(&devfreq_no));
+ dev_set_name(&devfreq->dev, "%s", dev_name(dev));
err = device_register(&devfreq->dev);
if (err) {
mutex_unlock(&devfreq->lock);
--
2.7.4
The WMI method to set the charge threshold does not provide a
way to specific a battery, so we assume it is the first/primary
battery (by checking if the name is BAT0).
On some newer ASUS laptops (Zenbook UM431DA) though, the
primary/first battery isn't named BAT0 but BATT, so we need
to support that case.
Signed-off-by: Kristian Klausen <kristian(a)klausen.dk>
Cc: stable(a)vger.kernel.org
---
I'm not sure if this is candidate for -stable, it fix a real bug
(charge threshold doesn't work on newer ASUS laptops) which has been
reported by a user[1], but is that enough?
I had a quick look at[2], can this be considered a "something
critical"? It "bothers people"[1]. My point: I'm not sure..
I'm unsure if there is a bettery way to fix this. Maybe a counter
would be better (+1 for every new battery)? It would probably need
to be atomic to prevent race condition (I'm not sure how this code
is run), but this "fix" is way simpler.
Please do not accept this patch just yet, I'm waiting for the tester
to either confirm or deny credit[3].
[1] https://gist.github.com/klausenbusk/643f15320ae8997427155c38be13e445#gistco…
[2] https://www.kernel.org/doc/html/v5.5/process/stable-kernel-rules.html
[3] https://gist.github.com/klausenbusk/643f15320ae8997427155c38be13e445#gistco…
drivers/platform/x86/asus-wmi.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/x86/asus-wmi.c b/drivers/platform/x86/asus-wmi.c
index 612ef5526226..4c690cebdd55 100644
--- a/drivers/platform/x86/asus-wmi.c
+++ b/drivers/platform/x86/asus-wmi.c
@@ -426,8 +426,11 @@ static int asus_wmi_battery_add(struct power_supply *battery)
{
/* The WMI method does not provide a way to specific a battery, so we
* just assume it is the first battery.
+ * Note: On some newer ASUS laptops (Zenbook UM431DA), the primary/first
+ * battery is named BATT.
*/
- if (strcmp(battery->desc->name, "BAT0") != 0)
+ if (strcmp(battery->desc->name, "BAT0") != 0
+ && (strcmp(battery->desc->name, "BATT") != 0))
return -ENODEV;
if (device_create_file(&battery->dev,
--
2.25.1
The patch titled
Subject: lib/stackdepot.c: fix global out-of-bounds in stack_slabs
has been removed from the -mm tree. Its filename was
lib-stackdepot-fix-global-out-of-bounds-in-stack_slabs.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Alexander Potapenko <glider(a)google.com>
Subject: lib/stackdepot.c: fix global out-of-bounds in stack_slabs
Walter Wu has reported a potential case in which init_stack_slab() is
called after stack_slabs[STACK_ALLOC_MAX_SLABS - 1] has already been
initialized. In that case init_stack_slab() will overwrite
stack_slabs[STACK_ALLOC_MAX_SLABS], which may result in a memory
corruption.
Link: http://lkml.kernel.org/r/20200218102950.260263-1-glider@google.com
Fixes: cd11016e5f521 ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB")
Signed-off-by: Alexander Potapenko <glider(a)google.com>
Reported-by: Walter Wu <walter-zh.wu(a)mediatek.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Matthias Brugger <matthias.bgg(a)gmail.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Kate Stewart <kstewart(a)linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/stackdepot.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/lib/stackdepot.c~lib-stackdepot-fix-global-out-of-bounds-in-stack_slabs
+++ a/lib/stackdepot.c
@@ -83,15 +83,19 @@ static bool init_stack_slab(void **preal
return true;
if (stack_slabs[depot_index] == NULL) {
stack_slabs[depot_index] = *prealloc;
+ *prealloc = NULL;
} else {
- stack_slabs[depot_index + 1] = *prealloc;
+ /* If this is the last depot slab, do not touch the next one. */
+ if (depot_index + 1 < STACK_ALLOC_MAX_SLABS) {
+ stack_slabs[depot_index + 1] = *prealloc;
+ *prealloc = NULL;
+ }
/*
* This smp_store_release pairs with smp_load_acquire() from
* |next_slab_inited| above and in stack_depot_save().
*/
smp_store_release(&next_slab_inited, 1);
}
- *prealloc = NULL;
return true;
}
_
Patches currently in -mm which might be from glider(a)google.com are
stackdepot-check-depot_index-before-accessing-the-stack-slab.patch
stackdepot-build-with-fno-builtin.patch
kasan-stackdepot-move-filter_irq_stacks-to-stackdepotc.patch
The patch titled
Subject: mm/vmscan.c: don't round up scan size for online memory cgroup
has been removed from the -mm tree. Its filename was
mm-vmscan-dont-round-up-scan-size-for-online-memory-cgroup.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Gavin Shan <gshan(a)redhat.com>
Subject: mm/vmscan.c: don't round up scan size for online memory cgroup
commit 68600f623d69 ("mm: don't miss the last page because of round-off
error") makes the scan size round up to @denominator regardless of the
memory cgroup's state, online or offline. This affects the overall
reclaiming behavior: The corresponding LRU list is eligible for reclaiming
only when its size logically right shifted by @sc->priority is bigger than
zero in the former formula. For example, the inactive anonymous LRU list
should have at least 0x4000 pages to be eligible for reclaiming when we
have 60/12 for swappiness/priority and without taking scan/rotation ratio
into account. After the roundup is applied, the inactive anonymous LRU
list becomes eligible for reclaiming when its size is bigger than or equal
to 0x1000 in the same condition.
(0x4000 >> 12) * 60 / (60 + 140 + 1) = 1
((0x1000 >> 12) * 60) + 200) / (60 + 140 + 1) = 1
aarch64 has 512MB huge page size when the base page size is 64KB. The
memory cgroup that has a huge page is always eligible for reclaiming in
that case. The reclaiming is likely to stop after the huge page is
reclaimed, meaing the further iteration on @sc->priority and the silbing
and child memory cgroups will be skipped. The overall behaviour has been
changed. This fixes the issue by applying the roundup to offlined memory
cgroups only, to give more preference to reclaim memory from offlined
memory cgroup. It sounds reasonable as those memory is unlikedly to be
used by anyone.
The issue was found by starting up 8 VMs on a Ampere Mustang machine,
which has 8 CPUs and 16 GB memory. Each VM is given with 2 vCPUs and 2GB
memory. It took 264 seconds for all VMs to be completely up and 784MB
swap is consumed after that. With this patch applied, it took 236 seconds
and 60MB swap to do same thing. So there is 10% performance improvement
for my case. Note that KSM is disable while THP is enabled in the
testing.
total used free shared buff/cache available
Mem: 16196 10065 2049 16 4081 3749
Swap: 8175 784 7391
total used free shared buff/cache available
Mem: 16196 11324 3656 24 1215 2936
Swap: 8175 60 8115
Link: http://lkml.kernel.org/r/20200211024514.8730-1-gshan@redhat.com
Fixes: 68600f623d69 ("mm: don't miss the last page because of round-off error")
Signed-off-by: Gavin Shan <gshan(a)redhat.com>
Acked-by: Roman Gushchin <guro(a)fb.com>
Cc: <stable(a)vger.kernel.org> [4.20+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
--- a/mm/vmscan.c~mm-vmscan-dont-round-up-scan-size-for-online-memory-cgroup
+++ a/mm/vmscan.c
@@ -2415,10 +2415,13 @@ out:
/*
* Scan types proportional to swappiness and
* their relative recent reclaim efficiency.
- * Make sure we don't miss the last page
- * because of a round-off error.
+ * Make sure we don't miss the last page on
+ * the offlined memory cgroups because of a
+ * round-off error.
*/
- scan = DIV64_U64_ROUND_UP(scan * fraction[file],
+ scan = mem_cgroup_online(memcg) ?
+ div64_u64(scan * fraction[file], denominator) :
+ DIV64_U64_ROUND_UP(scan * fraction[file],
denominator);
break;
case SCAN_FILE:
_
Patches currently in -mm which might be from gshan(a)redhat.com are