From: Minchan Kim <minchan(a)kernel.org>
Subject: zram: fix broken page writeback
commit 0d8359620d9b ("zram: support page writeback") introduced two
problems. It overwrites writeback_store's return value as kstrtol's
return value, which makes return value zero so user could see zero as
return value of write syscall even though it wrote data successfully.
It also breaks index value in the loop in that it doesn't increase the
index any longer. It means it can write only first starting block index
so user couldn't write all idle pages in the zram so lose memory saving
chance.
This patch fixes those issues.
Link: https://lkml.kernel.org/r/20210312173949.2197662-2-minchan@kernel.org
Fixes: 0d8359620d9b("zram: support page writeback")
Signed-off-by: Minchan Kim <minchan(a)kernel.org>
Reported-by: Amos Bianchi <amosbianchi(a)google.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky(a)gmail.com>
Cc: John Dias <joaodias(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/block/zram/zram_drv.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/block/zram/zram_drv.c~zram-fix-broken-page-writeback
+++ a/drivers/block/zram/zram_drv.c
@@ -638,8 +638,8 @@ static ssize_t writeback_store(struct de
if (strncmp(buf, PAGE_WB_SIG, sizeof(PAGE_WB_SIG) - 1))
return -EINVAL;
- ret = kstrtol(buf + sizeof(PAGE_WB_SIG) - 1, 10, &index);
- if (ret || index >= nr_pages)
+ if (kstrtol(buf + sizeof(PAGE_WB_SIG) - 1, 10, &index) ||
+ index >= nr_pages)
return -EINVAL;
nr_pages = 1;
@@ -663,7 +663,7 @@ static ssize_t writeback_store(struct de
goto release_init_lock;
}
- while (nr_pages--) {
+ for (; nr_pages != 0; index++, nr_pages--) {
struct bio_vec bvec;
bvec.bv_page = page;
_
From: Minchan Kim <minchan(a)kernel.org>
Subject: zram: fix return value on writeback_store
writeback_store's return value is overwritten by submit_bio_wait's return
value. Thus, writeback_store will return zero since there was no IO
error. In the end, write syscall from userspace will see the zero as
return value, which could make the process stall to keep trying the write
until it will succeed.
Link: https://lkml.kernel.org/r/20210312173949.2197662-1-minchan@kernel.org
Fixes: 3b82a051c101("drivers/block/zram/zram_drv.c: fix error return codes not being returned in writeback_store")
Signed-off-by: Minchan Kim <minchan(a)kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky(a)gmail.com>
Cc: Colin Ian King <colin.king(a)canonical.com>
Cc: John Dias <joaodias(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/block/zram/zram_drv.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
--- a/drivers/block/zram/zram_drv.c~zram-fix-return-value-on-writeback_store
+++ a/drivers/block/zram/zram_drv.c
@@ -627,7 +627,7 @@ static ssize_t writeback_store(struct de
struct bio_vec bio_vec;
struct page *page;
ssize_t ret = len;
- int mode;
+ int mode, err;
unsigned long blk_idx = 0;
if (sysfs_streq(buf, "idle"))
@@ -728,12 +728,17 @@ static ssize_t writeback_store(struct de
* XXX: A single page IO would be inefficient for write
* but it would be not bad as starter.
*/
- ret = submit_bio_wait(&bio);
- if (ret) {
+ err = submit_bio_wait(&bio);
+ if (err) {
zram_slot_lock(zram, index);
zram_clear_flag(zram, index, ZRAM_UNDER_WB);
zram_clear_flag(zram, index, ZRAM_IDLE);
zram_slot_unlock(zram, index);
+ /*
+ * Return last IO error unless every IO were
+ * not suceeded.
+ */
+ ret = err;
continue;
}
_
From: Zhou Guanghui <zhouguanghui1(a)huawei.com>
Subject: mm/memcg: set memcg when splitting page
As described in the split_page() comment, for the non-compound high order
page, the sub-pages must be freed individually. If the memcg of the first
page is valid, the tail pages cannot be uncharged when be freed.
For example, when alloc_pages_exact is used to allocate 1MB continuous
physical memory, 2MB is charged(kmemcg is enabled and __GFP_ACCOUNT is
set). When make_alloc_exact free the unused 1MB and free_pages_exact free
the applied 1MB, actually, only 4KB(one page) is uncharged.
Therefore, the memcg of the tail page needs to be set when splitting a
page.
Michel:
There are at least two explicit users of __GFP_ACCOUNT with
alloc_exact_pages added recently. See 7efe8ef274024 ("KVM: arm64:
Allocate stage-2 pgd pages with GFP_KERNEL_ACCOUNT") and c419621873713
("KVM: s390: Add memcg accounting to KVM allocations"), so this is not
just a theoretical issue.
Link: https://lkml.kernel.org/r/20210304074053.65527-3-zhouguanghui1@huawei.com
Signed-off-by: Zhou Guanghui <zhouguanghui1(a)huawei.com>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Reviewed-by: Zi Yan <ziy(a)nvidia.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Hanjun Guo <guohanjun(a)huawei.com>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Nicholas Piggin <npiggin(a)gmail.com>
Cc: Rui Xiang <rui.xiang(a)huawei.com>
Cc: Tianhong Ding <dingtianhong(a)huawei.com>
Cc: Weilong Chen <chenweilong(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/page_alloc.c | 1 +
1 file changed, 1 insertion(+)
--- a/mm/page_alloc.c~mm-memcg-set-memcg-when-split-page
+++ a/mm/page_alloc.c
@@ -3314,6 +3314,7 @@ void split_page(struct page *page, unsig
for (i = 1; i < (1 << order); i++)
set_page_refcounted(page + i);
split_page_owner(page, 1 << order);
+ split_page_memcg(page, 1 << order);
}
EXPORT_SYMBOL_GPL(split_page);
_