On 2/11/19 8:27 PM, Andrew Morton wrote:
> On Mon, 11 Feb 2019 10:02:45 -0800 <rcampbell(a)nvidia.com> wrote:
>
>> From: Ralph Campbell <rcampbell(a)nvidia.com>
>>
>> The system call, get_mempolicy() [1], passes an unsigned long *nodemask
>> pointer and an unsigned long maxnode argument which specifies the
>> length of the user's nodemask array in bits (which is rounded up).
>> The manual page says that if the maxnode value is too small,
>> get_mempolicy will return EINVAL but there is no system call to return
>> this minimum value. To determine this value, some programs search
>> /proc/<pid>/status for a line starting with "Mems_allowed:" and use
>> the number of digits in the mask to determine the minimum value.
>> A recent change to the way this line is formatted [2] causes these
>> programs to compute a value less than MAX_NUMNODES so get_mempolicy()
>> returns EINVAL.
>>
>> Change get_mempolicy(), the older compat version of get_mempolicy(), and
>> the copy_nodes_to_user() function to use nr_node_ids instead of
>> MAX_NUMNODES, thus preserving the defacto method of computing the
>> minimum size for the nodemask array and the maxnode argument.
>>
>> [1] http://man7.org/linux/man-pages/man2/get_mempolicy.2.html
>> [2] https://lore.kernel.org/lkml/1545405631-6808-1-git-send-email-longman@redha…
Please, the next time include linux-api and people involved in the previous
thread [1] into the CC list. Likely there should have been a Suggested-by: for
Alexander as well.
>>
>
> Ugh, what a mess.
I'm afraid it's even somewhat worse mess now.
> For a start, that's a crazy interface. I wish that had been brought to
> our attention so we could have provided a sane way for userspace to
> determine MAX_NUMNODES.
>
> Secondly, 4fb8e5b89bcbbb ("include/linux/nodemask.h: use nr_node_ids
> (not MAX_NUMNODES) in __nodemask_pr_numnodes()") introduced a
There's no such commit, that sha was probably from linux-next. The patch is
still in mmotm [1]. Luckily, I would say. Maybe Linus or some automation could
run some script to check for bogus Fixes tags before accepting patches?
> regession. The proposed get_mempolicy() change appears to be a good
> one, but is a strange way of addressing the regression. I suppose it's
> acceptable, as long as this change is backported into kernels which
> have 4fb8e5b89bcbbb.
Based on the non-existing sha, hopefully it wasn't backported anywhere, but
maybe some AI did anyway. Ah, seems like it indeed made it as far as 4.9, as a
fix for non-existing commit and without proper linux-api consideration :(
I guess it's too late to revert it for 5.0. Hopefully the change is really safe
and won't break anything, i.e. hopefully nobody was determining MAX_NUMNODES by
increasing buffer size until get_mempolicy() stopped returning EINVAL. Or other
problem in e.g. CRIU context.
What about the manpage? It says "The value specified by maxnode is less than
the number of node IDs supported by the system." which could be perhaps applied
both to nr_node_ids or MAX_NUMNODES. Or should we update it?
[1]
https://lore.kernel.org/linux-mm/631c44cc-df2d-40d4-a537-d24864df0679@nvidi…
[2]
https://www.ozlabs.org/~akpm/mmotm/broken-out/include-linux-nodemaskh-use-n…
Quoting Kenneth Graunke (2018-01-05 06:06:34)
> On Thursday, January 4, 2018 4:41:35 PM PST Rodrigo Vivi wrote:
> > On Thu, Jan 04, 2018 at 11:39:23PM +0000, Kenneth Graunke wrote:
> > > On Thursday, January 4, 2018 1:23:06 PM PST Chris Wilson wrote:
> > > > Quoting Kenneth Graunke (2018-01-04 19:38:05)
> > > > > Geminilake requires the 3D driver to select whether barriers are
> > > > > intended for compute shaders, or tessellation control shaders, by
> > > > > whacking a "Barrier Mode" bit in SLICE_COMMON_ECO_CHICKEN1 when
> > > > > switching pipelines. Failure to do this properly can result in GPU
> > > > > hangs.
> > > > >
> > > > > Unfortunately, this means it needs to switch mid-batch, so only
> > > > > userspace can properly set it. To facilitate this, the kernel needs
> > > > > to whitelist the register.
> > > > >
> > > > > Signed-off-by: Kenneth Graunke <kenneth(a)whitecape.org>
> > > > > Cc: stable(a)vger.kernel.org
> > > > > ---
> > > > > drivers/gpu/drm/i915/i915_reg.h | 2 ++
> > > > > drivers/gpu/drm/i915/intel_engine_cs.c | 5 +++++
> > > > > 2 files changed, 7 insertions(+)
> > > > >
> > > > > Hello,
> > > > >
> > > > > We unfortunately need to whitelist an extra register for GPU hang fix
> > > > > on Geminilake. Here's the corresponding Mesa patch:
> > > >
> > > > Thankfully it appears to be context saved. Has a w/a name been assigned
> > > > for this?
> > > > -Chris
> > >
> > > There doesn't appear to be one. The workaround page lists it, but there
> > > is no name. The register description has a note saying that you need to
> > > set this, but doesn't call it out as a workaround.
> >
> > It mentions only BXT:ALL, but not mention to GLK.
> >
> > Should we add to both then?
>
> Well, that's irritating. On the workarounds page, it does indeed say
> "BXT" with no mention of GLK. But the workaround text says to set
> "SLICE_COMMON_CHICKEN_ECO1 Barrier Mode [...] (bit 7 of MMIO 0x731C)."
>
> Looking at the register definition for SLICE_COMMON_ECO_CHICKEN1, bit 7
> is "Barrier Mode" on [GLK] only, with no mention of BXT. It's marked
> reserved PBC on [SKL+, not GLK, not KBL]. On KBL it's something else.
>
> I believe Mark saw hangs in tessellation control shader hangs on
> Geminilake only, and never saw this issue on Broxton. So, my guess is
> that the workaround really is new on Geminilake, and the BXT tag on the
> workarounds page is incorrect. (Mark, does that sound right to you?)
Hi, I'm back!
This fails a selftest on glk as we can't even write to the register
0x731c, or at least can't read from the register.
Did bspec ever get updated to include this register & wa?
-Chris
Daniel Verkamp reported that the backport of 0d640732dbeb ("arm64: KVM: Skip
MMIO insn after emulation") to 4.4-stable has broken KVM on arm/arm64.
It turns out that the guest cannot make forward progress as soon as it hits
a device emulated by the host kernel, like the interrupt controller. The
reason for this is a set of missing dependencies from the 4.7 era. With
these patches added to 4.4.175, I'm able to boot guests normally.
Tested with both kvmtool and crossvm.
Christoffer Dall (1):
KVM: arm/arm64: Fix MMIO emulation data handling
Marc Zyngier (1):
arm/arm64: KVM: Feed initialized memory to MMIO accesses
arch/arm/kvm/mmio.c | 10 ++++++----
virt/kvm/arm/vgic.c | 7 -------
2 files changed, 6 insertions(+), 11 deletions(-)
--
2.20.1
If userspace doesn't enable universal planes, then we automatically
add the primary and cursor planes. But for universal userspace there's
no such check (and maybe we only want to give the lessee one plane,
maybe not even the primary one), hence we need to check for the
implied plane.
v2: don't forget setcrtc ioctl.
v3: Still allow disabling of the crtc in SETCRTC.
Cc: stable(a)vger.kernel.org
Cc: Keith Packard <keithp(a)keithp.com>
Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
---
drivers/gpu/drm/drm_crtc.c | 4 ++++
drivers/gpu/drm/drm_plane.c | 8 ++++++++
2 files changed, 12 insertions(+)
diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
index 7dabbaf033a1..790ba5941954 100644
--- a/drivers/gpu/drm/drm_crtc.c
+++ b/drivers/gpu/drm/drm_crtc.c
@@ -559,6 +559,10 @@ int drm_mode_setcrtc(struct drm_device *dev, void *data,
plane = crtc->primary;
+ /* allow disabling with the primary plane leased */
+ if (crtc_req->mode_valid && !drm_lease_held(file_priv, plane->base.id))
+ return -EACCES;
+
mutex_lock(&crtc->dev->mode_config.mutex);
DRM_MODESET_LOCK_ALL_BEGIN(dev, ctx,
DRM_MODESET_ACQUIRE_INTERRUPTIBLE, ret);
diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
index 4cfb56893b7f..d6ad60ab0d38 100644
--- a/drivers/gpu/drm/drm_plane.c
+++ b/drivers/gpu/drm/drm_plane.c
@@ -960,6 +960,11 @@ static int drm_mode_cursor_common(struct drm_device *dev,
if (ret)
goto out;
+ if (!drm_lease_held(file_priv, crtc->cursor->base.id)) {
+ ret = -EACCES;
+ goto out;
+ }
+
ret = drm_mode_cursor_universal(crtc, req, file_priv, &ctx);
goto out;
}
@@ -1062,6 +1067,9 @@ int drm_mode_page_flip_ioctl(struct drm_device *dev,
plane = crtc->primary;
+ if (!drm_lease_held(file_priv, plane->base.id))
+ return -EACCES;
+
if (crtc->funcs->page_flip_target) {
u32 current_vblank;
int r;
--
2.14.4
If a PCA953x gpio was used as an interrupt and then released,
the shutdown function was trying to extract the pca953x_chip
pointer directly from the irq_data, but in reality was getting
the gpio_chip structure.
The net effect was that the subsequent writes to the data
structure corrupted data in the gpio_chip structure, which wasn't
immediately obvious until attempting to use the GPIO again in the
future, at which point the kernel panics.
This fix correctly extracts the pca953x_chip structure via the
gpio_chip structure, as is correctly done in the other irq
functions.
Fixes: 0a70fe00efea ("gpio: pca953x: Clear irq trigger type on irq shutdown")
Signed-off-by: Mark Walton <mark.walton(a)serialtek.com>
---
drivers/gpio/gpio-pca953x.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpio/gpio-pca953x.c b/drivers/gpio/gpio-pca953x.c
index caf7dd1..6bd55a4 100644
--- a/drivers/gpio/gpio-pca953x.c
+++ b/drivers/gpio/gpio-pca953x.c
@@ -659,7 +659,8 @@ static int pca953x_irq_set_type(struct irq_data *d, unsigned int type)
static void pca953x_irq_shutdown(struct irq_data *d)
{
- struct pca953x_chip *chip = irq_data_get_irq_chip_data(d);
+ struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
+ struct pca953x_chip *chip = gpiochip_get_data(gc);
u8 mask = 1 << (d->hwirq % BANK_SZ);
chip->irq_trig_raise[d->hwirq / BANK_SZ] &= ~mask;
--
2.7.4
If userspace doesn't enable universal planes, then we automatically
add the primary and cursor planes. But for universal userspace there's
no such check (and maybe we only want to give the lessee one plane,
maybe not even the primary one), hence we need to check for the
implied plane.
v2: don't forget setcrtc ioctl.
v3: Still allow disabling of the crtc in SETCRTC.
Cc: stable(a)vger.kernel.org
Cc: Keith Packard <keithp(a)keithp.com>
Signed-off-by: Daniel Vetter <daniel.vetter(a)intel.com>
---
drivers/gpu/drm/drm_crtc.c | 4 ++++
drivers/gpu/drm/drm_plane.c | 8 ++++++++
2 files changed, 12 insertions(+)
diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c
index 7dabbaf033a1..790ba5941954 100644
--- a/drivers/gpu/drm/drm_crtc.c
+++ b/drivers/gpu/drm/drm_crtc.c
@@ -559,6 +559,10 @@ int drm_mode_setcrtc(struct drm_device *dev, void *data,
plane = crtc->primary;
+ /* allow disabling with the primary plane leased */
+ if (crtc_req->mode_valid && !drm_lease_held(file_priv, plane->base.id))
+ return -EACCES;
+
mutex_lock(&crtc->dev->mode_config.mutex);
DRM_MODESET_LOCK_ALL_BEGIN(dev, ctx,
DRM_MODESET_ACQUIRE_INTERRUPTIBLE, ret);
diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
index 4cfb56893b7f..d6ad60ab0d38 100644
--- a/drivers/gpu/drm/drm_plane.c
+++ b/drivers/gpu/drm/drm_plane.c
@@ -960,6 +960,11 @@ static int drm_mode_cursor_common(struct drm_device *dev,
if (ret)
goto out;
+ if (!drm_lease_held(file_priv, crtc->cursor->base.id)) {
+ ret = -EACCES;
+ goto out;
+ }
+
ret = drm_mode_cursor_universal(crtc, req, file_priv, &ctx);
goto out;
}
@@ -1062,6 +1067,9 @@ int drm_mode_page_flip_ioctl(struct drm_device *dev,
plane = crtc->primary;
+ if (!drm_lease_held(file_priv, plane->base.id))
+ return -EACCES;
+
if (crtc->funcs->page_flip_target) {
u32 current_vblank;
int r;
--
2.14.4
Tree/Branch: v4.20.13
Git describe: v4.20.13
Commit: 0f7c162c1d Linux 4.20.13
Build Time: 130 min 1 sec
Passed: 11 / 11 (100.00 %)
Failed: 0 / 11 ( 0.00 %)
Errors: 0
Warnings: 5
Section Mismatches: 0
-------------------------------------------------------------------------------
defconfigs with issues (other than build errors):
1 warnings 0 mismatches : x86_64-allmodconfig
2 warnings 0 mismatches : arm64-allmodconfig
1 warnings 0 mismatches : arm-multi_v7_defconfig
4 warnings 0 mismatches : arm-allmodconfig
1 warnings 0 mismatches : arm64-defconfig
-------------------------------------------------------------------------------
Warnings Summary: 5
5 ../include/linux/spinlock.h:279:3: warning: 'flags' may be used uninitialized in this function [-Wmaybe-uninitialized]
1 ../drivers/staging/erofs/unzip_vle.c:185:29: warning: array subscript is above array bounds [-Warray-bounds]
1 ../drivers/scsi/myrs.c:821:24: warning: 'sshdr.sense_key' may be used uninitialized in this function [-Wmaybe-uninitialized]
1 ../drivers/net/ethernet/mellanox/mlx5/core/en_stats.c:216:1: warning: the frame size of 1064 bytes is larger than 1024 bytes [-Wframe-larger-than=]
1 ../drivers/isdn/hardware/eicon/message.c:5985:1: warning: the frame size of 2064 bytes is larger than 2048 bytes [-Wframe-larger-than=]
===============================================================================
Detailed per-defconfig build reports below:
-------------------------------------------------------------------------------
x86_64-allmodconfig : PASS, 0 errors, 1 warnings, 0 section mismatches
Warnings:
../include/linux/spinlock.h:279:3: warning: 'flags' may be used uninitialized in this function [-Wmaybe-uninitialized]
-------------------------------------------------------------------------------
arm64-allmodconfig : PASS, 0 errors, 2 warnings, 0 section mismatches
Warnings:
../drivers/isdn/hardware/eicon/message.c:5985:1: warning: the frame size of 2064 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../include/linux/spinlock.h:279:3: warning: 'flags' may be used uninitialized in this function [-Wmaybe-uninitialized]
-------------------------------------------------------------------------------
arm-multi_v7_defconfig : PASS, 0 errors, 1 warnings, 0 section mismatches
Warnings:
../include/linux/spinlock.h:279:3: warning: 'flags' may be used uninitialized in this function [-Wmaybe-uninitialized]
-------------------------------------------------------------------------------
arm-allmodconfig : PASS, 0 errors, 4 warnings, 0 section mismatches
Warnings:
../drivers/net/ethernet/mellanox/mlx5/core/en_stats.c:216:1: warning: the frame size of 1064 bytes is larger than 1024 bytes [-Wframe-larger-than=]
../drivers/scsi/myrs.c:821:24: warning: 'sshdr.sense_key' may be used uninitialized in this function [-Wmaybe-uninitialized]
../drivers/staging/erofs/unzip_vle.c:185:29: warning: array subscript is above array bounds [-Warray-bounds]
../include/linux/spinlock.h:279:3: warning: 'flags' may be used uninitialized in this function [-Wmaybe-uninitialized]
-------------------------------------------------------------------------------
arm64-defconfig : PASS, 0 errors, 1 warnings, 0 section mismatches
Warnings:
../include/linux/spinlock.h:279:3: warning: 'flags' may be used uninitialized in this function [-Wmaybe-uninitialized]
-------------------------------------------------------------------------------
Passed with no errors, warnings or mismatches:
arm64-allnoconfig
arm-multi_v5_defconfig
x86_64-defconfig
arm-allnoconfig
x86_64-allnoconfig
arm-multi_v4t_defconfig
Hello,
In my testing of crosvm[1] with Linux 4.4.175, I am observing failures
on a 'kevin' Chromebook (RK3399) device - the guest kernel does not
even get to the point of printing its first messages, and the host
seems to be spinning at 100% CPU in KVM_RUN.
I narrowed this down to the 4.4 stable backport of "arm64: KVM: Skip
MMIO insn after emulation" - with this patch reverted, I can boot the
guest kernel as normal again.
Unfortunately, I am unable to easily test with a newer upstream kernel
(this board is using the Chrome OS kernel with many additional patches
applied on top of 4.4), so I'm not sure if this issue was introduced
in the mainline commit or only in the stable branch. Is it possible
that this patch has other dependencies that were missed in the
backport? It looks like it was part of a larger series, but only this
patch got pulled for 4.4 stable.
Thanks,
-- Daniel
[1]: https://chromium.googlesource.com/chromiumos/platform/crosvm/
This is a note to let you know that I've just added the patch titled
staging: erofs: fix mis-acted TAIL merging behavior
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From a112152f6f3a2a88caa6f414d540bd49e406af60 Mon Sep 17 00:00:00 2001
From: Gao Xiang <gaoxiang25(a)huawei.com>
Date: Wed, 27 Feb 2019 13:33:32 +0800
Subject: staging: erofs: fix mis-acted TAIL merging behavior
EROFS has an optimized path called TAIL merging, which is designed
to merge multiple reads and the corresponding decompressions into
one if these requests read continuous pages almost at the same time.
In general, it behaves as follows:
________________________________________________________________
... | TAIL . HEAD | PAGE | PAGE | TAIL . HEAD | ...
_____|_combined page A_|________|________|_combined page B_|____
1 ] -> [ 2 ] -> [ 3
If the above three reads are requested in the order 1-2-3, it will
generate a large work chain rather than 3 individual work chains
to reduce scheduling overhead and boost up sequential read.
However, if Read 2 is processed slightly earlier than Read 1,
currently it still generates 2 individual work chains (chain 1, 2)
but it does in-place decompression for combined page A, moreover,
if chain 2 decompresses ahead of chain 1, it will be a race and
lead to corrupted decompressed page. This patch fixes it.
Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support")
Cc: <stable(a)vger.kernel.org> # 4.19+
Signed-off-by: Gao Xiang <gaoxiang25(a)huawei.com>
Reviewed-by: Chao Yu <yuchao0(a)huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/erofs/unzip_vle.c | 70 ++++++++++++++++++++-----------
1 file changed, 45 insertions(+), 25 deletions(-)
diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index 74c13b0a3d33..02f34a83147d 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -107,15 +107,30 @@ enum z_erofs_vle_work_role {
Z_EROFS_VLE_WORK_SECONDARY,
Z_EROFS_VLE_WORK_PRIMARY,
/*
- * The current work has at least been linked with the following
- * processed chained works, which means if the processing page
- * is the tail partial page of the work, the current work can
- * safely use the whole page, as illustrated below:
- * +--------------+-------------------------------------------+
- * | tail page | head page (of the previous work) |
- * +--------------+-------------------------------------------+
- * /\ which belongs to the current work
- * [ (*) this page can be used for the current work itself. ]
+ * The current work was the tail of an exist chain, and the previous
+ * processed chained works are all decided to be hooked up to it.
+ * A new chain should be created for the remaining unprocessed works,
+ * therefore different from Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED,
+ * the next work cannot reuse the whole page in the following scenario:
+ * ________________________________________________________________
+ * | tail (partial) page | head (partial) page |
+ * | (belongs to the next work) | (belongs to the current work) |
+ * |_______PRIMARY_FOLLOWED_______|________PRIMARY_HOOKED___________|
+ */
+ Z_EROFS_VLE_WORK_PRIMARY_HOOKED,
+ /*
+ * The current work has been linked with the processed chained works,
+ * and could be also linked with the potential remaining works, which
+ * means if the processing page is the tail partial page of the work,
+ * the current work can safely use the whole page (since the next work
+ * is under control) for in-place decompression, as illustrated below:
+ * ________________________________________________________________
+ * | tail (partial) page | head (partial) page |
+ * | (of the current work) | (of the previous work) |
+ * | PRIMARY_FOLLOWED or | |
+ * |_____PRIMARY_HOOKED____|____________PRIMARY_FOLLOWED____________|
+ *
+ * [ (*) the above page can be used for the current work itself. ]
*/
Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED,
Z_EROFS_VLE_WORK_MAX
@@ -309,10 +324,10 @@ static int z_erofs_vle_work_add_page(
return ret ? 0 : -EAGAIN;
}
-static inline bool try_to_claim_workgroup(
- struct z_erofs_vle_workgroup *grp,
- z_erofs_vle_owned_workgrp_t *owned_head,
- bool *hosted)
+static enum z_erofs_vle_work_role
+try_to_claim_workgroup(struct z_erofs_vle_workgroup *grp,
+ z_erofs_vle_owned_workgrp_t *owned_head,
+ bool *hosted)
{
DBG_BUGON(*hosted == true);
@@ -326,6 +341,9 @@ static inline bool try_to_claim_workgroup(
*owned_head = &grp->next;
*hosted = true;
+ /* lucky, I am the followee :) */
+ return Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED;
+
} else if (grp->next == Z_EROFS_VLE_WORKGRP_TAIL) {
/*
* type 2, link to the end of a existing open chain,
@@ -335,12 +353,11 @@ static inline bool try_to_claim_workgroup(
if (cmpxchg(&grp->next, Z_EROFS_VLE_WORKGRP_TAIL,
*owned_head) != Z_EROFS_VLE_WORKGRP_TAIL)
goto retry;
-
*owned_head = Z_EROFS_VLE_WORKGRP_TAIL;
- } else
- return false; /* :( better luck next time */
+ return Z_EROFS_VLE_WORK_PRIMARY_HOOKED;
+ }
- return true; /* lucky, I am the followee :) */
+ return Z_EROFS_VLE_WORK_PRIMARY; /* :( better luck next time */
}
struct z_erofs_vle_work_finder {
@@ -418,12 +435,9 @@ z_erofs_vle_work_lookup(const struct z_erofs_vle_work_finder *f)
*f->hosted = false;
if (!primary)
*f->role = Z_EROFS_VLE_WORK_SECONDARY;
- /* claim the workgroup if possible */
- else if (try_to_claim_workgroup(grp, f->owned_head, f->hosted))
- *f->role = Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED;
- else
- *f->role = Z_EROFS_VLE_WORK_PRIMARY;
-
+ else /* claim the workgroup if possible */
+ *f->role = try_to_claim_workgroup(grp, f->owned_head,
+ f->hosted);
return work;
}
@@ -487,6 +501,9 @@ z_erofs_vle_work_register(const struct z_erofs_vle_work_finder *f,
return work;
}
+#define builder_is_hooked(builder) \
+ ((builder)->role >= Z_EROFS_VLE_WORK_PRIMARY_HOOKED)
+
#define builder_is_followed(builder) \
((builder)->role >= Z_EROFS_VLE_WORK_PRIMARY_FOLLOWED)
@@ -680,7 +697,7 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
struct z_erofs_vle_work_builder *const builder = &fe->builder;
const loff_t offset = page_offset(page);
- bool tight = builder_is_followed(builder);
+ bool tight = builder_is_hooked(builder);
struct z_erofs_vle_work *work = builder->work;
enum z_erofs_cache_alloctype cache_strategy;
@@ -739,7 +756,7 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
map->m_plen / PAGE_SIZE,
cache_strategy, page_pool, GFP_KERNEL);
- tight &= builder_is_followed(builder);
+ tight &= builder_is_hooked(builder);
work = builder->work;
hitted:
cur = end - min_t(unsigned int, offset + end - map->m_la, end);
@@ -754,6 +771,9 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
(tight ? Z_EROFS_PAGE_TYPE_EXCLUSIVE :
Z_EROFS_VLE_PAGE_TYPE_TAIL_SHARED));
+ if (cur)
+ tight &= builder_is_followed(builder);
+
retry:
err = z_erofs_vle_work_add_page(builder, page, page_type);
/* should allocate an additional staging page for pagevec */
--
2.21.0
This is a note to let you know that I've just added the patch titled
staging: erofs: fix illegal address access under memory pressure
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 1e5ceeab6929585512c63d05911d6657064abf7b Mon Sep 17 00:00:00 2001
From: Gao Xiang <gaoxiang25(a)huawei.com>
Date: Wed, 27 Feb 2019 13:33:31 +0800
Subject: staging: erofs: fix illegal address access under memory pressure
Considering a read request with two decompressed file pages,
If a decompression work cannot be started on the previous page
due to memory pressure but in-memory LTP map lookup is done,
builder->work should be still NULL.
Moreover, if the current page also belongs to the same map,
it won't try to start the decompression work again and then
run into trouble.
This patch aims to solve the above issue only with little changes
as much as possible in order to make the fix backport easier.
kernel message is:
<4>[1051408.015930s]SLUB: Unable to allocate memory on node -1, gfp=0x2408040(GFP_NOFS|__GFP_ZERO)
<4>[1051408.015930s] cache: erofs_compress, object size: 144, buffer size: 144, default order: 0, min order: 0
<4>[1051408.015930s] node 0: slabs: 98, objs: 2744, free: 0
* Cannot allocate the decompression work
<3>[1051408.015960s]erofs: z_erofs_vle_normalaccess_readpages, readahead error at page 1008 of nid 5391488
* Note that the previous page was failed to read
<0>[1051408.015960s]Internal error: Accessing user space memory outside uaccess.h routines: 96000005 [#1] PREEMPT SMP
...
<4>[1051408.015991s]Hardware name: kirin710 (DT)
...
<4>[1051408.016021s]PC is at z_erofs_vle_work_add_page+0xa0/0x17c
<4>[1051408.016021s]LR is at z_erofs_do_read_page+0x12c/0xcf0
...
<4>[1051408.018096s][<ffffff80c6fb0fd4>] z_erofs_vle_work_add_page+0xa0/0x17c
<4>[1051408.018096s][<ffffff80c6fb3814>] z_erofs_vle_normalaccess_readpages+0x1a0/0x37c
<4>[1051408.018096s][<ffffff80c6d670b8>] read_pages+0x70/0x190
<4>[1051408.018127s][<ffffff80c6d6736c>] __do_page_cache_readahead+0x194/0x1a8
<4>[1051408.018127s][<ffffff80c6d59318>] filemap_fault+0x398/0x684
<4>[1051408.018127s][<ffffff80c6d8a9e0>] __do_fault+0x8c/0x138
<4>[1051408.018127s][<ffffff80c6d8f90c>] handle_pte_fault+0x730/0xb7c
<4>[1051408.018127s][<ffffff80c6d8fe04>] __handle_mm_fault+0xac/0xf4
<4>[1051408.018157s][<ffffff80c6d8fec8>] handle_mm_fault+0x7c/0x118
<4>[1051408.018157s][<ffffff80c8c52998>] do_page_fault+0x354/0x474
<4>[1051408.018157s][<ffffff80c8c52af8>] do_translation_fault+0x40/0x48
<4>[1051408.018157s][<ffffff80c6c002f4>] do_mem_abort+0x80/0x100
<4>[1051408.018310s]---[ end trace 9f4009a3283bd78b ]---
Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support")
Cc: <stable(a)vger.kernel.org> # 4.19+
Signed-off-by: Gao Xiang <gaoxiang25(a)huawei.com>
Reviewed-by: Chao Yu <yuchao0(a)huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/erofs/unzip_vle.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index 416dde4e8ea1..74c13b0a3d33 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -698,8 +698,12 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
/* lucky, within the range of the current map_blocks */
if (offset + cur >= map->m_la &&
- offset + cur < map->m_la + map->m_llen)
+ offset + cur < map->m_la + map->m_llen) {
+ /* didn't get a valid unzip work previously (very rare) */
+ if (!builder->work)
+ goto restart_now;
goto hitted;
+ }
/* go ahead the next map_blocks */
debugln("%s: [out-of-range] pos %llu", __func__, offset + cur);
@@ -713,6 +717,7 @@ static int z_erofs_do_read_page(struct z_erofs_vle_frontend *fe,
if (unlikely(err))
goto err_out;
+restart_now:
if (unlikely(!(map->m_flags & EROFS_MAP_MAPPED)))
goto hitted;
--
2.21.0
This is a note to let you know that I've just added the patch titled
staging: erofs: compressed_pages should not be accessed again after
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From af692e117cb8cd9d3d844d413095775abc1217f9 Mon Sep 17 00:00:00 2001
From: Gao Xiang <gaoxiang25(a)huawei.com>
Date: Wed, 27 Feb 2019 13:33:30 +0800
Subject: staging: erofs: compressed_pages should not be accessed again after
freed
This patch resolves the following page use-after-free issue,
z_erofs_vle_unzip:
...
for (i = 0; i < nr_pages; ++i) {
...
z_erofs_onlinepage_endio(page); (1)
}
for (i = 0; i < clusterpages; ++i) {
page = compressed_pages[i];
if (page->mapping == mngda) (2)
continue;
/* recycle all individual staging pages */
(void)z_erofs_gather_if_stagingpage(page_pool, page); (3)
WRITE_ONCE(compressed_pages[i], NULL);
}
...
After (1) is executed, page is freed and could be then reused, if
compressed_pages is scanned after that, it could fall info (2) or
(3) by mistake and that could finally be in a mess.
This patch aims to solve the above issue only with little changes
as much as possible in order to make the fix backport easier.
Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support")
Cc: <stable(a)vger.kernel.org> # 4.19+
Signed-off-by: Gao Xiang <gaoxiang25(a)huawei.com>
Reviewed-by: Chao Yu <yuchao0(a)huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/erofs/unzip_vle.c | 38 ++++++++++++++-------------
drivers/staging/erofs/unzip_vle.h | 3 +--
drivers/staging/erofs/unzip_vle_lz4.c | 19 ++++++--------
3 files changed, 29 insertions(+), 31 deletions(-)
diff --git a/drivers/staging/erofs/unzip_vle.c b/drivers/staging/erofs/unzip_vle.c
index a127d8db76d8..416dde4e8ea1 100644
--- a/drivers/staging/erofs/unzip_vle.c
+++ b/drivers/staging/erofs/unzip_vle.c
@@ -986,11 +986,10 @@ static int z_erofs_vle_unzip(struct super_block *sb,
if (llen > grp->llen)
llen = grp->llen;
- err = z_erofs_vle_unzip_fast_percpu(compressed_pages,
- clusterpages, pages, llen, work->pageofs,
- z_erofs_onlinepage_endio);
+ err = z_erofs_vle_unzip_fast_percpu(compressed_pages, clusterpages,
+ pages, llen, work->pageofs);
if (err != -ENOTSUPP)
- goto out_percpu;
+ goto out;
if (sparsemem_pages >= nr_pages)
goto skip_allocpage;
@@ -1011,8 +1010,25 @@ static int z_erofs_vle_unzip(struct super_block *sb,
erofs_vunmap(vout, nr_pages);
out:
+ /* must handle all compressed pages before endding pages */
+ for (i = 0; i < clusterpages; ++i) {
+ page = compressed_pages[i];
+
+#ifdef EROFS_FS_HAS_MANAGED_CACHE
+ if (page->mapping == MNGD_MAPPING(sbi))
+ continue;
+#endif
+ /* recycle all individual staging pages */
+ (void)z_erofs_gather_if_stagingpage(page_pool, page);
+
+ WRITE_ONCE(compressed_pages[i], NULL);
+ }
+
for (i = 0; i < nr_pages; ++i) {
page = pages[i];
+ if (!page)
+ continue;
+
DBG_BUGON(!page->mapping);
/* recycle all individual staging pages */
@@ -1025,20 +1041,6 @@ static int z_erofs_vle_unzip(struct super_block *sb,
z_erofs_onlinepage_endio(page);
}
-out_percpu:
- for (i = 0; i < clusterpages; ++i) {
- page = compressed_pages[i];
-
-#ifdef EROFS_FS_HAS_MANAGED_CACHE
- if (page->mapping == MNGD_MAPPING(sbi))
- continue;
-#endif
- /* recycle all individual staging pages */
- (void)z_erofs_gather_if_stagingpage(page_pool, page);
-
- WRITE_ONCE(compressed_pages[i], NULL);
- }
-
if (pages == z_pagemap_global)
mutex_unlock(&z_pagemap_global_lock);
else if (unlikely(pages != pages_onstack))
diff --git a/drivers/staging/erofs/unzip_vle.h b/drivers/staging/erofs/unzip_vle.h
index 9dafa8dc712c..517e5ce8c5e9 100644
--- a/drivers/staging/erofs/unzip_vle.h
+++ b/drivers/staging/erofs/unzip_vle.h
@@ -218,8 +218,7 @@ int z_erofs_vle_plain_copy(struct page **compressed_pages,
int z_erofs_vle_unzip_fast_percpu(struct page **compressed_pages,
unsigned int clusterpages,
struct page **pages, unsigned int outlen,
- unsigned short pageofs,
- void (*endio)(struct page *));
+ unsigned short pageofs);
int z_erofs_vle_unzip_vmap(struct page **compressed_pages,
unsigned int clusterpages,
void *vaddr, unsigned int llen,
diff --git a/drivers/staging/erofs/unzip_vle_lz4.c b/drivers/staging/erofs/unzip_vle_lz4.c
index 8e8d705a6861..48b263a2731a 100644
--- a/drivers/staging/erofs/unzip_vle_lz4.c
+++ b/drivers/staging/erofs/unzip_vle_lz4.c
@@ -125,8 +125,7 @@ int z_erofs_vle_unzip_fast_percpu(struct page **compressed_pages,
unsigned int clusterpages,
struct page **pages,
unsigned int outlen,
- unsigned short pageofs,
- void (*endio)(struct page *))
+ unsigned short pageofs)
{
void *vin, *vout;
unsigned int nr_pages, i, j;
@@ -148,19 +147,16 @@ int z_erofs_vle_unzip_fast_percpu(struct page **compressed_pages,
ret = z_erofs_unzip_lz4(vin, vout + pageofs,
clusterpages * PAGE_SIZE, outlen);
- if (ret >= 0) {
- outlen = ret;
- ret = 0;
- }
+ if (ret < 0)
+ goto out;
+ ret = 0;
for (i = 0; i < nr_pages; ++i) {
j = min((unsigned int)PAGE_SIZE - pageofs, outlen);
if (pages[i]) {
- if (ret < 0) {
- SetPageError(pages[i]);
- } else if (clusterpages == 1 &&
- pages[i] == compressed_pages[0]) {
+ if (clusterpages == 1 &&
+ pages[i] == compressed_pages[0]) {
memcpy(vin + pageofs, vout + pageofs, j);
} else {
void *dst = kmap_atomic(pages[i]);
@@ -168,12 +164,13 @@ int z_erofs_vle_unzip_fast_percpu(struct page **compressed_pages,
memcpy(dst + pageofs, vout + pageofs, j);
kunmap_atomic(dst);
}
- endio(pages[i]);
}
vout += PAGE_SIZE;
outlen -= j;
pageofs = 0;
}
+
+out:
preempt_enable();
if (clusterpages == 1)
--
2.21.0
Commit 5ad7346b4ae2 ("cpufreq: kryo: Add module remove and exit") made
it possible to build the kyro cpufreq driver as a module, but it failed
to release all the resources, i.e. OPP tables, when the module is
unloaded.
This patch fixes it by releasing the OPP tables, by calling
dev_pm_opp_put_supported_hw() for them, from the
qcom_cpufreq_kryo_remove() routine. The array of pointers to the OPP
tables is also allocated dynamically now in qcom_cpufreq_kryo_probe(),
as the pointers will be required while releasing the resources.
Compile tested only.
Cc: 4.18+ <stable(a)vger.kernel.org> # v4.18+
Fixes: 5ad7346b4ae2 ("cpufreq: kryo: Add module remove and exit")
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
V1->V2:
- The previous version targeted a compile time issue of frame size but
this fixes a bug in the driver and needs to get applied to 4.18+
kernels.
- This may go in 5.0 release.
- Haven't included any Reviewed-by tags as there were many changes in
this version.
drivers/cpufreq/qcom-cpufreq-kryo.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/qcom-cpufreq-kryo.c b/drivers/cpufreq/qcom-cpufreq-kryo.c
index 2a3675c24032..65edbfd4d4a5 100644
--- a/drivers/cpufreq/qcom-cpufreq-kryo.c
+++ b/drivers/cpufreq/qcom-cpufreq-kryo.c
@@ -75,7 +75,7 @@ static enum _msm8996_version qcom_cpufreq_kryo_get_msm_id(void)
static int qcom_cpufreq_kryo_probe(struct platform_device *pdev)
{
- struct opp_table *opp_tables[NR_CPUS] = {0};
+ struct opp_table **opp_tables;
enum _msm8996_version msm8996_version;
struct nvmem_cell *speedbin_nvmem;
struct device_node *np;
@@ -133,6 +133,10 @@ static int qcom_cpufreq_kryo_probe(struct platform_device *pdev)
}
kfree(speedbin);
+ opp_tables = kcalloc(num_possible_cpus(), sizeof(*opp_tables), GFP_KERNEL);
+ if (!opp_tables)
+ return -ENOMEM;
+
for_each_possible_cpu(cpu) {
cpu_dev = get_cpu_device(cpu);
if (NULL == cpu_dev) {
@@ -151,8 +155,10 @@ static int qcom_cpufreq_kryo_probe(struct platform_device *pdev)
cpufreq_dt_pdev = platform_device_register_simple("cpufreq-dt", -1,
NULL, 0);
- if (!IS_ERR(cpufreq_dt_pdev))
+ if (!IS_ERR(cpufreq_dt_pdev)) {
+ platform_set_drvdata(pdev, opp_tables);
return 0;
+ }
ret = PTR_ERR(cpufreq_dt_pdev);
dev_err(cpu_dev, "Failed to register platform device\n");
@@ -163,13 +169,23 @@ static int qcom_cpufreq_kryo_probe(struct platform_device *pdev)
break;
dev_pm_opp_put_supported_hw(opp_tables[cpu]);
}
+ kfree(opp_tables);
return ret;
}
static int qcom_cpufreq_kryo_remove(struct platform_device *pdev)
{
+ struct opp_table **opp_tables = platform_get_drvdata(pdev);
+ unsigned cpu;
+
platform_device_unregister(cpufreq_dt_pdev);
+
+ for_each_possible_cpu(cpu)
+ dev_pm_opp_put_supported_hw(opp_tables[cpu]);
+
+ kfree(opp_tables);
+
return 0;
}
--
2.21.0.rc0.269.g1a574e7a288b
Hello,
I would like to request:
8cad443eacf6 ("net: stmmac: Fix reception of Broadcom switches tags")
(was first included in v4.16)
and
565020aaeebf ("net: stmmac: Disable ACS Feature for GMAC >= 4")
(was first included in v4.17)
to be backported to linux-stable/linux-4.14.y
Without these, packets will get stripped twice (corrupted)
when using stmmac with switches that uses DSA tags.
Kind regards,
Niklas