From: Mark Zhang <markz(a)nvidia.com>
commit 7151449fe7fa5962c6153355f9779d6be99e8e97 upstream.
If client have not provided the mask base register then do not
write into the mask register.
Signed-off-by: Laxman Dewangan <ldewangan(a)nvidia.com>
Signed-off-by: Jinyoung Park <jinyoungp(a)nvidia.com>
Signed-off-by: Venkat Reddy Talla <vreddytalla(a)nvidia.com>
Signed-off-by: Mark Zhang <markz(a)nvidia.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
---
This commit was found in an nVidia product tree based on
v4.19, and looks like definitive stable material to me.
It should go into v4.19 only as far as I can tell.
---
drivers/base/regmap/regmap-irq.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/base/regmap/regmap-irq.c b/drivers/base/regmap/regmap-irq.c
index 429ca8ed7e51..982c7ac311b8 100644
--- a/drivers/base/regmap/regmap-irq.c
+++ b/drivers/base/regmap/regmap-irq.c
@@ -91,6 +91,9 @@ static void regmap_irq_sync_unlock(struct irq_data *data)
* suppress pointless writes.
*/
for (i = 0; i < d->chip->num_regs; i++) {
+ if (!d->chip->mask_base)
+ continue;
+
reg = d->chip->mask_base +
(i * map->reg_stride * d->irq_reg_stride);
if (d->chip->mask_invert) {
@@ -526,6 +529,9 @@ int regmap_add_irq_chip(struct regmap *map, int irq, int irq_flags,
/* Mask all the interrupts by default */
for (i = 0; i < chip->num_regs; i++) {
d->mask_buf[i] = d->mask_buf_def[i];
+ if (!chip->mask_base)
+ continue;
+
reg = chip->mask_base +
(i * map->reg_stride * d->irq_reg_stride);
if (chip->mask_invert)
--
2.20.1
The Raspberry Pi Compute Module 3 gathers components that are already
supported in 4.19.y kernels except there's no DTB for it. This small
series of patches backports:
1. the DTB addition on the arm platform
2. the extension of this addition to the arm64 platform
3. the correction of this extension.
I chose to backport patch 2 and 3 separately instead of squashing them
together but I can resubmit with patches 2 and 3 merged if that's
desirable.
This was successfully tested on bare metal, in 64-bit mode, with an
extra patch to the raspi3-firmware packages to get the DTB installed
in the right place, under the right name (bcm2710-rpi-cm3.dtb) so that
the bootloader finds it. The base kernel was 4.19.37-5 as packaged in
Debian.
The summary of changes follows.
Liviu Dudau (1):
arm64: dts: broadcom: Use the .dtb name in the rule, rather than .dts
Stefan Wahren (2):
ARM: dts: add Raspberry Pi Compute Module 3 and IO board
arm64: dts: broadcom: Add reference to Compute Module IO Board V3
arch/arm/boot/dts/Makefile | 1 +
arch/arm/boot/dts/bcm2837-rpi-cm3-io3.dts | 87 ++++++++++++++++++++++
arch/arm/boot/dts/bcm2837-rpi-cm3.dtsi | 52 +++++++++++++
arch/arm64/boot/dts/broadcom/Makefile | 3 +-
.../boot/dts/broadcom/bcm2837-rpi-cm3-io3.dts | 2 +
5 files changed, 144 insertions(+), 1 deletion(-)
create mode 100644 arch/arm/boot/dts/bcm2837-rpi-cm3-io3.dts
create mode 100644 arch/arm/boot/dts/bcm2837-rpi-cm3.dtsi
create mode 100644 arch/arm64/boot/dts/broadcom/bcm2837-rpi-cm3-io3.dts
----- Original Message -----
> From: "Sasha Levin" <sashal(a)kernel.org>
> To: "Sasha Levin" <sashal(a)kernel.org>, "Ronnie Sahlberg" <lsahlber(a)redhat.com>, "linux-cifs"
> <linux-cifs(a)vger.kernel.org>
> Cc: "Steve French" <smfrench(a)gmail.com>, "Stable" <stable(a)vger.kernel.org>, stable(a)vger.kernel.org
> Sent: Tuesday, 16 July, 2019 11:27:10 AM
> Subject: Re: [PATCH] cifs: fix crash in smb2_compound_op()/smb2_set_next_command()
>
> Hi,
>
> [This is an automated email]
>
> This commit has been processed because it contains a -stable tag.
> The stable tag indicates that it's relevant for the following trees: all
>
> The bot has tested the following trees: v5.2.1, v5.1.18, v4.19.59, v4.14.133,
> v4.9.185, v4.4.185.
>
> v5.2.1: Build OK!
> v5.1.18: Build OK!
> v4.19.59: Failed to apply! Possible dependencies:
> 271b9c0c8007 ("smb3: Fix rmdir compounding regression to strict servers")
> c2e0fe3f5aae ("cifs: make rmdir() use compounding")
> c5a5f38f075c ("cifs: add a smb2_compound_op and change QUERY_INFO to use
> it")
> dcbf91035709 ("cifs: change SMB2_OP_SET_INFO to use compounding")
> e77fe73c7e38 ("cifs: we can not use small padding iovs together with
> encryption")
> f5b05d622a3e ("cifs: add IOCTL for QUERY_INFO passthrough to userspace")
> f733e3936da4 ("cifs: change mkdir to use a compound")
> f7bfe04bf0db ("cifs: change SMB2_OP_SET_EOF to use compounding")
>
> v4.14.133: Failed to apply! Possible dependencies:
> 2e96467d9eb1 ("cifs: add pdu_size to the TCP_Server_Info structure")
> 3d4ef9a15343 ("smb3: fix redundant opens on root")
> 730928c8f4be ("cifs: update smb2_queryfs() to use compounding")
> 74dcf418fe34 ("CIFS: SMBD: Read correct returned data length for RDMA
> write (SMB read) I/O")
> 8ce79ec359ad ("cifs: update multiplex loop to handle compounded
> responses")
> 91cb74f5142c ("cifs: Change SMB2_open to return an iov for the error
> parameter")
> 93012bf98416 ("cifs: add server->vals->header_preamble_size")
> 9d874c36552a ("cifs: fix a buffer leak in smb2_query_symlink")
> c5a5f38f075c ("cifs: add a smb2_compound_op and change QUERY_INFO to use
> it")
> f5b05d622a3e ("cifs: add IOCTL for QUERY_INFO passthrough to userspace")
>
> v4.9.185: Failed to apply! Possible dependencies:
> 31473fc4f965 ("CIFS: Separate SMB2 header structure")
> 7fb8986e7449 ("CIFS: Add capability to transform requests before
> sending")
> 8ce79ec359ad ("cifs: update multiplex loop to handle compounded
> responses")
> 9bb17e0916a0 ("CIFS: Add transform header handling callbacks")
> b8f57ee8aad4 ("CIFS: Separate RFC1001 length processing for SMB2 read")
> da502f7df03d ("CIFS: Make SendReceive2() takes resp iov")
> ef65aaede23f ("smb2: Enforce sec= mount option")
> f5b05d622a3e ("cifs: add IOCTL for QUERY_INFO passthrough to userspace")
>
> v4.4.185: Failed to apply! Possible dependencies:
> 141891f4727c ("SMB3: Add mount parameter to allow user to override max
> credits")
> 166cea4dc3a4 ("SMB2: Separate RawNTLMSSP authentication from
> SMB2_sess_setup")
> 16c568efff82 ("cifs: merge the hash calculation helpers")
> 275516cdcfa4 ("Print IP address of unresponsive server")
> 31473fc4f965 ("CIFS: Separate SMB2 header structure")
> 373512ec5c10 ("Prepare for encryption support (first part). Add
> decryption and encryption key generation. Thanks to Metze for helping
> with this.")
> 3baf1a7b9215 ("SMB2: Separate Kerberos authentication from
> SMB2_sess_setup")
> 7fb8986e7449 ("CIFS: Add capability to transform requests before
> sending")
> 834170c85978 ("Enable previous version support")
> 8ce79ec359ad ("cifs: update multiplex loop to handle compounded
> responses")
> 9bb17e0916a0 ("CIFS: Add transform header handling callbacks")
> adfeb3e00e8e ("cifs: Make echo interval tunable")
> da502f7df03d ("CIFS: Make SendReceive2() takes resp iov")
> ef65aaede23f ("smb2: Enforce sec= mount option")
> f5b05d622a3e ("cifs: add IOCTL for QUERY_INFO passthrough to userspace")
>
>
> NOTE: The patch will not be queued to stable trees until it is upstream.
>
> How should we proceed with this patch?
So it applies cleanly in v5.2.1 and v5.1.18.
I think it is sufficient to get into those two versions then. It is very hard to trigger this issue.
> --
> Thanks,
> Sasha
>
When scsi_init_sense_cache(host) is called concurrently from different
hosts, each code path may see that the cache isn't created, then try
to create a new one, then the created sense cache may be overrided and
leaked.
Fixes the issue by moving 'mutex_lock(&scsi_sense_cache_mutex)' before
scsi_select_sense_cache().
Fixes: 0a6ac4ee7c21 ("scsi: respect unchecked_isa_dma for blk-mq")
Cc: Stable <stable(a)vger.kernel.org>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Hannes Reinecke <hare(a)suse.com>
Cc: Ewan D. Milne <emilne(a)redhat.com>
Signed-off-by: Ming Lei <ming.lei(a)redhat.com>
---
drivers/scsi/scsi_lib.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index e07a376a8c38..7493680ec104 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -72,11 +72,11 @@ int scsi_init_sense_cache(struct Scsi_Host *shost)
struct kmem_cache *cache;
int ret = 0;
+ mutex_lock(&scsi_sense_cache_mutex);
cache = scsi_select_sense_cache(shost->unchecked_isa_dma);
if (cache)
- return 0;
+ goto exit;
- mutex_lock(&scsi_sense_cache_mutex);
if (shost->unchecked_isa_dma) {
scsi_sense_isadma_cache =
kmem_cache_create("scsi_sense_cache(DMA)",
@@ -92,7 +92,7 @@ int scsi_init_sense_cache(struct Scsi_Host *shost)
if (!scsi_sense_cache)
ret = -ENOMEM;
}
-
+ exit:
mutex_unlock(&scsi_sense_cache_mutex);
return ret;
}
--
2.20.1
When migrating an anonymous private page to a ZONE_DEVICE private page,
the source page->mapping and page->index fields are copied to the
destination ZONE_DEVICE struct page and the page_mapcount() is increased.
This is so rmap_walk() can be used to unmap and migrate the page back to
system memory. However, try_to_unmap_one() computes the subpage pointer
from a swap pte which computes an invalid page pointer and a kernel panic
results such as:
BUG: unable to handle page fault for address: ffffea1fffffffc8
Currently, only single pages can be migrated to device private memory so
no subpage computation is needed and it can be set to "page".
Fixes: a5430dda8a3a1c ("mm/migrate: support un-addressable ZONE_DEVICE page in migration")
Signed-off-by: Ralph Campbell <rcampbell(a)nvidia.com>
Cc: "Jérôme Glisse" <jglisse(a)redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/rmap.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/mm/rmap.c b/mm/rmap.c
index e5dfe2ae6b0d..ec1af8b60423 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1476,6 +1476,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
* No need to invalidate here it will synchronize on
* against the special swap migration pte.
*/
+ subpage = page;
goto discard;
}
--
2.20.1
When a ZONE_DEVICE private page is freed, the page->mapping field can be
set. If this page is reused as an anonymous page, the previous value can
prevent the page from being inserted into the CPU's anon rmap table.
For example, when migrating a pte_none() page to device memory:
migrate_vma(ops, vma, start, end, src, dst, private)
migrate_vma_collect()
src[] = MIGRATE_PFN_MIGRATE
migrate_vma_prepare()
/* no page to lock or isolate so OK */
migrate_vma_unmap()
/* no page to unmap so OK */
ops->alloc_and_copy()
/* driver allocates ZONE_DEVICE page for dst[] */
migrate_vma_pages()
migrate_vma_insert_page()
page_add_new_anon_rmap()
__page_set_anon_rmap()
/* This check sees the page's stale mapping field */
if (PageAnon(page))
return
/* page->mapping is not updated */
The result is that the migration appears to succeed but a subsequent CPU
fault will be unable to migrate the page back to system memory or worse.
Clear the page->mapping field when freeing the ZONE_DEVICE page so stale
pointer data doesn't affect future page use.
Fixes: b7a523109fb5c9d2d6dd ("mm: don't clear ->mapping in hmm_devmem_free")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ralph Campbell <rcampbell(a)nvidia.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: Logan Gunthorpe <logang(a)deltatee.com>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Jan Kara <jack(a)suse.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: "Jérôme Glisse" <jglisse(a)redhat.com>
---
kernel/memremap.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/kernel/memremap.c b/kernel/memremap.c
index bea6f887adad..238ae5d0ae8a 100644
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -408,6 +408,10 @@ void __put_devmap_managed_page(struct page *page)
mem_cgroup_uncharge(page);
+ /* Clear anonymous page mapping to prevent stale pointers */
+ if (is_device_private_page(page))
+ page->mapping = NULL;
+
page->pgmap->ops->page_free(page);
} else if (!count)
__put_page(page);
--
2.20.1
From: Fei Yang <fei.yang(a)intel.com>
If scatter-gather operation is allowed, a large USB request is split into
multiple TRBs. These TRBs are chained up by setting DWC3_TRB_CTRL_CHN bit
except the last one which has DWC3_TRB_CTRL_IOC bit set instead.
Since only the last TRB has IOC set for the whole USB request, the
dwc3_gadget_ep_reclaim_completed_trb() gets called only once for the request
and all TRBs are supposed to be reclaimed. However that is not what happens
with the current code.
This patch addresses the issue by checking req->num_pending_sgs. In case the
pending sgs is not zero, update trb_dequeue and req->num_trbs accordingly.
Signed-off-by: Fei Yang <fei.yang(a)intel.com>
Cc: stable <stable(a)vger.kernel.org>
---
drivers/usb/dwc3/gadget.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 173f532..4d5b4eb 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2394,8 +2394,14 @@ static int dwc3_gadget_ep_reclaim_completed_trb(struct dwc3_ep *dep,
if (event->status & DEPEVT_STATUS_SHORT && !chain)
return 1;
- if (event->status & DEPEVT_STATUS_IOC)
+ if (event->status & DEPEVT_STATUS_IOC) {
+ for (count = 0; count < req->num_pending_sgs; count++) {
+ dwc3_ep_inc_deq(dep);
+ req->num_trbs--;
+ }
+ req->num_pending_sgs = 0;
return 1;
+ }
return 0;
}
@@ -2404,7 +2410,7 @@ static int dwc3_gadget_ep_reclaim_trb_sg(struct dwc3_ep *dep,
struct dwc3_request *req, const struct dwc3_event_depevt *event,
int status)
{
- struct dwc3_trb *trb = &dep->trb_pool[dep->trb_dequeue];
+ struct dwc3_trb *trb;
struct scatterlist *sg = req->sg;
struct scatterlist *s;
unsigned int pending = req->num_pending_sgs;
--
2.7.4
From: Jan Harkes <jaharkes(a)cs.cmu.edu>
Subject: coda: pass the host file in vma->vm_file on mmap
Patch series "Coda updates".
The following patch series is a collection of various fixes for Coda, most
of which were collected from linux-fsdevel or linux-kernel but which have
as yet not found their way upstream.
This patch (of 22):
Various file systems expect that vma->vm_file points at their own file
handle, several use file_inode(vma->vm_file) to get at their inode or use
vma->vm_file->private_data. However the way Coda wrapped mmap on a host
file broke this assumption, vm_file was still pointing at the Coda file
and the host file systems would scribble over Coda's inode and private
file data.
This patch fixes the incorrect expectation and wraps vm_ops->open and
vm_ops->close to allow Coda to track when the vm_area_struct is destroyed
so we still release the reference on the Coda file handle at the right
time.
Link: http://lkml.kernel.org/r/0e850c6e59c0b147dc2dcd51a3af004c948c3697.155811738…
Signed-off-by: Jan Harkes <jaharkes(a)cs.cmu.edu>
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Colin Ian King <colin.king(a)canonical.com>
Cc: Dan Carpenter <dan.carpenter(a)oracle.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Fabian Frederick <fabf(a)skynet.be>
Cc: Mikko Rapeli <mikko.rapeli(a)iki.fi>
Cc: Sam Protsenko <semen.protsenko(a)linaro.org>
Cc: Yann Droneaud <ydroneaud(a)opteya.com>
Cc: Zhouyang Jia <jiazhouyang09(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/coda/file.c | 70 +++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 68 insertions(+), 2 deletions(-)
--- a/fs/coda/file.c~coda-pass-the-host-file-in-vma-vm_file-on-mmap
+++ a/fs/coda/file.c
@@ -27,6 +27,13 @@
#include "coda_linux.h"
#include "coda_int.h"
+struct coda_vm_ops {
+ atomic_t refcnt;
+ struct file *coda_file;
+ const struct vm_operations_struct *host_vm_ops;
+ struct vm_operations_struct vm_ops;
+};
+
static ssize_t
coda_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
{
@@ -61,6 +68,34 @@ coda_file_write_iter(struct kiocb *iocb,
return ret;
}
+static void
+coda_vm_open(struct vm_area_struct *vma)
+{
+ struct coda_vm_ops *cvm_ops =
+ container_of(vma->vm_ops, struct coda_vm_ops, vm_ops);
+
+ atomic_inc(&cvm_ops->refcnt);
+
+ if (cvm_ops->host_vm_ops && cvm_ops->host_vm_ops->open)
+ cvm_ops->host_vm_ops->open(vma);
+}
+
+static void
+coda_vm_close(struct vm_area_struct *vma)
+{
+ struct coda_vm_ops *cvm_ops =
+ container_of(vma->vm_ops, struct coda_vm_ops, vm_ops);
+
+ if (cvm_ops->host_vm_ops && cvm_ops->host_vm_ops->close)
+ cvm_ops->host_vm_ops->close(vma);
+
+ if (atomic_dec_and_test(&cvm_ops->refcnt)) {
+ vma->vm_ops = cvm_ops->host_vm_ops;
+ fput(cvm_ops->coda_file);
+ kfree(cvm_ops);
+ }
+}
+
static int
coda_file_mmap(struct file *coda_file, struct vm_area_struct *vma)
{
@@ -68,6 +103,8 @@ coda_file_mmap(struct file *coda_file, s
struct coda_inode_info *cii;
struct file *host_file;
struct inode *coda_inode, *host_inode;
+ struct coda_vm_ops *cvm_ops;
+ int ret;
cfi = CODA_FTOC(coda_file);
BUG_ON(!cfi || cfi->cfi_magic != CODA_MAGIC);
@@ -76,6 +113,13 @@ coda_file_mmap(struct file *coda_file, s
if (!host_file->f_op->mmap)
return -ENODEV;
+ if (WARN_ON(coda_file != vma->vm_file))
+ return -EIO;
+
+ cvm_ops = kmalloc(sizeof(struct coda_vm_ops), GFP_KERNEL);
+ if (!cvm_ops)
+ return -ENOMEM;
+
coda_inode = file_inode(coda_file);
host_inode = file_inode(host_file);
@@ -89,6 +133,7 @@ coda_file_mmap(struct file *coda_file, s
* the container file on us! */
else if (coda_inode->i_mapping != host_inode->i_mapping) {
spin_unlock(&cii->c_lock);
+ kfree(cvm_ops);
return -EBUSY;
}
@@ -97,7 +142,29 @@ coda_file_mmap(struct file *coda_file, s
cfi->cfi_mapcount++;
spin_unlock(&cii->c_lock);
- return call_mmap(host_file, vma);
+ vma->vm_file = get_file(host_file);
+ ret = call_mmap(vma->vm_file, vma);
+
+ if (ret) {
+ /* if call_mmap fails, our caller will put coda_file so we
+ * should drop the reference to the host_file that we got.
+ */
+ fput(host_file);
+ kfree(cvm_ops);
+ } else {
+ /* here we add redirects for the open/close vm_operations */
+ cvm_ops->host_vm_ops = vma->vm_ops;
+ if (vma->vm_ops)
+ cvm_ops->vm_ops = *vma->vm_ops;
+
+ cvm_ops->vm_ops.open = coda_vm_open;
+ cvm_ops->vm_ops.close = coda_vm_close;
+ cvm_ops->coda_file = coda_file;
+ atomic_set(&cvm_ops->refcnt, 1);
+
+ vma->vm_ops = &cvm_ops->vm_ops;
+ }
+ return ret;
}
int coda_open(struct inode *coda_inode, struct file *coda_file)
@@ -207,4 +274,3 @@ const struct file_operations coda_file_o
.fsync = coda_fsync,
.splice_read = generic_file_splice_read,
};
-
_
From: Radoslaw Burny <rburny(a)google.com>
Subject: fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.
Normally, the inode's i_uid/i_gid are translated relative to s_user_ns,
but this is not a correct behavior for proc. Since sysctl permission
check in test_perm is done against GLOBAL_ROOT_[UG]ID, it makes more sense
to use these values in u_[ug]id of proc inodes. In other words: although
uid/gid in the inode is not read during test_perm, the inode logically
belongs to the root of the namespace. I have confirmed this with Eric
Biederman at LPC and in this thread:
https://lore.kernel.org/lkml/87k1kzjdff.fsf@xmission.com
Consequences
============
Since the i_[ug]id values of proc nodes are not used for permissions
checks, this change usually makes no functional difference. However, it
causes an issue in a setup where:
* a namespace container is created without root user in container -
hence the i_[ug]id of proc nodes are set to INVALID_[UG]ID
* container creator tries to configure it by writing /proc/sys files,
e.g. writing /proc/sys/kernel/shmmax to configure shared memory limit
Kernel does not allow to open an inode for writing if its i_[ug]id are
invalid, making it impossible to write shmmax and thus - configure the
container.
Using a container with no root mapping is apparently rare, but we do use
this configuration at Google. Also, we use a generic tool to configure
the container limits, and the inability to write any of them causes a
failure.
History
=======
The invalid uids/gids in inodes first appeared due to 81754357770e (fs:
Update i_[ug]id_(read|write) to translate relative to s_user_ns).
However, AFAIK, this did not immediately cause any issues. The inability
to write to these "invalid" inodes was only caused by a later commit
0bd23d09b874 (vfs: Don't modify inodes with a uid or gid unknown to the
vfs).
Tested: Used a repro program that creates a user namespace without any
mapping and stat'ed /proc/$PID/root/proc/sys/kernel/shmmax from outside.
Before the change, it shows the overflow uid, with the change it's 0. The
overflow uid indicates that the uid in the inode is not correct and thus
it is not possible to open the file for writing.
Link: http://lkml.kernel.org/r/20190708115130.250149-1-rburny@google.com
Fixes: 0bd23d09b874 ("vfs: Don't modify inodes with a uid or gid unknown to the vfs")
Signed-off-by: Radoslaw Burny <rburny(a)google.com>
Acked-by: Luis Chamberlain <mcgrof(a)kernel.org>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: "Eric W . Biederman" <ebiederm(a)xmission.com>
Cc: Seth Forshee <seth.forshee(a)canonical.com>
Cc: John Sperbeck <jsperbeck(a)google.com>
Cc: Alexey Dobriyan <adobriyan(a)gmail.com>
Cc: <stable(a)vger.kernel.org> [4.8+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/proc_sysctl.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/fs/proc/proc_sysctl.c~fs-fix-the-default-values-of-i_uid-i_gid-on-proc-sys-inodes
+++ a/fs/proc/proc_sysctl.c
@@ -499,6 +499,10 @@ static struct inode *proc_sys_make_inode
if (root->set_ownership)
root->set_ownership(head, table, &inode->i_uid, &inode->i_gid);
+ else {
+ inode->i_uid = GLOBAL_ROOT_UID;
+ inode->i_gid = GLOBAL_ROOT_GID;
+ }
return inode;
}
_