This is the start of the stable review cycle for the 4.4.225 release. There are 65 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 28 May 2020 18:36:22 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.225-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.4.225-rc1
R. Parameswaran parameswaran.r7@gmail.com l2tp: device MTU setup, tunnel socket needs a lock
Christophe JAILLET christophe.jaillet@wanadoo.fr iio: sca3000: Remove an erroneous 'get_device()'
Alexander Usyskin alexander.usyskin@intel.com mei: release me_cl object reference
Dragos Bogdan dragos.bogdan@analog.com staging: iio: ad2s1210: Fix SPI reading
Bob Peterson rpeterso@redhat.com Revert "gfs2: Don't demote a glock until its revokes are written"
Guillaume Nault g.nault@alphalink.fr l2tp: initialise PPP sessions before registering them
Guillaume Nault g.nault@alphalink.fr l2tp: protect sock pointer of struct pppol2tp_session with RCU
Guillaume Nault g.nault@alphalink.fr l2tp: initialise l2tp_eth sessions before registering them
Guillaume Nault g.nault@alphalink.fr l2tp: don't register sessions in l2tp_session_create()
Guillaume Nault g.nault@alphalink.fr l2tp: fix l2tp_eth module loading
Guillaume Nault g.nault@alphalink.fr l2tp: pass tunnel pointer to ->session_create()
Guillaume Nault g.nault@alphalink.fr l2tp: prevent creation of sessions on terminated tunnels
Guillaume Nault g.nault@alphalink.fr l2tp: hold tunnel used while creating sessions with netlink
Guillaume Nault g.nault@alphalink.fr l2tp: hold tunnel while handling genl TUNNEL_GET commands
Guillaume Nault g.nault@alphalink.fr l2tp: hold tunnel while handling genl tunnel updates
Guillaume Nault g.nault@alphalink.fr l2tp: hold tunnel while processing genl delete command
Guillaume Nault g.nault@alphalink.fr l2tp: hold tunnel while looking up sessions in l2tp_netlink
Guillaume Nault g.nault@alphalink.fr l2tp: initialise session's refcount before making it reachable
Guillaume Nault g.nault@alphalink.fr l2tp: define parameters of l2tp_tunnel_find*() as "const"
Guillaume Nault g.nault@alphalink.fr l2tp: define parameters of l2tp_session_get*() as "const"
Guillaume Nault g.nault@alphalink.fr l2tp: remove l2tp_session_find()
Guillaume Nault g.nault@alphalink.fr l2tp: remove useless duplicate session detection in l2tp_netlink
R. Parameswaran parameswaran.r7@gmail.com L2TP:Adjust intf MTU, add underlay L3, L2 hdrs.
R. Parameswaran parameswaran.r7@gmail.com New kernel function to get IP overhead on a socket.
Asbjørn Sloth Tønnesen asbjorn@asbjorn.st net: l2tp: ppp: change PPPOL2TP_MSG_* => L2TP_MSG_*
Asbjørn Sloth Tønnesen asbjorn@asbjorn.st net: l2tp: deprecate PPPOL2TP_MSG_* in favour of L2TP_MSG_*
Asbjørn Sloth Tønnesen asbjorn@asbjorn.st net: l2tp: export debug flags to UAPI
Guillaume Nault g.nault@alphalink.fr l2tp: don't use l2tp_tunnel_find() in l2tp_ip and l2tp_ip6
Guillaume Nault g.nault@alphalink.fr l2tp: take a reference on sessions used in genetlink handlers
Guillaume Nault g.nault@alphalink.fr l2tp: hold session while sending creation notifications
Guillaume Nault g.nault@alphalink.fr l2tp: fix racy socket lookup in l2tp_ip and l2tp_ip6 bind()
Guillaume Nault g.nault@alphalink.fr l2tp: lock socket before checking flags in connect()
Vishal Verma vishal.l.verma@intel.com libnvdimm/btt: Remove unnecessary code in btt_freelist_init
Colin Ian King colin.king@canonical.com platform/x86: alienware-wmi: fix kfree on potentially uninitialized pointer
Theodore Ts'o tytso@mit.edu ext4: lock the xattr block before checksuming it
Brent Lu brent.lu@intel.com ALSA: pcm: fix incorrect hw_base increase
Daniel Jordan daniel.m.jordan@oracle.com padata: purge get_cpu and reorder_via_wq from padata_do_serial
Daniel Jordan daniel.m.jordan@oracle.com padata: initialize pd->cpu with effective cpumask
Herbert Xu herbert@gondor.apana.org.au padata: Replace delayed timer with immediate workqueue in padata_reorder
Peter Zijlstra peterz@infradead.org sched/fair, cpumask: Export for_each_cpu_wrap()
Mathias Krause minipli@googlemail.com padata: set cpu_index of unused CPUs to -1
Kevin Hao haokexin@gmail.com i2c: dev: Fix the race between the release of i2c_dev and cdev
viresh kumar viresh.kumar@linaro.org i2c-dev: don't get i2c adapter via i2c_dev
Dan Carpenter dan.carpenter@oracle.com i2c: dev: use after free in detach
Wolfram Sang wsa@the-dreams.de i2c: dev: don't start function name with 'return'
Erico Nunes erico.nunes@datacom.ind.br i2c: dev: switch from register_chrdev to cdev API
Shuah Khan shuahkh@osg.samsung.com media: fix media devnode ioctl/syscall and unregister race
Shuah Khan shuahkh@osg.samsung.com media: fix use-after-free in cdev_put() when app exits after driver unbind
Mauro Carvalho Chehab mchehab@osg.samsung.com media-device: dynamically allocate struct media_devnode
Mauro Carvalho Chehab mchehab@osg.samsung.com media-devnode: fix namespace mess
Max Kellermann max@duempel.org media-devnode: add missing mutex lock in error handler
Max Kellermann max@duempel.org drivers/media/media-devnode: clear private_data before put_device()
Shuah Khan shuahkh@osg.samsung.com media: Fix media_open() to clear filp->private_data in error leg
Thomas Gleixner tglx@linutronix.de ARM: futex: Address build warning
Hans de Goede hdegoede@redhat.com platform/x86: asus-nb-wmi: Do not load on Asus T100TA and T200TA
Alan Stern stern@rowland.harvard.edu USB: core: Fix misleading driver bug report
Wu Bo wubo40@huawei.com ceph: fix double unlock in handle_cap_export()
Sebastian Reichel sebastian.reichel@collabora.com HID: multitouch: add eGalaxTouch P80H84 support
Al Viro viro@zeniv.linux.org.uk fix multiplication overflow in copy_fdtable()
Roberto Sassu roberto.sassu@huawei.com evm: Check also if *tfm is an error pointer in init_desc()
Mathias Krause minipli@googlemail.com padata: ensure padata_do_serial() runs on the correct CPU
Mathias Krause minipli@googlemail.com padata: ensure the reorder timer callback runs on the correct CPU
Jason A. Donenfeld Jason@zx2c4.com padata: get_next is never NULL
Tobias Klauser tklauser@distanz.ch padata: Remove unused but set variables
Cao jin caoj.fnst@cn.fujitsu.com igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr
-------------
Diffstat:
Documentation/networking/l2tp.txt | 8 +- Makefile | 4 +- arch/arm/include/asm/futex.h | 9 +- drivers/hid/hid-ids.h | 1 + drivers/hid/hid-multitouch.c | 3 + drivers/i2c/i2c-dev.c | 60 +++--- drivers/media/media-device.c | 43 +++-- drivers/media/media-devnode.c | 168 +++++++++------- drivers/media/usb/uvc/uvc_driver.c | 2 +- drivers/misc/mei/client.c | 2 + drivers/net/ethernet/intel/igb/igb_main.c | 4 +- drivers/nvdimm/btt.c | 8 +- drivers/platform/x86/alienware-wmi.c | 17 +- drivers/platform/x86/asus-nb-wmi.c | 24 +++ drivers/staging/iio/accel/sca3000_ring.c | 2 +- drivers/staging/iio/resolver/ad2s1210.c | 17 +- drivers/usb/core/message.c | 4 +- fs/ceph/caps.c | 1 + fs/ext4/xattr.c | 66 ++++--- fs/file.c | 2 +- fs/gfs2/glock.c | 3 - include/linux/cpumask.h | 17 ++ include/linux/net.h | 3 + include/linux/padata.h | 13 +- include/media/media-device.h | 5 +- include/media/media-devnode.h | 32 +++- include/net/ipv6.h | 2 + include/uapi/linux/if_pppol2tp.h | 13 +- include/uapi/linux/l2tp.h | 17 +- kernel/padata.c | 88 ++++----- lib/cpumask.c | 32 ++++ net/ipv6/datagram.c | 4 +- net/l2tp/l2tp_core.c | 181 ++++++----------- net/l2tp/l2tp_core.h | 47 +++-- net/l2tp/l2tp_eth.c | 216 +++++++++++++-------- net/l2tp/l2tp_ip.c | 68 ++++--- net/l2tp/l2tp_ip6.c | 82 ++++---- net/l2tp/l2tp_netlink.c | 124 +++++++----- net/l2tp/l2tp_ppp.c | 309 ++++++++++++++++++------------ net/socket.c | 46 +++++ security/integrity/evm/evm_crypto.c | 2 +- sound/core/pcm_lib.c | 1 + 42 files changed, 1014 insertions(+), 736 deletions(-)
From: Cao jin caoj.fnst@cn.fujitsu.com
commit 629823b872402451b42462414da08dddd0e2c93d upstream.
When running as guest, under certain condition, it will oops as following. writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr is NULL. While other register access won't oops kernel because they use wr32/rd32 which have a defense against NULL pointer.
[ 141.225449] pcieport 0000:00:1c.0: AER: Multiple Uncorrected (Fatal) error received: id=0101 [ 141.225523] igb 0000:01:00.1: PCIe Bus Error: severity=Uncorrected (Fatal), type=Unaccessible, id=0101(Unregistered Agent ID) [ 141.299442] igb 0000:01:00.1: broadcast error_detected message [ 141.300539] igb 0000:01:00.0 enp1s0f0: PCIe link lost, device now detached [ 141.351019] igb 0000:01:00.1 enp1s0f1: PCIe link lost, device now detached [ 143.465904] pcieport 0000:00:1c.0: Root Port link has been reset [ 143.465994] igb 0000:01:00.1: broadcast slot_reset message [ 143.466039] igb 0000:01:00.0: enabling device (0000 -> 0002) [ 144.389078] igb 0000:01:00.1: enabling device (0000 -> 0002) [ 145.312078] igb 0000:01:00.1: broadcast resume message [ 145.322211] BUG: unable to handle kernel paging request at 0000000000003818 [ 145.361275] IP: [<ffffffffa02fd38d>] igb_configure_tx_ring+0x14d/0x280 [igb] [ 145.400048] PGD 0 [ 145.438007] Oops: 0002 [#1] SMP
A similar issue & solution could be found at: http://patchwork.ozlabs.org/patch/689592/
Signed-off-by: Cao jin caoj.fnst@cn.fujitsu.com Acked-by: Alexander Duyck alexander.h.duyck@intel.com Tested-by: Aaron Brown aaron.f.brown@intel.com Signed-off-by: Jeff Kirsher jeffrey.t.kirsher@intel.com Cc: Guenter Roeck linux@roeck-us.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/net/ethernet/intel/igb/igb_main.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -3296,7 +3296,7 @@ void igb_configure_tx_ring(struct igb_ad tdba & 0x00000000ffffffffULL); wr32(E1000_TDBAH(reg_idx), tdba >> 32);
- ring->tail = hw->hw_addr + E1000_TDT(reg_idx); + ring->tail = adapter->io_addr + E1000_TDT(reg_idx); wr32(E1000_TDH(reg_idx), 0); writel(0, ring->tail);
@@ -3652,7 +3652,7 @@ void igb_configure_rx_ring(struct igb_ad ring->count * sizeof(union e1000_adv_rx_desc));
/* initialize head and tail */ - ring->tail = hw->hw_addr + E1000_RDT(reg_idx); + ring->tail = adapter->io_addr + E1000_RDT(reg_idx); wr32(E1000_RDH(reg_idx), 0); writel(0, ring->tail);
From: Tobias Klauser tklauser@distanz.ch
commit 119a0798dc42ed4c4f96d39b8b676efcea73aec6 upstream.
Remove the unused but set variable pinst in padata_parallel_worker to fix the following warning when building with 'W=1':
kernel/padata.c: In function ‘padata_parallel_worker’: kernel/padata.c:68:26: warning: variable ‘pinst’ set but not used [-Wunused-but-set-variable]
Also remove the now unused variable pd which is only used to set pinst.
Signed-off-by: Tobias Klauser tklauser@distanz.ch Acked-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Cc: Ben Hutchings ben@decadent.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/padata.c | 4 ---- 1 file changed, 4 deletions(-)
--- a/kernel/padata.c +++ b/kernel/padata.c @@ -65,15 +65,11 @@ static int padata_cpu_hash(struct parall static void padata_parallel_worker(struct work_struct *parallel_work) { struct padata_parallel_queue *pqueue; - struct parallel_data *pd; - struct padata_instance *pinst; LIST_HEAD(local_list);
local_bh_disable(); pqueue = container_of(parallel_work, struct padata_parallel_queue, work); - pd = pqueue->pd; - pinst = pd->pinst;
spin_lock(&pqueue->parallel.lock); list_replace_init(&pqueue->parallel.list, &local_list);
From: Jason A. Donenfeld Jason@zx2c4.com
commit 69b348449bda0f9588737539cfe135774c9939a7 upstream.
Per Dan's static checker warning, the code that returns NULL was removed in 2010, so this patch updates the comments and fixes the code assumptions.
Signed-off-by: Jason A. Donenfeld Jason@zx2c4.com Reported-by: Dan Carpenter dan.carpenter@oracle.com Acked-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Cc: Ben Hutchings ben@decadent.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- kernel/padata.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-)
--- a/kernel/padata.c +++ b/kernel/padata.c @@ -155,8 +155,6 @@ EXPORT_SYMBOL(padata_do_parallel); * A pointer to the control struct of the next object that needs * serialization, if present in one of the percpu reorder queues. * - * NULL, if all percpu reorder queues are empty. - * * -EINPROGRESS, if the next object that needs serialization will * be parallel processed by another cpu and is not yet present in * the cpu's reorder queue. @@ -183,8 +181,6 @@ static struct padata_priv *padata_get_ne cpu = padata_index_to_cpu(pd, next_index); next_queue = per_cpu_ptr(pd->pqueue, cpu);
- padata = NULL; - reorder = &next_queue->reorder;
spin_lock(&reorder->lock); @@ -236,12 +232,11 @@ static void padata_reorder(struct parall padata = padata_get_next(pd);
/* - * All reorder queues are empty, or the next object that needs - * serialization is parallel processed by another cpu and is - * still on it's way to the cpu's reorder queue, nothing to - * do for now. + * If the next object that needs serialization is parallel + * processed by another cpu and is still on it's way to the + * cpu's reorder queue, nothing to do for now. */ - if (!padata || PTR_ERR(padata) == -EINPROGRESS) + if (PTR_ERR(padata) == -EINPROGRESS) break;
/*
From: Mathias Krause minipli@googlemail.com
commit cf5868c8a22dc2854b96e9569064bb92365549ca upstream.
The reorder timer function runs on the CPU where the timer interrupt was handled which is not necessarily one of the CPUs of the 'pcpu' CPU mask set.
Ensure the padata_reorder() callback runs on the correct CPU, which is one in the 'pcpu' CPU mask set and, preferrably, the next expected one. Do so by comparing the current CPU with the expected target CPU. If they match, call padata_reorder() right away. If they differ, schedule a work item on the target CPU that does the padata_reorder() call for us.
Signed-off-by: Mathias Krause minipli@googlemail.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Cc: Ben Hutchings ben@decadent.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/padata.h | 2 ++ kernel/padata.c | 43 ++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 44 insertions(+), 1 deletion(-)
--- a/include/linux/padata.h +++ b/include/linux/padata.h @@ -85,6 +85,7 @@ struct padata_serial_queue { * @swork: work struct for serialization. * @pd: Backpointer to the internal control structure. * @work: work struct for parallelization. + * @reorder_work: work struct for reordering. * @num_obj: Number of objects that are processed by this cpu. * @cpu_index: Index of the cpu. */ @@ -93,6 +94,7 @@ struct padata_parallel_queue { struct padata_list reorder; struct parallel_data *pd; struct work_struct work; + struct work_struct reorder_work; atomic_t num_obj; int cpu_index; }; --- a/kernel/padata.c +++ b/kernel/padata.c @@ -281,11 +281,51 @@ static void padata_reorder(struct parall return; }
+static void invoke_padata_reorder(struct work_struct *work) +{ + struct padata_parallel_queue *pqueue; + struct parallel_data *pd; + + local_bh_disable(); + pqueue = container_of(work, struct padata_parallel_queue, reorder_work); + pd = pqueue->pd; + padata_reorder(pd); + local_bh_enable(); +} + static void padata_reorder_timer(unsigned long arg) { struct parallel_data *pd = (struct parallel_data *)arg; + unsigned int weight; + int target_cpu, cpu;
- padata_reorder(pd); + cpu = get_cpu(); + + /* We don't lock pd here to not interfere with parallel processing + * padata_reorder() calls on other CPUs. We just need any CPU out of + * the cpumask.pcpu set. It would be nice if it's the right one but + * it doesn't matter if we're off to the next one by using an outdated + * pd->processed value. + */ + weight = cpumask_weight(pd->cpumask.pcpu); + target_cpu = padata_index_to_cpu(pd, pd->processed % weight); + + /* ensure to call the reorder callback on the correct CPU */ + if (cpu != target_cpu) { + struct padata_parallel_queue *pqueue; + struct padata_instance *pinst; + + /* The timer function is serialized wrt itself -- no locking + * needed. + */ + pinst = pd->pinst; + pqueue = per_cpu_ptr(pd->pqueue, target_cpu); + queue_work_on(target_cpu, pinst->wq, &pqueue->reorder_work); + } else { + padata_reorder(pd); + } + + put_cpu(); }
static void padata_serial_worker(struct work_struct *serial_work) @@ -412,6 +452,7 @@ static void padata_init_pqueues(struct p __padata_list_init(&pqueue->reorder); __padata_list_init(&pqueue->parallel); INIT_WORK(&pqueue->work, padata_parallel_worker); + INIT_WORK(&pqueue->reorder_work, invoke_padata_reorder); atomic_set(&pqueue->num_obj, 0); } }
From: Mathias Krause minipli@googlemail.com
commit 350ef88e7e922354f82a931897ad4a4ce6c686ff upstream.
If the algorithm we're parallelizing is asynchronous we might change CPUs between padata_do_parallel() and padata_do_serial(). However, we don't expect this to happen as we need to enqueue the padata object into the per-cpu reorder queue we took it from, i.e. the same-cpu's parallel queue.
Ensure we're not switching CPUs for a given padata object by tracking the CPU within the padata object. If the serial callback gets called on the wrong CPU, defer invoking padata_reorder() via a kernel worker on the CPU we're expected to run on.
Signed-off-by: Mathias Krause minipli@googlemail.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Cc: Ben Hutchings ben@decadent.org.uk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/padata.h | 2 ++ kernel/padata.c | 20 +++++++++++++++++++- 2 files changed, 21 insertions(+), 1 deletion(-)
--- a/include/linux/padata.h +++ b/include/linux/padata.h @@ -37,6 +37,7 @@ * @list: List entry, to attach to the padata lists. * @pd: Pointer to the internal control structure. * @cb_cpu: Callback cpu for serializatioon. + * @cpu: Cpu for parallelization. * @seq_nr: Sequence number of the parallelized data object. * @info: Used to pass information from the parallel to the serial function. * @parallel: Parallel execution function. @@ -46,6 +47,7 @@ struct padata_priv { struct list_head list; struct parallel_data *pd; int cb_cpu; + int cpu; int info; void (*parallel)(struct padata_priv *padata); void (*serial)(struct padata_priv *padata); --- a/kernel/padata.c +++ b/kernel/padata.c @@ -132,6 +132,7 @@ int padata_do_parallel(struct padata_ins padata->cb_cpu = cb_cpu;
target_cpu = padata_cpu_hash(pd); + padata->cpu = target_cpu; queue = per_cpu_ptr(pd->pqueue, target_cpu);
spin_lock(&queue->parallel.lock); @@ -375,10 +376,21 @@ void padata_do_serial(struct padata_priv int cpu; struct padata_parallel_queue *pqueue; struct parallel_data *pd; + int reorder_via_wq = 0;
pd = padata->pd;
cpu = get_cpu(); + + /* We need to run on the same CPU padata_do_parallel(.., padata, ..) + * was called on -- or, at least, enqueue the padata object into the + * correct per-cpu queue. + */ + if (cpu != padata->cpu) { + reorder_via_wq = 1; + cpu = padata->cpu; + } + pqueue = per_cpu_ptr(pd->pqueue, cpu);
spin_lock(&pqueue->reorder.lock); @@ -395,7 +407,13 @@ void padata_do_serial(struct padata_priv
put_cpu();
- padata_reorder(pd); + /* If we're running on the wrong CPU, call padata_reorder() via a + * kernel worker. + */ + if (reorder_via_wq) + queue_work_on(cpu, pd->pinst->wq, &pqueue->reorder_work); + else + padata_reorder(pd); } EXPORT_SYMBOL(padata_do_serial);
From: Roberto Sassu roberto.sassu@huawei.com
[ Upstream commit 53de3b080d5eae31d0de219617155dcc34e7d698 ]
This patch avoids a kernel panic due to accessing an error pointer set by crypto_alloc_shash(). It occurs especially when there are many files that require an unsupported algorithm, as it would increase the likelihood of the following race condition:
Task A: *tfm = crypto_alloc_shash() <= error pointer Task B: if (*tfm == NULL) <= *tfm is not NULL, use it Task B: rc = crypto_shash_init(desc) <= panic Task A: *tfm = NULL
This patch uses the IS_ERR_OR_NULL macro to determine whether or not a new crypto context must be created.
Cc: stable@vger.kernel.org Fixes: d46eb3699502b ("evm: crypto hash replaced by shash") Co-developed-by: Krzysztof Struczynski krzysztof.struczynski@huawei.com Signed-off-by: Krzysztof Struczynski krzysztof.struczynski@huawei.com Signed-off-by: Roberto Sassu roberto.sassu@huawei.com Signed-off-by: Mimi Zohar zohar@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- security/integrity/evm/evm_crypto.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/security/integrity/evm/evm_crypto.c b/security/integrity/evm/evm_crypto.c index 461f8d891579..44352b0b7510 100644 --- a/security/integrity/evm/evm_crypto.c +++ b/security/integrity/evm/evm_crypto.c @@ -47,7 +47,7 @@ static struct shash_desc *init_desc(char type) algo = evm_hash; }
- if (*tfm == NULL) { + if (IS_ERR_OR_NULL(*tfm)) { mutex_lock(&mutex); if (*tfm) goto out;
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit 4e89b7210403fa4a8acafe7c602b6212b7af6c3b ]
cpy and set really should be size_t; we won't get an overflow on that, since sysctl_nr_open can't be set above ~(size_t)0 / sizeof(void *), so nr that would've managed to overflow size_t on that multiplication won't get anywhere near copy_fdtable() - we'll fail with EMFILE before that.
Cc: stable@kernel.org # v2.6.25+ Fixes: 9cfe015aa424 (get rid of NR_OPEN and introduce a sysctl_nr_open) Reported-by: Thiago Macieira thiago.macieira@intel.com Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/file.c b/fs/file.c index 7e9eb65a2912..090015401c55 100644 --- a/fs/file.c +++ b/fs/file.c @@ -88,7 +88,7 @@ static void copy_fd_bitmaps(struct fdtable *nfdt, struct fdtable *ofdt, */ static void copy_fdtable(struct fdtable *nfdt, struct fdtable *ofdt) { - unsigned int cpy, set; + size_t cpy, set;
BUG_ON(nfdt->max_fds < ofdt->max_fds);
From: Sebastian Reichel sebastian.reichel@collabora.com
[ Upstream commit f9e82295eec141a0569649d400d249333d74aa91 ]
Add support for P80H84 touchscreen from eGalaxy:
idVendor 0x0eef D-WAV Scientific Co., Ltd idProduct 0xc002 iManufacturer 1 eGalax Inc. iProduct 2 eGalaxTouch P80H84 2019 vDIVA_1204_T01 k4.02.146
Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-ids.h | 1 + drivers/hid/hid-multitouch.c | 3 +++ 2 files changed, 4 insertions(+)
diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index e1807296a1a0..33d2b5948d7f 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -319,6 +319,7 @@ #define USB_DEVICE_ID_DWAV_EGALAX_MULTITOUCH_7349 0x7349 #define USB_DEVICE_ID_DWAV_EGALAX_MULTITOUCH_73F7 0x73f7 #define USB_DEVICE_ID_DWAV_EGALAX_MULTITOUCH_A001 0xa001 +#define USB_DEVICE_ID_DWAV_EGALAX_MULTITOUCH_C002 0xc002
#define USB_VENDOR_ID_ELAN 0x04f3
diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c index 9de379c1b3fd..56c4a81d3ea2 100644 --- a/drivers/hid/hid-multitouch.c +++ b/drivers/hid/hid-multitouch.c @@ -1300,6 +1300,9 @@ static const struct hid_device_id mt_devices[] = { { .driver_data = MT_CLS_EGALAX_SERIAL, MT_USB_DEVICE(USB_VENDOR_ID_DWAV, USB_DEVICE_ID_DWAV_EGALAX_MULTITOUCH_A001) }, + { .driver_data = MT_CLS_EGALAX, + MT_USB_DEVICE(USB_VENDOR_ID_DWAV, + USB_DEVICE_ID_DWAV_EGALAX_MULTITOUCH_C002) },
/* Elitegroup panel */ { .driver_data = MT_CLS_SERIAL,
From: Wu Bo wubo40@huawei.com
[ Upstream commit 4d8e28ff3106b093d98bfd2eceb9b430c70a8758 ]
If the ceph_mdsc_open_export_target_session() return fails, it will do a "goto retry", but the session mutex has already been unlocked. Re-lock the mutex in that case to ensure that we don't unlock it twice.
Signed-off-by: Wu Bo wubo40@huawei.com Reviewed-by: "Yan, Zheng" zyan@redhat.com Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/caps.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index efdf81ea3b5f..3d0497421e62 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -3293,6 +3293,7 @@ retry: WARN_ON(1); tsession = NULL; target = -1; + mutex_lock(&session->s_mutex); } goto retry;
From: Alan Stern stern@rowland.harvard.edu
[ Upstream commit ac854131d9844f79e2fdcef67a7707227538d78a ]
The syzbot fuzzer found a race between URB submission to endpoint 0 and device reset. Namely, during the reset we call usb_ep0_reinit() because the characteristics of ep0 may have changed (if the reset follows a firmware update, for example). While usb_ep0_reinit() is running there is a brief period during which the pointers stored in udev->ep_in[0] and udev->ep_out[0] are set to NULL, and if an URB is submitted to ep0 during that period, usb_urb_ep_type_check() will report it as a driver bug. In the absence of those pointers, the routine thinks that the endpoint doesn't exist. The log message looks like this:
------------[ cut here ]------------ usb 2-1: BOGUS urb xfer, pipe 2 != type 2 WARNING: CPU: 0 PID: 9241 at drivers/usb/core/urb.c:478 usb_submit_urb+0x1188/0x1460 drivers/usb/core/urb.c:478
Now, although submitting an URB while the device is being reset is a questionable thing to do, it shouldn't count as a driver bug as severe as submitting an URB for an endpoint that doesn't exist. Indeed, endpoint 0 always exists, even while the device is in its unconfigured state.
To prevent these misleading driver bug reports, this patch updates usb_disable_endpoint() to avoid clearing the ep_in[] and ep_out[] pointers when the endpoint being disabled is ep0. There's no danger of leaving a stale pointer in place, because the usb_host_endpoint structure being pointed to is stored permanently in udev->ep0; it doesn't get deallocated until the entire usb_device structure does.
Reported-and-tested-by: syzbot+db339689b2101f6f6071@syzkaller.appspotmail.com Signed-off-by: Alan Stern stern@rowland.harvard.edu
Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.2005011558590.903-100000@netrider.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/core/message.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/core/message.c b/drivers/usb/core/message.c index 747343c61398..f083ecfddd1b 100644 --- a/drivers/usb/core/message.c +++ b/drivers/usb/core/message.c @@ -1080,11 +1080,11 @@ void usb_disable_endpoint(struct usb_device *dev, unsigned int epaddr,
if (usb_endpoint_out(epaddr)) { ep = dev->ep_out[epnum]; - if (reset_hardware) + if (reset_hardware && epnum != 0) dev->ep_out[epnum] = NULL; } else { ep = dev->ep_in[epnum]; - if (reset_hardware) + if (reset_hardware && epnum != 0) dev->ep_in[epnum] = NULL; } if (ep) {
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 3bd12da7f50b8bc191fcb3bab1f55c582234df59 ]
asus-nb-wmi does not add any extra functionality on these Asus Transformer books. They have detachable keyboards, so the hotkeys are send through a HID device (and handled by the hid-asus driver) and also the rfkill functionality is not used on these devices.
Besides not adding any extra functionality, initializing the WMI interface on these devices actually has a negative side-effect. For some reason the _SB.ATKD.INIT() function which asus_wmi_platform_init() calls drives GPO2 (INT33FC:02) pin 8, which is connected to the front facing webcam LED, high and there is no (WMI or other) interface to drive this low again causing the LED to be permanently on, even during suspend.
This commit adds a blacklist of DMI system_ids on which not to load the asus-nb-wmi and adds these Transformer books to this list. This fixes the webcam LED being permanently on under Linux.
Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/asus-nb-wmi.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
diff --git a/drivers/platform/x86/asus-nb-wmi.c b/drivers/platform/x86/asus-nb-wmi.c index cccf250cd1e3..ee64c9512a3a 100644 --- a/drivers/platform/x86/asus-nb-wmi.c +++ b/drivers/platform/x86/asus-nb-wmi.c @@ -551,9 +551,33 @@ static struct asus_wmi_driver asus_nb_wmi_driver = { .detect_quirks = asus_nb_wmi_quirks, };
+static const struct dmi_system_id asus_nb_wmi_blacklist[] __initconst = { + { + /* + * asus-nb-wm adds no functionality. The T100TA has a detachable + * USB kbd, so no hotkeys and it has no WMI rfkill; and loading + * asus-nb-wm causes the camera LED to turn and _stay_ on. + */ + .matches = { + DMI_EXACT_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."), + DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "T100TA"), + }, + }, + { + /* The Asus T200TA has the same issue as the T100TA */ + .matches = { + DMI_EXACT_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."), + DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "T200TA"), + }, + }, + {} /* Terminating entry */ +};
static int __init asus_nb_wmi_init(void) { + if (dmi_check_system(asus_nb_wmi_blacklist)) + return -ENODEV; + return asus_wmi_register_driver(&asus_nb_wmi_driver); }
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit 8101b5a1531f3390b3a69fa7934c70a8fd6566ad ]
Stephen reported the following build warning on a ARM multi_v7_defconfig build with GCC 9.2.1:
kernel/futex.c: In function 'do_futex': kernel/futex.c:1676:17: warning: 'oldval' may be used uninitialized in this function [-Wmaybe-uninitialized] 1676 | return oldval == cmparg; | ~~~~~~~^~~~~~~~~ kernel/futex.c:1652:6: note: 'oldval' was declared here 1652 | int oldval, ret; | ^~~~~~
introduced by commit a08971e9488d ("futex: arch_futex_atomic_op_inuser() calling conventions change").
While that change should not make any difference it confuses GCC which fails to work out that oldval is not referenced when the return value is not zero.
GCC fails to properly analyze arch_futex_atomic_op_inuser(). It's not the early return, the issue is with the assembly macros. GCC fails to detect that those either set 'ret' to 0 and set oldval or set 'ret' to -EFAULT which makes oldval uninteresting. The store to the callsite supplied oldval pointer is conditional on ret == 0.
The straight forward way to solve this is to make the store unconditional.
Aside of addressing the build warning this makes sense anyway because it removes the conditional from the fastpath. In the error case the stored value is uninteresting and the extra store does not matter at all.
Reported-by: Stephen Rothwell sfr@canb.auug.org.au Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lkml.kernel.org/r/87pncao2ph.fsf@nanos.tec.linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/include/asm/futex.h | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/arch/arm/include/asm/futex.h b/arch/arm/include/asm/futex.h index cc414382dab4..561b2ba6bc28 100644 --- a/arch/arm/include/asm/futex.h +++ b/arch/arm/include/asm/futex.h @@ -162,8 +162,13 @@ arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr) preempt_enable(); #endif
- if (!ret) - *oval = oldval; + /* + * Store unconditionally. If ret != 0 the extra store is the least + * of the worries but GCC cannot figure out that __futex_atomic_op() + * is either setting ret to -EFAULT or storing the old value in + * oldval which results in a uninitialized warning at the call site. + */ + *oval = oldval;
return ret; }
From: Shuah Khan shuahkh@osg.samsung.com
commit d40ec6fdb0b03b7be4c7923a3da0e46bf943740a upstream.
Fix media_open() to clear filp->private_data when file open fails.
Signed-off-by: Shuah Khan shuahkh@osg.samsung.com Acked-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Mauro Carvalho Chehab mchehab@osg.samsung.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/media-devnode.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/media/media-devnode.c b/drivers/media/media-devnode.c index ebf9626e5ae5..a8cb52dc8c4f 100644 --- a/drivers/media/media-devnode.c +++ b/drivers/media/media-devnode.c @@ -181,6 +181,7 @@ static int media_open(struct inode *inode, struct file *filp) ret = mdev->fops->open(filp); if (ret) { put_device(&mdev->dev); + filp->private_data = NULL; return ret; } }
From: Max Kellermann max@duempel.org
commit bf244f665d76d20312c80524689b32a752888838 upstream.
Callbacks invoked from put_device() may free the struct media_devnode pointer, so any cleanup needs to be done before put_device().
Signed-off-by: Max Kellermann max@duempel.org Signed-off-by: Mauro Carvalho Chehab mchehab@osg.samsung.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/media-devnode.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/media/media-devnode.c b/drivers/media/media-devnode.c index a8cb52dc8c4f..6c56aebd8db0 100644 --- a/drivers/media/media-devnode.c +++ b/drivers/media/media-devnode.c @@ -197,10 +197,11 @@ static int media_release(struct inode *inode, struct file *filp) if (mdev->fops->release) mdev->fops->release(filp);
+ filp->private_data = NULL; + /* decrease the refcount unconditionally since the release() return value is ignored. */ put_device(&mdev->dev); - filp->private_data = NULL; return 0; }
From: Max Kellermann max@duempel.org
commit 88336e174645948da269e1812f138f727cd2896b upstream.
We should protect the device unregister patch too, at the error condition.
Signed-off-by: Max Kellermann max@duempel.org Signed-off-by: Mauro Carvalho Chehab mchehab@osg.samsung.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/media-devnode.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/media/media-devnode.c b/drivers/media/media-devnode.c index 6c56aebd8db0..86c7c3732c84 100644 --- a/drivers/media/media-devnode.c +++ b/drivers/media/media-devnode.c @@ -282,8 +282,11 @@ int __must_check media_devnode_register(struct media_devnode *mdev, return 0;
error: + mutex_lock(&media_devnode_lock); cdev_del(&mdev->cdev); clear_bit(mdev->minor, media_devnode_nums); + mutex_unlock(&media_devnode_lock); + return ret; }
From: Mauro Carvalho Chehab mchehab@osg.samsung.com
commit 163f1e93e995048b894c5fc86a6034d16beed740 upstream.
Along all media controller code, "mdev" is used to represent a pointer to struct media_device, and "devnode" for a pointer to struct media_devnode.
However, inside media-devnode.[ch], "mdev" is used to represent a pointer to struct media_devnode.
This is very confusing and may lead to development errors.
So, let's change all occurrences at media-devnode.[ch] to also use "devnode" for such pointers.
This patch doesn't make any functional changes.
Signed-off-by: Mauro Carvalho Chehab mchehab@osg.samsung.com Signed-off-by: Mauro Carvalho Chehab mchehab@s-opensource.com [bwh: Backported to 4.4: adjust filename, context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/media-devnode.c | 114 +++++++++++++++++----------------- include/media/media-devnode.h | 10 +-- 2 files changed, 62 insertions(+), 62 deletions(-)
diff --git a/drivers/media/media-devnode.c b/drivers/media/media-devnode.c index 86c7c3732c84..98211c570e11 100644 --- a/drivers/media/media-devnode.c +++ b/drivers/media/media-devnode.c @@ -59,21 +59,21 @@ static DECLARE_BITMAP(media_devnode_nums, MEDIA_NUM_DEVICES); /* Called when the last user of the media device exits. */ static void media_devnode_release(struct device *cd) { - struct media_devnode *mdev = to_media_devnode(cd); + struct media_devnode *devnode = to_media_devnode(cd);
mutex_lock(&media_devnode_lock);
/* Delete the cdev on this minor as well */ - cdev_del(&mdev->cdev); + cdev_del(&devnode->cdev);
/* Mark device node number as free */ - clear_bit(mdev->minor, media_devnode_nums); + clear_bit(devnode->minor, media_devnode_nums);
mutex_unlock(&media_devnode_lock);
/* Release media_devnode and perform other cleanups as needed. */ - if (mdev->release) - mdev->release(mdev); + if (devnode->release) + devnode->release(devnode); }
static struct bus_type media_bus_type = { @@ -83,37 +83,37 @@ static struct bus_type media_bus_type = { static ssize_t media_read(struct file *filp, char __user *buf, size_t sz, loff_t *off) { - struct media_devnode *mdev = media_devnode_data(filp); + struct media_devnode *devnode = media_devnode_data(filp);
- if (!mdev->fops->read) + if (!devnode->fops->read) return -EINVAL; - if (!media_devnode_is_registered(mdev)) + if (!media_devnode_is_registered(devnode)) return -EIO; - return mdev->fops->read(filp, buf, sz, off); + return devnode->fops->read(filp, buf, sz, off); }
static ssize_t media_write(struct file *filp, const char __user *buf, size_t sz, loff_t *off) { - struct media_devnode *mdev = media_devnode_data(filp); + struct media_devnode *devnode = media_devnode_data(filp);
- if (!mdev->fops->write) + if (!devnode->fops->write) return -EINVAL; - if (!media_devnode_is_registered(mdev)) + if (!media_devnode_is_registered(devnode)) return -EIO; - return mdev->fops->write(filp, buf, sz, off); + return devnode->fops->write(filp, buf, sz, off); }
static unsigned int media_poll(struct file *filp, struct poll_table_struct *poll) { - struct media_devnode *mdev = media_devnode_data(filp); + struct media_devnode *devnode = media_devnode_data(filp);
- if (!media_devnode_is_registered(mdev)) + if (!media_devnode_is_registered(devnode)) return POLLERR | POLLHUP; - if (!mdev->fops->poll) + if (!devnode->fops->poll) return DEFAULT_POLLMASK; - return mdev->fops->poll(filp, poll); + return devnode->fops->poll(filp, poll); }
static long @@ -121,12 +121,12 @@ __media_ioctl(struct file *filp, unsigned int cmd, unsigned long arg, long (*ioctl_func)(struct file *filp, unsigned int cmd, unsigned long arg)) { - struct media_devnode *mdev = media_devnode_data(filp); + struct media_devnode *devnode = media_devnode_data(filp);
if (!ioctl_func) return -ENOTTY;
- if (!media_devnode_is_registered(mdev)) + if (!media_devnode_is_registered(devnode)) return -EIO;
return ioctl_func(filp, cmd, arg); @@ -134,9 +134,9 @@ __media_ioctl(struct file *filp, unsigned int cmd, unsigned long arg,
static long media_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) { - struct media_devnode *mdev = media_devnode_data(filp); + struct media_devnode *devnode = media_devnode_data(filp);
- return __media_ioctl(filp, cmd, arg, mdev->fops->ioctl); + return __media_ioctl(filp, cmd, arg, devnode->fops->ioctl); }
#ifdef CONFIG_COMPAT @@ -144,9 +144,9 @@ static long media_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) static long media_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) { - struct media_devnode *mdev = media_devnode_data(filp); + struct media_devnode *devnode = media_devnode_data(filp);
- return __media_ioctl(filp, cmd, arg, mdev->fops->compat_ioctl); + return __media_ioctl(filp, cmd, arg, devnode->fops->compat_ioctl); }
#endif /* CONFIG_COMPAT */ @@ -154,7 +154,7 @@ static long media_compat_ioctl(struct file *filp, unsigned int cmd, /* Override for the open function */ static int media_open(struct inode *inode, struct file *filp) { - struct media_devnode *mdev; + struct media_devnode *devnode; int ret;
/* Check if the media device is available. This needs to be done with @@ -164,23 +164,23 @@ static int media_open(struct inode *inode, struct file *filp) * a crash. */ mutex_lock(&media_devnode_lock); - mdev = container_of(inode->i_cdev, struct media_devnode, cdev); + devnode = container_of(inode->i_cdev, struct media_devnode, cdev); /* return ENXIO if the media device has been removed already or if it is not registered anymore. */ - if (!media_devnode_is_registered(mdev)) { + if (!media_devnode_is_registered(devnode)) { mutex_unlock(&media_devnode_lock); return -ENXIO; } /* and increase the device refcount */ - get_device(&mdev->dev); + get_device(&devnode->dev); mutex_unlock(&media_devnode_lock);
- filp->private_data = mdev; + filp->private_data = devnode;
- if (mdev->fops->open) { - ret = mdev->fops->open(filp); + if (devnode->fops->open) { + ret = devnode->fops->open(filp); if (ret) { - put_device(&mdev->dev); + put_device(&devnode->dev); filp->private_data = NULL; return ret; } @@ -192,16 +192,16 @@ static int media_open(struct inode *inode, struct file *filp) /* Override for the release function */ static int media_release(struct inode *inode, struct file *filp) { - struct media_devnode *mdev = media_devnode_data(filp); + struct media_devnode *devnode = media_devnode_data(filp);
- if (mdev->fops->release) - mdev->fops->release(filp); + if (devnode->fops->release) + devnode->fops->release(filp);
filp->private_data = NULL;
/* decrease the refcount unconditionally since the release() return value is ignored. */ - put_device(&mdev->dev); + put_device(&devnode->dev); return 0; }
@@ -221,7 +221,7 @@ static const struct file_operations media_devnode_fops = {
/** * media_devnode_register - register a media device node - * @mdev: media device node structure we want to register + * @devnode: media device node structure we want to register * * The registration code assigns minor numbers and registers the new device node * with the kernel. An error is returned if no free minor number can be found, @@ -233,7 +233,7 @@ static const struct file_operations media_devnode_fops = { * the media_devnode structure is *not* called, so the caller is responsible for * freeing any data. */ -int __must_check media_devnode_register(struct media_devnode *mdev, +int __must_check media_devnode_register(struct media_devnode *devnode, struct module *owner) { int minor; @@ -251,40 +251,40 @@ int __must_check media_devnode_register(struct media_devnode *mdev, set_bit(minor, media_devnode_nums); mutex_unlock(&media_devnode_lock);
- mdev->minor = minor; + devnode->minor = minor;
/* Part 2: Initialize and register the character device */ - cdev_init(&mdev->cdev, &media_devnode_fops); - mdev->cdev.owner = owner; + cdev_init(&devnode->cdev, &media_devnode_fops); + devnode->cdev.owner = owner;
- ret = cdev_add(&mdev->cdev, MKDEV(MAJOR(media_dev_t), mdev->minor), 1); + ret = cdev_add(&devnode->cdev, MKDEV(MAJOR(media_dev_t), devnode->minor), 1); if (ret < 0) { pr_err("%s: cdev_add failed\n", __func__); goto error; }
/* Part 3: Register the media device */ - mdev->dev.bus = &media_bus_type; - mdev->dev.devt = MKDEV(MAJOR(media_dev_t), mdev->minor); - mdev->dev.release = media_devnode_release; - if (mdev->parent) - mdev->dev.parent = mdev->parent; - dev_set_name(&mdev->dev, "media%d", mdev->minor); - ret = device_register(&mdev->dev); + devnode->dev.bus = &media_bus_type; + devnode->dev.devt = MKDEV(MAJOR(media_dev_t), devnode->minor); + devnode->dev.release = media_devnode_release; + if (devnode->parent) + devnode->dev.parent = devnode->parent; + dev_set_name(&devnode->dev, "media%d", devnode->minor); + ret = device_register(&devnode->dev); if (ret < 0) { pr_err("%s: device_register failed\n", __func__); goto error; }
/* Part 4: Activate this minor. The char device can now be used. */ - set_bit(MEDIA_FLAG_REGISTERED, &mdev->flags); + set_bit(MEDIA_FLAG_REGISTERED, &devnode->flags);
return 0;
error: mutex_lock(&media_devnode_lock); - cdev_del(&mdev->cdev); - clear_bit(mdev->minor, media_devnode_nums); + cdev_del(&devnode->cdev); + clear_bit(devnode->minor, media_devnode_nums); mutex_unlock(&media_devnode_lock);
return ret; @@ -292,7 +292,7 @@ int __must_check media_devnode_register(struct media_devnode *mdev,
/** * media_devnode_unregister - unregister a media device node - * @mdev: the device node to unregister + * @devnode: the device node to unregister * * This unregisters the passed device. Future open calls will be met with * errors. @@ -300,16 +300,16 @@ int __must_check media_devnode_register(struct media_devnode *mdev, * This function can safely be called if the device node has never been * registered or has already been unregistered. */ -void media_devnode_unregister(struct media_devnode *mdev) +void media_devnode_unregister(struct media_devnode *devnode) { - /* Check if mdev was ever registered at all */ - if (!media_devnode_is_registered(mdev)) + /* Check if devnode was ever registered at all */ + if (!media_devnode_is_registered(devnode)) return;
mutex_lock(&media_devnode_lock); - clear_bit(MEDIA_FLAG_REGISTERED, &mdev->flags); + clear_bit(MEDIA_FLAG_REGISTERED, &devnode->flags); mutex_unlock(&media_devnode_lock); - device_unregister(&mdev->dev); + device_unregister(&devnode->dev); }
/* diff --git a/include/media/media-devnode.h b/include/media/media-devnode.h index 17ddae32060d..79f702d26d1f 100644 --- a/include/media/media-devnode.h +++ b/include/media/media-devnode.h @@ -80,24 +80,24 @@ struct media_devnode { unsigned long flags; /* Use bitops to access flags */
/* callbacks */ - void (*release)(struct media_devnode *mdev); + void (*release)(struct media_devnode *devnode); };
/* dev to media_devnode */ #define to_media_devnode(cd) container_of(cd, struct media_devnode, dev)
-int __must_check media_devnode_register(struct media_devnode *mdev, +int __must_check media_devnode_register(struct media_devnode *devnode, struct module *owner); -void media_devnode_unregister(struct media_devnode *mdev); +void media_devnode_unregister(struct media_devnode *devnode);
static inline struct media_devnode *media_devnode_data(struct file *filp) { return filp->private_data; }
-static inline int media_devnode_is_registered(struct media_devnode *mdev) +static inline int media_devnode_is_registered(struct media_devnode *devnode) { - return test_bit(MEDIA_FLAG_REGISTERED, &mdev->flags); + return test_bit(MEDIA_FLAG_REGISTERED, &devnode->flags); }
#endif /* _MEDIA_DEVNODE_H */
From: Mauro Carvalho Chehab mchehab@osg.samsung.com
commit a087ce704b802becbb4b0f2a20f2cb3f6911802e upstream.
struct media_devnode is currently embedded at struct media_device.
While this works fine during normal usage, it leads to a race condition during devnode unregister. the problem is that drivers assume that, after calling media_device_unregister(), the struct that contains media_device can be freed. This is not true, as it can't be freed until userspace closes all opened /dev/media devnodes.
In other words, if the media devnode is still open, and media_device gets freed, any call to an ioctl will make the core to try to access struct media_device, with will cause an use-after-free and even GPF.
Fix this by dynamically allocating the struct media_devnode and only freeing it when it is safe.
Signed-off-by: Mauro Carvalho Chehab mchehab@osg.samsung.com Signed-off-by: Mauro Carvalho Chehab mchehab@s-opensource.com [bwh: Backported to 4.4: - Drop change in au0828 - Include <linux/slab.h> in media-device.c - Adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/media-device.c | 40 +++++++++++++++++++++--------- drivers/media/media-devnode.c | 8 +++++- drivers/media/usb/uvc/uvc_driver.c | 2 +- include/media/media-device.h | 5 +--- include/media/media-devnode.h | 10 +++++++- 5 files changed, 46 insertions(+), 19 deletions(-)
diff --git a/drivers/media/media-device.c b/drivers/media/media-device.c index 7b39440192d6..fb018fe1a8f7 100644 --- a/drivers/media/media-device.c +++ b/drivers/media/media-device.c @@ -24,6 +24,7 @@ #include <linux/export.h> #include <linux/ioctl.h> #include <linux/media.h> +#include <linux/slab.h> #include <linux/types.h>
#include <media/media-device.h> @@ -234,7 +235,7 @@ static long media_device_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) { struct media_devnode *devnode = media_devnode_data(filp); - struct media_device *dev = to_media_device(devnode); + struct media_device *dev = devnode->media_dev; long ret;
switch (cmd) { @@ -303,7 +304,7 @@ static long media_device_compat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) { struct media_devnode *devnode = media_devnode_data(filp); - struct media_device *dev = to_media_device(devnode); + struct media_device *dev = devnode->media_dev; long ret;
switch (cmd) { @@ -344,7 +345,8 @@ static const struct media_file_operations media_device_fops = { static ssize_t show_model(struct device *cd, struct device_attribute *attr, char *buf) { - struct media_device *mdev = to_media_device(to_media_devnode(cd)); + struct media_devnode *devnode = to_media_devnode(cd); + struct media_device *mdev = devnode->media_dev;
return sprintf(buf, "%.*s\n", (int)sizeof(mdev->model), mdev->model); } @@ -372,6 +374,7 @@ static void media_device_release(struct media_devnode *mdev) int __must_check __media_device_register(struct media_device *mdev, struct module *owner) { + struct media_devnode *devnode; int ret;
if (WARN_ON(mdev->dev == NULL || mdev->model[0] == 0)) @@ -382,17 +385,27 @@ int __must_check __media_device_register(struct media_device *mdev, spin_lock_init(&mdev->lock); mutex_init(&mdev->graph_mutex);
+ devnode = kzalloc(sizeof(*devnode), GFP_KERNEL); + if (!devnode) + return -ENOMEM; + /* Register the device node. */ - mdev->devnode.fops = &media_device_fops; - mdev->devnode.parent = mdev->dev; - mdev->devnode.release = media_device_release; - ret = media_devnode_register(&mdev->devnode, owner); - if (ret < 0) + mdev->devnode = devnode; + devnode->fops = &media_device_fops; + devnode->parent = mdev->dev; + devnode->release = media_device_release; + ret = media_devnode_register(mdev, devnode, owner); + if (ret < 0) { + mdev->devnode = NULL; + kfree(devnode); return ret; + }
- ret = device_create_file(&mdev->devnode.dev, &dev_attr_model); + ret = device_create_file(&devnode->dev, &dev_attr_model); if (ret < 0) { - media_devnode_unregister(&mdev->devnode); + mdev->devnode = NULL; + media_devnode_unregister(devnode); + kfree(devnode); return ret; }
@@ -413,8 +426,11 @@ void media_device_unregister(struct media_device *mdev) list_for_each_entry_safe(entity, next, &mdev->entities, list) media_device_unregister_entity(entity);
- device_remove_file(&mdev->devnode.dev, &dev_attr_model); - media_devnode_unregister(&mdev->devnode); + /* Check if mdev devnode was registered */ + if (media_devnode_is_registered(mdev->devnode)) { + device_remove_file(&mdev->devnode->dev, &dev_attr_model); + media_devnode_unregister(mdev->devnode); + } } EXPORT_SYMBOL_GPL(media_device_unregister);
diff --git a/drivers/media/media-devnode.c b/drivers/media/media-devnode.c index 98211c570e11..000efb17b95b 100644 --- a/drivers/media/media-devnode.c +++ b/drivers/media/media-devnode.c @@ -44,6 +44,7 @@ #include <linux/uaccess.h>
#include <media/media-devnode.h> +#include <media/media-device.h>
#define MEDIA_NUM_DEVICES 256 #define MEDIA_NAME "media" @@ -74,6 +75,8 @@ static void media_devnode_release(struct device *cd) /* Release media_devnode and perform other cleanups as needed. */ if (devnode->release) devnode->release(devnode); + + kfree(devnode); }
static struct bus_type media_bus_type = { @@ -221,6 +224,7 @@ static const struct file_operations media_devnode_fops = {
/** * media_devnode_register - register a media device node + * @media_dev: struct media_device we want to register a device node * @devnode: media device node structure we want to register * * The registration code assigns minor numbers and registers the new device node @@ -233,7 +237,8 @@ static const struct file_operations media_devnode_fops = { * the media_devnode structure is *not* called, so the caller is responsible for * freeing any data. */ -int __must_check media_devnode_register(struct media_devnode *devnode, +int __must_check media_devnode_register(struct media_device *mdev, + struct media_devnode *devnode, struct module *owner) { int minor; @@ -252,6 +257,7 @@ int __must_check media_devnode_register(struct media_devnode *devnode, mutex_unlock(&media_devnode_lock);
devnode->minor = minor; + devnode->media_dev = mdev;
/* Part 2: Initialize and register the character device */ cdev_init(&devnode->cdev, &media_devnode_fops); diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c index 9cd0268b2767..f353ab569b8e 100644 --- a/drivers/media/usb/uvc/uvc_driver.c +++ b/drivers/media/usb/uvc/uvc_driver.c @@ -1800,7 +1800,7 @@ static void uvc_delete(struct uvc_device *dev) if (dev->vdev.dev) v4l2_device_unregister(&dev->vdev); #ifdef CONFIG_MEDIA_CONTROLLER - if (media_devnode_is_registered(&dev->mdev.devnode)) + if (media_devnode_is_registered(dev->mdev.devnode)) media_device_unregister(&dev->mdev); #endif
diff --git a/include/media/media-device.h b/include/media/media-device.h index 6e6db78f1ee2..00bbd679864a 100644 --- a/include/media/media-device.h +++ b/include/media/media-device.h @@ -60,7 +60,7 @@ struct device; struct media_device { /* dev->driver_data points to this struct. */ struct device *dev; - struct media_devnode devnode; + struct media_devnode *devnode;
char model[32]; char serial[40]; @@ -84,9 +84,6 @@ struct media_device { #define MEDIA_DEV_NOTIFY_PRE_LINK_CH 0 #define MEDIA_DEV_NOTIFY_POST_LINK_CH 1
-/* media_devnode to media_device */ -#define to_media_device(node) container_of(node, struct media_device, devnode) - int __must_check __media_device_register(struct media_device *mdev, struct module *owner); #define media_device_register(mdev) __media_device_register(mdev, THIS_MODULE) diff --git a/include/media/media-devnode.h b/include/media/media-devnode.h index 79f702d26d1f..8b854c044032 100644 --- a/include/media/media-devnode.h +++ b/include/media/media-devnode.h @@ -33,6 +33,8 @@ #include <linux/device.h> #include <linux/cdev.h>
+struct media_device; + /* * Flag to mark the media_devnode struct as registered. Drivers must not touch * this flag directly, it will be set and cleared by media_devnode_register and @@ -67,6 +69,8 @@ struct media_file_operations { * before registering the node. */ struct media_devnode { + struct media_device *media_dev; + /* device ops */ const struct media_file_operations *fops;
@@ -86,7 +90,8 @@ struct media_devnode { /* dev to media_devnode */ #define to_media_devnode(cd) container_of(cd, struct media_devnode, dev)
-int __must_check media_devnode_register(struct media_devnode *devnode, +int __must_check media_devnode_register(struct media_device *mdev, + struct media_devnode *devnode, struct module *owner); void media_devnode_unregister(struct media_devnode *devnode);
@@ -97,6 +102,9 @@ static inline struct media_devnode *media_devnode_data(struct file *filp)
static inline int media_devnode_is_registered(struct media_devnode *devnode) { + if (!devnode) + return false; + return test_bit(MEDIA_FLAG_REGISTERED, &devnode->flags); }
From: Shuah Khan shuahkh@osg.samsung.com
commit 5b28dde51d0ccc54cee70756e1800d70bed7114a upstream.
When driver unbinds while media_ioctl is in progress, cdev_put() fails with when app exits after driver unbinds.
Add devnode struct device kobj as the cdev parent kobject. cdev_add() gets a reference to it and releases it in cdev_del() ensuring that the devnode is not deallocated as long as the application has the device file open.
media_devnode_register() initializes the struct device kobj before calling cdev_add(). media_devnode_unregister() does cdev_del() and then deletes the device. devnode is released when the last reference to the struct device is gone.
This problem is found on uvcvideo, em28xx, and au0828 drivers and fix has been tested on all three.
kernel: [ 193.599736] BUG: KASAN: use-after-free in cdev_put+0x4e/0x50 kernel: [ 193.599745] Read of size 8 by task media_device_te/1851 kernel: [ 193.599792] INFO: Allocated in __media_device_register+0x54 kernel: [ 193.599951] INFO: Freed in media_devnode_release+0xa4/0xc0
kernel: [ 193.601083] Call Trace: kernel: [ 193.601093] [<ffffffff81aecac3>] dump_stack+0x67/0x94 kernel: [ 193.601102] [<ffffffff815359b2>] print_trailer+0x112/0x1a0 kernel: [ 193.601111] [<ffffffff8153b5e4>] object_err+0x34/0x40 kernel: [ 193.601119] [<ffffffff8153d9d4>] kasan_report_error+0x224/0x530 kernel: [ 193.601128] [<ffffffff814a2c3d>] ? kzfree+0x2d/0x40 kernel: [ 193.601137] [<ffffffff81539d72>] ? kfree+0x1d2/0x1f0 kernel: [ 193.601154] [<ffffffff8157ca7e>] ? cdev_put+0x4e/0x50 kernel: [ 193.601162] [<ffffffff8157ca7e>] cdev_put+0x4e/0x50 kernel: [ 193.601170] [<ffffffff815767eb>] __fput+0x52b/0x6c0 kernel: [ 193.601179] [<ffffffff8117743a>] ? switch_task_namespaces+0x2a kernel: [ 193.601188] [<ffffffff815769ee>] ____fput+0xe/0x10 kernel: [ 193.601196] [<ffffffff81170023>] task_work_run+0x133/0x1f0 kernel: [ 193.601204] [<ffffffff8117746e>] ? switch_task_namespaces+0x5e kernel: [ 193.601213] [<ffffffff8111b50c>] do_exit+0x72c/0x2c20 kernel: [ 193.601224] [<ffffffff8111ade0>] ? release_task+0x1250/0x1250 - - - kernel: [ 193.601360] [<ffffffff81003587>] ? exit_to_usermode_loop+0xe7 kernel: [ 193.601368] [<ffffffff810035c0>] exit_to_usermode_loop+0x120 kernel: [ 193.601376] [<ffffffff810061da>] syscall_return_slowpath+0x16a kernel: [ 193.601386] [<ffffffff82848b33>] entry_SYSCALL_64_fastpath+0xa6
Signed-off-by: Shuah Khan shuahkh@osg.samsung.com Tested-by: Mauro Carvalho Chehab mchehab@osg.samsung.com Signed-off-by: Mauro Carvalho Chehab mchehab@s-opensource.com Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/media-device.c | 6 +++-- drivers/media/media-devnode.c | 48 +++++++++++++++++++++-------------- 2 files changed, 33 insertions(+), 21 deletions(-)
diff --git a/drivers/media/media-device.c b/drivers/media/media-device.c index fb018fe1a8f7..5d79cd481730 100644 --- a/drivers/media/media-device.c +++ b/drivers/media/media-device.c @@ -396,16 +396,16 @@ int __must_check __media_device_register(struct media_device *mdev, devnode->release = media_device_release; ret = media_devnode_register(mdev, devnode, owner); if (ret < 0) { + /* devnode free is handled in media_devnode_*() */ mdev->devnode = NULL; - kfree(devnode); return ret; }
ret = device_create_file(&devnode->dev, &dev_attr_model); if (ret < 0) { + /* devnode free is handled in media_devnode_*() */ mdev->devnode = NULL; media_devnode_unregister(devnode); - kfree(devnode); return ret; }
@@ -430,6 +430,8 @@ void media_device_unregister(struct media_device *mdev) if (media_devnode_is_registered(mdev->devnode)) { device_remove_file(&mdev->devnode->dev, &dev_attr_model); media_devnode_unregister(mdev->devnode); + /* devnode free is handled in media_devnode_*() */ + mdev->devnode = NULL; } } EXPORT_SYMBOL_GPL(media_device_unregister); diff --git a/drivers/media/media-devnode.c b/drivers/media/media-devnode.c index 000efb17b95b..45bb70d27224 100644 --- a/drivers/media/media-devnode.c +++ b/drivers/media/media-devnode.c @@ -63,13 +63,8 @@ static void media_devnode_release(struct device *cd) struct media_devnode *devnode = to_media_devnode(cd);
mutex_lock(&media_devnode_lock); - - /* Delete the cdev on this minor as well */ - cdev_del(&devnode->cdev); - /* Mark device node number as free */ clear_bit(devnode->minor, media_devnode_nums); - mutex_unlock(&media_devnode_lock);
/* Release media_devnode and perform other cleanups as needed. */ @@ -77,6 +72,7 @@ static void media_devnode_release(struct device *cd) devnode->release(devnode);
kfree(devnode); + pr_debug("%s: Media Devnode Deallocated\n", __func__); }
static struct bus_type media_bus_type = { @@ -205,6 +201,8 @@ static int media_release(struct inode *inode, struct file *filp) /* decrease the refcount unconditionally since the release() return value is ignored. */ put_device(&devnode->dev); + + pr_debug("%s: Media Release\n", __func__); return 0; }
@@ -250,6 +248,7 @@ int __must_check media_devnode_register(struct media_device *mdev, if (minor == MEDIA_NUM_DEVICES) { mutex_unlock(&media_devnode_lock); pr_err("could not get a free minor\n"); + kfree(devnode); return -ENFILE; }
@@ -259,27 +258,31 @@ int __must_check media_devnode_register(struct media_device *mdev, devnode->minor = minor; devnode->media_dev = mdev;
+ /* Part 1: Initialize dev now to use dev.kobj for cdev.kobj.parent */ + devnode->dev.bus = &media_bus_type; + devnode->dev.devt = MKDEV(MAJOR(media_dev_t), devnode->minor); + devnode->dev.release = media_devnode_release; + if (devnode->parent) + devnode->dev.parent = devnode->parent; + dev_set_name(&devnode->dev, "media%d", devnode->minor); + device_initialize(&devnode->dev); + /* Part 2: Initialize and register the character device */ cdev_init(&devnode->cdev, &media_devnode_fops); devnode->cdev.owner = owner; + devnode->cdev.kobj.parent = &devnode->dev.kobj;
ret = cdev_add(&devnode->cdev, MKDEV(MAJOR(media_dev_t), devnode->minor), 1); if (ret < 0) { pr_err("%s: cdev_add failed\n", __func__); - goto error; + goto cdev_add_error; }
- /* Part 3: Register the media device */ - devnode->dev.bus = &media_bus_type; - devnode->dev.devt = MKDEV(MAJOR(media_dev_t), devnode->minor); - devnode->dev.release = media_devnode_release; - if (devnode->parent) - devnode->dev.parent = devnode->parent; - dev_set_name(&devnode->dev, "media%d", devnode->minor); - ret = device_register(&devnode->dev); + /* Part 3: Add the media device */ + ret = device_add(&devnode->dev); if (ret < 0) { - pr_err("%s: device_register failed\n", __func__); - goto error; + pr_err("%s: device_add failed\n", __func__); + goto device_add_error; }
/* Part 4: Activate this minor. The char device can now be used. */ @@ -287,12 +290,15 @@ int __must_check media_devnode_register(struct media_device *mdev,
return 0;
-error: - mutex_lock(&media_devnode_lock); +device_add_error: cdev_del(&devnode->cdev); +cdev_add_error: + mutex_lock(&media_devnode_lock); clear_bit(devnode->minor, media_devnode_nums); + devnode->media_dev = NULL; mutex_unlock(&media_devnode_lock);
+ put_device(&devnode->dev); return ret; }
@@ -314,8 +320,12 @@ void media_devnode_unregister(struct media_devnode *devnode)
mutex_lock(&media_devnode_lock); clear_bit(MEDIA_FLAG_REGISTERED, &devnode->flags); + /* Delete the cdev on this minor as well */ + cdev_del(&devnode->cdev); mutex_unlock(&media_devnode_lock); - device_unregister(&devnode->dev); + device_del(&devnode->dev); + devnode->media_dev = NULL; + put_device(&devnode->dev); }
/*
From: Shuah Khan shuahkh@osg.samsung.com
commit 6f0dd24a084a17f9984dd49dffbf7055bf123993 upstream.
Media devnode open/ioctl could be in progress when media device unregister is initiated. System calls and ioctls check media device registered status at the beginning, however, there is a window where unregister could be in progress without changing the media devnode status to unregistered.
process 1 process 2 fd = open(/dev/media0) media_devnode_is_registered() (returns true here)
media_device_unregister() (unregister is in progress and devnode isn't unregistered yet) ... ioctl(fd, ...) __media_ioctl() media_devnode_is_registered() (returns true here) ... media_devnode_unregister() ... (driver releases the media device memory)
media_device_ioctl() (By this point devnode->media_dev does not point to allocated memory. use-after free in in mutex_lock_nested)
BUG: KASAN: use-after-free in mutex_lock_nested+0x79c/0x800 at addr ffff8801ebe914f0
Fix it by clearing register bit when unregister starts to avoid the race.
process 1 process 2 fd = open(/dev/media0) media_devnode_is_registered() (could return true here)
media_device_unregister() (clear the register bit, then start unregister.) ... ioctl(fd, ...) __media_ioctl() media_devnode_is_registered() (return false here, ioctl returns I/O error, and will not access media device memory) ... media_devnode_unregister() ... (driver releases the media device memory)
Signed-off-by: Shuah Khan shuahkh@osg.samsung.com Suggested-by: Sakari Ailus sakari.ailus@linux.intel.com Reported-by: Mauro Carvalho Chehab mchehab@osg.samsung.com Tested-by: Mauro Carvalho Chehab mchehab@osg.samsung.com Signed-off-by: Mauro Carvalho Chehab mchehab@s-opensource.com [bwh: Backported to 4.4: adjut filename, context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/media-device.c | 15 ++++++++------- drivers/media/media-devnode.c | 19 ++++++++++++------- include/media/media-devnode.h | 14 ++++++++++++++ 3 files changed, 34 insertions(+), 14 deletions(-)
diff --git a/drivers/media/media-device.c b/drivers/media/media-device.c index 5d79cd481730..0ca9506f4654 100644 --- a/drivers/media/media-device.c +++ b/drivers/media/media-device.c @@ -405,6 +405,7 @@ int __must_check __media_device_register(struct media_device *mdev, if (ret < 0) { /* devnode free is handled in media_devnode_*() */ mdev->devnode = NULL; + media_devnode_unregister_prepare(devnode); media_devnode_unregister(devnode); return ret; } @@ -423,16 +424,16 @@ void media_device_unregister(struct media_device *mdev) struct media_entity *entity; struct media_entity *next;
+ /* Clear the devnode register bit to avoid races with media dev open */ + media_devnode_unregister_prepare(mdev->devnode); + list_for_each_entry_safe(entity, next, &mdev->entities, list) media_device_unregister_entity(entity);
- /* Check if mdev devnode was registered */ - if (media_devnode_is_registered(mdev->devnode)) { - device_remove_file(&mdev->devnode->dev, &dev_attr_model); - media_devnode_unregister(mdev->devnode); - /* devnode free is handled in media_devnode_*() */ - mdev->devnode = NULL; - } + device_remove_file(&mdev->devnode->dev, &dev_attr_model); + media_devnode_unregister(mdev->devnode); + /* devnode free is handled in media_devnode_*() */ + mdev->devnode = NULL; } EXPORT_SYMBOL_GPL(media_device_unregister);
diff --git a/drivers/media/media-devnode.c b/drivers/media/media-devnode.c index 45bb70d27224..e887120d19aa 100644 --- a/drivers/media/media-devnode.c +++ b/drivers/media/media-devnode.c @@ -302,6 +302,17 @@ int __must_check media_devnode_register(struct media_device *mdev, return ret; }
+void media_devnode_unregister_prepare(struct media_devnode *devnode) +{ + /* Check if devnode was ever registered at all */ + if (!media_devnode_is_registered(devnode)) + return; + + mutex_lock(&media_devnode_lock); + clear_bit(MEDIA_FLAG_REGISTERED, &devnode->flags); + mutex_unlock(&media_devnode_lock); +} + /** * media_devnode_unregister - unregister a media device node * @devnode: the device node to unregister @@ -309,17 +320,11 @@ int __must_check media_devnode_register(struct media_device *mdev, * This unregisters the passed device. Future open calls will be met with * errors. * - * This function can safely be called if the device node has never been - * registered or has already been unregistered. + * Should be called after media_devnode_unregister_prepare() */ void media_devnode_unregister(struct media_devnode *devnode) { - /* Check if devnode was ever registered at all */ - if (!media_devnode_is_registered(devnode)) - return; - mutex_lock(&media_devnode_lock); - clear_bit(MEDIA_FLAG_REGISTERED, &devnode->flags); /* Delete the cdev on this minor as well */ cdev_del(&devnode->cdev); mutex_unlock(&media_devnode_lock); diff --git a/include/media/media-devnode.h b/include/media/media-devnode.h index 8b854c044032..d5ff95bf2d4b 100644 --- a/include/media/media-devnode.h +++ b/include/media/media-devnode.h @@ -93,6 +93,20 @@ struct media_devnode { int __must_check media_devnode_register(struct media_device *mdev, struct media_devnode *devnode, struct module *owner); + +/** + * media_devnode_unregister_prepare - clear the media device node register bit + * @devnode: the device node to prepare for unregister + * + * This clears the passed device register bit. Future open calls will be met + * with errors. Should be called before media_devnode_unregister() to avoid + * races with unregister and device file open calls. + * + * This function can safely be called if the device node has never been + * registered or has already been unregistered. + */ +void media_devnode_unregister_prepare(struct media_devnode *devnode); + void media_devnode_unregister(struct media_devnode *devnode);
static inline struct media_devnode *media_devnode_data(struct file *filp)
From: Erico Nunes erico.nunes@datacom.ind.br
commit d6760b14d4a1243f918d983bba1e35c5a5cd5a6d upstream.
i2c-dev had never moved away from the older register_chrdev interface to implement its char device registration. The register_chrdev API has the limitation of enabling only up to 256 i2c-dev busses to exist.
Large platforms with lots of i2c devices (i.e. pluggable transceivers) with dedicated busses may have to exceed that limit. In particular, there are also platforms making use of the i2c bus multiplexing API, which instantiates a virtual bus for each possible multiplexed selection.
This patch removes the register_chrdev usage and replaces it with the less old cdev API, which takes away the 256 i2c-dev bus limitation. It should not have any other impact for i2c bus drivers or user space.
This patch has been tested on qemu x86 and qemu powerpc platforms with the aid of a module which adds and removes 5000 virtual i2c busses, as well as validated on an existing powerpc hardware platform which makes use of the i2c bus multiplexing API. i2c-dev busses with device minor numbers larger than 256 have also been validated to work with the existing i2c-tools.
Signed-off-by: Erico Nunes erico.nunes@datacom.ind.br [wsa: kept includes sorted] Signed-off-by: Wolfram Sang wsa@the-dreams.de [bwh: Backported to 4.4: adjust context] Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/i2c-dev.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/i2c/i2c-dev.c b/drivers/i2c/i2c-dev.c index e56b774e7cf9..5fecc1d9e0a1 100644 --- a/drivers/i2c/i2c-dev.c +++ b/drivers/i2c/i2c-dev.c @@ -22,6 +22,7 @@
/* The I2C_RDWR ioctl code is written by Kolja Waschk waschk@telos.de */
+#include <linux/cdev.h> #include <linux/kernel.h> #include <linux/module.h> #include <linux/device.h> @@ -47,9 +48,10 @@ struct i2c_dev { struct list_head list; struct i2c_adapter *adap; struct device *dev; + struct cdev cdev; };
-#define I2C_MINORS 256 +#define I2C_MINORS MINORMASK static LIST_HEAD(i2c_dev_list); static DEFINE_SPINLOCK(i2c_dev_list_lock);
@@ -559,6 +561,12 @@ static int i2cdev_attach_adapter(struct device *dev, void *dummy) if (IS_ERR(i2c_dev)) return PTR_ERR(i2c_dev);
+ cdev_init(&i2c_dev->cdev, &i2cdev_fops); + i2c_dev->cdev.owner = THIS_MODULE; + res = cdev_add(&i2c_dev->cdev, MKDEV(I2C_MAJOR, adap->nr), 1); + if (res) + goto error_cdev; + /* register this i2c device with the driver core */ i2c_dev->dev = device_create(i2c_dev_class, &adap->dev, MKDEV(I2C_MAJOR, adap->nr), NULL, @@ -572,6 +580,8 @@ static int i2cdev_attach_adapter(struct device *dev, void *dummy) adap->name, adap->nr); return 0; error: + cdev_del(&i2c_dev->cdev); +error_cdev: return_i2c_dev(i2c_dev); return res; } @@ -591,6 +601,7 @@ static int i2cdev_detach_adapter(struct device *dev, void *dummy)
return_i2c_dev(i2c_dev); device_destroy(i2c_dev_class, MKDEV(I2C_MAJOR, adap->nr)); + cdev_del(&i2c_dev->cdev);
pr_debug("i2c-dev: adapter [%s] unregistered\n", adap->name); return 0; @@ -627,7 +638,7 @@ static int __init i2c_dev_init(void)
printk(KERN_INFO "i2c /dev entries driver\n");
- res = register_chrdev(I2C_MAJOR, "i2c", &i2cdev_fops); + res = register_chrdev_region(MKDEV(I2C_MAJOR, 0), I2C_MINORS, "i2c"); if (res) goto out;
@@ -651,7 +662,7 @@ static int __init i2c_dev_init(void) out_unreg_class: class_destroy(i2c_dev_class); out_unreg_chrdev: - unregister_chrdev(I2C_MAJOR, "i2c"); + unregister_chrdev_region(MKDEV(I2C_MAJOR, 0), I2C_MINORS); out: printk(KERN_ERR "%s: Driver Initialisation failed\n", __FILE__); return res; @@ -662,7 +673,7 @@ static void __exit i2c_dev_exit(void) bus_unregister_notifier(&i2c_bus_type, &i2cdev_notifier); i2c_for_each_dev(NULL, i2cdev_detach_adapter); class_destroy(i2c_dev_class); - unregister_chrdev(I2C_MAJOR, "i2c"); + unregister_chrdev_region(MKDEV(I2C_MAJOR, 0), I2C_MINORS); }
MODULE_AUTHOR("Frodo Looijaard frodol@dds.nl and "
From: Wolfram Sang wsa@the-dreams.de
commit 72a71f869c95dc11b73f09fe18c593d4a0618c3f upstream.
I stumbled multiple times over 'return_i2c_dev', especially before the actual 'return res'. It makes the code hard to read, so reanme the function to 'put_i2c_dev' which also better matches 'get_free_i2c_dev'.
Signed-off-by: Wolfram Sang wsa@the-dreams.de Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/i2c-dev.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/i2c/i2c-dev.c b/drivers/i2c/i2c-dev.c index 5fecc1d9e0a1..382c66d5a470 100644 --- a/drivers/i2c/i2c-dev.c +++ b/drivers/i2c/i2c-dev.c @@ -91,7 +91,7 @@ static struct i2c_dev *get_free_i2c_dev(struct i2c_adapter *adap) return i2c_dev; }
-static void return_i2c_dev(struct i2c_dev *i2c_dev) +static void put_i2c_dev(struct i2c_dev *i2c_dev) { spin_lock(&i2c_dev_list_lock); list_del(&i2c_dev->list); @@ -582,7 +582,7 @@ static int i2cdev_attach_adapter(struct device *dev, void *dummy) error: cdev_del(&i2c_dev->cdev); error_cdev: - return_i2c_dev(i2c_dev); + put_i2c_dev(i2c_dev); return res; }
@@ -599,7 +599,7 @@ static int i2cdev_detach_adapter(struct device *dev, void *dummy) if (!i2c_dev) /* attach_adapter must have failed */ return 0;
- return_i2c_dev(i2c_dev); + put_i2c_dev(i2c_dev); device_destroy(i2c_dev_class, MKDEV(I2C_MAJOR, adap->nr)); cdev_del(&i2c_dev->cdev);
From: Dan Carpenter dan.carpenter@oracle.com
commit e6be18f6d62c1d3b331ae020b76a29c2ccf6b0bf upstream.
The call to put_i2c_dev() frees "i2c_dev" so there is a use after free when we call cdev_del(&i2c_dev->cdev).
Fixes: d6760b14d4a1 ('i2c: dev: switch from register_chrdev to cdev API') Signed-off-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Wolfram Sang wsa@the-dreams.de Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/i2c-dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i2c/i2c-dev.c b/drivers/i2c/i2c-dev.c index 382c66d5a470..e5cd307ebfc9 100644 --- a/drivers/i2c/i2c-dev.c +++ b/drivers/i2c/i2c-dev.c @@ -599,9 +599,9 @@ static int i2cdev_detach_adapter(struct device *dev, void *dummy) if (!i2c_dev) /* attach_adapter must have failed */ return 0;
+ cdev_del(&i2c_dev->cdev); put_i2c_dev(i2c_dev); device_destroy(i2c_dev_class, MKDEV(I2C_MAJOR, adap->nr)); - cdev_del(&i2c_dev->cdev);
pr_debug("i2c-dev: adapter [%s] unregistered\n", adap->name); return 0;
From: viresh kumar viresh.kumar@linaro.org
commit 5136ed4fcb05cd4981cc6034a11e66370ed84789 upstream.
There is no code protecting i2c_dev to be freed after it is returned from i2c_dev_get_by_minor() and using it to access the value which we already have (minor) isn't safe really.
Avoid using it and get the adapter directly from 'minor'.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Reviewed-by: Jean Delvare jdelvare@suse.de Tested-by: Jean Delvare jdelvare@suse.de Signed-off-by: Wolfram Sang wsa@the-dreams.de Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/i2c-dev.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/drivers/i2c/i2c-dev.c b/drivers/i2c/i2c-dev.c index e5cd307ebfc9..5543b49e2e05 100644 --- a/drivers/i2c/i2c-dev.c +++ b/drivers/i2c/i2c-dev.c @@ -492,13 +492,8 @@ static int i2cdev_open(struct inode *inode, struct file *file) unsigned int minor = iminor(inode); struct i2c_client *client; struct i2c_adapter *adap; - struct i2c_dev *i2c_dev; - - i2c_dev = i2c_dev_get_by_minor(minor); - if (!i2c_dev) - return -ENODEV;
- adap = i2c_get_adapter(i2c_dev->adap->nr); + adap = i2c_get_adapter(minor); if (!adap) return -ENODEV;
From: Kevin Hao haokexin@gmail.com
commit 1413ef638abae4ab5621901cf4d8ef08a4a48ba6 upstream.
The struct cdev is embedded in the struct i2c_dev. In the current code, we would free the i2c_dev struct directly in put_i2c_dev(), but the cdev is manged by a kobject, and the release of it is not predictable. So it is very possible that the i2c_dev is freed before the cdev is entirely released. We can easily get the following call trace with CONFIG_DEBUG_KOBJECT_RELEASE and CONFIG_DEBUG_OBJECTS_TIMERS enabled. ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x38 WARNING: CPU: 19 PID: 1 at lib/debugobjects.c:325 debug_print_object+0xb0/0xf0 Modules linked in: CPU: 19 PID: 1 Comm: swapper/0 Tainted: G W 5.2.20-yocto-standard+ #120 Hardware name: Marvell OcteonTX CN96XX board (DT) pstate: 80c00089 (Nzcv daIf +PAN +UAO) pc : debug_print_object+0xb0/0xf0 lr : debug_print_object+0xb0/0xf0 sp : ffff00001292f7d0 x29: ffff00001292f7d0 x28: ffff800b82151788 x27: 0000000000000001 x26: ffff800b892c0000 x25: ffff0000124a2558 x24: 0000000000000000 x23: ffff00001107a1d8 x22: ffff0000116b5088 x21: ffff800bdc6afca8 x20: ffff000012471ae8 x19: ffff00001168f2c8 x18: 0000000000000010 x17: 00000000fd6f304b x16: 00000000ee79de43 x15: ffff800bc0e80568 x14: 79616c6564203a74 x13: 6e6968207473696c x12: 5f72656d6974203a x11: ffff0000113f0018 x10: 0000000000000000 x9 : 000000000000001f x8 : 0000000000000000 x7 : ffff0000101294cc x6 : 0000000000000000 x5 : 0000000000000000 x4 : 0000000000000001 x3 : 00000000ffffffff x2 : 0000000000000000 x1 : 387fc15c8ec0f200 x0 : 0000000000000000 Call trace: debug_print_object+0xb0/0xf0 __debug_check_no_obj_freed+0x19c/0x228 debug_check_no_obj_freed+0x1c/0x28 kfree+0x250/0x440 put_i2c_dev+0x68/0x78 i2cdev_detach_adapter+0x60/0xc8 i2cdev_notifier_call+0x3c/0x70 notifier_call_chain+0x8c/0xe8 blocking_notifier_call_chain+0x64/0x88 device_del+0x74/0x380 device_unregister+0x54/0x78 i2c_del_adapter+0x278/0x2d0 unittest_i2c_bus_remove+0x3c/0x80 platform_drv_remove+0x30/0x50 device_release_driver_internal+0xf4/0x1c0 driver_detach+0x58/0xa0 bus_remove_driver+0x84/0xd8 driver_unregister+0x34/0x60 platform_driver_unregister+0x20/0x30 of_unittest_overlay+0x8d4/0xbe0 of_unittest+0xae8/0xb3c do_one_initcall+0xac/0x450 do_initcall_level+0x208/0x224 kernel_init_freeable+0x2d8/0x36c kernel_init+0x18/0x108 ret_from_fork+0x10/0x1c irq event stamp: 3934661 hardirqs last enabled at (3934661): [<ffff00001009fa04>] debug_exception_exit+0x4c/0x58 hardirqs last disabled at (3934660): [<ffff00001009fb14>] debug_exception_enter+0xa4/0xe0 softirqs last enabled at (3934654): [<ffff000010081d94>] __do_softirq+0x46c/0x628 softirqs last disabled at (3934649): [<ffff0000100b4a1c>] irq_exit+0x104/0x118
This is a common issue when using cdev embedded in a struct. Fortunately, we already have a mechanism to solve this kind of issue. Please see commit 233ed09d7fda ("chardev: add helper function to register char devs with a struct device") for more detail.
In this patch, we choose to embed the struct device into the i2c_dev, and use the API provided by the commit 233ed09d7fda to make sure that the release of i2c_dev and cdev are in sequence.
Signed-off-by: Kevin Hao haokexin@gmail.com Signed-off-by: Wolfram Sang wsa@the-dreams.de Signed-off-by: Ben Hutchings ben.hutchings@codethink.co.uk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/i2c-dev.c | 48 +++++++++++++++++++++++-------------------- 1 file changed, 26 insertions(+), 22 deletions(-)
diff --git a/drivers/i2c/i2c-dev.c b/drivers/i2c/i2c-dev.c index 5543b49e2e05..7584f292e2fd 100644 --- a/drivers/i2c/i2c-dev.c +++ b/drivers/i2c/i2c-dev.c @@ -47,7 +47,7 @@ struct i2c_dev { struct list_head list; struct i2c_adapter *adap; - struct device *dev; + struct device dev; struct cdev cdev; };
@@ -91,12 +91,14 @@ static struct i2c_dev *get_free_i2c_dev(struct i2c_adapter *adap) return i2c_dev; }
-static void put_i2c_dev(struct i2c_dev *i2c_dev) +static void put_i2c_dev(struct i2c_dev *i2c_dev, bool del_cdev) { spin_lock(&i2c_dev_list_lock); list_del(&i2c_dev->list); spin_unlock(&i2c_dev_list_lock); - kfree(i2c_dev); + if (del_cdev) + cdev_device_del(&i2c_dev->cdev, &i2c_dev->dev); + put_device(&i2c_dev->dev); }
static ssize_t name_show(struct device *dev, @@ -542,6 +544,14 @@ static const struct file_operations i2cdev_fops = {
static struct class *i2c_dev_class;
+static void i2cdev_dev_release(struct device *dev) +{ + struct i2c_dev *i2c_dev; + + i2c_dev = container_of(dev, struct i2c_dev, dev); + kfree(i2c_dev); +} + static int i2cdev_attach_adapter(struct device *dev, void *dummy) { struct i2c_adapter *adap; @@ -558,27 +568,23 @@ static int i2cdev_attach_adapter(struct device *dev, void *dummy)
cdev_init(&i2c_dev->cdev, &i2cdev_fops); i2c_dev->cdev.owner = THIS_MODULE; - res = cdev_add(&i2c_dev->cdev, MKDEV(I2C_MAJOR, adap->nr), 1); - if (res) - goto error_cdev; - - /* register this i2c device with the driver core */ - i2c_dev->dev = device_create(i2c_dev_class, &adap->dev, - MKDEV(I2C_MAJOR, adap->nr), NULL, - "i2c-%d", adap->nr); - if (IS_ERR(i2c_dev->dev)) { - res = PTR_ERR(i2c_dev->dev); - goto error; + + device_initialize(&i2c_dev->dev); + i2c_dev->dev.devt = MKDEV(I2C_MAJOR, adap->nr); + i2c_dev->dev.class = i2c_dev_class; + i2c_dev->dev.parent = &adap->dev; + i2c_dev->dev.release = i2cdev_dev_release; + dev_set_name(&i2c_dev->dev, "i2c-%d", adap->nr); + + res = cdev_device_add(&i2c_dev->cdev, &i2c_dev->dev); + if (res) { + put_i2c_dev(i2c_dev, false); + return res; }
pr_debug("i2c-dev: adapter [%s] registered as minor %d\n", adap->name, adap->nr); return 0; -error: - cdev_del(&i2c_dev->cdev); -error_cdev: - put_i2c_dev(i2c_dev); - return res; }
static int i2cdev_detach_adapter(struct device *dev, void *dummy) @@ -594,9 +600,7 @@ static int i2cdev_detach_adapter(struct device *dev, void *dummy) if (!i2c_dev) /* attach_adapter must have failed */ return 0;
- cdev_del(&i2c_dev->cdev); - put_i2c_dev(i2c_dev); - device_destroy(i2c_dev_class, MKDEV(I2C_MAJOR, adap->nr)); + put_i2c_dev(i2c_dev, true);
pr_debug("i2c-dev: adapter [%s] unregistered\n", adap->name); return 0;
From: Mathias Krause minipli@googlemail.com
[ Upstream commit 1bd845bcb41d5b7f83745e0cb99273eb376f2ec5 ]
The parallel queue per-cpu data structure gets initialized only for CPUs in the 'pcpu' CPU mask set. This is not sufficient as the reorder timer may run on a different CPU and might wrongly decide it's the target CPU for the next reorder item as per-cpu memory gets memset(0) and we might be waiting for the first CPU in cpumask.pcpu, i.e. cpu_index 0.
Make the '__this_cpu_read(pd->pqueue->cpu_index) == next_queue->cpu_index' compare in padata_get_next() fail in this case by initializing the cpu_index member of all per-cpu parallel queues. Use -1 for unused ones.
Signed-off-by: Mathias Krause minipli@googlemail.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Daniel Jordan daniel.m.jordan@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/padata.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/padata.c b/kernel/padata.c index 8aef48c3267b..4f860043a8e5 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -461,8 +461,14 @@ static void padata_init_pqueues(struct parallel_data *pd) struct padata_parallel_queue *pqueue;
cpu_index = 0; - for_each_cpu(cpu, pd->cpumask.pcpu) { + for_each_possible_cpu(cpu) { pqueue = per_cpu_ptr(pd->pqueue, cpu); + + if (!cpumask_test_cpu(cpu, pd->cpumask.pcpu)) { + pqueue->cpu_index = -1; + continue; + } + pqueue->pd = pd; pqueue->cpu_index = cpu_index; cpu_index++;
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit c743f0a5c50f2fcbc628526279cfa24f3dabe182 ]
More users for for_each_cpu_wrap() have appeared. Promote the construct to generic cpumask interface.
The implementation is slightly modified to reduce arguments.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Lauro Ramos Venancio lvenanci@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Mike Galbraith efault@gmx.de Cc: Peter Zijlstra peterz@infradead.org Cc: Rik van Riel riel@redhat.com Cc: Thomas Gleixner tglx@linutronix.de Cc: lwang@redhat.com Link: http://lkml.kernel.org/r/20170414122005.o35me2h5nowqkxbv@hirez.programming.k... Signed-off-by: Ingo Molnar mingo@kernel.org [dj: include only what's added to the cpumask interface, 4.4 doesn't have them in the scheduler] Signed-off-by: Daniel Jordan daniel.m.jordan@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/cpumask.h | 17 +++++++++++++++++ lib/cpumask.c | 32 ++++++++++++++++++++++++++++++++ 2 files changed, 49 insertions(+)
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h index bb3a4bb35183..1322883e7b46 100644 --- a/include/linux/cpumask.h +++ b/include/linux/cpumask.h @@ -232,6 +232,23 @@ unsigned int cpumask_local_spread(unsigned int i, int node); (cpu) = cpumask_next_zero((cpu), (mask)), \ (cpu) < nr_cpu_ids;)
+extern int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool wrap); + +/** + * for_each_cpu_wrap - iterate over every cpu in a mask, starting at a specified location + * @cpu: the (optionally unsigned) integer iterator + * @mask: the cpumask poiter + * @start: the start location + * + * The implementation does not assume any bit in @mask is set (including @start). + * + * After the loop, cpu is >= nr_cpu_ids. + */ +#define for_each_cpu_wrap(cpu, mask, start) \ + for ((cpu) = cpumask_next_wrap((start)-1, (mask), (start), false); \ + (cpu) < nr_cpumask_bits; \ + (cpu) = cpumask_next_wrap((cpu), (mask), (start), true)) + /** * for_each_cpu_and - iterate over every cpu in both masks * @cpu: the (optionally unsigned) integer iterator diff --git a/lib/cpumask.c b/lib/cpumask.c index 5a70f6196f57..24f06e7abf92 100644 --- a/lib/cpumask.c +++ b/lib/cpumask.c @@ -42,6 +42,38 @@ int cpumask_any_but(const struct cpumask *mask, unsigned int cpu) return i; }
+/** + * cpumask_next_wrap - helper to implement for_each_cpu_wrap + * @n: the cpu prior to the place to search + * @mask: the cpumask pointer + * @start: the start point of the iteration + * @wrap: assume @n crossing @start terminates the iteration + * + * Returns >= nr_cpu_ids on completion + * + * Note: the @wrap argument is required for the start condition when + * we cannot assume @start is set in @mask. + */ +int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool wrap) +{ + int next; + +again: + next = cpumask_next(n, mask); + + if (wrap && n < start && next >= start) { + return nr_cpumask_bits; + + } else if (next >= nr_cpumask_bits) { + wrap = true; + n = -1; + goto again; + } + + return next; +} +EXPORT_SYMBOL(cpumask_next_wrap); + /* These are not inline because of header tangles. */ #ifdef CONFIG_CPUMASK_OFFSTACK /**
Hi,
-----Original Message----- From: stable-owner@vger.kernel.org [mailto:stable-owner@vger.kernel.org] On Behalf Of Greg Kroah-Hartman Sent: Wednesday, May 27, 2020 3:53 AM To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org; stable@vger.kernel.org; Peter Zijlstra (Intel) peterz@infradead.org; Lauro Ramos Venancio lvenanci@redhat.com; Linus Torvalds torvalds@linux-foundation.org; Mike Galbraith efault@gmx.de; Rik van Riel riel@redhat.com; Thomas Gleixner tglx@linutronix.de; lwang@redhat.com; Ingo Molnar mingo@kernel.org; Daniel Jordan daniel.m.jordan@oracle.com; Sasha Levin sashal@kernel.org Subject: [PATCH 4.4 26/65] sched/fair, cpumask: Export for_each_cpu_wrap()
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit c743f0a5c50f2fcbc628526279cfa24f3dabe182 ]
More users for for_each_cpu_wrap() have appeared. Promote the construct to generic cpumask interface.
The implementation is slightly modified to reduce arguments.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Lauro Ramos Venancio lvenanci@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Mike Galbraith efault@gmx.de Cc: Peter Zijlstra peterz@infradead.org Cc: Rik van Riel riel@redhat.com Cc: Thomas Gleixner tglx@linutronix.de Cc: lwang@redhat.com Link: http://lkml.kernel.org/r/20170414122005.o35me2h5nowqkxbv@hirez.programming.k... Signed-off-by: Ingo Molnar mingo@kernel.org [dj: include only what's added to the cpumask interface, 4.4 doesn't have them in the scheduler] Signed-off-by: Daniel Jordan daniel.m.jordan@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org
include/linux/cpumask.h | 17 +++++++++++++++++ lib/cpumask.c | 32 ++++++++++++++++++++++++++++++++ 2 files changed, 49 insertions(+)
This commit also needs the following commits:
commit d207af2eab3f8668b95ad02b21930481c42806fd Author: Michael Kelley mhkelley@outlook.com Date: Wed Feb 14 02:54:03 2018 +0000
cpumask: Make for_each_cpu_wrap() available on UP as well
for_each_cpu_wrap() was originally added in the #else half of a large "#if NR_CPUS == 1" statement, but was omitted in the #if half. This patch adds the missing #if half to prevent compile errors when NR_CPUS is 1.
Reported-by: kbuild test robot fengguang.wu@intel.com Signed-off-by: Michael Kelley mhkelley@outlook.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: kys@microsoft.com Cc: martin.petersen@oracle.com Cc: mikelley@microsoft.com Fixes: c743f0a5c50f ("sched/fair, cpumask: Export for_each_cpu_wrap()") Link: http://lkml.kernel.org/r/SN6PR1901MB2045F087F59450507D4FCC17CBF50@SN6PR1901M... Signed-off-by: Ingo Molnar mingo@kernel.org
Please apply this commit.
Best regards, Nobuhro
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h index bb3a4bb35183..1322883e7b46 100644 --- a/include/linux/cpumask.h +++ b/include/linux/cpumask.h @@ -232,6 +232,23 @@ unsigned int cpumask_local_spread(unsigned int i, int node); (cpu) = cpumask_next_zero((cpu), (mask)), \ (cpu) < nr_cpu_ids;)
+extern int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool wrap);
+/**
- for_each_cpu_wrap - iterate over every cpu in a mask, starting at a specified location
- @cpu: the (optionally unsigned) integer iterator
- @mask: the cpumask poiter
- @start: the start location
- The implementation does not assume any bit in @mask is set (including @start).
- After the loop, cpu is >= nr_cpu_ids.
- */
+#define for_each_cpu_wrap(cpu, mask, start) \
- for ((cpu) = cpumask_next_wrap((start)-1, (mask), (start), false); \
(cpu) < nr_cpumask_bits; \
(cpu) = cpumask_next_wrap((cpu), (mask), (start), true))
/**
- for_each_cpu_and - iterate over every cpu in both masks
- @cpu: the (optionally unsigned) integer iterator
diff --git a/lib/cpumask.c b/lib/cpumask.c index 5a70f6196f57..24f06e7abf92 100644 --- a/lib/cpumask.c +++ b/lib/cpumask.c @@ -42,6 +42,38 @@ int cpumask_any_but(const struct cpumask *mask, unsigned int cpu) return i; }
+/**
- cpumask_next_wrap - helper to implement for_each_cpu_wrap
- @n: the cpu prior to the place to search
- @mask: the cpumask pointer
- @start: the start point of the iteration
- @wrap: assume @n crossing @start terminates the iteration
- Returns >= nr_cpu_ids on completion
- Note: the @wrap argument is required for the start condition when
- we cannot assume @start is set in @mask.
- */
+int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool wrap) +{
- int next;
+again:
- next = cpumask_next(n, mask);
- if (wrap && n < start && next >= start) {
return nr_cpumask_bits;
- } else if (next >= nr_cpumask_bits) {
wrap = true;
n = -1;
goto again;
- }
- return next;
+} +EXPORT_SYMBOL(cpumask_next_wrap);
/* These are not inline because of header tangles. */ #ifdef CONFIG_CPUMASK_OFFSTACK /** -- 2.25.1
On Wed, May 27, 2020 at 07:50:56AM +0000, nobuhiro1.iwamatsu@toshiba.co.jp wrote:
Hi,
-----Original Message----- From: stable-owner@vger.kernel.org [mailto:stable-owner@vger.kernel.org] On Behalf Of Greg Kroah-Hartman Sent: Wednesday, May 27, 2020 3:53 AM To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org; stable@vger.kernel.org; Peter Zijlstra (Intel) peterz@infradead.org; Lauro Ramos Venancio lvenanci@redhat.com; Linus Torvalds torvalds@linux-foundation.org; Mike Galbraith efault@gmx.de; Rik van Riel riel@redhat.com; Thomas Gleixner tglx@linutronix.de; lwang@redhat.com; Ingo Molnar mingo@kernel.org; Daniel Jordan daniel.m.jordan@oracle.com; Sasha Levin sashal@kernel.org Subject: [PATCH 4.4 26/65] sched/fair, cpumask: Export for_each_cpu_wrap()
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit c743f0a5c50f2fcbc628526279cfa24f3dabe182 ]
More users for for_each_cpu_wrap() have appeared. Promote the construct to generic cpumask interface.
The implementation is slightly modified to reduce arguments.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Lauro Ramos Venancio lvenanci@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Mike Galbraith efault@gmx.de Cc: Peter Zijlstra peterz@infradead.org Cc: Rik van Riel riel@redhat.com Cc: Thomas Gleixner tglx@linutronix.de Cc: lwang@redhat.com Link: http://lkml.kernel.org/r/20170414122005.o35me2h5nowqkxbv@hirez.programming.k... Signed-off-by: Ingo Molnar mingo@kernel.org [dj: include only what's added to the cpumask interface, 4.4 doesn't have them in the scheduler] Signed-off-by: Daniel Jordan daniel.m.jordan@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org
include/linux/cpumask.h | 17 +++++++++++++++++ lib/cpumask.c | 32 ++++++++++++++++++++++++++++++++ 2 files changed, 49 insertions(+)
This commit also needs the following commits:
commit d207af2eab3f8668b95ad02b21930481c42806fd Author: Michael Kelley mhkelley@outlook.com Date: Wed Feb 14 02:54:03 2018 +0000
cpumask: Make for_each_cpu_wrap() available on UP as well
for_each_cpu_wrap() was originally added in the #else half of a large "#if NR_CPUS == 1" statement, but was omitted in the #if half. This patch adds the missing #if half to prevent compile errors when NR_CPUS is 1. Reported-by: kbuild test robot fengguang.wu@intel.com Signed-off-by: Michael Kelley mhkelley@outlook.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: kys@microsoft.com Cc: martin.petersen@oracle.com Cc: mikelley@microsoft.com Fixes: c743f0a5c50f ("sched/fair, cpumask: Export for_each_cpu_wrap()") Link: http://lkml.kernel.org/r/SN6PR1901MB2045F087F59450507D4FCC17CBF50@SN6PR1901M... Signed-off-by: Ingo Molnar mingo@kernel.org
Please apply this commit.
Good catch, now queued up, thanks.
greg k-h
On 5/27/20 4:09 AM, Greg KH wrote:
On Wed, May 27, 2020 at 07:50:56AM +0000, nobuhiro1.iwamatsu@toshiba.co.jp wrote:
Subject: [PATCH 4.4 26/65] sched/fair, cpumask: Export for_each_cpu_wrap()
...
This commit also needs the following commits:
commit d207af2eab3f8668b95ad02b21930481c42806fd Author: Michael Kelley mhkelley@outlook.com Date: Wed Feb 14 02:54:03 2018 +0000
cpumask: Make for_each_cpu_wrap() available on UP as well
for_each_cpu_wrap() was originally added in the #else half of a large "#if NR_CPUS == 1" statement, but was omitted in the #if half. This patch adds the missing #if half to prevent compile errors when NR_CPUS is 1. Reported-by: kbuild test robot fengguang.wu@intel.com Signed-off-by: Michael Kelley mhkelley@outlook.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: kys@microsoft.com Cc: martin.petersen@oracle.com Cc: mikelley@microsoft.com Fixes: c743f0a5c50f ("sched/fair, cpumask: Export for_each_cpu_wrap()") Link: http://lkml.kernel.org/r/SN6PR1901MB2045F087F59450507D4FCC17CBF50@SN6PR1901M... Signed-off-by: Ingo Molnar mingo@kernel.org
Please apply this commit.
Good catch, now queued up, thanks.
I left this commit out because the 4.4 kernel only uses cpumask_next_wrap in padata, which is only enabled for SMP kernels, but it's probably best to be safe.
Daniel
From: Herbert Xu herbert@gondor.apana.org.au
[ Upstream commit 6fc4dbcf0276279d488c5fbbfabe94734134f4fa ]
The function padata_reorder will use a timer when it cannot progress while completed jobs are outstanding (pd->reorder_objects > 0). This is suboptimal as if we do end up using the timer then it would have introduced a gratuitous delay of one second.
In fact we can easily distinguish between whether completed jobs are outstanding and whether we can make progress. All we have to do is look at the next pqueue list.
This patch does that by replacing pd->processed with pd->cpu so that the next pqueue is more accessible.
A work queue is used instead of the original try_again to avoid hogging the CPU.
Note that we don't bother removing the work queue in padata_flush_queues because the whole premise is broken. You cannot flush async crypto requests so it makes no sense to even try. A subsequent patch will fix it by replacing it with a ref counting scheme.
Signed-off-by: Herbert Xu herbert@gondor.apana.org.au [dj: - adjust context - corrected setup_timer -> timer_setup to delete hunk - skip padata_flush_queues() hunk, function already removed in 4.4] Signed-off-by: Daniel Jordan daniel.m.jordan@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/padata.h | 13 ++---- kernel/padata.c | 95 ++++++++---------------------------------- 2 files changed, 22 insertions(+), 86 deletions(-)
diff --git a/include/linux/padata.h b/include/linux/padata.h index e74d61fa50fe..547a8d1e4a3b 100644 --- a/include/linux/padata.h +++ b/include/linux/padata.h @@ -24,7 +24,6 @@ #include <linux/workqueue.h> #include <linux/spinlock.h> #include <linux/list.h> -#include <linux/timer.h> #include <linux/notifier.h> #include <linux/kobject.h>
@@ -85,18 +84,14 @@ struct padata_serial_queue { * @serial: List to wait for serialization after reordering. * @pwork: work struct for parallelization. * @swork: work struct for serialization. - * @pd: Backpointer to the internal control structure. * @work: work struct for parallelization. - * @reorder_work: work struct for reordering. * @num_obj: Number of objects that are processed by this cpu. * @cpu_index: Index of the cpu. */ struct padata_parallel_queue { struct padata_list parallel; struct padata_list reorder; - struct parallel_data *pd; struct work_struct work; - struct work_struct reorder_work; atomic_t num_obj; int cpu_index; }; @@ -122,10 +117,10 @@ struct padata_cpumask { * @reorder_objects: Number of objects waiting in the reorder queues. * @refcnt: Number of objects holding a reference on this parallel_data. * @max_seq_nr: Maximal used sequence number. + * @cpu: Next CPU to be processed. * @cpumask: The cpumasks in use for parallel and serial workers. + * @reorder_work: work struct for reordering. * @lock: Reorder lock. - * @processed: Number of already processed objects. - * @timer: Reorder timer. */ struct parallel_data { struct padata_instance *pinst; @@ -134,10 +129,10 @@ struct parallel_data { atomic_t reorder_objects; atomic_t refcnt; atomic_t seq_nr; + int cpu; struct padata_cpumask cpumask; + struct work_struct reorder_work; spinlock_t lock ____cacheline_aligned; - unsigned int processed; - struct timer_list timer; };
/** diff --git a/kernel/padata.c b/kernel/padata.c index 4f860043a8e5..e5966eedfa36 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -165,23 +165,12 @@ EXPORT_SYMBOL(padata_do_parallel); */ static struct padata_priv *padata_get_next(struct parallel_data *pd) { - int cpu, num_cpus; - unsigned int next_nr, next_index; struct padata_parallel_queue *next_queue; struct padata_priv *padata; struct padata_list *reorder; + int cpu = pd->cpu;
- num_cpus = cpumask_weight(pd->cpumask.pcpu); - - /* - * Calculate the percpu reorder queue and the sequence - * number of the next object. - */ - next_nr = pd->processed; - next_index = next_nr % num_cpus; - cpu = padata_index_to_cpu(pd, next_index); next_queue = per_cpu_ptr(pd->pqueue, cpu); - reorder = &next_queue->reorder;
spin_lock(&reorder->lock); @@ -192,7 +181,8 @@ static struct padata_priv *padata_get_next(struct parallel_data *pd) list_del_init(&padata->list); atomic_dec(&pd->reorder_objects);
- pd->processed++; + pd->cpu = cpumask_next_wrap(cpu, pd->cpumask.pcpu, -1, + false);
spin_unlock(&reorder->lock); goto out; @@ -215,6 +205,7 @@ static void padata_reorder(struct parallel_data *pd) struct padata_priv *padata; struct padata_serial_queue *squeue; struct padata_instance *pinst = pd->pinst; + struct padata_parallel_queue *next_queue;
/* * We need to ensure that only one cpu can work on dequeueing of @@ -246,7 +237,6 @@ static void padata_reorder(struct parallel_data *pd) * so exit immediately. */ if (PTR_ERR(padata) == -ENODATA) { - del_timer(&pd->timer); spin_unlock_bh(&pd->lock); return; } @@ -265,70 +255,29 @@ static void padata_reorder(struct parallel_data *pd)
/* * The next object that needs serialization might have arrived to - * the reorder queues in the meantime, we will be called again - * from the timer function if no one else cares for it. + * the reorder queues in the meantime. * - * Ensure reorder_objects is read after pd->lock is dropped so we see - * an increment from another task in padata_do_serial. Pairs with + * Ensure reorder queue is read after pd->lock is dropped so we see + * new objects from another task in padata_do_serial. Pairs with * smp_mb__after_atomic in padata_do_serial. */ smp_mb(); - if (atomic_read(&pd->reorder_objects) - && !(pinst->flags & PADATA_RESET)) - mod_timer(&pd->timer, jiffies + HZ); - else - del_timer(&pd->timer);
- return; + next_queue = per_cpu_ptr(pd->pqueue, pd->cpu); + if (!list_empty(&next_queue->reorder.list)) + queue_work(pinst->wq, &pd->reorder_work); }
static void invoke_padata_reorder(struct work_struct *work) { - struct padata_parallel_queue *pqueue; struct parallel_data *pd;
local_bh_disable(); - pqueue = container_of(work, struct padata_parallel_queue, reorder_work); - pd = pqueue->pd; + pd = container_of(work, struct parallel_data, reorder_work); padata_reorder(pd); local_bh_enable(); }
-static void padata_reorder_timer(unsigned long arg) -{ - struct parallel_data *pd = (struct parallel_data *)arg; - unsigned int weight; - int target_cpu, cpu; - - cpu = get_cpu(); - - /* We don't lock pd here to not interfere with parallel processing - * padata_reorder() calls on other CPUs. We just need any CPU out of - * the cpumask.pcpu set. It would be nice if it's the right one but - * it doesn't matter if we're off to the next one by using an outdated - * pd->processed value. - */ - weight = cpumask_weight(pd->cpumask.pcpu); - target_cpu = padata_index_to_cpu(pd, pd->processed % weight); - - /* ensure to call the reorder callback on the correct CPU */ - if (cpu != target_cpu) { - struct padata_parallel_queue *pqueue; - struct padata_instance *pinst; - - /* The timer function is serialized wrt itself -- no locking - * needed. - */ - pinst = pd->pinst; - pqueue = per_cpu_ptr(pd->pqueue, target_cpu); - queue_work_on(target_cpu, pinst->wq, &pqueue->reorder_work); - } else { - padata_reorder(pd); - } - - put_cpu(); -} - static void padata_serial_worker(struct work_struct *serial_work) { struct padata_serial_queue *squeue; @@ -382,9 +331,8 @@ void padata_do_serial(struct padata_priv *padata)
cpu = get_cpu();
- /* We need to run on the same CPU padata_do_parallel(.., padata, ..) - * was called on -- or, at least, enqueue the padata object into the - * correct per-cpu queue. + /* We need to enqueue the padata object into the correct + * per-cpu queue. */ if (cpu != padata->cpu) { reorder_via_wq = 1; @@ -394,12 +342,12 @@ void padata_do_serial(struct padata_priv *padata) pqueue = per_cpu_ptr(pd->pqueue, cpu);
spin_lock(&pqueue->reorder.lock); - atomic_inc(&pd->reorder_objects); list_add_tail(&padata->list, &pqueue->reorder.list); + atomic_inc(&pd->reorder_objects); spin_unlock(&pqueue->reorder.lock);
/* - * Ensure the atomic_inc of reorder_objects above is ordered correctly + * Ensure the addition to the reorder list is ordered correctly * with the trylock of pd->lock in padata_reorder. Pairs with smp_mb * in padata_reorder. */ @@ -407,13 +355,7 @@ void padata_do_serial(struct padata_priv *padata)
put_cpu();
- /* If we're running on the wrong CPU, call padata_reorder() via a - * kernel worker. - */ - if (reorder_via_wq) - queue_work_on(cpu, pd->pinst->wq, &pqueue->reorder_work); - else - padata_reorder(pd); + padata_reorder(pd); } EXPORT_SYMBOL(padata_do_serial);
@@ -469,14 +411,12 @@ static void padata_init_pqueues(struct parallel_data *pd) continue; }
- pqueue->pd = pd; pqueue->cpu_index = cpu_index; cpu_index++;
__padata_list_init(&pqueue->reorder); __padata_list_init(&pqueue->parallel); INIT_WORK(&pqueue->work, padata_parallel_worker); - INIT_WORK(&pqueue->reorder_work, invoke_padata_reorder); atomic_set(&pqueue->num_obj, 0); } } @@ -504,12 +444,13 @@ static struct parallel_data *padata_alloc_pd(struct padata_instance *pinst,
padata_init_pqueues(pd); padata_init_squeues(pd); - setup_timer(&pd->timer, padata_reorder_timer, (unsigned long)pd); atomic_set(&pd->seq_nr, -1); atomic_set(&pd->reorder_objects, 0); atomic_set(&pd->refcnt, 1); pd->pinst = pinst; spin_lock_init(&pd->lock); + pd->cpu = cpumask_first(pcpumask); + INIT_WORK(&pd->reorder_work, invoke_padata_reorder);
return pd;
From: Daniel Jordan daniel.m.jordan@oracle.com
[ Upstream commit ec9c7d19336ee98ecba8de80128aa405c45feebb ]
Exercising CPU hotplug on a 5.2 kernel with recent padata fixes from cryptodev-2.6.git in an 8-CPU kvm guest...
# modprobe tcrypt alg="pcrypt(rfc4106(gcm(aes)))" type=3 # echo 0 > /sys/devices/system/cpu/cpu1/online # echo c > /sys/kernel/pcrypt/pencrypt/parallel_cpumask # modprobe tcrypt mode=215
...caused the following crash:
BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 2 PID: 134 Comm: kworker/2:2 Not tainted 5.2.0-padata-base+ #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-<snip> Workqueue: pencrypt padata_parallel_worker RIP: 0010:padata_reorder+0xcb/0x180 ... Call Trace: padata_do_serial+0x57/0x60 pcrypt_aead_enc+0x3a/0x50 [pcrypt] padata_parallel_worker+0x9b/0xe0 process_one_work+0x1b5/0x3f0 worker_thread+0x4a/0x3c0 ...
In padata_alloc_pd, pd->cpu is set using the user-supplied cpumask instead of the effective cpumask, and in this case cpumask_first picked an offline CPU.
The offline CPU's reorder->list.next is NULL in padata_reorder because the list wasn't initialized in padata_init_pqueues, which only operates on CPUs in the effective mask.
Fix by using the effective mask in padata_alloc_pd.
Fixes: 6fc4dbcf0276 ("padata: Replace delayed timer with immediate workqueue in padata_reorder") Signed-off-by: Daniel Jordan daniel.m.jordan@oracle.com Cc: Herbert Xu herbert@gondor.apana.org.au Cc: Steffen Klassert steffen.klassert@secunet.com Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Daniel Jordan daniel.m.jordan@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/padata.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/padata.c b/kernel/padata.c index e5966eedfa36..43b72f5dfe07 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -449,7 +449,7 @@ static struct parallel_data *padata_alloc_pd(struct padata_instance *pinst, atomic_set(&pd->refcnt, 1); pd->pinst = pinst; spin_lock_init(&pd->lock); - pd->cpu = cpumask_first(pcpumask); + pd->cpu = cpumask_first(pd->cpumask.pcpu); INIT_WORK(&pd->reorder_work, invoke_padata_reorder);
return pd;
From: Daniel Jordan daniel.m.jordan@oracle.com
[ Upstream commit 065cf577135a4977931c7a1e1edf442bfd9773dd]
With the removal of the padata timer, padata_do_serial no longer needs special CPU handling, so remove it.
Signed-off-by: Daniel Jordan daniel.m.jordan@oracle.com Cc: Herbert Xu herbert@gondor.apana.org.au Cc: Steffen Klassert steffen.klassert@secunet.com Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Daniel Jordan daniel.m.jordan@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/padata.c | 23 +++-------------------- 1 file changed, 3 insertions(+), 20 deletions(-)
diff --git a/kernel/padata.c b/kernel/padata.c index 43b72f5dfe07..c50975f43b34 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -322,24 +322,9 @@ static void padata_serial_worker(struct work_struct *serial_work) */ void padata_do_serial(struct padata_priv *padata) { - int cpu; - struct padata_parallel_queue *pqueue; - struct parallel_data *pd; - int reorder_via_wq = 0; - - pd = padata->pd; - - cpu = get_cpu(); - - /* We need to enqueue the padata object into the correct - * per-cpu queue. - */ - if (cpu != padata->cpu) { - reorder_via_wq = 1; - cpu = padata->cpu; - } - - pqueue = per_cpu_ptr(pd->pqueue, cpu); + struct parallel_data *pd = padata->pd; + struct padata_parallel_queue *pqueue = per_cpu_ptr(pd->pqueue, + padata->cpu);
spin_lock(&pqueue->reorder.lock); list_add_tail(&padata->list, &pqueue->reorder.list); @@ -353,8 +338,6 @@ void padata_do_serial(struct padata_priv *padata) */ smp_mb__after_atomic();
- put_cpu(); - padata_reorder(pd); } EXPORT_SYMBOL(padata_do_serial);
From: Brent Lu brent.lu@intel.com
commit e7513c5786f8b33f0c107b3759e433bc6cbb2efa upstream.
There is a corner case that ALSA keeps increasing the hw_ptr but DMA already stop working/updating the position for a long time.
In following log we can see the position returned from DMA driver does not move at all but the hw_ptr got increased at some point of time so snd_pcm_avail() will return a large number which seems to be a buffer underrun event from user space program point of view. The program thinks there is space in the buffer and fill more data.
[ 418.510086] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 4096 avail 12368 [ 418.510149] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 6910 avail 9554 ... [ 418.681052] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 15102 avail 1362 [ 418.681130] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 16464 avail 0 [ 418.726515] sound pcmC0D5p: pos 96 hw_ptr 16464 appl_ptr 16464 avail 16368
This is because the hw_base will be increased by runtime->buffer_size frames unconditionally if the hw_ptr is not updated for over half of buffer time. As the hw_base increases, so does the hw_ptr increased by the same number.
The avail value returned from snd_pcm_avail() could exceed the limit (buffer_size) easily becase the hw_ptr itself got increased by same buffer_size samples when the corner case happens. In following log, the buffer_size is 16368 samples but the avail is 21810 samples so CRAS server complains about it.
[ 418.851755] sound pcmC0D5p: pos 96 hw_ptr 16464 appl_ptr 27390 avail 5442 [ 418.926491] sound pcmC0D5p: pos 96 hw_ptr 32832 appl_ptr 27390 avail 21810
cras_server[1907]: pcm_avail returned frames larger than buf_size: sof-glkda7219max: :0,5: 21810 > 16368
By updating runtime->hw_ptr_jiffies each time the HWSYNC is called, the hw_base will keep the same when buffer stall happens at long as the interval between each HWSYNC call is shorter than half of buffer time.
Following is a log captured by a patched kernel. The hw_base/hw_ptr value is fixed in this corner case and user space program should be aware of the buffer stall and handle it.
[ 293.525543] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 4096 avail 12368 [ 293.525606] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 6880 avail 9584 [ 293.525975] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 10976 avail 5488 [ 293.611178] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 15072 avail 1392 [ 293.696429] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 16464 avail 0 ... [ 381.139517] sound pcmC0D5p: pos 96 hw_ptr 96 appl_ptr 16464 avail 0
Signed-off-by: Brent Lu brent.lu@intel.com Reviewed-by: Jaroslav Kysela perex@perex.cz Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1589776238-23877-1-git-send-email-brent.lu@intel.c... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- sound/core/pcm_lib.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -456,6 +456,7 @@ static int snd_pcm_update_hw_ptr0(struct
no_delta_check: if (runtime->status->hw_ptr == new_hw_ptr) { + runtime->hw_ptr_jiffies = curr_jiffies; update_audio_tstamp(substream, &curr_tstamp, &audio_tstamp); return 0; }
From: Theodore Ts'o tytso@mit.edu
commit dac7a4b4b1f664934e8b713f529b629f67db313c upstream.
We must lock the xattr block before calculating or verifying the checksum in order to avoid spurious checksum failures.
https://bugzilla.kernel.org/show_bug.cgi?id=193661
Reported-by: Colin Ian King colin.king@canonical.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@vger.kernel.org Signed-off-by: Sultan Alsawaf sultan@kerneltoast.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- fs/ext4/xattr.c | 66 +++++++++++++++++++++++++++----------------------------- 1 file changed, 32 insertions(+), 34 deletions(-)
--- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -139,31 +139,26 @@ static __le32 ext4_xattr_block_csum(stru }
static int ext4_xattr_block_csum_verify(struct inode *inode, - sector_t block_nr, - struct ext4_xattr_header *hdr) + struct buffer_head *bh) { - if (ext4_has_metadata_csum(inode->i_sb) && - (hdr->h_checksum != ext4_xattr_block_csum(inode, block_nr, hdr))) - return 0; - return 1; -} - -static void ext4_xattr_block_csum_set(struct inode *inode, - sector_t block_nr, - struct ext4_xattr_header *hdr) -{ - if (!ext4_has_metadata_csum(inode->i_sb)) - return; + struct ext4_xattr_header *hdr = BHDR(bh); + int ret = 1;
- hdr->h_checksum = ext4_xattr_block_csum(inode, block_nr, hdr); + if (ext4_has_metadata_csum(inode->i_sb)) { + lock_buffer(bh); + ret = (hdr->h_checksum == ext4_xattr_block_csum(inode, + bh->b_blocknr, hdr)); + unlock_buffer(bh); + } + return ret; }
-static inline int ext4_handle_dirty_xattr_block(handle_t *handle, - struct inode *inode, - struct buffer_head *bh) +static void ext4_xattr_block_csum_set(struct inode *inode, + struct buffer_head *bh) { - ext4_xattr_block_csum_set(inode, bh->b_blocknr, BHDR(bh)); - return ext4_handle_dirty_metadata(handle, inode, bh); + if (ext4_has_metadata_csum(inode->i_sb)) + BHDR(bh)->h_checksum = ext4_xattr_block_csum(inode, + bh->b_blocknr, BHDR(bh)); }
static inline const struct xattr_handler * @@ -226,7 +221,7 @@ ext4_xattr_check_block(struct inode *ino if (buffer_verified(bh)) return 0;
- if (!ext4_xattr_block_csum_verify(inode, bh->b_blocknr, BHDR(bh))) + if (!ext4_xattr_block_csum_verify(inode, bh)) return -EFSBADCRC; error = ext4_xattr_check_names(BFIRST(bh), bh->b_data + bh->b_size, bh->b_data); @@ -590,23 +585,23 @@ ext4_xattr_release_block(handle_t *handl le32_add_cpu(&BHDR(bh)->h_refcount, -1); if (ce) mb_cache_entry_release(ce); + + ext4_xattr_block_csum_set(inode, bh); /* * Beware of this ugliness: Releasing of xattr block references * from different inodes can race and so we have to protect * from a race where someone else frees the block (and releases * its journal_head) before we are done dirtying the buffer. In * nojournal mode this race is harmless and we actually cannot - * call ext4_handle_dirty_xattr_block() with locked buffer as + * call ext4_handle_dirty_metadata() with locked buffer as * that function can call sync_dirty_buffer() so for that case * we handle the dirtying after unlocking the buffer. */ if (ext4_handle_valid(handle)) - error = ext4_handle_dirty_xattr_block(handle, inode, - bh); + error = ext4_handle_dirty_metadata(handle, inode, bh); unlock_buffer(bh); if (!ext4_handle_valid(handle)) - error = ext4_handle_dirty_xattr_block(handle, inode, - bh); + error = ext4_handle_dirty_metadata(handle, inode, bh); if (IS_SYNC(inode)) ext4_handle_sync(handle); dquot_free_block(inode, EXT4_C2B(EXT4_SB(inode->i_sb), 1)); @@ -837,13 +832,14 @@ ext4_xattr_block_set(handle_t *handle, s ext4_xattr_rehash(header(s->base), s->here); } + ext4_xattr_block_csum_set(inode, bs->bh); unlock_buffer(bs->bh); if (error == -EFSCORRUPTED) goto bad_block; if (!error) - error = ext4_handle_dirty_xattr_block(handle, - inode, - bs->bh); + error = ext4_handle_dirty_metadata(handle, + inode, + bs->bh); if (error) goto cleanup; goto inserted; @@ -912,10 +908,11 @@ inserted: le32_add_cpu(&BHDR(new_bh)->h_refcount, 1); ea_bdebug(new_bh, "reusing; refcount now=%d", le32_to_cpu(BHDR(new_bh)->h_refcount)); + ext4_xattr_block_csum_set(inode, new_bh); unlock_buffer(new_bh); - error = ext4_handle_dirty_xattr_block(handle, - inode, - new_bh); + error = ext4_handle_dirty_metadata(handle, + inode, + new_bh); if (error) goto cleanup_dquot; } @@ -965,11 +962,12 @@ getblk_failed: goto getblk_failed; } memcpy(new_bh->b_data, s->base, new_bh->b_size); + ext4_xattr_block_csum_set(inode, new_bh); set_buffer_uptodate(new_bh); unlock_buffer(new_bh); ext4_xattr_cache_insert(ext4_mb_cache, new_bh); - error = ext4_handle_dirty_xattr_block(handle, - inode, new_bh); + error = ext4_handle_dirty_metadata(handle, inode, + new_bh); if (error) goto cleanup; }
From: Colin Ian King colin.king@canonical.com
commit 98e2630284ab741804bd0713e932e725466f2f84 upstream.
Currently the kfree of output.pointer can be potentially freeing an uninitalized pointer in the case where out_data is NULL. Fix this by reworking the case where out_data is not-null to perform the ACPI status check and also the kfree of outpoint.pointer in one block and hence ensuring the pointer is only freed when it has been used.
Also replace the if (ptr != NULL) idiom with just if (ptr).
Fixes: ff0e9f26288d ("platform/x86: alienware-wmi: Correct a memory leak") Signed-off-by: Colin Ian King colin.king@canonical.com Signed-off-by: Darren Hart (VMware) dvhart@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/platform/x86/alienware-wmi.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-)
--- a/drivers/platform/x86/alienware-wmi.c +++ b/drivers/platform/x86/alienware-wmi.c @@ -449,23 +449,22 @@ static acpi_status alienware_hdmi_comman
input.length = (acpi_size) sizeof(*in_args); input.pointer = in_args; - if (out_data != NULL) { + if (out_data) { output.length = ACPI_ALLOCATE_BUFFER; output.pointer = NULL; status = wmi_evaluate_method(WMAX_CONTROL_GUID, 1, command, &input, &output); - } else + if (ACPI_SUCCESS(status)) { + obj = (union acpi_object *)output.pointer; + if (obj && obj->type == ACPI_TYPE_INTEGER) + *out_data = (u32)obj->integer.value; + } + kfree(output.pointer); + } else { status = wmi_evaluate_method(WMAX_CONTROL_GUID, 1, command, &input, NULL); - - if (ACPI_SUCCESS(status) && out_data != NULL) { - obj = (union acpi_object *)output.pointer; - if (obj && obj->type == ACPI_TYPE_INTEGER) - *out_data = (u32) obj->integer.value; } - kfree(output.pointer); return status; - }
static ssize_t show_hdmi_cable(struct device *dev,
From: Vishal Verma vishal.l.verma@intel.com
[ Upstream commit 2f8c9011151337d0bc106693f272f9bddbccfab2 ]
We call btt_log_read() twice, once to get the 'old' log entry, and again to get the 'new' entry. However, we have no use for the 'old' entry, so remove it.
Cc: Dan Williams dan.j.williams@intel.com Signed-off-by: Vishal Verma vishal.l.verma@intel.com Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvdimm/btt.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c index 957234272ef7..727eaf203463 100644 --- a/drivers/nvdimm/btt.c +++ b/drivers/nvdimm/btt.c @@ -443,9 +443,9 @@ static int btt_log_init(struct arena_info *arena)
static int btt_freelist_init(struct arena_info *arena) { - int old, new, ret; + int new, ret; u32 i, map_entry; - struct log_entry log_new, log_old; + struct log_entry log_new;
arena->freelist = kcalloc(arena->nfree, sizeof(struct free_entry), GFP_KERNEL); @@ -453,10 +453,6 @@ static int btt_freelist_init(struct arena_info *arena) return -ENOMEM;
for (i = 0; i < arena->nfree; i++) { - old = btt_log_read(arena, i, &log_old, LOG_OLD_ENT); - if (old < 0) - return old; - new = btt_log_read(arena, i, &log_new, LOG_NEW_ENT); if (new < 0) return new;
From: Guillaume Nault g.nault@alphalink.fr
commit 0382a25af3c771a8e4d5e417d1834cbe28c2aaac upstream.
Socket flags aren't updated atomically, so the socket must be locked while reading the SOCK_ZAPPED flag.
This issue exists for both l2tp_ip and l2tp_ip6. For IPv6, this patch also brings error handling for __ip6_datagram_connect() failures.
Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/ipv6.h | 2 ++ net/ipv6/datagram.c | 4 +++- net/l2tp/l2tp_ip.c | 19 ++++++++++++------- net/l2tp/l2tp_ip6.c | 16 +++++++++++----- 4 files changed, 28 insertions(+), 13 deletions(-)
--- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -915,6 +915,8 @@ int compat_ipv6_setsockopt(struct sock * int compat_ipv6_getsockopt(struct sock *sk, int level, int optname, char __user *optval, int __user *optlen);
+int __ip6_datagram_connect(struct sock *sk, struct sockaddr *addr, + int addr_len); int ip6_datagram_connect(struct sock *sk, struct sockaddr *addr, int addr_len); int ip6_datagram_connect_v6_only(struct sock *sk, struct sockaddr *addr, int addr_len); --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -40,7 +40,8 @@ static bool ipv6_mapped_addr_any(const s return ipv6_addr_v4mapped(a) && (a->s6_addr32[3] == 0); }
-static int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) +int __ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr, + int addr_len) { struct sockaddr_in6 *usin = (struct sockaddr_in6 *) uaddr; struct inet_sock *inet = inet_sk(sk); @@ -213,6 +214,7 @@ out: fl6_sock_release(flowlabel); return err; } +EXPORT_SYMBOL_GPL(__ip6_datagram_connect);
int ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) { --- a/net/l2tp/l2tp_ip.c +++ b/net/l2tp/l2tp_ip.c @@ -321,21 +321,24 @@ static int l2tp_ip_connect(struct sock * struct sockaddr_l2tpip *lsa = (struct sockaddr_l2tpip *) uaddr; int rc;
- if (sock_flag(sk, SOCK_ZAPPED)) /* Must bind first - autobinding does not work */ - return -EINVAL; - if (addr_len < sizeof(*lsa)) return -EINVAL;
if (ipv4_is_multicast(lsa->l2tp_addr.s_addr)) return -EINVAL;
- rc = ip4_datagram_connect(sk, uaddr, addr_len); - if (rc < 0) - return rc; - lock_sock(sk);
+ /* Must bind first - autobinding does not work */ + if (sock_flag(sk, SOCK_ZAPPED)) { + rc = -EINVAL; + goto out_sk; + } + + rc = __ip4_datagram_connect(sk, uaddr, addr_len); + if (rc < 0) + goto out_sk; + l2tp_ip_sk(sk)->peer_conn_id = lsa->l2tp_conn_id;
write_lock_bh(&l2tp_ip_lock); @@ -343,7 +346,9 @@ static int l2tp_ip_connect(struct sock * sk_add_bind_node(sk, &l2tp_ip_bind_table); write_unlock_bh(&l2tp_ip_lock);
+out_sk: release_sock(sk); + return rc; }
--- a/net/l2tp/l2tp_ip6.c +++ b/net/l2tp/l2tp_ip6.c @@ -383,9 +383,6 @@ static int l2tp_ip6_connect(struct sock int addr_type; int rc;
- if (sock_flag(sk, SOCK_ZAPPED)) /* Must bind first - autobinding does not work */ - return -EINVAL; - if (addr_len < sizeof(*lsa)) return -EINVAL;
@@ -402,10 +399,18 @@ static int l2tp_ip6_connect(struct sock return -EINVAL; }
- rc = ip6_datagram_connect(sk, uaddr, addr_len); - lock_sock(sk);
+ /* Must bind first - autobinding does not work */ + if (sock_flag(sk, SOCK_ZAPPED)) { + rc = -EINVAL; + goto out_sk; + } + + rc = __ip6_datagram_connect(sk, uaddr, addr_len); + if (rc < 0) + goto out_sk; + l2tp_ip6_sk(sk)->peer_conn_id = lsa->l2tp_conn_id;
write_lock_bh(&l2tp_ip6_lock); @@ -413,6 +418,7 @@ static int l2tp_ip6_connect(struct sock sk_add_bind_node(sk, &l2tp_ip6_bind_table); write_unlock_bh(&l2tp_ip6_lock);
+out_sk: release_sock(sk);
return rc;
From: Guillaume Nault g.nault@alphalink.fr
commit d5e3a190937a1e386671266202c62565741f0f1a upstream.
It's not enough to check for sockets bound to same address at the beginning of l2tp_ip{,6}_bind(): even if no socket is found at that time, a socket with the same address could be bound before we take the l2tp lock again.
This patch moves the lookup right before inserting the new socket, so that no change can ever happen to the list between address lookup and socket insertion.
Care is taken to avoid side effects on the socket in case of failure. That is, modifications of the socket are done after the lookup, when binding is guaranteed to succeed, and before releasing the l2tp lock, so that concurrent lookups will always see fully initialised sockets.
For l2tp_ip, 'ret' is set to -EINVAL before checking the SOCK_ZAPPED bit. Error code was mistakenly set to -EADDRINUSE on error by commit 32c231164b76 ("l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()"). Using -EINVAL restores original behaviour.
For l2tp_ip6, the lookup is now always done with the correct bound device. Before this patch, when binding to a link-local address, the lookup was done with the original sk->sk_bound_dev_if, which was later overwritten with addr->l2tp_scope_id. Lookup is now performed with the final sk->sk_bound_dev_if value.
Finally, the (addr_len >= sizeof(struct sockaddr_in6)) check has been dropped: addr is a sockaddr_l2tpip6 not sockaddr_in6 and addr_len has already been checked at this point (this part of the code seems to have been copy-pasted from net/ipv6/raw.c).
Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_ip.c | 27 ++++++++++++--------------- net/l2tp/l2tp_ip6.c | 43 ++++++++++++++++++++----------------------- 2 files changed, 32 insertions(+), 38 deletions(-)
--- a/net/l2tp/l2tp_ip.c +++ b/net/l2tp/l2tp_ip.c @@ -269,15 +269,9 @@ static int l2tp_ip_bind(struct sock *sk, if (addr->l2tp_family != AF_INET) return -EINVAL;
- ret = -EADDRINUSE; - read_lock_bh(&l2tp_ip_lock); - if (__l2tp_ip_bind_lookup(net, addr->l2tp_addr.s_addr, - sk->sk_bound_dev_if, addr->l2tp_conn_id)) - goto out_in_use; - - read_unlock_bh(&l2tp_ip_lock); - lock_sock(sk); + + ret = -EINVAL; if (!sock_flag(sk, SOCK_ZAPPED)) goto out;
@@ -294,14 +288,22 @@ static int l2tp_ip_bind(struct sock *sk, inet->inet_rcv_saddr = inet->inet_saddr = addr->l2tp_addr.s_addr; if (chk_addr_ret == RTN_MULTICAST || chk_addr_ret == RTN_BROADCAST) inet->inet_saddr = 0; /* Use device */ - sk_dst_reset(sk);
+ write_lock_bh(&l2tp_ip_lock); + if (__l2tp_ip_bind_lookup(net, addr->l2tp_addr.s_addr, + sk->sk_bound_dev_if, addr->l2tp_conn_id)) { + write_unlock_bh(&l2tp_ip_lock); + ret = -EADDRINUSE; + goto out; + } + + sk_dst_reset(sk); l2tp_ip_sk(sk)->conn_id = addr->l2tp_conn_id;
- write_lock_bh(&l2tp_ip_lock); sk_add_bind_node(sk, &l2tp_ip_bind_table); sk_del_node_init(sk); write_unlock_bh(&l2tp_ip_lock); + ret = 0; sock_reset_flag(sk, SOCK_ZAPPED);
@@ -309,11 +311,6 @@ out: release_sock(sk);
return ret; - -out_in_use: - read_unlock_bh(&l2tp_ip_lock); - - return ret; }
static int l2tp_ip_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) --- a/net/l2tp/l2tp_ip6.c +++ b/net/l2tp/l2tp_ip6.c @@ -278,6 +278,7 @@ static int l2tp_ip6_bind(struct sock *sk struct sockaddr_l2tpip6 *addr = (struct sockaddr_l2tpip6 *) uaddr; struct net *net = sock_net(sk); __be32 v4addr = 0; + int bound_dev_if; int addr_type; int err;
@@ -296,13 +297,6 @@ static int l2tp_ip6_bind(struct sock *sk if (addr_type & IPV6_ADDR_MULTICAST) return -EADDRNOTAVAIL;
- err = -EADDRINUSE; - read_lock_bh(&l2tp_ip6_lock); - if (__l2tp_ip6_bind_lookup(net, &addr->l2tp_addr, - sk->sk_bound_dev_if, addr->l2tp_conn_id)) - goto out_in_use; - read_unlock_bh(&l2tp_ip6_lock); - lock_sock(sk);
err = -EINVAL; @@ -312,28 +306,25 @@ static int l2tp_ip6_bind(struct sock *sk if (sk->sk_state != TCP_CLOSE) goto out_unlock;
+ bound_dev_if = sk->sk_bound_dev_if; + /* Check if the address belongs to the host. */ rcu_read_lock(); if (addr_type != IPV6_ADDR_ANY) { struct net_device *dev = NULL;
if (addr_type & IPV6_ADDR_LINKLOCAL) { - if (addr_len >= sizeof(struct sockaddr_in6) && - addr->l2tp_scope_id) { - /* Override any existing binding, if another - * one is supplied by user. - */ - sk->sk_bound_dev_if = addr->l2tp_scope_id; - } + if (addr->l2tp_scope_id) + bound_dev_if = addr->l2tp_scope_id;
/* Binding to link-local address requires an - interface */ - if (!sk->sk_bound_dev_if) + * interface. + */ + if (!bound_dev_if) goto out_unlock_rcu;
err = -ENODEV; - dev = dev_get_by_index_rcu(sock_net(sk), - sk->sk_bound_dev_if); + dev = dev_get_by_index_rcu(sock_net(sk), bound_dev_if); if (!dev) goto out_unlock_rcu; } @@ -348,13 +339,22 @@ static int l2tp_ip6_bind(struct sock *sk } rcu_read_unlock();
- inet->inet_rcv_saddr = inet->inet_saddr = v4addr; + write_lock_bh(&l2tp_ip6_lock); + if (__l2tp_ip6_bind_lookup(net, &addr->l2tp_addr, bound_dev_if, + addr->l2tp_conn_id)) { + write_unlock_bh(&l2tp_ip6_lock); + err = -EADDRINUSE; + goto out_unlock; + } + + inet->inet_saddr = v4addr; + inet->inet_rcv_saddr = v4addr; + sk->sk_bound_dev_if = bound_dev_if; sk->sk_v6_rcv_saddr = addr->l2tp_addr; np->saddr = addr->l2tp_addr;
l2tp_ip6_sk(sk)->conn_id = addr->l2tp_conn_id;
- write_lock_bh(&l2tp_ip6_lock); sk_add_bind_node(sk, &l2tp_ip6_bind_table); sk_del_node_init(sk); write_unlock_bh(&l2tp_ip6_lock); @@ -367,10 +367,7 @@ out_unlock_rcu: rcu_read_unlock(); out_unlock: release_sock(sk); - return err;
-out_in_use: - read_unlock_bh(&l2tp_ip6_lock); return err; }
From: Guillaume Nault g.nault@alphalink.fr
commit 5e6a9e5a3554a5b3db09cdc22253af1849c65dff upstream.
l2tp_session_find() doesn't take any reference on the returned session. Therefore, the session may disappear while sending the notification.
Use l2tp_session_get() instead and decrement session's refcount once the notification is sent.
Backporting Notes
This is a backport of a backport.
Fixes: 33f72e6f0c67 ("l2tp : multicast notification to the registered listeners") Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Amit Pundir amit.pundir@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_netlink.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/net/l2tp/l2tp_netlink.c +++ b/net/l2tp/l2tp_netlink.c @@ -626,10 +626,12 @@ static int l2tp_nl_cmd_session_create(st session_id, peer_session_id, &cfg);
if (ret >= 0) { - session = l2tp_session_find(net, tunnel, session_id); - if (session) + session = l2tp_session_get(net, tunnel, session_id, false); + if (session) { ret = l2tp_session_notify(&l2tp_nl_family, info, session, L2TP_CMD_SESSION_CREATE); + l2tp_session_dec_refcount(session); + } }
out:
From: Guillaume Nault g.nault@alphalink.fr
commit 2777e2ab5a9cf2b4524486c6db1517a6ded25261 upstream.
Callers of l2tp_nl_session_find() need to hold a reference on the returned session since there's no guarantee that it isn't going to disappear from under them.
Relying on the fact that no l2tp netlink message may be processed concurrently isn't enough: sessions can be deleted by other means (e.g. by closing the PPPOL2TP socket of a ppp pseudowire).
l2tp_nl_cmd_session_delete() is a bit special: it runs a callback function that may require a previous call to session->ref(). In particular, for ppp pseudowires, the callback is l2tp_session_delete(), which then calls pppol2tp_session_close() and dereferences the PPPOL2TP socket. The socket might already be gone at the moment l2tp_session_delete() calls session->ref(), so we need to take a reference during the session lookup. So we need to pass the do_ref variable down to l2tp_session_get() and l2tp_session_get_by_ifname().
Since all callers have to be updated, l2tp_session_find_by_ifname() and l2tp_nl_session_find() are renamed to reflect their new behaviour.
Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Amit Pundir amit.pundir@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_core.c | 9 +++++++-- net/l2tp/l2tp_core.h | 3 ++- net/l2tp/l2tp_netlink.c | 39 ++++++++++++++++++++++++++------------- 3 files changed, 35 insertions(+), 16 deletions(-)
--- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -355,7 +355,8 @@ EXPORT_SYMBOL_GPL(l2tp_session_get_nth); /* Lookup a session by interface name. * This is very inefficient but is only used by management interfaces. */ -struct l2tp_session *l2tp_session_find_by_ifname(struct net *net, char *ifname) +struct l2tp_session *l2tp_session_get_by_ifname(struct net *net, char *ifname, + bool do_ref) { struct l2tp_net *pn = l2tp_pernet(net); int hash; @@ -365,7 +366,11 @@ struct l2tp_session *l2tp_session_find_b for (hash = 0; hash < L2TP_HASH_SIZE_2; hash++) { hlist_for_each_entry_rcu(session, &pn->l2tp_session_hlist[hash], global_hlist) { if (!strcmp(session->ifname, ifname)) { + l2tp_session_inc_refcount(session); + if (do_ref && session->ref) + session->ref(session); rcu_read_unlock_bh(); + return session; } } @@ -375,7 +380,7 @@ struct l2tp_session *l2tp_session_find_b
return NULL; } -EXPORT_SYMBOL_GPL(l2tp_session_find_by_ifname); +EXPORT_SYMBOL_GPL(l2tp_session_get_by_ifname);
static int l2tp_session_add_to_tunnel(struct l2tp_tunnel *tunnel, struct l2tp_session *session) --- a/net/l2tp/l2tp_core.h +++ b/net/l2tp/l2tp_core.h @@ -252,7 +252,8 @@ struct l2tp_session *l2tp_session_find(s u32 session_id); struct l2tp_session *l2tp_session_get_nth(struct l2tp_tunnel *tunnel, int nth, bool do_ref); -struct l2tp_session *l2tp_session_find_by_ifname(struct net *net, char *ifname); +struct l2tp_session *l2tp_session_get_by_ifname(struct net *net, char *ifname, + bool do_ref); struct l2tp_tunnel *l2tp_tunnel_find(struct net *net, u32 tunnel_id); struct l2tp_tunnel *l2tp_tunnel_find_nth(struct net *net, int nth);
--- a/net/l2tp/l2tp_netlink.c +++ b/net/l2tp/l2tp_netlink.c @@ -55,7 +55,8 @@ static int l2tp_nl_session_send(struct s /* Accessed under genl lock */ static const struct l2tp_nl_cmd_ops *l2tp_nl_cmd_ops[__L2TP_PWTYPE_MAX];
-static struct l2tp_session *l2tp_nl_session_find(struct genl_info *info) +static struct l2tp_session *l2tp_nl_session_get(struct genl_info *info, + bool do_ref) { u32 tunnel_id; u32 session_id; @@ -66,14 +67,15 @@ static struct l2tp_session *l2tp_nl_sess
if (info->attrs[L2TP_ATTR_IFNAME]) { ifname = nla_data(info->attrs[L2TP_ATTR_IFNAME]); - session = l2tp_session_find_by_ifname(net, ifname); + session = l2tp_session_get_by_ifname(net, ifname, do_ref); } else if ((info->attrs[L2TP_ATTR_SESSION_ID]) && (info->attrs[L2TP_ATTR_CONN_ID])) { tunnel_id = nla_get_u32(info->attrs[L2TP_ATTR_CONN_ID]); session_id = nla_get_u32(info->attrs[L2TP_ATTR_SESSION_ID]); tunnel = l2tp_tunnel_find(net, tunnel_id); if (tunnel) - session = l2tp_session_find(net, tunnel, session_id); + session = l2tp_session_get(net, tunnel, session_id, + do_ref); }
return session; @@ -644,7 +646,7 @@ static int l2tp_nl_cmd_session_delete(st struct l2tp_session *session; u16 pw_type;
- session = l2tp_nl_session_find(info); + session = l2tp_nl_session_get(info, true); if (session == NULL) { ret = -ENODEV; goto out; @@ -658,6 +660,10 @@ static int l2tp_nl_cmd_session_delete(st if (l2tp_nl_cmd_ops[pw_type] && l2tp_nl_cmd_ops[pw_type]->session_delete) ret = (*l2tp_nl_cmd_ops[pw_type]->session_delete)(session);
+ if (session->deref) + session->deref(session); + l2tp_session_dec_refcount(session); + out: return ret; } @@ -667,7 +673,7 @@ static int l2tp_nl_cmd_session_modify(st int ret = 0; struct l2tp_session *session;
- session = l2tp_nl_session_find(info); + session = l2tp_nl_session_get(info, false); if (session == NULL) { ret = -ENODEV; goto out; @@ -702,6 +708,8 @@ static int l2tp_nl_cmd_session_modify(st ret = l2tp_session_notify(&l2tp_nl_family, info, session, L2TP_CMD_SESSION_MODIFY);
+ l2tp_session_dec_refcount(session); + out: return ret; } @@ -788,29 +796,34 @@ static int l2tp_nl_cmd_session_get(struc struct sk_buff *msg; int ret;
- session = l2tp_nl_session_find(info); + session = l2tp_nl_session_get(info, false); if (session == NULL) { ret = -ENODEV; - goto out; + goto err; }
msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); if (!msg) { ret = -ENOMEM; - goto out; + goto err_ref; }
ret = l2tp_nl_session_send(msg, info->snd_portid, info->snd_seq, 0, session, L2TP_CMD_SESSION_GET); if (ret < 0) - goto err_out; + goto err_ref_msg;
- return genlmsg_unicast(genl_info_net(info), msg, info->snd_portid); + ret = genlmsg_unicast(genl_info_net(info), msg, info->snd_portid);
-err_out: - nlmsg_free(msg); + l2tp_session_dec_refcount(session);
-out: + return ret; + +err_ref_msg: + nlmsg_free(msg); +err_ref: + l2tp_session_dec_refcount(session); +err: return ret; }
From: Guillaume Nault g.nault@alphalink.fr
commit 8f7dc9ae4a7aece9fbc3e6637bdfa38b36bcdf09 upstream.
Using l2tp_tunnel_find() in l2tp_ip_recv() is wrong for two reasons:
* It doesn't take a reference on the returned tunnel, which makes the call racy wrt. concurrent tunnel deletion.
* The lookup is only based on the tunnel identifier, so it can return a tunnel that doesn't match the packet's addresses or protocol.
For example, a packet sent to an L2TPv3 over IPv6 tunnel can be delivered to an L2TPv2 over UDPv4 tunnel. This is worse than a simple cross-talk: when delivering the packet to an L2TP over UDP tunnel, the corresponding socket is UDP, where ->sk_backlog_rcv() is NULL. Calling sk_receive_skb() will then crash the kernel by trying to execute this callback.
And l2tp_tunnel_find() isn't even needed here. __l2tp_ip_bind_lookup() properly checks the socket binding and connection settings. It was used as a fallback mechanism for finding tunnels that didn't have their data path registered yet. But it's not limited to this case and can be used to replace l2tp_tunnel_find() in the general case.
Fix l2tp_ip6 in the same way.
Fixes: 0d76751fad77 ("l2tp: Add L2TPv3 IP encapsulation (no UDP) support") Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6") Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Cc: Nicolas Schier n.schier@avm.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_ip.c | 22 ++++++++-------------- net/l2tp/l2tp_ip6.c | 23 ++++++++--------------- 2 files changed, 16 insertions(+), 29 deletions(-)
--- a/net/l2tp/l2tp_ip.c +++ b/net/l2tp/l2tp_ip.c @@ -122,6 +122,7 @@ static int l2tp_ip_recv(struct sk_buff * unsigned char *ptr, *optr; struct l2tp_session *session; struct l2tp_tunnel *tunnel = NULL; + struct iphdr *iph; int length;
if (!pskb_may_pull(skb, 4)) @@ -180,23 +181,16 @@ pass_up: goto discard;
tunnel_id = ntohl(*(__be32 *) &skb->data[4]); - tunnel = l2tp_tunnel_find(net, tunnel_id); - if (tunnel) { - sk = tunnel->sock; - sock_hold(sk); - } else { - struct iphdr *iph = (struct iphdr *) skb_network_header(skb); - - read_lock_bh(&l2tp_ip_lock); - sk = __l2tp_ip_bind_lookup(net, iph->daddr, 0, tunnel_id); - if (!sk) { - read_unlock_bh(&l2tp_ip_lock); - goto discard; - } + iph = (struct iphdr *)skb_network_header(skb);
- sock_hold(sk); + read_lock_bh(&l2tp_ip_lock); + sk = __l2tp_ip_bind_lookup(net, iph->daddr, 0, tunnel_id); + if (!sk) { read_unlock_bh(&l2tp_ip_lock); + goto discard; } + sock_hold(sk); + read_unlock_bh(&l2tp_ip_lock);
if (!xfrm4_policy_check(sk, XFRM_POLICY_IN, skb)) goto discard_put; --- a/net/l2tp/l2tp_ip6.c +++ b/net/l2tp/l2tp_ip6.c @@ -134,6 +134,7 @@ static int l2tp_ip6_recv(struct sk_buff unsigned char *ptr, *optr; struct l2tp_session *session; struct l2tp_tunnel *tunnel = NULL; + struct ipv6hdr *iph; int length;
if (!pskb_may_pull(skb, 4)) @@ -193,24 +194,16 @@ pass_up: goto discard;
tunnel_id = ntohl(*(__be32 *) &skb->data[4]); - tunnel = l2tp_tunnel_find(net, tunnel_id); - if (tunnel) { - sk = tunnel->sock; - sock_hold(sk); - } else { - struct ipv6hdr *iph = ipv6_hdr(skb); - - read_lock_bh(&l2tp_ip6_lock); - sk = __l2tp_ip6_bind_lookup(net, &iph->daddr, - 0, tunnel_id); - if (!sk) { - read_unlock_bh(&l2tp_ip6_lock); - goto discard; - } + iph = ipv6_hdr(skb);
- sock_hold(sk); + read_lock_bh(&l2tp_ip6_lock); + sk = __l2tp_ip6_bind_lookup(net, &iph->daddr, 0, tunnel_id); + if (!sk) { read_unlock_bh(&l2tp_ip6_lock); + goto discard; } + sock_hold(sk); + read_unlock_bh(&l2tp_ip6_lock);
if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb)) goto discard_put;
From: Asbj�rn Sloth T�nnesen asbjorn@asbjorn.st
commit 41c43fbee68f4f9a2a9675d83bca91c77862d7f0 upstream.
Move the L2TP_MSG_* definitions to UAPI, as it is part of the netlink API.
Signed-off-by: Asbjoern Sloth Toennesen asbjorn@asbjorn.st Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/uapi/linux/l2tp.h | 17 ++++++++++++++++- net/l2tp/l2tp_core.h | 10 ---------- 2 files changed, 16 insertions(+), 11 deletions(-)
--- a/include/uapi/linux/l2tp.h +++ b/include/uapi/linux/l2tp.h @@ -108,7 +108,7 @@ enum { L2TP_ATTR_VLAN_ID, /* u16 */ L2TP_ATTR_COOKIE, /* 0, 4 or 8 bytes */ L2TP_ATTR_PEER_COOKIE, /* 0, 4 or 8 bytes */ - L2TP_ATTR_DEBUG, /* u32 */ + L2TP_ATTR_DEBUG, /* u32, enum l2tp_debug_flags */ L2TP_ATTR_RECV_SEQ, /* u8 */ L2TP_ATTR_SEND_SEQ, /* u8 */ L2TP_ATTR_LNS_MODE, /* u8 */ @@ -173,6 +173,21 @@ enum l2tp_seqmode { L2TP_SEQ_ALL = 2, };
+/** + * enum l2tp_debug_flags - debug message categories for L2TP tunnels/sessions + * + * @L2TP_MSG_DEBUG: verbose debug (if compiled in) + * @L2TP_MSG_CONTROL: userspace - kernel interface + * @L2TP_MSG_SEQ: sequence numbers + * @L2TP_MSG_DATA: data packets + */ +enum l2tp_debug_flags { + L2TP_MSG_DEBUG = (1 << 0), + L2TP_MSG_CONTROL = (1 << 1), + L2TP_MSG_SEQ = (1 << 2), + L2TP_MSG_DATA = (1 << 3), +}; + /* * NETLINK_GENERIC related info */ --- a/net/l2tp/l2tp_core.h +++ b/net/l2tp/l2tp_core.h @@ -23,16 +23,6 @@ #define L2TP_HASH_BITS_2 8 #define L2TP_HASH_SIZE_2 (1 << L2TP_HASH_BITS_2)
-/* Debug message categories for the DEBUG socket option */ -enum { - L2TP_MSG_DEBUG = (1 << 0), /* verbose debug (if - * compiled in) */ - L2TP_MSG_CONTROL = (1 << 1), /* userspace - kernel - * interface */ - L2TP_MSG_SEQ = (1 << 2), /* sequence numbers */ - L2TP_MSG_DATA = (1 << 3), /* data packets */ -}; - struct sk_buff;
struct l2tp_stats {
From: Asbj�rn Sloth T�nnesen asbjorn@asbjorn.st
commit 47c3e7783be4e142b861d34b5c2e223330b05d8a upstream.
PPPOL2TP_MSG_* and L2TP_MSG_* are duplicates, and are being used interchangeably in the kernel, so let's standardize on L2TP_MSG_* internally, and keep PPPOL2TP_MSG_* defined in UAPI for compatibility.
Signed-off-by: Asbjoern Sloth Toennesen asbjorn@asbjorn.st Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/networking/l2tp.txt | 8 ++++---- include/uapi/linux/if_pppol2tp.h | 13 ++++++------- 2 files changed, 10 insertions(+), 11 deletions(-)
--- a/Documentation/networking/l2tp.txt +++ b/Documentation/networking/l2tp.txt @@ -177,10 +177,10 @@ setsockopt on the PPPoX socket to set a
The following debug mask bits are available:
-PPPOL2TP_MSG_DEBUG verbose debug (if compiled in) -PPPOL2TP_MSG_CONTROL userspace - kernel interface -PPPOL2TP_MSG_SEQ sequence numbers handling -PPPOL2TP_MSG_DATA data packets +L2TP_MSG_DEBUG verbose debug (if compiled in) +L2TP_MSG_CONTROL userspace - kernel interface +L2TP_MSG_SEQ sequence numbers handling +L2TP_MSG_DATA data packets
If enabled, files under a l2tp debugfs directory can be used to dump kernel state about L2TP tunnels and sessions. To access it, the --- a/include/uapi/linux/if_pppol2tp.h +++ b/include/uapi/linux/if_pppol2tp.h @@ -17,6 +17,7 @@
#include <linux/types.h>
+#include <linux/l2tp.h>
/* Structure used to connect() the socket to a particular tunnel UDP * socket over IPv4. @@ -89,14 +90,12 @@ enum { PPPOL2TP_SO_REORDERTO = 5, };
-/* Debug message categories for the DEBUG socket option */ +/* Debug message categories for the DEBUG socket option (deprecated) */ enum { - PPPOL2TP_MSG_DEBUG = (1 << 0), /* verbose debug (if - * compiled in) */ - PPPOL2TP_MSG_CONTROL = (1 << 1), /* userspace - kernel - * interface */ - PPPOL2TP_MSG_SEQ = (1 << 2), /* sequence numbers */ - PPPOL2TP_MSG_DATA = (1 << 3), /* data packets */ + PPPOL2TP_MSG_DEBUG = L2TP_MSG_DEBUG, + PPPOL2TP_MSG_CONTROL = L2TP_MSG_CONTROL, + PPPOL2TP_MSG_SEQ = L2TP_MSG_SEQ, + PPPOL2TP_MSG_DATA = L2TP_MSG_DATA, };
From: Asbj�rn Sloth T�nnesen asbjorn@asbjorn.st
commit fba40c632c6473fa89660e870a6042c0fe733f8c upstream.
Signed-off-by: Asbjoern Sloth Toennesen asbjorn@asbjorn.st Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_ppp.c | 54 ++++++++++++++++++++++++++-------------------------- 1 file changed, 27 insertions(+), 27 deletions(-)
--- a/net/l2tp/l2tp_ppp.c +++ b/net/l2tp/l2tp_ppp.c @@ -230,7 +230,7 @@ static void pppol2tp_recv(struct l2tp_se
if (sk->sk_state & PPPOX_BOUND) { struct pppox_sock *po; - l2tp_dbg(session, PPPOL2TP_MSG_DATA, + l2tp_dbg(session, L2TP_MSG_DATA, "%s: recv %d byte data frame, passing to ppp\n", session->name, data_len);
@@ -253,7 +253,7 @@ static void pppol2tp_recv(struct l2tp_se po = pppox_sk(sk); ppp_input(&po->chan, skb); } else { - l2tp_dbg(session, PPPOL2TP_MSG_DATA, + l2tp_dbg(session, L2TP_MSG_DATA, "%s: recv %d byte data frame, passing to L2TP socket\n", session->name, data_len);
@@ -266,7 +266,7 @@ static void pppol2tp_recv(struct l2tp_se return;
no_sock: - l2tp_info(session, PPPOL2TP_MSG_DATA, "%s: no socket\n", session->name); + l2tp_info(session, L2TP_MSG_DATA, "%s: no socket\n", session->name); kfree_skb(skb); }
@@ -797,7 +797,7 @@ out_no_ppp: /* This is how we get the session context from the socket. */ sk->sk_user_data = session; sk->sk_state = PPPOX_CONNECTED; - l2tp_info(session, PPPOL2TP_MSG_CONTROL, "%s: created\n", + l2tp_info(session, L2TP_MSG_CONTROL, "%s: created\n", session->name);
end: @@ -848,7 +848,7 @@ static int pppol2tp_session_create(struc ps = l2tp_session_priv(session); ps->tunnel_sock = tunnel->sock;
- l2tp_info(session, PPPOL2TP_MSG_CONTROL, "%s: created\n", + l2tp_info(session, L2TP_MSG_CONTROL, "%s: created\n", session->name);
error = 0; @@ -1010,7 +1010,7 @@ static int pppol2tp_session_ioctl(struct struct l2tp_tunnel *tunnel = session->tunnel; struct pppol2tp_ioc_stats stats;
- l2tp_dbg(session, PPPOL2TP_MSG_CONTROL, + l2tp_dbg(session, L2TP_MSG_CONTROL, "%s: pppol2tp_session_ioctl(cmd=%#x, arg=%#lx)\n", session->name, cmd, arg);
@@ -1033,7 +1033,7 @@ static int pppol2tp_session_ioctl(struct if (copy_to_user((void __user *) arg, &ifr, sizeof(struct ifreq))) break;
- l2tp_info(session, PPPOL2TP_MSG_CONTROL, "%s: get mtu=%d\n", + l2tp_info(session, L2TP_MSG_CONTROL, "%s: get mtu=%d\n", session->name, session->mtu); err = 0; break; @@ -1049,7 +1049,7 @@ static int pppol2tp_session_ioctl(struct
session->mtu = ifr.ifr_mtu;
- l2tp_info(session, PPPOL2TP_MSG_CONTROL, "%s: set mtu=%d\n", + l2tp_info(session, L2TP_MSG_CONTROL, "%s: set mtu=%d\n", session->name, session->mtu); err = 0; break; @@ -1063,7 +1063,7 @@ static int pppol2tp_session_ioctl(struct if (put_user(session->mru, (int __user *) arg)) break;
- l2tp_info(session, PPPOL2TP_MSG_CONTROL, "%s: get mru=%d\n", + l2tp_info(session, L2TP_MSG_CONTROL, "%s: get mru=%d\n", session->name, session->mru); err = 0; break; @@ -1078,7 +1078,7 @@ static int pppol2tp_session_ioctl(struct break;
session->mru = val; - l2tp_info(session, PPPOL2TP_MSG_CONTROL, "%s: set mru=%d\n", + l2tp_info(session, L2TP_MSG_CONTROL, "%s: set mru=%d\n", session->name, session->mru); err = 0; break; @@ -1088,7 +1088,7 @@ static int pppol2tp_session_ioctl(struct if (put_user(ps->flags, (int __user *) arg)) break;
- l2tp_info(session, PPPOL2TP_MSG_CONTROL, "%s: get flags=%d\n", + l2tp_info(session, L2TP_MSG_CONTROL, "%s: get flags=%d\n", session->name, ps->flags); err = 0; break; @@ -1098,7 +1098,7 @@ static int pppol2tp_session_ioctl(struct if (get_user(val, (int __user *) arg)) break; ps->flags = val; - l2tp_info(session, PPPOL2TP_MSG_CONTROL, "%s: set flags=%d\n", + l2tp_info(session, L2TP_MSG_CONTROL, "%s: set flags=%d\n", session->name, ps->flags); err = 0; break; @@ -1115,7 +1115,7 @@ static int pppol2tp_session_ioctl(struct if (copy_to_user((void __user *) arg, &stats, sizeof(stats))) break; - l2tp_info(session, PPPOL2TP_MSG_CONTROL, "%s: get L2TP stats\n", + l2tp_info(session, L2TP_MSG_CONTROL, "%s: get L2TP stats\n", session->name); err = 0; break; @@ -1143,7 +1143,7 @@ static int pppol2tp_tunnel_ioctl(struct struct sock *sk; struct pppol2tp_ioc_stats stats;
- l2tp_dbg(tunnel, PPPOL2TP_MSG_CONTROL, + l2tp_dbg(tunnel, L2TP_MSG_CONTROL, "%s: pppol2tp_tunnel_ioctl(cmd=%#x, arg=%#lx)\n", tunnel->name, cmd, arg);
@@ -1186,7 +1186,7 @@ static int pppol2tp_tunnel_ioctl(struct err = -EFAULT; break; } - l2tp_info(tunnel, PPPOL2TP_MSG_CONTROL, "%s: get L2TP stats\n", + l2tp_info(tunnel, L2TP_MSG_CONTROL, "%s: get L2TP stats\n", tunnel->name); err = 0; break; @@ -1276,7 +1276,7 @@ static int pppol2tp_tunnel_setsockopt(st switch (optname) { case PPPOL2TP_SO_DEBUG: tunnel->debug = val; - l2tp_info(tunnel, PPPOL2TP_MSG_CONTROL, "%s: set debug=%x\n", + l2tp_info(tunnel, L2TP_MSG_CONTROL, "%s: set debug=%x\n", tunnel->name, tunnel->debug); break;
@@ -1304,7 +1304,7 @@ static int pppol2tp_session_setsockopt(s break; } session->recv_seq = val ? -1 : 0; - l2tp_info(session, PPPOL2TP_MSG_CONTROL, + l2tp_info(session, L2TP_MSG_CONTROL, "%s: set recv_seq=%d\n", session->name, session->recv_seq); break; @@ -1322,7 +1322,7 @@ static int pppol2tp_session_setsockopt(s PPPOL2TP_L2TP_HDR_SIZE_NOSEQ; } l2tp_session_set_header_len(session, session->tunnel->version); - l2tp_info(session, PPPOL2TP_MSG_CONTROL, + l2tp_info(session, L2TP_MSG_CONTROL, "%s: set send_seq=%d\n", session->name, session->send_seq); break; @@ -1333,20 +1333,20 @@ static int pppol2tp_session_setsockopt(s break; } session->lns_mode = val ? -1 : 0; - l2tp_info(session, PPPOL2TP_MSG_CONTROL, + l2tp_info(session, L2TP_MSG_CONTROL, "%s: set lns_mode=%d\n", session->name, session->lns_mode); break;
case PPPOL2TP_SO_DEBUG: session->debug = val; - l2tp_info(session, PPPOL2TP_MSG_CONTROL, "%s: set debug=%x\n", + l2tp_info(session, L2TP_MSG_CONTROL, "%s: set debug=%x\n", session->name, session->debug); break;
case PPPOL2TP_SO_REORDERTO: session->reorder_timeout = msecs_to_jiffies(val); - l2tp_info(session, PPPOL2TP_MSG_CONTROL, + l2tp_info(session, L2TP_MSG_CONTROL, "%s: set reorder_timeout=%d\n", session->name, session->reorder_timeout); break; @@ -1427,7 +1427,7 @@ static int pppol2tp_tunnel_getsockopt(st switch (optname) { case PPPOL2TP_SO_DEBUG: *val = tunnel->debug; - l2tp_info(tunnel, PPPOL2TP_MSG_CONTROL, "%s: get debug=%x\n", + l2tp_info(tunnel, L2TP_MSG_CONTROL, "%s: get debug=%x\n", tunnel->name, tunnel->debug); break;
@@ -1450,31 +1450,31 @@ static int pppol2tp_session_getsockopt(s switch (optname) { case PPPOL2TP_SO_RECVSEQ: *val = session->recv_seq; - l2tp_info(session, PPPOL2TP_MSG_CONTROL, + l2tp_info(session, L2TP_MSG_CONTROL, "%s: get recv_seq=%d\n", session->name, *val); break;
case PPPOL2TP_SO_SENDSEQ: *val = session->send_seq; - l2tp_info(session, PPPOL2TP_MSG_CONTROL, + l2tp_info(session, L2TP_MSG_CONTROL, "%s: get send_seq=%d\n", session->name, *val); break;
case PPPOL2TP_SO_LNSMODE: *val = session->lns_mode; - l2tp_info(session, PPPOL2TP_MSG_CONTROL, + l2tp_info(session, L2TP_MSG_CONTROL, "%s: get lns_mode=%d\n", session->name, *val); break;
case PPPOL2TP_SO_DEBUG: *val = session->debug; - l2tp_info(session, PPPOL2TP_MSG_CONTROL, "%s: get debug=%d\n", + l2tp_info(session, L2TP_MSG_CONTROL, "%s: get debug=%d\n", session->name, *val); break;
case PPPOL2TP_SO_REORDERTO: *val = (int) jiffies_to_msecs(session->reorder_timeout); - l2tp_info(session, PPPOL2TP_MSG_CONTROL, + l2tp_info(session, L2TP_MSG_CONTROL, "%s: get reorder_timeout=%d\n", session->name, *val); break;
From: "R. Parameswaran" parameswaran.r7@gmail.com
commit 113c3075931a334f899008f6c753abe70a3a9323 upstream.
A new function, kernel_sock_ip_overhead(), is provided to calculate the cumulative overhead imposed by the IP Header and IP options, if any, on a socket's payload. The new function returns an overhead of zero for sockets that do not belong to the IPv4 or IPv6 address families. This is used in the L2TP code path to compute the total outer IP overhead on the L2TP tunnel socket when calculating the default MTU for Ethernet pseudowires.
Signed-off-by: R. Parameswaran rparames@brocade.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/net.h | 3 +++ net/socket.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 49 insertions(+)
--- a/include/linux/net.h +++ b/include/linux/net.h @@ -291,6 +291,9 @@ int kernel_sendpage(struct socket *sock, int kernel_sock_ioctl(struct socket *sock, int cmd, unsigned long arg); int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how);
+/* Following routine returns the IP overhead imposed by a socket. */ +u32 kernel_sock_ip_overhead(struct sock *sk); + #define MODULE_ALIAS_NETPROTO(proto) \ MODULE_ALIAS("net-pf-" __stringify(proto))
--- a/net/socket.c +++ b/net/socket.c @@ -3304,3 +3304,49 @@ int kernel_sock_shutdown(struct socket * return sock->ops->shutdown(sock, how); } EXPORT_SYMBOL(kernel_sock_shutdown); + +/* This routine returns the IP overhead imposed by a socket i.e. + * the length of the underlying IP header, depending on whether + * this is an IPv4 or IPv6 socket and the length from IP options turned + * on at the socket. + */ +u32 kernel_sock_ip_overhead(struct sock *sk) +{ + struct inet_sock *inet; + struct ip_options_rcu *opt; + u32 overhead = 0; + bool owned_by_user; +#if IS_ENABLED(CONFIG_IPV6) + struct ipv6_pinfo *np; + struct ipv6_txoptions *optv6 = NULL; +#endif /* IS_ENABLED(CONFIG_IPV6) */ + + if (!sk) + return overhead; + + owned_by_user = sock_owned_by_user(sk); + switch (sk->sk_family) { + case AF_INET: + inet = inet_sk(sk); + overhead += sizeof(struct iphdr); + opt = rcu_dereference_protected(inet->inet_opt, + owned_by_user); + if (opt) + overhead += opt->opt.optlen; + return overhead; +#if IS_ENABLED(CONFIG_IPV6) + case AF_INET6: + np = inet6_sk(sk); + overhead += sizeof(struct ipv6hdr); + if (np) + optv6 = rcu_dereference_protected(np->opt, + owned_by_user); + if (optv6) + overhead += (optv6->opt_flen + optv6->opt_nflen); + return overhead; +#endif /* IS_ENABLED(CONFIG_IPV6) */ + default: /* Returns 0 overhead if the socket is not ipv4 or ipv6 */ + return overhead; + } +} +EXPORT_SYMBOL(kernel_sock_ip_overhead);
From: "R. Parameswaran" parameswaran.r7@gmail.com
commit b784e7ebfce8cfb16c6f95e14e8532d0768ab7ff upstream.
Existing L2TP kernel code does not derive the optimal MTU for Ethernet pseudowires and instead leaves this to a userspace L2TP daemon or operator. If an MTU is not specified, the existing kernel code chooses an MTU that does not take account of all tunnel header overheads, which can lead to unwanted IP fragmentation. When L2TP is used without a control plane (userspace daemon), we would prefer that the kernel does a better job of choosing a default pseudowire MTU, taking account of all tunnel header overheads, including IP header options, if any. This patch addresses this.
Change-set here uses the new kernel function, kernel_sock_ip_overhead(), to factor the outer IP overhead on the L2TP tunnel socket (including IP Options, if any) when calculating the default MTU for an Ethernet pseudowire, along with consideration of the inner Ethernet header.
Signed-off-by: R. Parameswaran rparames@brocade.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_eth.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 51 insertions(+), 4 deletions(-)
--- a/net/l2tp/l2tp_eth.c +++ b/net/l2tp/l2tp_eth.c @@ -30,6 +30,9 @@ #include <net/xfrm.h> #include <net/net_namespace.h> #include <net/netns/generic.h> +#include <linux/ip.h> +#include <linux/ipv6.h> +#include <linux/udp.h>
#include "l2tp_core.h"
@@ -206,6 +209,53 @@ static void l2tp_eth_show(struct seq_fil } #endif
+static void l2tp_eth_adjust_mtu(struct l2tp_tunnel *tunnel, + struct l2tp_session *session, + struct net_device *dev) +{ + unsigned int overhead = 0; + struct dst_entry *dst; + u32 l3_overhead = 0; + + /* if the encap is UDP, account for UDP header size */ + if (tunnel->encap == L2TP_ENCAPTYPE_UDP) { + overhead += sizeof(struct udphdr); + dev->needed_headroom += sizeof(struct udphdr); + } + if (session->mtu != 0) { + dev->mtu = session->mtu; + dev->needed_headroom += session->hdr_len; + return; + } + l3_overhead = kernel_sock_ip_overhead(tunnel->sock); + if (l3_overhead == 0) { + /* L3 Overhead couldn't be identified, this could be + * because tunnel->sock was NULL or the socket's + * address family was not IPv4 or IPv6, + * dev mtu stays at 1500. + */ + return; + } + /* Adjust MTU, factor overhead - underlay L3, overlay L2 hdr + * UDP overhead, if any, was already factored in above. + */ + overhead += session->hdr_len + ETH_HLEN + l3_overhead; + + /* If PMTU discovery was enabled, use discovered MTU on L2TP device */ + dst = sk_dst_get(tunnel->sock); + if (dst) { + /* dst_mtu will use PMTU if found, else fallback to intf MTU */ + u32 pmtu = dst_mtu(dst); + + if (pmtu != 0) + dev->mtu = pmtu; + dst_release(dst); + } + session->mtu = dev->mtu - overhead; + dev->mtu = session->mtu; + dev->needed_headroom += session->hdr_len; +} + static int l2tp_eth_create(struct net *net, u32 tunnel_id, u32 session_id, u32 peer_session_id, struct l2tp_session_cfg *cfg) { struct net_device *dev; @@ -249,10 +299,7 @@ static int l2tp_eth_create(struct net *n }
dev_net_set(dev, net); - if (session->mtu == 0) - session->mtu = dev->mtu - session->hdr_len; - dev->mtu = session->mtu; - dev->needed_headroom += session->hdr_len; + l2tp_eth_adjust_mtu(tunnel, session, dev);
priv = netdev_priv(dev); priv->dev = dev;
From: Guillaume Nault g.nault@alphalink.fr
commit af87ae465abdc070de0dc35d6c6a9e7a8cd82987 upstream.
There's no point in checking for duplicate sessions at the beginning of l2tp_nl_cmd_session_create(); the ->session_create() callbacks already return -EEXIST when the session already exists.
Furthermore, even if l2tp_session_find() returns NULL, a new session might be created right after the test. So relying on ->session_create() to avoid duplicate session is the only sane behaviour.
Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_netlink.c | 5 ----- 1 file changed, 5 deletions(-)
--- a/net/l2tp/l2tp_netlink.c +++ b/net/l2tp/l2tp_netlink.c @@ -505,11 +505,6 @@ static int l2tp_nl_cmd_session_create(st goto out; } session_id = nla_get_u32(info->attrs[L2TP_ATTR_SESSION_ID]); - session = l2tp_session_find(net, tunnel, session_id); - if (session) { - ret = -EEXIST; - goto out; - }
if (!info->attrs[L2TP_ATTR_PEER_SESSION_ID]) { ret = -EINVAL;
From: Guillaume Nault g.nault@alphalink.fr
commit 55a3ce3b9d98f752df9e2cfb1cba7e715522428a upstream.
This function isn't used anymore.
Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_core.c | 51 +-------------------------------------------------- net/l2tp/l2tp_core.h | 3 --- 2 files changed, 1 insertion(+), 53 deletions(-)
--- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -216,27 +216,6 @@ static void l2tp_tunnel_sock_put(struct sock_put(sk); }
-/* Lookup a session by id in the global session list - */ -static struct l2tp_session *l2tp_session_find_2(struct net *net, u32 session_id) -{ - struct l2tp_net *pn = l2tp_pernet(net); - struct hlist_head *session_list = - l2tp_session_id_hash_2(pn, session_id); - struct l2tp_session *session; - - rcu_read_lock_bh(); - hlist_for_each_entry_rcu(session, session_list, global_hlist) { - if (session->session_id == session_id) { - rcu_read_unlock_bh(); - return session; - } - } - rcu_read_unlock_bh(); - - return NULL; -} - /* Session hash list. * The session_id SHOULD be random according to RFC2661, but several * L2TP implementations (Cisco and Microsoft) use incrementing @@ -249,35 +228,7 @@ l2tp_session_id_hash(struct l2tp_tunnel return &tunnel->session_hlist[hash_32(session_id, L2TP_HASH_BITS)]; }
-/* Lookup a session by id - */ -struct l2tp_session *l2tp_session_find(struct net *net, struct l2tp_tunnel *tunnel, u32 session_id) -{ - struct hlist_head *session_list; - struct l2tp_session *session; - - /* In L2TPv3, session_ids are unique over all tunnels and we - * sometimes need to look them up before we know the - * tunnel. - */ - if (tunnel == NULL) - return l2tp_session_find_2(net, session_id); - - session_list = l2tp_session_id_hash(tunnel, session_id); - read_lock_bh(&tunnel->hlist_lock); - hlist_for_each_entry(session, session_list, hlist) { - if (session->session_id == session_id) { - read_unlock_bh(&tunnel->hlist_lock); - return session; - } - } - read_unlock_bh(&tunnel->hlist_lock); - - return NULL; -} -EXPORT_SYMBOL_GPL(l2tp_session_find); - -/* Like l2tp_session_find() but takes a reference on the returned session. +/* Lookup a session. A new reference is held on the returned session. * Optionally calls session->ref() too if do_ref is true. */ struct l2tp_session *l2tp_session_get(struct net *net, --- a/net/l2tp/l2tp_core.h +++ b/net/l2tp/l2tp_core.h @@ -237,9 +237,6 @@ out: struct l2tp_session *l2tp_session_get(struct net *net, struct l2tp_tunnel *tunnel, u32 session_id, bool do_ref); -struct l2tp_session *l2tp_session_find(struct net *net, - struct l2tp_tunnel *tunnel, - u32 session_id); struct l2tp_session *l2tp_session_get_nth(struct l2tp_tunnel *tunnel, int nth, bool do_ref); struct l2tp_session *l2tp_session_get_by_ifname(struct net *net, char *ifname,
From: Guillaume Nault g.nault@alphalink.fr
commit 9aaef50c44f132e040dcd7686c8e78a3390037c5 upstream.
Make l2tp_pernet()'s parameter constant, so that l2tp_session_get*() can declare their "net" variable as "const". Also constify "ifname" in l2tp_session_get_by_ifname().
Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_core.c | 7 ++++--- net/l2tp/l2tp_core.h | 5 +++-- 2 files changed, 7 insertions(+), 5 deletions(-)
--- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -119,7 +119,7 @@ static inline struct l2tp_tunnel *l2tp_t return sk->sk_user_data; }
-static inline struct l2tp_net *l2tp_pernet(struct net *net) +static inline struct l2tp_net *l2tp_pernet(const struct net *net) { BUG_ON(!net);
@@ -231,7 +231,7 @@ l2tp_session_id_hash(struct l2tp_tunnel /* Lookup a session. A new reference is held on the returned session. * Optionally calls session->ref() too if do_ref is true. */ -struct l2tp_session *l2tp_session_get(struct net *net, +struct l2tp_session *l2tp_session_get(const struct net *net, struct l2tp_tunnel *tunnel, u32 session_id, bool do_ref) { @@ -306,7 +306,8 @@ EXPORT_SYMBOL_GPL(l2tp_session_get_nth); /* Lookup a session by interface name. * This is very inefficient but is only used by management interfaces. */ -struct l2tp_session *l2tp_session_get_by_ifname(struct net *net, char *ifname, +struct l2tp_session *l2tp_session_get_by_ifname(const struct net *net, + const char *ifname, bool do_ref) { struct l2tp_net *pn = l2tp_pernet(net); --- a/net/l2tp/l2tp_core.h +++ b/net/l2tp/l2tp_core.h @@ -234,12 +234,13 @@ out: return tunnel; }
-struct l2tp_session *l2tp_session_get(struct net *net, +struct l2tp_session *l2tp_session_get(const struct net *net, struct l2tp_tunnel *tunnel, u32 session_id, bool do_ref); struct l2tp_session *l2tp_session_get_nth(struct l2tp_tunnel *tunnel, int nth, bool do_ref); -struct l2tp_session *l2tp_session_get_by_ifname(struct net *net, char *ifname, +struct l2tp_session *l2tp_session_get_by_ifname(const struct net *net, + const char *ifname, bool do_ref); struct l2tp_tunnel *l2tp_tunnel_find(struct net *net, u32 tunnel_id); struct l2tp_tunnel *l2tp_tunnel_find_nth(struct net *net, int nth);
From: Guillaume Nault g.nault@alphalink.fr
commit 2f858b928bf5a8174911aaec76b8b72a9ca0533d upstream.
l2tp_tunnel_find() and l2tp_tunnel_find_nth() don't modify "net".
Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_core.c | 4 ++-- net/l2tp/l2tp_core.h | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
--- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -378,7 +378,7 @@ exist:
/* Lookup a tunnel by id */ -struct l2tp_tunnel *l2tp_tunnel_find(struct net *net, u32 tunnel_id) +struct l2tp_tunnel *l2tp_tunnel_find(const struct net *net, u32 tunnel_id) { struct l2tp_tunnel *tunnel; struct l2tp_net *pn = l2tp_pernet(net); @@ -396,7 +396,7 @@ struct l2tp_tunnel *l2tp_tunnel_find(str } EXPORT_SYMBOL_GPL(l2tp_tunnel_find);
-struct l2tp_tunnel *l2tp_tunnel_find_nth(struct net *net, int nth) +struct l2tp_tunnel *l2tp_tunnel_find_nth(const struct net *net, int nth) { struct l2tp_net *pn = l2tp_pernet(net); struct l2tp_tunnel *tunnel; --- a/net/l2tp/l2tp_core.h +++ b/net/l2tp/l2tp_core.h @@ -242,8 +242,8 @@ struct l2tp_session *l2tp_session_get_nt struct l2tp_session *l2tp_session_get_by_ifname(const struct net *net, const char *ifname, bool do_ref); -struct l2tp_tunnel *l2tp_tunnel_find(struct net *net, u32 tunnel_id); -struct l2tp_tunnel *l2tp_tunnel_find_nth(struct net *net, int nth); +struct l2tp_tunnel *l2tp_tunnel_find(const struct net *net, u32 tunnel_id); +struct l2tp_tunnel *l2tp_tunnel_find_nth(const struct net *net, int nth);
int l2tp_tunnel_create(struct net *net, int fd, int version, u32 tunnel_id, u32 peer_tunnel_id, struct l2tp_tunnel_cfg *cfg,
From: Guillaume Nault g.nault@alphalink.fr
commit 9ee369a405c57613d7c83a3967780c3e30c52ecc upstream.
Sessions must be fully initialised before calling l2tp_session_add_to_tunnel(). Otherwise, there's a short time frame where partially initialised sessions can be accessed by external users.
Backporting Notes
l2tp_core.c: moving code that had been converted from atomic to refcount_t by an earlier change (which isn't being included in this patch series).
Fixes: dbdbc73b4478 ("l2tp: fix duplicate session creation") Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_core.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -1853,6 +1853,8 @@ struct l2tp_session *l2tp_session_create
l2tp_session_set_header_len(session, tunnel->version);
+ l2tp_session_inc_refcount(session); + err = l2tp_session_add_to_tunnel(tunnel, session); if (err) { kfree(session); @@ -1860,10 +1862,6 @@ struct l2tp_session *l2tp_session_create return ERR_PTR(err); }
- /* Bump the reference count. The session context is deleted - * only when this drops to zero. - */ - l2tp_session_inc_refcount(session); l2tp_tunnel_inc_refcount(tunnel);
/* Ensure tunnel socket isn't deleted */
From: Guillaume Nault g.nault@alphalink.fr
commit 54652eb12c1b72e9602d09cb2821d5760939190f upstream.
l2tp_tunnel_find() doesn't take a reference on the returned tunnel. Therefore, it's unsafe to use it because the returned tunnel can go away on us anytime.
Fix this by defining l2tp_tunnel_get(), which works like l2tp_tunnel_find(), but takes a reference on the returned tunnel. Caller then has to drop this reference using l2tp_tunnel_dec_refcount().
As l2tp_tunnel_dec_refcount() needs to be moved to l2tp_core.h, let's simplify the patch and not move the L2TP_REFCNT_DEBUG part. This code has been broken (not even compiling) in May 2012 by commit a4ca44fa578c ("net: l2tp: Standardize logging styles") and fixed more than two years later by commit 29abe2fda54f ("l2tp: fix missing line continuation"). So it doesn't appear to be used by anyone.
Same thing for l2tp_tunnel_free(); instead of moving it to l2tp_core.h, let's just simplify things and call kfree_rcu() directly in l2tp_tunnel_dec_refcount(). Extra assertions and debugging code provided by l2tp_tunnel_free() didn't help catching any of the reference counting and socket handling issues found while working on this series.
Backporting Notes
l2tp_core.c: This patch deletes some code / moves some code to l2tp_core.h and follows the patch (not including in this series) that switched from atomic to refcount_t. Moved code changed back to atomic.
Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_core.c | 66 +++++++++++++++--------------------------------- net/l2tp/l2tp_core.h | 13 +++++++++ net/l2tp/l2tp_netlink.c | 6 ++-- 3 files changed, 38 insertions(+), 47 deletions(-)
--- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -112,7 +112,6 @@ struct l2tp_net { spinlock_t l2tp_session_hlist_lock; };
-static void l2tp_tunnel_free(struct l2tp_tunnel *tunnel);
static inline struct l2tp_tunnel *l2tp_tunnel(struct sock *sk) { @@ -126,39 +125,6 @@ static inline struct l2tp_net *l2tp_pern return net_generic(net, l2tp_net_id); }
-/* Tunnel reference counts. Incremented per session that is added to - * the tunnel. - */ -static inline void l2tp_tunnel_inc_refcount_1(struct l2tp_tunnel *tunnel) -{ - atomic_inc(&tunnel->ref_count); -} - -static inline void l2tp_tunnel_dec_refcount_1(struct l2tp_tunnel *tunnel) -{ - if (atomic_dec_and_test(&tunnel->ref_count)) - l2tp_tunnel_free(tunnel); -} -#ifdef L2TP_REFCNT_DEBUG -#define l2tp_tunnel_inc_refcount(_t) \ -do { \ - pr_debug("l2tp_tunnel_inc_refcount: %s:%d %s: cnt=%d\n", \ - __func__, __LINE__, (_t)->name, \ - atomic_read(&_t->ref_count)); \ - l2tp_tunnel_inc_refcount_1(_t); \ -} while (0) -#define l2tp_tunnel_dec_refcount(_t) \ -do { \ - pr_debug("l2tp_tunnel_dec_refcount: %s:%d %s: cnt=%d\n", \ - __func__, __LINE__, (_t)->name, \ - atomic_read(&_t->ref_count)); \ - l2tp_tunnel_dec_refcount_1(_t); \ -} while (0) -#else -#define l2tp_tunnel_inc_refcount(t) l2tp_tunnel_inc_refcount_1(t) -#define l2tp_tunnel_dec_refcount(t) l2tp_tunnel_dec_refcount_1(t) -#endif - /* Session hash global list for L2TPv3. * The session_id SHOULD be random according to RFC3931, but several * L2TP implementations use incrementing session_ids. So we do a real @@ -228,6 +194,27 @@ l2tp_session_id_hash(struct l2tp_tunnel return &tunnel->session_hlist[hash_32(session_id, L2TP_HASH_BITS)]; }
+/* Lookup a tunnel. A new reference is held on the returned tunnel. */ +struct l2tp_tunnel *l2tp_tunnel_get(const struct net *net, u32 tunnel_id) +{ + const struct l2tp_net *pn = l2tp_pernet(net); + struct l2tp_tunnel *tunnel; + + rcu_read_lock_bh(); + list_for_each_entry_rcu(tunnel, &pn->l2tp_tunnel_list, list) { + if (tunnel->tunnel_id == tunnel_id) { + l2tp_tunnel_inc_refcount(tunnel); + rcu_read_unlock_bh(); + + return tunnel; + } + } + rcu_read_unlock_bh(); + + return NULL; +} +EXPORT_SYMBOL_GPL(l2tp_tunnel_get); + /* Lookup a session. A new reference is held on the returned session. * Optionally calls session->ref() too if do_ref is true. */ @@ -1351,17 +1338,6 @@ static void l2tp_udp_encap_destroy(struc } }
-/* Really kill the tunnel. - * Come here only when all sessions have been cleared from the tunnel. - */ -static void l2tp_tunnel_free(struct l2tp_tunnel *tunnel) -{ - BUG_ON(atomic_read(&tunnel->ref_count) != 0); - BUG_ON(tunnel->sock != NULL); - l2tp_info(tunnel, L2TP_MSG_CONTROL, "%s: free...\n", tunnel->name); - kfree_rcu(tunnel, rcu); -} - /* Workqueue tunnel deletion function */ static void l2tp_tunnel_del_work(struct work_struct *work) { --- a/net/l2tp/l2tp_core.h +++ b/net/l2tp/l2tp_core.h @@ -234,6 +234,8 @@ out: return tunnel; }
+struct l2tp_tunnel *l2tp_tunnel_get(const struct net *net, u32 tunnel_id); + struct l2tp_session *l2tp_session_get(const struct net *net, struct l2tp_tunnel *tunnel, u32 session_id, bool do_ref); @@ -272,6 +274,17 @@ int l2tp_nl_register_ops(enum l2tp_pwtyp void l2tp_nl_unregister_ops(enum l2tp_pwtype pw_type); int l2tp_ioctl(struct sock *sk, int cmd, unsigned long arg);
+static inline void l2tp_tunnel_inc_refcount(struct l2tp_tunnel *tunnel) +{ + atomic_inc(&tunnel->ref_count); +} + +static inline void l2tp_tunnel_dec_refcount(struct l2tp_tunnel *tunnel) +{ + if (atomic_dec_and_test(&tunnel->ref_count)) + kfree_rcu(tunnel, rcu); +} + /* Session reference counts. Incremented when code obtains a reference * to a session. */ --- a/net/l2tp/l2tp_netlink.c +++ b/net/l2tp/l2tp_netlink.c @@ -72,10 +72,12 @@ static struct l2tp_session *l2tp_nl_sess (info->attrs[L2TP_ATTR_CONN_ID])) { tunnel_id = nla_get_u32(info->attrs[L2TP_ATTR_CONN_ID]); session_id = nla_get_u32(info->attrs[L2TP_ATTR_SESSION_ID]); - tunnel = l2tp_tunnel_find(net, tunnel_id); - if (tunnel) + tunnel = l2tp_tunnel_get(net, tunnel_id); + if (tunnel) { session = l2tp_session_get(net, tunnel, session_id, do_ref); + l2tp_tunnel_dec_refcount(tunnel); + } }
return session;
From: Guillaume Nault g.nault@alphalink.fr
commit bb0a32ce4389e17e47e198d2cddaf141561581ad upstream.
l2tp_nl_cmd_tunnel_delete() needs to take a reference on the tunnel, to prevent it from being concurrently freed by l2tp_tunnel_destruct().
Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_netlink.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/net/l2tp/l2tp_netlink.c +++ b/net/l2tp/l2tp_netlink.c @@ -280,8 +280,8 @@ static int l2tp_nl_cmd_tunnel_delete(str } tunnel_id = nla_get_u32(info->attrs[L2TP_ATTR_CONN_ID]);
- tunnel = l2tp_tunnel_find(net, tunnel_id); - if (tunnel == NULL) { + tunnel = l2tp_tunnel_get(net, tunnel_id); + if (!tunnel) { ret = -ENODEV; goto out; } @@ -291,6 +291,8 @@ static int l2tp_nl_cmd_tunnel_delete(str
l2tp_tunnel_delete(tunnel);
+ l2tp_tunnel_dec_refcount(tunnel); + out: return ret; }
From: Guillaume Nault g.nault@alphalink.fr
commit 8c0e421525c9eb50d68e8f633f703ca31680b746 upstream.
We need to make sure the tunnel is not going to be destroyed by l2tp_tunnel_destruct() concurrently.
Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_netlink.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/net/l2tp/l2tp_netlink.c +++ b/net/l2tp/l2tp_netlink.c @@ -310,8 +310,8 @@ static int l2tp_nl_cmd_tunnel_modify(str } tunnel_id = nla_get_u32(info->attrs[L2TP_ATTR_CONN_ID]);
- tunnel = l2tp_tunnel_find(net, tunnel_id); - if (tunnel == NULL) { + tunnel = l2tp_tunnel_get(net, tunnel_id); + if (!tunnel) { ret = -ENODEV; goto out; } @@ -322,6 +322,8 @@ static int l2tp_nl_cmd_tunnel_modify(str ret = l2tp_tunnel_notify(&l2tp_nl_family, info, tunnel, L2TP_CMD_TUNNEL_MODIFY);
+ l2tp_tunnel_dec_refcount(tunnel); + out: return ret; }
From: Guillaume Nault g.nault@alphalink.fr
commit 4e4b21da3acc68a7ea55f850cacc13706b7480e9 upstream.
Use l2tp_tunnel_get() instead of l2tp_tunnel_find() so that we get a reference on the tunnel, preventing l2tp_tunnel_destruct() from freeing it from under us.
Also move l2tp_tunnel_get() below nlmsg_new() so that we only take the reference when needed.
Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_netlink.c | 27 +++++++++++++++------------ 1 file changed, 15 insertions(+), 12 deletions(-)
--- a/net/l2tp/l2tp_netlink.c +++ b/net/l2tp/l2tp_netlink.c @@ -428,34 +428,37 @@ static int l2tp_nl_cmd_tunnel_get(struct
if (!info->attrs[L2TP_ATTR_CONN_ID]) { ret = -EINVAL; - goto out; + goto err; }
tunnel_id = nla_get_u32(info->attrs[L2TP_ATTR_CONN_ID]);
- tunnel = l2tp_tunnel_find(net, tunnel_id); - if (tunnel == NULL) { - ret = -ENODEV; - goto out; - } - msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); if (!msg) { ret = -ENOMEM; - goto out; + goto err; + } + + tunnel = l2tp_tunnel_get(net, tunnel_id); + if (!tunnel) { + ret = -ENODEV; + goto err_nlmsg; }
ret = l2tp_nl_tunnel_send(msg, info->snd_portid, info->snd_seq, NLM_F_ACK, tunnel, L2TP_CMD_TUNNEL_GET); if (ret < 0) - goto err_out; + goto err_nlmsg_tunnel; + + l2tp_tunnel_dec_refcount(tunnel);
return genlmsg_unicast(net, msg, info->snd_portid);
-err_out: +err_nlmsg_tunnel: + l2tp_tunnel_dec_refcount(tunnel); +err_nlmsg: nlmsg_free(msg); - -out: +err: return ret; }
From: Guillaume Nault g.nault@alphalink.fr
commit e702c1204eb57788ef189c839c8c779368267d70 upstream.
Use l2tp_tunnel_get() to retrieve tunnel, so that it can't go away on us. Otherwise l2tp_tunnel_destruct() might release the last reference count concurrently, thus freeing the tunnel while we're using it.
Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_netlink.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-)
--- a/net/l2tp/l2tp_netlink.c +++ b/net/l2tp/l2tp_netlink.c @@ -502,8 +502,9 @@ static int l2tp_nl_cmd_session_create(st ret = -EINVAL; goto out; } + tunnel_id = nla_get_u32(info->attrs[L2TP_ATTR_CONN_ID]); - tunnel = l2tp_tunnel_find(net, tunnel_id); + tunnel = l2tp_tunnel_get(net, tunnel_id); if (!tunnel) { ret = -ENODEV; goto out; @@ -511,24 +512,24 @@ static int l2tp_nl_cmd_session_create(st
if (!info->attrs[L2TP_ATTR_SESSION_ID]) { ret = -EINVAL; - goto out; + goto out_tunnel; } session_id = nla_get_u32(info->attrs[L2TP_ATTR_SESSION_ID]);
if (!info->attrs[L2TP_ATTR_PEER_SESSION_ID]) { ret = -EINVAL; - goto out; + goto out_tunnel; } peer_session_id = nla_get_u32(info->attrs[L2TP_ATTR_PEER_SESSION_ID]);
if (!info->attrs[L2TP_ATTR_PW_TYPE]) { ret = -EINVAL; - goto out; + goto out_tunnel; } cfg.pw_type = nla_get_u16(info->attrs[L2TP_ATTR_PW_TYPE]); if (cfg.pw_type >= __L2TP_PWTYPE_MAX) { ret = -EINVAL; - goto out; + goto out_tunnel; }
if (tunnel->version > 2) { @@ -550,7 +551,7 @@ static int l2tp_nl_cmd_session_create(st u16 len = nla_len(info->attrs[L2TP_ATTR_COOKIE]); if (len > 8) { ret = -EINVAL; - goto out; + goto out_tunnel; } cfg.cookie_len = len; memcpy(&cfg.cookie[0], nla_data(info->attrs[L2TP_ATTR_COOKIE]), len); @@ -559,7 +560,7 @@ static int l2tp_nl_cmd_session_create(st u16 len = nla_len(info->attrs[L2TP_ATTR_PEER_COOKIE]); if (len > 8) { ret = -EINVAL; - goto out; + goto out_tunnel; } cfg.peer_cookie_len = len; memcpy(&cfg.peer_cookie[0], nla_data(info->attrs[L2TP_ATTR_PEER_COOKIE]), len); @@ -602,7 +603,7 @@ static int l2tp_nl_cmd_session_create(st if ((l2tp_nl_cmd_ops[cfg.pw_type] == NULL) || (l2tp_nl_cmd_ops[cfg.pw_type]->session_create == NULL)) { ret = -EPROTONOSUPPORT; - goto out; + goto out_tunnel; }
/* Check that pseudowire-specific params are present */ @@ -612,7 +613,7 @@ static int l2tp_nl_cmd_session_create(st case L2TP_PWTYPE_ETH_VLAN: if (!info->attrs[L2TP_ATTR_VLAN_ID]) { ret = -EINVAL; - goto out; + goto out_tunnel; } break; case L2TP_PWTYPE_ETH: @@ -640,6 +641,8 @@ static int l2tp_nl_cmd_session_create(st } }
+out_tunnel: + l2tp_tunnel_dec_refcount(tunnel); out: return ret; }
From: Guillaume Nault g.nault@alphalink.fr
commit f3c66d4e144a0904ea9b95d23ed9f8eb38c11bfb upstream.
l2tp_tunnel_destruct() sets tunnel->sock to NULL, then removes the tunnel from the pernet list and finally closes all its sessions. Therefore, it's possible to add a session to a tunnel that is still reachable, but for which tunnel->sock has already been reset. This can make l2tp_session_create() dereference a NULL pointer when calling sock_hold(tunnel->sock).
This patch adds the .acpt_newsess field to struct l2tp_tunnel, which is used by l2tp_tunnel_closeall() to prevent addition of new sessions to tunnels. Resetting tunnel->sock is done after l2tp_tunnel_closeall() returned, so that l2tp_session_add_to_tunnel() can safely take a reference on it when .acpt_newsess is true.
The .acpt_newsess field is modified in l2tp_tunnel_closeall(), rather than in l2tp_tunnel_destruct(), so that it benefits all tunnel removal mechanisms. E.g. on UDP tunnels, a session could be added to a tunnel after l2tp_udp_encap_destroy() proceeded. This would prevent the tunnel from being removed because of the references held by this new session on the tunnel and its socket. Even though the session could be removed manually later on, this defeats the purpose of commit 9980d001cec8 ("l2tp: add udp encap socket destroy handler").
Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_core.c | 41 ++++++++++++++++++++++++++++------------- net/l2tp/l2tp_core.h | 4 ++++ 2 files changed, 32 insertions(+), 13 deletions(-)
--- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -328,13 +328,21 @@ static int l2tp_session_add_to_tunnel(st struct hlist_head *g_head; struct hlist_head *head; struct l2tp_net *pn; + int err;
head = l2tp_session_id_hash(tunnel, session->session_id);
write_lock_bh(&tunnel->hlist_lock); + if (!tunnel->acpt_newsess) { + err = -ENODEV; + goto err_tlock; + } + hlist_for_each_entry(session_walk, head, hlist) - if (session_walk->session_id == session->session_id) - goto exist; + if (session_walk->session_id == session->session_id) { + err = -EEXIST; + goto err_tlock; + }
if (tunnel->version == L2TP_HDR_VER_3) { pn = l2tp_pernet(tunnel->l2tp_net); @@ -342,12 +350,21 @@ static int l2tp_session_add_to_tunnel(st session->session_id);
spin_lock_bh(&pn->l2tp_session_hlist_lock); + hlist_for_each_entry(session_walk, g_head, global_hlist) - if (session_walk->session_id == session->session_id) - goto exist_glob; + if (session_walk->session_id == session->session_id) { + err = -EEXIST; + goto err_tlock_pnlock; + }
+ l2tp_tunnel_inc_refcount(tunnel); + sock_hold(tunnel->sock); hlist_add_head_rcu(&session->global_hlist, g_head); + spin_unlock_bh(&pn->l2tp_session_hlist_lock); + } else { + l2tp_tunnel_inc_refcount(tunnel); + sock_hold(tunnel->sock); }
hlist_add_head(&session->hlist, head); @@ -355,12 +372,12 @@ static int l2tp_session_add_to_tunnel(st
return 0;
-exist_glob: +err_tlock_pnlock: spin_unlock_bh(&pn->l2tp_session_hlist_lock); -exist: +err_tlock: write_unlock_bh(&tunnel->hlist_lock);
- return -EEXIST; + return err; }
/* Lookup a tunnel by id @@ -1251,7 +1268,6 @@ static void l2tp_tunnel_destruct(struct /* Remove hooks into tunnel socket */ sk->sk_destruct = tunnel->old_sk_destruct; sk->sk_user_data = NULL; - tunnel->sock = NULL;
/* Remove the tunnel struct from the tunnel list */ pn = l2tp_pernet(tunnel->l2tp_net); @@ -1261,6 +1277,8 @@ static void l2tp_tunnel_destruct(struct atomic_dec(&l2tp_tunnel_count);
l2tp_tunnel_closeall(tunnel); + + tunnel->sock = NULL; l2tp_tunnel_dec_refcount(tunnel);
/* Call the original destructor */ @@ -1285,6 +1303,7 @@ void l2tp_tunnel_closeall(struct l2tp_tu tunnel->name);
write_lock_bh(&tunnel->hlist_lock); + tunnel->acpt_newsess = false; for (hash = 0; hash < L2TP_HASH_SIZE; hash++) { again: hlist_for_each_safe(walk, tmp, &tunnel->session_hlist[hash]) { @@ -1588,6 +1607,7 @@ int l2tp_tunnel_create(struct net *net, tunnel->magic = L2TP_TUNNEL_MAGIC; sprintf(&tunnel->name[0], "tunl %u", tunnel_id); rwlock_init(&tunnel->hlist_lock); + tunnel->acpt_newsess = true;
/* The net we belong to */ tunnel->l2tp_net = net; @@ -1838,11 +1858,6 @@ struct l2tp_session *l2tp_session_create return ERR_PTR(err); }
- l2tp_tunnel_inc_refcount(tunnel); - - /* Ensure tunnel socket isn't deleted */ - sock_hold(tunnel->sock); - /* Ignore management session in session count value */ if (session->session_id != 0) atomic_inc(&l2tp_session_count); --- a/net/l2tp/l2tp_core.h +++ b/net/l2tp/l2tp_core.h @@ -165,6 +165,10 @@ struct l2tp_tunnel {
struct rcu_head rcu; rwlock_t hlist_lock; /* protect session_hlist */ + bool acpt_newsess; /* Indicates whether this + * tunnel accepts new sessions. + * Protected by hlist_lock. + */ struct hlist_head session_hlist[L2TP_HASH_SIZE]; /* hashed list of sessions, * hashed by id */
From: Guillaume Nault g.nault@alphalink.fr
commit f026bc29a8e093edfbb2a77700454b285c97e8ad upstream.
Using l2tp_tunnel_find() in pppol2tp_session_create() and l2tp_eth_create() is racy, because no reference is held on the returned session. These functions are only used to implement the ->session_create callback which is run by l2tp_nl_cmd_session_create(). Therefore searching for the parent tunnel isn't necessary because l2tp_nl_cmd_session_create() already has a pointer to it and holds a reference.
This patch modifies ->session_create()'s prototype to directly pass the the parent tunnel as parameter, thus avoiding searching for it in pppol2tp_session_create() and l2tp_eth_create().
Since we have to touch the ->session_create() call in l2tp_nl_cmd_session_create(), let's also remove the useless conditional: we know that ->session_create isn't NULL at this point because it's already been checked earlier in this same function.
Finally, one might be tempted to think that the removed l2tp_tunnel_find() calls were harmless because they would return the same tunnel as the one held by l2tp_nl_cmd_session_create() anyway. But that tunnel might be removed and a new one created with same tunnel Id before the l2tp_tunnel_find() call. In this case l2tp_tunnel_find() would return the new tunnel which wouldn't be protected by the reference held by l2tp_nl_cmd_session_create().
Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support") Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_core.h | 4 +++- net/l2tp/l2tp_eth.c | 11 +++-------- net/l2tp/l2tp_netlink.c | 8 ++++---- net/l2tp/l2tp_ppp.c | 19 +++++++------------ 4 files changed, 17 insertions(+), 25 deletions(-)
--- a/net/l2tp/l2tp_core.h +++ b/net/l2tp/l2tp_core.h @@ -204,7 +204,9 @@ struct l2tp_tunnel { };
struct l2tp_nl_cmd_ops { - int (*session_create)(struct net *net, u32 tunnel_id, u32 session_id, u32 peer_session_id, struct l2tp_session_cfg *cfg); + int (*session_create)(struct net *net, struct l2tp_tunnel *tunnel, + u32 session_id, u32 peer_session_id, + struct l2tp_session_cfg *cfg); int (*session_delete)(struct l2tp_session *session); };
--- a/net/l2tp/l2tp_eth.c +++ b/net/l2tp/l2tp_eth.c @@ -256,23 +256,18 @@ static void l2tp_eth_adjust_mtu(struct l dev->needed_headroom += session->hdr_len; }
-static int l2tp_eth_create(struct net *net, u32 tunnel_id, u32 session_id, u32 peer_session_id, struct l2tp_session_cfg *cfg) +static int l2tp_eth_create(struct net *net, struct l2tp_tunnel *tunnel, + u32 session_id, u32 peer_session_id, + struct l2tp_session_cfg *cfg) { struct net_device *dev; char name[IFNAMSIZ]; - struct l2tp_tunnel *tunnel; struct l2tp_session *session; struct l2tp_eth *priv; struct l2tp_eth_sess *spriv; int rc; struct l2tp_eth_net *pn;
- tunnel = l2tp_tunnel_find(net, tunnel_id); - if (!tunnel) { - rc = -ENODEV; - goto out; - } - if (cfg->ifname) { dev = dev_get_by_name(net, cfg->ifname); if (dev) { --- a/net/l2tp/l2tp_netlink.c +++ b/net/l2tp/l2tp_netlink.c @@ -627,10 +627,10 @@ static int l2tp_nl_cmd_session_create(st break; }
- ret = -EPROTONOSUPPORT; - if (l2tp_nl_cmd_ops[cfg.pw_type]->session_create) - ret = (*l2tp_nl_cmd_ops[cfg.pw_type]->session_create)(net, tunnel_id, - session_id, peer_session_id, &cfg); + ret = l2tp_nl_cmd_ops[cfg.pw_type]->session_create(net, tunnel, + session_id, + peer_session_id, + &cfg);
if (ret >= 0) { session = l2tp_session_get(net, tunnel, session_id, false); --- a/net/l2tp/l2tp_ppp.c +++ b/net/l2tp/l2tp_ppp.c @@ -810,25 +810,20 @@ end:
#ifdef CONFIG_L2TP_V3
-/* Called when creating sessions via the netlink interface. - */ -static int pppol2tp_session_create(struct net *net, u32 tunnel_id, u32 session_id, u32 peer_session_id, struct l2tp_session_cfg *cfg) +/* Called when creating sessions via the netlink interface. */ +static int pppol2tp_session_create(struct net *net, struct l2tp_tunnel *tunnel, + u32 session_id, u32 peer_session_id, + struct l2tp_session_cfg *cfg) { int error; - struct l2tp_tunnel *tunnel; struct l2tp_session *session; struct pppol2tp_session *ps;
- tunnel = l2tp_tunnel_find(net, tunnel_id); - - /* Error if we can't find the tunnel */ - error = -ENOENT; - if (tunnel == NULL) - goto out; - /* Error if tunnel socket is not prepped */ - if (tunnel->sock == NULL) + if (!tunnel->sock) { + error = -ENOENT; goto out; + }
/* Default MTU values. */ if (cfg->mtu == 0)
From: Guillaume Nault g.nault@alphalink.fr
commit 9f775ead5e570e7e19015b9e4e2f3dd6e71a5935 upstream.
The l2tp_eth module crashes if its netlink callbacks are run when the pernet data aren't initialised.
We should normally register_pernet_device() before the genl callbacks. However, the pernet data only maintain a list of l2tpeth interfaces, and this list is never used. So let's just drop pernet handling instead.
Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support") Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_eth.c | 51 ++------------------------------------------------- 1 file changed, 2 insertions(+), 49 deletions(-)
--- a/net/l2tp/l2tp_eth.c +++ b/net/l2tp/l2tp_eth.c @@ -44,7 +44,6 @@ struct l2tp_eth { struct net_device *dev; struct sock *tunnel_sock; struct l2tp_session *session; - struct list_head list; atomic_long_t tx_bytes; atomic_long_t tx_packets; atomic_long_t tx_dropped; @@ -58,17 +57,6 @@ struct l2tp_eth_sess { struct net_device *dev; };
-/* per-net private data for this module */ -static unsigned int l2tp_eth_net_id; -struct l2tp_eth_net { - struct list_head l2tp_eth_dev_list; - spinlock_t l2tp_eth_lock; -}; - -static inline struct l2tp_eth_net *l2tp_eth_pernet(struct net *net) -{ - return net_generic(net, l2tp_eth_net_id); -}
static struct lock_class_key l2tp_eth_tx_busylock; static int l2tp_eth_dev_init(struct net_device *dev) @@ -84,12 +72,6 @@ static int l2tp_eth_dev_init(struct net_
static void l2tp_eth_dev_uninit(struct net_device *dev) { - struct l2tp_eth *priv = netdev_priv(dev); - struct l2tp_eth_net *pn = l2tp_eth_pernet(dev_net(dev)); - - spin_lock(&pn->l2tp_eth_lock); - list_del_init(&priv->list); - spin_unlock(&pn->l2tp_eth_lock); dev_put(dev); }
@@ -266,7 +248,6 @@ static int l2tp_eth_create(struct net *n struct l2tp_eth *priv; struct l2tp_eth_sess *spriv; int rc; - struct l2tp_eth_net *pn;
if (cfg->ifname) { dev = dev_get_by_name(net, cfg->ifname); @@ -299,7 +280,6 @@ static int l2tp_eth_create(struct net *n priv = netdev_priv(dev); priv->dev = dev; priv->session = session; - INIT_LIST_HEAD(&priv->list);
priv->tunnel_sock = tunnel->sock; session->recv_skb = l2tp_eth_dev_recv; @@ -320,10 +300,6 @@ static int l2tp_eth_create(struct net *n strlcpy(session->ifname, dev->name, IFNAMSIZ);
dev_hold(dev); - pn = l2tp_eth_pernet(dev_net(dev)); - spin_lock(&pn->l2tp_eth_lock); - list_add(&priv->list, &pn->l2tp_eth_dev_list); - spin_unlock(&pn->l2tp_eth_lock);
return 0;
@@ -336,22 +312,6 @@ out: return rc; }
-static __net_init int l2tp_eth_init_net(struct net *net) -{ - struct l2tp_eth_net *pn = net_generic(net, l2tp_eth_net_id); - - INIT_LIST_HEAD(&pn->l2tp_eth_dev_list); - spin_lock_init(&pn->l2tp_eth_lock); - - return 0; -} - -static struct pernet_operations l2tp_eth_net_ops = { - .init = l2tp_eth_init_net, - .id = &l2tp_eth_net_id, - .size = sizeof(struct l2tp_eth_net), -}; -
static const struct l2tp_nl_cmd_ops l2tp_eth_nl_cmd_ops = { .session_create = l2tp_eth_create, @@ -365,25 +325,18 @@ static int __init l2tp_eth_init(void)
err = l2tp_nl_register_ops(L2TP_PWTYPE_ETH, &l2tp_eth_nl_cmd_ops); if (err) - goto out; - - err = register_pernet_device(&l2tp_eth_net_ops); - if (err) - goto out_unreg; + goto err;
pr_info("L2TP ethernet pseudowire support (L2TPv3)\n");
return 0;
-out_unreg: - l2tp_nl_unregister_ops(L2TP_PWTYPE_ETH); -out: +err: return err; }
static void __exit l2tp_eth_exit(void) { - unregister_pernet_device(&l2tp_eth_net_ops); l2tp_nl_unregister_ops(L2TP_PWTYPE_ETH); }
From: Guillaume Nault g.nault@alphalink.fr
commit 3953ae7b218df4d1e544b98a393666f9ae58a78c upstream.
Sessions created by l2tp_session_create() aren't fully initialised: some pseudo-wire specific operations need to be done before making the session usable. Therefore the PPP and Ethernet pseudo-wires continue working on the returned l2tp session while it's already been exposed to the rest of the system. This can lead to various issues. In particular, the session may enter the deletion process before having been fully initialised, which will confuse the session removal code.
This patch moves session registration out of l2tp_session_create(), so that callers can control when the session is exposed to the rest of the system. This is done by the new l2tp_session_register() function.
Only pppol2tp_session_create() can be easily converted to avoid modifying its session after registration (the debug message is dropped in order to avoid the need for holding a reference on the session).
For pppol2tp_connect() and l2tp_eth_create()), more work is needed. That'll be done in followup patches. For now, let's just register the session right after its creation, like it was done before. The only difference is that we can easily take a reference on the session before registering it, so, at least, we're sure it's not going to be freed while we're working on it.
Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_core.c | 21 +++++++-------------- net/l2tp/l2tp_core.h | 3 +++ net/l2tp/l2tp_eth.c | 9 +++++++++ net/l2tp/l2tp_ppp.c | 27 +++++++++++++++++++-------- 4 files changed, 38 insertions(+), 22 deletions(-)
--- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -321,8 +321,8 @@ struct l2tp_session *l2tp_session_get_by } EXPORT_SYMBOL_GPL(l2tp_session_get_by_ifname);
-static int l2tp_session_add_to_tunnel(struct l2tp_tunnel *tunnel, - struct l2tp_session *session) +int l2tp_session_register(struct l2tp_session *session, + struct l2tp_tunnel *tunnel) { struct l2tp_session *session_walk; struct hlist_head *g_head; @@ -370,6 +370,10 @@ static int l2tp_session_add_to_tunnel(st hlist_add_head(&session->hlist, head); write_unlock_bh(&tunnel->hlist_lock);
+ /* Ignore management session in session count value */ + if (session->session_id != 0) + atomic_inc(&l2tp_session_count); + return 0;
err_tlock_pnlock: @@ -379,6 +383,7 @@ err_tlock:
return err; } +EXPORT_SYMBOL_GPL(l2tp_session_register);
/* Lookup a tunnel by id */ @@ -1793,7 +1798,6 @@ EXPORT_SYMBOL_GPL(l2tp_session_set_heade struct l2tp_session *l2tp_session_create(int priv_size, struct l2tp_tunnel *tunnel, u32 session_id, u32 peer_session_id, struct l2tp_session_cfg *cfg) { struct l2tp_session *session; - int err;
session = kzalloc(sizeof(struct l2tp_session) + priv_size, GFP_KERNEL); if (session != NULL) { @@ -1851,17 +1855,6 @@ struct l2tp_session *l2tp_session_create
l2tp_session_inc_refcount(session);
- err = l2tp_session_add_to_tunnel(tunnel, session); - if (err) { - kfree(session); - - return ERR_PTR(err); - } - - /* Ignore management session in session count value */ - if (session->session_id != 0) - atomic_inc(&l2tp_session_count); - return session; }
--- a/net/l2tp/l2tp_core.h +++ b/net/l2tp/l2tp_core.h @@ -262,6 +262,9 @@ struct l2tp_session *l2tp_session_create struct l2tp_tunnel *tunnel, u32 session_id, u32 peer_session_id, struct l2tp_session_cfg *cfg); +int l2tp_session_register(struct l2tp_session *session, + struct l2tp_tunnel *tunnel); + void __l2tp_session_unhash(struct l2tp_session *session); int l2tp_session_delete(struct l2tp_session *session); void l2tp_session_free(struct l2tp_session *session); --- a/net/l2tp/l2tp_eth.c +++ b/net/l2tp/l2tp_eth.c @@ -267,6 +267,13 @@ static int l2tp_eth_create(struct net *n goto out; }
+ l2tp_session_inc_refcount(session); + rc = l2tp_session_register(session, tunnel); + if (rc < 0) { + kfree(session); + goto out; + } + dev = alloc_netdev(sizeof(*priv), name, NET_NAME_UNKNOWN, l2tp_eth_dev_setup); if (!dev) { @@ -298,6 +305,7 @@ static int l2tp_eth_create(struct net *n __module_get(THIS_MODULE); /* Must be done after register_netdev() */ strlcpy(session->ifname, dev->name, IFNAMSIZ); + l2tp_session_dec_refcount(session);
dev_hold(dev);
@@ -308,6 +316,7 @@ out_del_dev: spriv->dev = NULL; out_del_session: l2tp_session_delete(session); + l2tp_session_dec_refcount(session); out: return rc; } --- a/net/l2tp/l2tp_ppp.c +++ b/net/l2tp/l2tp_ppp.c @@ -737,6 +737,14 @@ static int pppol2tp_connect(struct socke error = PTR_ERR(session); goto end; } + + l2tp_session_inc_refcount(session); + error = l2tp_session_register(session, tunnel); + if (error < 0) { + kfree(session); + goto end; + } + drop_refcnt = true; }
/* Associate session with its PPPoL2TP socket */ @@ -822,7 +830,7 @@ static int pppol2tp_session_create(struc /* Error if tunnel socket is not prepped */ if (!tunnel->sock) { error = -ENOENT; - goto out; + goto err; }
/* Default MTU values. */ @@ -837,18 +845,21 @@ static int pppol2tp_session_create(struc peer_session_id, cfg); if (IS_ERR(session)) { error = PTR_ERR(session); - goto out; + goto err; }
ps = l2tp_session_priv(session); ps->tunnel_sock = tunnel->sock;
- l2tp_info(session, L2TP_MSG_CONTROL, "%s: created\n", - session->name); - - error = 0; - -out: + error = l2tp_session_register(session, tunnel); + if (error < 0) + goto err_sess; + + return 0; + +err_sess: + kfree(session); +err: return error; }
From: Guillaume Nault g.nault@alphalink.fr
commit ee28de6bbd78c2e18111a0aef43ea746f28d2073 upstream.
Sessions must be initialised before being made externally visible by l2tp_session_register(). Otherwise the session may be concurrently deleted before being initialised, which can confuse the deletion path and eventually lead to kernel oops.
Therefore, we need to move l2tp_session_register() down in l2tp_eth_create(), but also handle the intermediate step where only the session or the netdevice has been registered.
We can't just call l2tp_session_register() in ->ndo_init() because we'd have no way to properly undo this operation in ->ndo_uninit(). Instead, let's register the session and the netdevice in two different steps and protect the session's device pointer with RCU.
And now that we allow the session's .dev field to be NULL, we don't need to prevent the netdevice from being removed anymore. So we can drop the dev_hold() and dev_put() calls in l2tp_eth_create() and l2tp_eth_dev_uninit().
Backporting Notes
l2tp_eth.c: In l2tp_eth_create the "out" label was renamed to "err". There was one extra occurrence of "goto out" to update.
Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support") Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_eth.c | 108 ++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 76 insertions(+), 32 deletions(-)
--- a/net/l2tp/l2tp_eth.c +++ b/net/l2tp/l2tp_eth.c @@ -54,7 +54,7 @@ struct l2tp_eth {
/* via l2tp_session_priv() */ struct l2tp_eth_sess { - struct net_device *dev; + struct net_device __rcu *dev; };
@@ -72,7 +72,14 @@ static int l2tp_eth_dev_init(struct net_
static void l2tp_eth_dev_uninit(struct net_device *dev) { - dev_put(dev); + struct l2tp_eth *priv = netdev_priv(dev); + struct l2tp_eth_sess *spriv; + + spriv = l2tp_session_priv(priv->session); + RCU_INIT_POINTER(spriv->dev, NULL); + /* No need for synchronize_net() here. We're called by + * unregister_netdev*(), which does the synchronisation for us. + */ }
static int l2tp_eth_dev_xmit(struct sk_buff *skb, struct net_device *dev) @@ -126,8 +133,8 @@ static void l2tp_eth_dev_setup(struct ne static void l2tp_eth_dev_recv(struct l2tp_session *session, struct sk_buff *skb, int data_len) { struct l2tp_eth_sess *spriv = l2tp_session_priv(session); - struct net_device *dev = spriv->dev; - struct l2tp_eth *priv = netdev_priv(dev); + struct net_device *dev; + struct l2tp_eth *priv;
if (session->debug & L2TP_MSG_DATA) { unsigned int length; @@ -151,16 +158,25 @@ static void l2tp_eth_dev_recv(struct l2t skb_dst_drop(skb); nf_reset(skb);
+ rcu_read_lock(); + dev = rcu_dereference(spriv->dev); + if (!dev) + goto error_rcu; + + priv = netdev_priv(dev); if (dev_forward_skb(dev, skb) == NET_RX_SUCCESS) { atomic_long_inc(&priv->rx_packets); atomic_long_add(data_len, &priv->rx_bytes); } else { atomic_long_inc(&priv->rx_errors); } + rcu_read_unlock(); + return;
+error_rcu: + rcu_read_unlock(); error: - atomic_long_inc(&priv->rx_errors); kfree_skb(skb); }
@@ -171,11 +187,15 @@ static void l2tp_eth_delete(struct l2tp_
if (session) { spriv = l2tp_session_priv(session); - dev = spriv->dev; + + rtnl_lock(); + dev = rtnl_dereference(spriv->dev); if (dev) { - unregister_netdev(dev); - spriv->dev = NULL; + unregister_netdevice(dev); + rtnl_unlock(); module_put(THIS_MODULE); + } else { + rtnl_unlock(); } } } @@ -185,9 +205,20 @@ static void l2tp_eth_show(struct seq_fil { struct l2tp_session *session = arg; struct l2tp_eth_sess *spriv = l2tp_session_priv(session); - struct net_device *dev = spriv->dev; + struct net_device *dev; + + rcu_read_lock(); + dev = rcu_dereference(spriv->dev); + if (!dev) { + rcu_read_unlock(); + return; + } + dev_hold(dev); + rcu_read_unlock();
seq_printf(m, " interface %s\n", dev->name); + + dev_put(dev); } #endif
@@ -254,7 +285,7 @@ static int l2tp_eth_create(struct net *n if (dev) { dev_put(dev); rc = -EEXIST; - goto out; + goto err; } strlcpy(name, cfg->ifname, IFNAMSIZ); } else @@ -264,21 +295,14 @@ static int l2tp_eth_create(struct net *n peer_session_id, cfg); if (IS_ERR(session)) { rc = PTR_ERR(session); - goto out; - } - - l2tp_session_inc_refcount(session); - rc = l2tp_session_register(session, tunnel); - if (rc < 0) { - kfree(session); - goto out; + goto err; }
dev = alloc_netdev(sizeof(*priv), name, NET_NAME_UNKNOWN, l2tp_eth_dev_setup); if (!dev) { rc = -ENOMEM; - goto out_del_session; + goto err_sess; }
dev_net_set(dev, net); @@ -296,28 +320,48 @@ static int l2tp_eth_create(struct net *n #endif
spriv = l2tp_session_priv(session); - spriv->dev = dev;
- rc = register_netdev(dev); - if (rc < 0) - goto out_del_dev; + l2tp_session_inc_refcount(session); + + rtnl_lock(); + + /* Register both device and session while holding the rtnl lock. This + * ensures that l2tp_eth_delete() will see that there's a device to + * unregister, even if it happened to run before we assign spriv->dev. + */ + rc = l2tp_session_register(session, tunnel); + if (rc < 0) { + rtnl_unlock(); + goto err_sess_dev; + } + + rc = register_netdevice(dev); + if (rc < 0) { + rtnl_unlock(); + l2tp_session_delete(session); + l2tp_session_dec_refcount(session); + free_netdev(dev); + + return rc; + }
- __module_get(THIS_MODULE); - /* Must be done after register_netdev() */ strlcpy(session->ifname, dev->name, IFNAMSIZ); + rcu_assign_pointer(spriv->dev, dev); + + rtnl_unlock(); + l2tp_session_dec_refcount(session);
- dev_hold(dev); + __module_get(THIS_MODULE);
return 0;
-out_del_dev: - free_netdev(dev); - spriv->dev = NULL; -out_del_session: - l2tp_session_delete(session); +err_sess_dev: l2tp_session_dec_refcount(session); -out: + free_netdev(dev); +err_sess: + kfree(session); +err: return rc; }
From: Guillaume Nault g.nault@alphalink.fr
commit ee40fb2e1eb5bc0ddd3f2f83c6e39a454ef5a741 upstream.
pppol2tp_session_create() registers sessions that can't have their corresponding socket initialised. This socket has to be created by userspace, then connected to the session by pppol2tp_connect(). Therefore, we need to protect the pppol2tp socket pointer of L2TP sessions, so that it can safely be updated when userspace is connecting or closing the socket. This will eventually allow pppol2tp_connect() to avoid generating transient states while initialising its parts of the session.
To this end, this patch protects the pppol2tp socket pointer using RCU.
The pppol2tp socket pointer is still set in pppol2tp_connect(), but only once we know the function isn't going to fail. It's eventually reset by pppol2tp_release(), which now has to wait for a grace period to elapse before it can drop the last reference on the socket. This ensures that pppol2tp_session_get_sock() can safely grab a reference on the socket, even after ps->sk is reset to NULL but before this operation actually gets visible from pppol2tp_session_get_sock().
The rest is standard RCU conversion: pppol2tp_recv(), which already runs in atomic context, is simply enclosed by rcu_read_lock() and rcu_read_unlock(), while other functions are converted to use pppol2tp_session_get_sock() followed by sock_put(). pppol2tp_session_setsockopt() is a special case. It used to retrieve the pppol2tp socket from the L2TP session, which itself was retrieved from the pppol2tp socket. Therefore we can just avoid dereferencing ps->sk and directly use the original socket pointer instead.
With all users of ps->sk now handling NULL and concurrent updates, the L2TP ->ref() and ->deref() callbacks aren't needed anymore. Therefore, rather than converting pppol2tp_session_sock_hold() and pppol2tp_session_sock_put(), we can just drop them.
Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_ppp.c | 154 ++++++++++++++++++++++++++++++++++------------------ 1 file changed, 101 insertions(+), 53 deletions(-)
--- a/net/l2tp/l2tp_ppp.c +++ b/net/l2tp/l2tp_ppp.c @@ -122,8 +122,11 @@ struct pppol2tp_session { int owner; /* pid that opened the socket */
- struct sock *sock; /* Pointer to the session + struct mutex sk_lock; /* Protects .sk */ + struct sock __rcu *sk; /* Pointer to the session * PPPoX socket */ + struct sock *__sk; /* Copy of .sk, for cleanup */ + struct rcu_head rcu; /* For asynchronous release */ struct sock *tunnel_sock; /* Pointer to the tunnel UDP * socket */ int flags; /* accessed by PPPIOCGFLAGS. @@ -138,6 +141,24 @@ static const struct ppp_channel_ops pppo
static const struct proto_ops pppol2tp_ops;
+/* Retrieves the pppol2tp socket associated to a session. + * A reference is held on the returned socket, so this function must be paired + * with sock_put(). + */ +static struct sock *pppol2tp_session_get_sock(struct l2tp_session *session) +{ + struct pppol2tp_session *ps = l2tp_session_priv(session); + struct sock *sk; + + rcu_read_lock(); + sk = rcu_dereference(ps->sk); + if (sk) + sock_hold(sk); + rcu_read_unlock(); + + return sk; +} + /* Helpers to obtain tunnel/session contexts from sockets. */ static inline struct l2tp_session *pppol2tp_sock_to_session(struct sock *sk) @@ -224,7 +245,8 @@ static void pppol2tp_recv(struct l2tp_se /* If the socket is bound, send it in to PPP's input queue. Otherwise * queue it on the session socket. */ - sk = ps->sock; + rcu_read_lock(); + sk = rcu_dereference(ps->sk); if (sk == NULL) goto no_sock;
@@ -262,30 +284,16 @@ static void pppol2tp_recv(struct l2tp_se kfree_skb(skb); } } + rcu_read_unlock();
return;
no_sock: + rcu_read_unlock(); l2tp_info(session, L2TP_MSG_DATA, "%s: no socket\n", session->name); kfree_skb(skb); }
-static void pppol2tp_session_sock_hold(struct l2tp_session *session) -{ - struct pppol2tp_session *ps = l2tp_session_priv(session); - - if (ps->sock) - sock_hold(ps->sock); -} - -static void pppol2tp_session_sock_put(struct l2tp_session *session) -{ - struct pppol2tp_session *ps = l2tp_session_priv(session); - - if (ps->sock) - sock_put(ps->sock); -} - /************************************************************************ * Transmit handling ***********************************************************************/ @@ -446,14 +454,16 @@ abort: */ static void pppol2tp_session_close(struct l2tp_session *session) { - struct pppol2tp_session *ps = l2tp_session_priv(session); - struct sock *sk = ps->sock; - struct socket *sock = sk->sk_socket; + struct sock *sk;
BUG_ON(session->magic != L2TP_SESSION_MAGIC);
- if (sock) - inet_shutdown(sock, SEND_SHUTDOWN); + sk = pppol2tp_session_get_sock(session); + if (sk) { + if (sk->sk_socket) + inet_shutdown(sk->sk_socket, SEND_SHUTDOWN); + sock_put(sk); + }
/* Don't let the session go away before our socket does */ l2tp_session_inc_refcount(session); @@ -476,6 +486,14 @@ static void pppol2tp_session_destruct(st } }
+static void pppol2tp_put_sk(struct rcu_head *head) +{ + struct pppol2tp_session *ps; + + ps = container_of(head, typeof(*ps), rcu); + sock_put(ps->__sk); +} + /* Called when the PPPoX socket (session) is closed. */ static int pppol2tp_release(struct socket *sock) @@ -501,11 +519,24 @@ static int pppol2tp_release(struct socke
session = pppol2tp_sock_to_session(sk);
- /* Purge any queued data */ if (session != NULL) { + struct pppol2tp_session *ps; + __l2tp_session_unhash(session); l2tp_session_queue_purge(session); - sock_put(sk); + + ps = l2tp_session_priv(session); + mutex_lock(&ps->sk_lock); + ps->__sk = rcu_dereference_protected(ps->sk, + lockdep_is_held(&ps->sk_lock)); + RCU_INIT_POINTER(ps->sk, NULL); + mutex_unlock(&ps->sk_lock); + call_rcu(&ps->rcu, pppol2tp_put_sk); + + /* Rely on the sock_put() call at the end of the function for + * dropping the reference held by pppol2tp_sock_to_session(). + * The last reference will be dropped by pppol2tp_put_sk(). + */ } release_sock(sk);
@@ -572,12 +603,14 @@ out: static void pppol2tp_show(struct seq_file *m, void *arg) { struct l2tp_session *session = arg; - struct pppol2tp_session *ps = l2tp_session_priv(session); + struct sock *sk; + + sk = pppol2tp_session_get_sock(session); + if (sk) { + struct pppox_sock *po = pppox_sk(sk);
- if (ps) { - struct pppox_sock *po = pppox_sk(ps->sock); - if (po) - seq_printf(m, " interface %s\n", ppp_dev_name(&po->chan)); + seq_printf(m, " interface %s\n", ppp_dev_name(&po->chan)); + sock_put(sk); } } #endif @@ -715,13 +748,17 @@ static int pppol2tp_connect(struct socke /* Using a pre-existing session is fine as long as it hasn't * been connected yet. */ - if (ps->sock) { + mutex_lock(&ps->sk_lock); + if (rcu_dereference_protected(ps->sk, + lockdep_is_held(&ps->sk_lock))) { + mutex_unlock(&ps->sk_lock); error = -EEXIST; goto end; }
/* consistency checks */ if (ps->tunnel_sock != tunnel->sock) { + mutex_unlock(&ps->sk_lock); error = -EEXIST; goto end; } @@ -738,19 +775,21 @@ static int pppol2tp_connect(struct socke goto end; }
+ ps = l2tp_session_priv(session); + mutex_init(&ps->sk_lock); l2tp_session_inc_refcount(session); + + mutex_lock(&ps->sk_lock); error = l2tp_session_register(session, tunnel); if (error < 0) { + mutex_unlock(&ps->sk_lock); kfree(session); goto end; } drop_refcnt = true; }
- /* Associate session with its PPPoL2TP socket */ - ps = l2tp_session_priv(session); ps->owner = current->pid; - ps->sock = sk; ps->tunnel_sock = tunnel->sock;
session->recv_skb = pppol2tp_recv; @@ -759,12 +798,6 @@ static int pppol2tp_connect(struct socke session->show = pppol2tp_show; #endif
- /* We need to know each time a skb is dropped from the reorder - * queue. - */ - session->ref = pppol2tp_session_sock_hold; - session->deref = pppol2tp_session_sock_put; - /* If PMTU discovery was enabled, use the MTU that was discovered */ dst = sk_dst_get(tunnel->sock); if (dst != NULL) { @@ -798,12 +831,17 @@ static int pppol2tp_connect(struct socke po->chan.mtu = session->mtu;
error = ppp_register_net_channel(sock_net(sk), &po->chan); - if (error) + if (error) { + mutex_unlock(&ps->sk_lock); goto end; + }
out_no_ppp: /* This is how we get the session context from the socket. */ sk->sk_user_data = session; + rcu_assign_pointer(ps->sk, sk); + mutex_unlock(&ps->sk_lock); + sk->sk_state = PPPOX_CONNECTED; l2tp_info(session, L2TP_MSG_CONTROL, "%s: created\n", session->name); @@ -849,6 +887,7 @@ static int pppol2tp_session_create(struc }
ps = l2tp_session_priv(session); + mutex_init(&ps->sk_lock); ps->tunnel_sock = tunnel->sock;
error = l2tp_session_register(session, tunnel); @@ -1020,12 +1059,10 @@ static int pppol2tp_session_ioctl(struct "%s: pppol2tp_session_ioctl(cmd=%#x, arg=%#lx)\n", session->name, cmd, arg);
- sk = ps->sock; + sk = pppol2tp_session_get_sock(session); if (!sk) return -EBADR;
- sock_hold(sk); - switch (cmd) { case SIOCGIFMTU: err = -ENXIO; @@ -1301,7 +1338,6 @@ static int pppol2tp_session_setsockopt(s int optname, int val) { int err = 0; - struct pppol2tp_session *ps = l2tp_session_priv(session);
switch (optname) { case PPPOL2TP_SO_RECVSEQ: @@ -1322,8 +1358,8 @@ static int pppol2tp_session_setsockopt(s } session->send_seq = val ? -1 : 0; { - struct sock *ssk = ps->sock; - struct pppox_sock *po = pppox_sk(ssk); + struct pppox_sock *po = pppox_sk(sk); + po->chan.hdrlen = val ? PPPOL2TP_L2TP_HDR_SIZE_SEQ : PPPOL2TP_L2TP_HDR_SIZE_NOSEQ; } @@ -1659,8 +1695,9 @@ static void pppol2tp_seq_session_show(st { struct l2tp_session *session = v; struct l2tp_tunnel *tunnel = session->tunnel; - struct pppol2tp_session *ps = l2tp_session_priv(session); - struct pppox_sock *po = pppox_sk(ps->sock); + unsigned char state; + char user_data_ok; + struct sock *sk; u32 ip = 0; u16 port = 0;
@@ -1670,6 +1707,15 @@ static void pppol2tp_seq_session_show(st port = ntohs(inet->inet_sport); }
+ sk = pppol2tp_session_get_sock(session); + if (sk) { + state = sk->sk_state; + user_data_ok = (session == sk->sk_user_data) ? 'Y' : 'N'; + } else { + state = 0; + user_data_ok = 'N'; + } + seq_printf(m, " SESSION '%s' %08X/%d %04X/%04X -> " "%04X/%04X %d %c\n", session->name, ip, port, @@ -1677,9 +1723,7 @@ static void pppol2tp_seq_session_show(st session->session_id, tunnel->peer_tunnel_id, session->peer_session_id, - ps->sock->sk_state, - (session == ps->sock->sk_user_data) ? - 'Y' : 'N'); + state, user_data_ok); seq_printf(m, " %d/%d/%c/%c/%s %08x %u\n", session->mtu, session->mru, session->recv_seq ? 'R' : '-', @@ -1696,8 +1740,12 @@ static void pppol2tp_seq_session_show(st atomic_long_read(&session->stats.rx_bytes), atomic_long_read(&session->stats.rx_errors));
- if (po) + if (sk) { + struct pppox_sock *po = pppox_sk(sk); + seq_printf(m, " interface %s\n", ppp_dev_name(&po->chan)); + sock_put(sk); + } }
static int pppol2tp_seq_show(struct seq_file *m, void *v)
From: Guillaume Nault g.nault@alphalink.fr
commit f98be6c6359e7e4a61aaefb9964c1db31cb9ec0c upstream.
pppol2tp_connect() initialises L2TP sessions after they've been exposed to the rest of the system by l2tp_session_register(). This puts sessions into transient states that are the source of several races, in particular with session's deletion path.
This patch centralises the initialisation code into pppol2tp_session_init(), which is called before the registration phase. The only field that can't be set before session registration is the pppol2tp socket pointer, which has already been converted to RCU. So pppol2tp_connect() should now be race-free.
The session's .session_close() callback is now set before registration. Therefore, it's always called when l2tp_core deletes the session, even if it was created by pppol2tp_session_create() and hasn't been plugged to a pppol2tp socket yet. That'd prevent session free because the extra reference taken by pppol2tp_session_close() wouldn't be dropped by the socket's ->sk_destruct() callback (pppol2tp_session_destruct()). We could set .session_close() only while connecting a session to its pppol2tp socket, or teach pppol2tp_session_close() to avoid grabbing a reference when the session isn't connected, but that'd require adding some form of synchronisation to be race free.
Instead of that, we can just let the pppol2tp socket hold a reference on the session as soon as it starts depending on it (that is, in pppol2tp_connect()). Then we don't need to utilise pppol2tp_session_close() to hold a reference at the last moment to prevent l2tp_core from dropping it.
When releasing the socket, pppol2tp_release() now deletes the session using the standard l2tp_session_delete() function, instead of merely removing it from hash tables. l2tp_session_delete() drops the reference the sessions holds on itself, but also makes sure it doesn't remove a session twice. So it can safely be called, even if l2tp_core already tried, or is concurrently trying, to remove the session. Finally, pppol2tp_session_destruct() drops the reference held by the socket.
Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/l2tp/l2tp_ppp.c | 69 ++++++++++++++++++++++++++++------------------------ 1 file changed, 38 insertions(+), 31 deletions(-)
--- a/net/l2tp/l2tp_ppp.c +++ b/net/l2tp/l2tp_ppp.c @@ -464,9 +464,6 @@ static void pppol2tp_session_close(struc inet_shutdown(sk->sk_socket, SEND_SHUTDOWN); sock_put(sk); } - - /* Don't let the session go away before our socket does */ - l2tp_session_inc_refcount(session); }
/* Really kill the session socket. (Called from sock_put() if @@ -522,8 +519,7 @@ static int pppol2tp_release(struct socke if (session != NULL) { struct pppol2tp_session *ps;
- __l2tp_session_unhash(session); - l2tp_session_queue_purge(session); + l2tp_session_delete(session);
ps = l2tp_session_priv(session); mutex_lock(&ps->sk_lock); @@ -615,6 +611,35 @@ static void pppol2tp_show(struct seq_fil } #endif
+static void pppol2tp_session_init(struct l2tp_session *session) +{ + struct pppol2tp_session *ps; + struct dst_entry *dst; + + session->recv_skb = pppol2tp_recv; + session->session_close = pppol2tp_session_close; +#if defined(CONFIG_L2TP_DEBUGFS) || defined(CONFIG_L2TP_DEBUGFS_MODULE) + session->show = pppol2tp_show; +#endif + + ps = l2tp_session_priv(session); + mutex_init(&ps->sk_lock); + ps->tunnel_sock = session->tunnel->sock; + ps->owner = current->pid; + + /* If PMTU discovery was enabled, use the MTU that was discovered */ + dst = sk_dst_get(session->tunnel->sock); + if (dst) { + u32 pmtu = dst_mtu(dst); + + if (pmtu) { + session->mtu = pmtu - PPPOL2TP_HEADER_OVERHEAD; + session->mru = pmtu - PPPOL2TP_HEADER_OVERHEAD; + } + dst_release(dst); + } +} + /* connect() handler. Attach a PPPoX socket to a tunnel UDP socket */ static int pppol2tp_connect(struct socket *sock, struct sockaddr *uservaddr, @@ -626,7 +651,6 @@ static int pppol2tp_connect(struct socke struct l2tp_session *session = NULL; struct l2tp_tunnel *tunnel; struct pppol2tp_session *ps; - struct dst_entry *dst; struct l2tp_session_cfg cfg = { 0, }; int error = 0; u32 tunnel_id, peer_tunnel_id; @@ -775,8 +799,8 @@ static int pppol2tp_connect(struct socke goto end; }
+ pppol2tp_session_init(session); ps = l2tp_session_priv(session); - mutex_init(&ps->sk_lock); l2tp_session_inc_refcount(session);
mutex_lock(&ps->sk_lock); @@ -789,26 +813,6 @@ static int pppol2tp_connect(struct socke drop_refcnt = true; }
- ps->owner = current->pid; - ps->tunnel_sock = tunnel->sock; - - session->recv_skb = pppol2tp_recv; - session->session_close = pppol2tp_session_close; -#if defined(CONFIG_L2TP_DEBUGFS) || defined(CONFIG_L2TP_DEBUGFS_MODULE) - session->show = pppol2tp_show; -#endif - - /* If PMTU discovery was enabled, use the MTU that was discovered */ - dst = sk_dst_get(tunnel->sock); - if (dst != NULL) { - u32 pmtu = dst_mtu(dst); - - if (pmtu != 0) - session->mtu = session->mru = pmtu - - PPPOL2TP_HEADER_OVERHEAD; - dst_release(dst); - } - /* Special case: if source & dest session_id == 0x0000, this * socket is being created to manage the tunnel. Just set up * the internal context for use by ioctl() and sockopt() @@ -842,6 +846,12 @@ out_no_ppp: rcu_assign_pointer(ps->sk, sk); mutex_unlock(&ps->sk_lock);
+ /* Keep the reference we've grabbed on the session: sk doesn't expect + * the session to disappear. pppol2tp_session_destruct() is responsible + * for dropping it. + */ + drop_refcnt = false; + sk->sk_state = PPPOX_CONNECTED; l2tp_info(session, L2TP_MSG_CONTROL, "%s: created\n", session->name); @@ -863,7 +873,6 @@ static int pppol2tp_session_create(struc { int error; struct l2tp_session *session; - struct pppol2tp_session *ps;
/* Error if tunnel socket is not prepped */ if (!tunnel->sock) { @@ -886,9 +895,7 @@ static int pppol2tp_session_create(struc goto err; }
- ps = l2tp_session_priv(session); - mutex_init(&ps->sk_lock); - ps->tunnel_sock = tunnel->sock; + pppol2tp_session_init(session);
error = l2tp_session_register(session, tunnel); if (error < 0)
From: Bob Peterson rpeterso@redhat.com
[ Upstream commit b14c94908b1b884276a6608dea3d0b1b510338b7 ]
This reverts commit df5db5f9ee112e76b5202fbc331f990a0fc316d6.
This patch fixes a regression: patch df5db5f9ee112 allowed function run_queue() to bypass its call to do_xmote() if revokes were queued for the glock. That's wrong because its call to do_xmote() is what is responsible for calling the go_sync() glops functions to sync both the ail list and any revokes queued for it. By bypassing the call, gfs2 could get into a stand-off where the glock could not be demoted until its revokes are written back, but the revokes would not be written back because do_xmote() was never called.
It "sort of" works, however, because there are other mechanisms like the log flush daemon (logd) that can sync the ail items and revokes, if it deems it necessary. The problem is: without file system pressure, it might never deem it necessary.
Signed-off-by: Bob Peterson rpeterso@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/glock.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index f80ffccb0316..1eb737c466dd 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -541,9 +541,6 @@ __acquires(&gl->gl_lockref.lock) goto out_unlock; if (nonblock) goto out_sched; - smp_mb(); - if (atomic_read(&gl->gl_revokes) != 0) - goto out_sched; set_bit(GLF_DEMOTE_IN_PROGRESS, &gl->gl_flags); GLOCK_BUG_ON(gl, gl->gl_demote_state == LM_ST_EXCLUSIVE); gl->gl_target = gl->gl_demote_state;
From: Dragos Bogdan dragos.bogdan@analog.com
commit 5e4f99a6b788047b0b8a7496c2e0c8f372f6edf2 upstream.
If the serial interface is used, the 8-bit address should be latched using the rising edge of the WR/FSYNC signal.
This basically means that a CS change is required between the first byte sent, and the second one. This change splits the single-transfer transfer of 2 bytes into 2 transfers with a single byte, and CS change in-between.
Note fixes tag is not accurate, but reflects a point beyond which there are too many refactors to make backporting straight forward.
Fixes: b19e9ad5e2cb ("staging:iio:resolver:ad2s1210 general driver cleanup.") Signed-off-by: Dragos Bogdan dragos.bogdan@analog.com Signed-off-by: Alexandru Ardelean alexandru.ardelean@analog.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/staging/iio/resolver/ad2s1210.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
--- a/drivers/staging/iio/resolver/ad2s1210.c +++ b/drivers/staging/iio/resolver/ad2s1210.c @@ -125,17 +125,24 @@ static int ad2s1210_config_write(struct static int ad2s1210_config_read(struct ad2s1210_state *st, unsigned char address) { - struct spi_transfer xfer = { - .len = 2, - .rx_buf = st->rx, - .tx_buf = st->tx, + struct spi_transfer xfers[] = { + { + .len = 1, + .rx_buf = &st->rx[0], + .tx_buf = &st->tx[0], + .cs_change = 1, + }, { + .len = 1, + .rx_buf = &st->rx[1], + .tx_buf = &st->tx[1], + }, }; int ret = 0;
ad2s1210_set_mode(MOD_CONFIG, st); st->tx[0] = address | AD2S1210_MSB_IS_HIGH; st->tx[1] = AD2S1210_REG_FAULT; - ret = spi_sync_transfer(st->sdev, &xfer, 1); + ret = spi_sync_transfer(st->sdev, xfers, 2); if (ret < 0) return ret; st->old_data = true;
From: Alexander Usyskin alexander.usyskin@intel.com
commit fc9c03ce30f79b71807961bfcb42be191af79873 upstream.
Allow me_cl object to be freed by releasing the reference that was acquired by one of the search functions: __mei_me_cl_by_uuid_id() or __mei_me_cl_by_uuid()
Cc: stable@vger.kernel.org Reported-by: 亿一 teroincn@gmail.com Signed-off-by: Alexander Usyskin alexander.usyskin@intel.com Signed-off-by: Tomas Winkler tomas.winkler@intel.com Link: https://lore.kernel.org/r/20200512223140.32186-1-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- drivers/misc/mei/client.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/misc/mei/client.c +++ b/drivers/misc/mei/client.c @@ -276,6 +276,7 @@ void mei_me_cl_rm_by_uuid(struct mei_dev down_write(&dev->me_clients_rwsem); me_cl = __mei_me_cl_by_uuid(dev, uuid); __mei_me_cl_del(dev, me_cl); + mei_me_cl_put(me_cl); up_write(&dev->me_clients_rwsem); }
@@ -297,6 +298,7 @@ void mei_me_cl_rm_by_uuid_id(struct mei_ down_write(&dev->me_clients_rwsem); me_cl = __mei_me_cl_by_uuid_id(dev, uuid, id); __mei_me_cl_del(dev, me_cl); + mei_me_cl_put(me_cl); up_write(&dev->me_clients_rwsem); }
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit 928edefbc18cd8433f7df235c6e09a9306e7d580 ]
This looks really unusual to have a 'get_device()' hidden in a 'dev_err()' call. Remove it.
While at it add a missing \n at the end of the message.
Fixes: 574fb258d636 ("Staging: IIO: VTI sca3000 series accelerometer driver (spi)") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/staging/iio/accel/sca3000_ring.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/staging/iio/accel/sca3000_ring.c b/drivers/staging/iio/accel/sca3000_ring.c index 20b878d35ea2..fc8b6f179ec6 100644 --- a/drivers/staging/iio/accel/sca3000_ring.c +++ b/drivers/staging/iio/accel/sca3000_ring.c @@ -56,7 +56,7 @@ static int sca3000_read_data(struct sca3000_state *st, st->tx[0] = SCA3000_READ_REG(reg_address_high); ret = spi_sync_transfer(st->us, xfer, ARRAY_SIZE(xfer)); if (ret) { - dev_err(get_device(&st->us->dev), "problem reading register"); + dev_err(&st->us->dev, "problem reading register"); goto error_free_rx; }
From: R. Parameswaran parameswaran.r7@gmail.com
commit 57240d007816486131bee88cd474c2a71f0fe224 upstream.
The MTU overhead calculation in L2TP device set-up merged via commit b784e7ebfce8cfb16c6f95e14e8532d0768ab7ff needs to be adjusted to lock the tunnel socket while referencing the sub-data structures to derive the socket's IP overhead.
Reported-by: Guillaume Nault g.nault@alphalink.fr Tested-by: Guillaume Nault g.nault@alphalink.fr Signed-off-by: R. Parameswaran rparames@brocade.com Signed-off-by: David S. Miller davem@davemloft.net Cc: Giuliano Procida gprocida@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- include/linux/net.h | 2 +- net/l2tp/l2tp_eth.c | 2 ++ net/socket.c | 2 +- 3 files changed, 4 insertions(+), 2 deletions(-)
--- a/include/linux/net.h +++ b/include/linux/net.h @@ -291,7 +291,7 @@ int kernel_sendpage(struct socket *sock, int kernel_sock_ioctl(struct socket *sock, int cmd, unsigned long arg); int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how);
-/* Following routine returns the IP overhead imposed by a socket. */ +/* Routine returns the IP overhead imposed by a (caller-protected) socket. */ u32 kernel_sock_ip_overhead(struct sock *sk);
#define MODULE_ALIAS_NETPROTO(proto) \ --- a/net/l2tp/l2tp_eth.c +++ b/net/l2tp/l2tp_eth.c @@ -240,7 +240,9 @@ static void l2tp_eth_adjust_mtu(struct l dev->needed_headroom += session->hdr_len; return; } + lock_sock(tunnel->sock); l3_overhead = kernel_sock_ip_overhead(tunnel->sock); + release_sock(tunnel->sock); if (l3_overhead == 0) { /* L3 Overhead couldn't be identified, this could be * because tunnel->sock was NULL or the socket's --- a/net/socket.c +++ b/net/socket.c @@ -3308,7 +3308,7 @@ EXPORT_SYMBOL(kernel_sock_shutdown); /* This routine returns the IP overhead imposed by a socket i.e. * the length of the underlying IP header, depending on whether * this is an IPv4 or IPv6 socket and the length from IP options turned - * on at the socket. + * on at the socket. Assumes that the caller has a lock on the socket. */ u32 kernel_sock_ip_overhead(struct sock *sk) {
On 26/05/2020 19:52, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.4.225 release. There are 65 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 28 May 2020 18:36:22 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.225-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
All tests are passing for Tegra ...
Test results for stable-v4.4: 6 builds: 6 pass, 0 fail 12 boots: 12 pass, 0 fail 19 tests: 19 pass, 0 fail
Linux version: 4.4.225-rc1-g1f47601a4296 Boards tested: tegra124-jetson-tk1, tegra20-ventana, tegra30-cardhu-a04
Cheers Jon
On Wed, 27 May 2020 at 00:24, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 4.4.225 release. There are 65 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 28 May 2020 18:36:22 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.225-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Summary ------------------------------------------------------------------------
kernel: 4.4.225-rc1 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git branch: linux-4.4.y git commit: 147ece171c0dc02b417f35088182a61e6dac368a git describe: v4.4.224-66-g147ece171c0d Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.4-oe/build/v4.4.224-66-...
No regressions (compared to build v4.4.224)
No fixes (compared to build v4.4.224)
Ran 15012 total tests in the following environments and test suites.
Environments -------------- - i386 - juno-r2 - arm64 - juno-r2-compat - juno-r2-kasan - x15 - arm - x86_64 - x86-kasan
Test Suites ----------- * build * kselftest * kselftest/drivers * kselftest/filesystems * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-open-posix-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * network-basic-tests * perf * v4l2-compliance * kvm-unit-tests * install-android-platform-tools-r2600 * install-android-platform-tools-r2800 * kselftest/net * kselftest-vsyscall-mode-native * kselftest-vsyscall-mode-native/drivers * kselftest-vsyscall-mode-native/filesystems
Summary ------------------------------------------------------------------------
kernel: 4.4.225-rc1 git repo: https://git.linaro.org/lkft/arm64-stable-rc.git git branch: 4.4.225-rc1-hikey-20200526-731 git commit: f578d6e82f6756e9b9385131e4bef87a9fe5483f git describe: 4.4.225-rc1-hikey-20200526-731 Test details: https://qa-reports.linaro.org/lkft/linaro-hikey-stable-rc-4.4-oe/build/4.4.2...
No regressions (compared to build 4.4.225-rc1-hikey-20200525-730)
No fixes (compared to build 4.4.225-rc1-hikey-20200525-730)
Ran 305 total tests in the following environments and test suites.
Environments -------------- - hi6220-hikey - arm64
Test Suites ----------- * build * install-android-platform-tools-r2600 * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-cpuhotplug-tests * ltp-cve-tests * ltp-fcntl-locktests-tests * ltp-ipc-tests * ltp-nptl-tests * ltp-pty-tests * ltp-securebits-tests * spectre-meltdown-checker-test
Good morning Greg,
From: stable-owner@vger.kernel.org stable-owner@vger.kernel.org On Behalf Of Greg Kroah-Hartman Sent: 26 May 2020 19:52
This is the start of the stable review cycle for the 4.4.225 release. There are 65 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
No build/boot issues seen for CIP configs for Linux 4.4.225-rc1 (147ece171c0d).
Build/test pipeline/logs: https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/pipelines/1498... GitLab CI pipeline: https://gitlab.com/cip-project/cip-testing/linux-cip-pipelines/-/blob/master... Relevant LAVA jobs: https://lava.ciplatform.org/scheduler/alljobs?length=25&search=147ece#ta...
Kind regards, Chris
Responses should be made by Thu, 28 May 2020 18:36:22 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable- review/patch-4.4.225-rc1.gz or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable- rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 4.4.225-rc1
R. Parameswaran parameswaran.r7@gmail.com l2tp: device MTU setup, tunnel socket needs a lock
Christophe JAILLET christophe.jaillet@wanadoo.fr iio: sca3000: Remove an erroneous 'get_device()'
Alexander Usyskin alexander.usyskin@intel.com mei: release me_cl object reference
Dragos Bogdan dragos.bogdan@analog.com staging: iio: ad2s1210: Fix SPI reading
Bob Peterson rpeterso@redhat.com Revert "gfs2: Don't demote a glock until its revokes are written"
Guillaume Nault g.nault@alphalink.fr l2tp: initialise PPP sessions before registering them
Guillaume Nault g.nault@alphalink.fr l2tp: protect sock pointer of struct pppol2tp_session with RCU
Guillaume Nault g.nault@alphalink.fr l2tp: initialise l2tp_eth sessions before registering them
Guillaume Nault g.nault@alphalink.fr l2tp: don't register sessions in l2tp_session_create()
Guillaume Nault g.nault@alphalink.fr l2tp: fix l2tp_eth module loading
Guillaume Nault g.nault@alphalink.fr l2tp: pass tunnel pointer to ->session_create()
Guillaume Nault g.nault@alphalink.fr l2tp: prevent creation of sessions on terminated tunnels
Guillaume Nault g.nault@alphalink.fr l2tp: hold tunnel used while creating sessions with netlink
Guillaume Nault g.nault@alphalink.fr l2tp: hold tunnel while handling genl TUNNEL_GET commands
Guillaume Nault g.nault@alphalink.fr l2tp: hold tunnel while handling genl tunnel updates
Guillaume Nault g.nault@alphalink.fr l2tp: hold tunnel while processing genl delete command
Guillaume Nault g.nault@alphalink.fr l2tp: hold tunnel while looking up sessions in l2tp_netlink
Guillaume Nault g.nault@alphalink.fr l2tp: initialise session's refcount before making it reachable
Guillaume Nault g.nault@alphalink.fr l2tp: define parameters of l2tp_tunnel_find*() as "const"
Guillaume Nault g.nault@alphalink.fr l2tp: define parameters of l2tp_session_get*() as "const"
Guillaume Nault g.nault@alphalink.fr l2tp: remove l2tp_session_find()
Guillaume Nault g.nault@alphalink.fr l2tp: remove useless duplicate session detection in l2tp_netlink
R. Parameswaran parameswaran.r7@gmail.com L2TP:Adjust intf MTU, add underlay L3, L2 hdrs.
R. Parameswaran parameswaran.r7@gmail.com New kernel function to get IP overhead on a socket.
Asbjørn Sloth Tønnesen asbjorn@asbjorn.st net: l2tp: ppp: change PPPOL2TP_MSG_* => L2TP_MSG_*
Asbjørn Sloth Tønnesen asbjorn@asbjorn.st net: l2tp: deprecate PPPOL2TP_MSG_* in favour of L2TP_MSG_*
Asbjørn Sloth Tønnesen asbjorn@asbjorn.st net: l2tp: export debug flags to UAPI
Guillaume Nault g.nault@alphalink.fr l2tp: don't use l2tp_tunnel_find() in l2tp_ip and l2tp_ip6
Guillaume Nault g.nault@alphalink.fr l2tp: take a reference on sessions used in genetlink handlers
Guillaume Nault g.nault@alphalink.fr l2tp: hold session while sending creation notifications
Guillaume Nault g.nault@alphalink.fr l2tp: fix racy socket lookup in l2tp_ip and l2tp_ip6 bind()
Guillaume Nault g.nault@alphalink.fr l2tp: lock socket before checking flags in connect()
Vishal Verma vishal.l.verma@intel.com libnvdimm/btt: Remove unnecessary code in btt_freelist_init
Colin Ian King colin.king@canonical.com platform/x86: alienware-wmi: fix kfree on potentially uninitialized pointer
Theodore Ts'o tytso@mit.edu ext4: lock the xattr block before checksuming it
Brent Lu brent.lu@intel.com ALSA: pcm: fix incorrect hw_base increase
Daniel Jordan daniel.m.jordan@oracle.com padata: purge get_cpu and reorder_via_wq from padata_do_serial
Daniel Jordan daniel.m.jordan@oracle.com padata: initialize pd->cpu with effective cpumask
Herbert Xu herbert@gondor.apana.org.au padata: Replace delayed timer with immediate workqueue in padata_reorder
Peter Zijlstra peterz@infradead.org sched/fair, cpumask: Export for_each_cpu_wrap()
Mathias Krause minipli@googlemail.com padata: set cpu_index of unused CPUs to -1
Kevin Hao haokexin@gmail.com i2c: dev: Fix the race between the release of i2c_dev and cdev
viresh kumar viresh.kumar@linaro.org i2c-dev: don't get i2c adapter via i2c_dev
Dan Carpenter dan.carpenter@oracle.com i2c: dev: use after free in detach
Wolfram Sang wsa@the-dreams.de i2c: dev: don't start function name with 'return'
Erico Nunes erico.nunes@datacom.ind.br i2c: dev: switch from register_chrdev to cdev API
Shuah Khan shuahkh@osg.samsung.com media: fix media devnode ioctl/syscall and unregister race
Shuah Khan shuahkh@osg.samsung.com media: fix use-after-free in cdev_put() when app exits after driver unbind
Mauro Carvalho Chehab mchehab@osg.samsung.com media-device: dynamically allocate struct media_devnode
Mauro Carvalho Chehab mchehab@osg.samsung.com media-devnode: fix namespace mess
Max Kellermann max@duempel.org media-devnode: add missing mutex lock in error handler
Max Kellermann max@duempel.org drivers/media/media-devnode: clear private_data before put_device()
Shuah Khan shuahkh@osg.samsung.com media: Fix media_open() to clear filp->private_data in error leg
Thomas Gleixner tglx@linutronix.de ARM: futex: Address build warning
Hans de Goede hdegoede@redhat.com platform/x86: asus-nb-wmi: Do not load on Asus T100TA and T200TA
Alan Stern stern@rowland.harvard.edu USB: core: Fix misleading driver bug report
Wu Bo wubo40@huawei.com ceph: fix double unlock in handle_cap_export()
Sebastian Reichel sebastian.reichel@collabora.com HID: multitouch: add eGalaxTouch P80H84 support
Al Viro viro@zeniv.linux.org.uk fix multiplication overflow in copy_fdtable()
Roberto Sassu roberto.sassu@huawei.com evm: Check also if *tfm is an error pointer in init_desc()
Mathias Krause minipli@googlemail.com padata: ensure padata_do_serial() runs on the correct CPU
Mathias Krause minipli@googlemail.com padata: ensure the reorder timer callback runs on the correct CPU
Jason A. Donenfeld Jason@zx2c4.com padata: get_next is never NULL
Tobias Klauser tklauser@distanz.ch padata: Remove unused but set variables
Cao jin caoj.fnst@cn.fujitsu.com igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr
Diffstat:
Documentation/networking/l2tp.txt | 8 +- Makefile | 4 +- arch/arm/include/asm/futex.h | 9 +- drivers/hid/hid-ids.h | 1 + drivers/hid/hid-multitouch.c | 3 + drivers/i2c/i2c-dev.c | 60 +++--- drivers/media/media-device.c | 43 +++-- drivers/media/media-devnode.c | 168 +++++++++------- drivers/media/usb/uvc/uvc_driver.c | 2 +- drivers/misc/mei/client.c | 2 + drivers/net/ethernet/intel/igb/igb_main.c | 4 +- drivers/nvdimm/btt.c | 8 +- drivers/platform/x86/alienware-wmi.c | 17 +- drivers/platform/x86/asus-nb-wmi.c | 24 +++ drivers/staging/iio/accel/sca3000_ring.c | 2 +- drivers/staging/iio/resolver/ad2s1210.c | 17 +- drivers/usb/core/message.c | 4 +- fs/ceph/caps.c | 1 + fs/ext4/xattr.c | 66 ++++--- fs/file.c | 2 +- fs/gfs2/glock.c | 3 - include/linux/cpumask.h | 17 ++ include/linux/net.h | 3 + include/linux/padata.h | 13 +- include/media/media-device.h | 5 +- include/media/media-devnode.h | 32 +++- include/net/ipv6.h | 2 + include/uapi/linux/if_pppol2tp.h | 13 +- include/uapi/linux/l2tp.h | 17 +- kernel/padata.c | 88 ++++----- lib/cpumask.c | 32 ++++ net/ipv6/datagram.c | 4 +- net/l2tp/l2tp_core.c | 181 ++++++----------- net/l2tp/l2tp_core.h | 47 +++-- net/l2tp/l2tp_eth.c | 216 +++++++++++++-------- net/l2tp/l2tp_ip.c | 68 ++++--- net/l2tp/l2tp_ip6.c | 82 ++++---- net/l2tp/l2tp_netlink.c | 124 +++++++----- net/l2tp/l2tp_ppp.c | 309 ++++++++++++++++++------------ net/socket.c | 46 +++++ security/integrity/evm/evm_crypto.c | 2 +- sound/core/pcm_lib.c | 1 + 42 files changed, 1014 insertions(+), 736 deletions(-)
On 5/26/20 11:52 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.4.225 release. There are 65 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 28 May 2020 18:36:22 +0000. Anything received after that time might be too late.
Build results: total: 169 pass: 169 fail: 0 Qemu test results: total: 332 pass: 332 fail: 0
Guenter
On 5/26/20 12:52 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 4.4.225 release. There are 65 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 28 May 2020 18:36:22 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.225-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
thanks, -- Shuah
linux-stable-mirror@lists.linaro.org