On 02/03/18 15:58, Simon Gaiser wrote:
> Juergen Gross:
>> On 20/02/18 05:56, Simon Gaiser wrote:
>>> Juergen Gross:
>>>> On 07/02/18 23:22, Simon Gaiser wrote:
>>>>> Commit fd8aa9095a95 ("xen: optimize xenbus driver for multiple
>>>>> concurrent xenstore accesses") made a subtle change to the semantic of
>>>>> xenbus_dev_request_and_reply() and xenbus_transaction_end().
>>>>>
>>>>> Before on an error response to XS_TRANSACTION_END
>>>>> xenbus_dev_request_and_reply() would not decrement the active
>>>>> transaction counter. But xenbus_transaction_end() has always counted the
>>>>> transaction as finished regardless of the response.
>>>>
>>>> Which is correct now. Xenstore will free all transaction related
>>>> data regardless of the response. A once failed transaction can't
>>>> be repaired, it has to be repeated completely.
>>>
>>> So if xenstore frees the transaction why should we keep it in the list
>>> with pending transaction in xenbus_dev_frontend? That's exactly what
>>> this patch fixes by always removing it from the list, not only on a
>>> successful response (See below for the EINVAL case).
>>
>> Aah, sorry, I seem to have misread my own coding. :-(
>>
>> Yes, you are right. Sorry for not seeing it before.
>>
>>>
>>> [...]
>>>>> But xenbus_dev_frontend tries to end a transaction on closing of the
>>>>> device if the XS_TRANSACTION_END failed before. Trying to close the
>>>>> transaction twice corrupts the reference count. So fix this by also
>>>>> considering a transaction closed if we have sent XS_TRANSACTION_END once
>>>>> regardless of the return code.
>>>>
>>>> A transaction in the list of transactions should not considered to be
>>>> finished. Either it is not on the list or it is still pending.
>>>
>>> With "considering a transaction closed" I mean "take the code path which
>>> removes the transaction from the list with pending transactions".
>>>
>>> From the follow-up mail:
>>>>>> The new behavior is that xenbus_dev_request_and_reply() and
>>>>>> xenbus_transaction_end() will always count the transaction as finished
>>>>>> regardless the response code (handled in xs_request_exit()).
>>>>>
>>>>> ENOENT should not decrement the transaction counter, while all
>>>>> other responses to XS_TRANSACTION_END should still do so.
>>>>
>>>> Sorry, I stand corrected: the ENOENT case should never happen, as this
>>>> case is tested in xenbus_write_transaction(). It doesn't hurt to test
>>>> for ENOENT, though.
>>>>
>>>> What should be handled is EINVAL: this would happen if a user specified
>>>> a string different from "T" and "F".
>>>
>>> Ok, I will handle those cases in xs_request_exit(). Although I don't
>>> like that this depends on the internals of xenstore (At least to me it's
>>> not obvious why it should only return ENOENT or EINVAL in this case).
>>>
>>> In the xenbus_write_transaction() case checking the string before
>>> sending the transaction (like the transaction itself is verified) would
>>> avoid this problem.
>>
>> Right. I'd prefer this solution.
>>
>> Remains the only problem you tried to tackle with your second patch: a
>> kernel driver going crazy and ending transactions it never started (or
>> ending them multiple times). The EINVAL case can't happen here, but
>> ENOENT can. Either ENOENT has to be handled in xs_request_exit() or you
>> need to keep track of the transactions like in the user interface and
>> refuse ending an unknown transaction. Or you trust the kernel users.
>> Trying to fix the usage counter seems to be the wrong approach IMO.
>
> The point of the second patch was to detect such bugs. This would have
> saved quite some time to find this bug. I added the "fix" of the counter
> I just because it was trivial after having the if there.
>
> Adding tracking seems to be a quite complex solution for a _potential_
> problem.
I agree.
> So I would go with checking ENOENT in xs_request_exit(). Should this be
> WARN_ON_ONCE()? Since this normally should not happen I would say yes.
Yes, having a WARN_ON_ONCE here will help.
> Should I keep the reference counter sanity check? And if yes, with the
> "fix" to the counter?
I'd drop it. This really should not happen and blowing up kernel size
with checks of impossible situations isn't the way to go.
In case you really want to do something here you can add something like
ASSERT(xs_state_users) before decrementing the counter.
Juergen
On 20/02/18 05:56, Simon Gaiser wrote:
> Juergen Gross:
>> On 07/02/18 23:22, Simon Gaiser wrote:
>>> Commit fd8aa9095a95 ("xen: optimize xenbus driver for multiple
>>> concurrent xenstore accesses") made a subtle change to the semantic of
>>> xenbus_dev_request_and_reply() and xenbus_transaction_end().
>>>
>>> Before on an error response to XS_TRANSACTION_END
>>> xenbus_dev_request_and_reply() would not decrement the active
>>> transaction counter. But xenbus_transaction_end() has always counted the
>>> transaction as finished regardless of the response.
>>
>> Which is correct now. Xenstore will free all transaction related
>> data regardless of the response. A once failed transaction can't
>> be repaired, it has to be repeated completely.
>
> So if xenstore frees the transaction why should we keep it in the list
> with pending transaction in xenbus_dev_frontend? That's exactly what
> this patch fixes by always removing it from the list, not only on a
> successful response (See below for the EINVAL case).
Aah, sorry, I seem to have misread my own coding. :-(
Yes, you are right. Sorry for not seeing it before.
>
> [...]
>>> But xenbus_dev_frontend tries to end a transaction on closing of the
>>> device if the XS_TRANSACTION_END failed before. Trying to close the
>>> transaction twice corrupts the reference count. So fix this by also
>>> considering a transaction closed if we have sent XS_TRANSACTION_END once
>>> regardless of the return code.
>>
>> A transaction in the list of transactions should not considered to be
>> finished. Either it is not on the list or it is still pending.
>
> With "considering a transaction closed" I mean "take the code path which
> removes the transaction from the list with pending transactions".
>
> From the follow-up mail:
>>>> The new behavior is that xenbus_dev_request_and_reply() and
>>>> xenbus_transaction_end() will always count the transaction as finished
>>>> regardless the response code (handled in xs_request_exit()).
>>>
>>> ENOENT should not decrement the transaction counter, while all
>>> other responses to XS_TRANSACTION_END should still do so.
>>
>> Sorry, I stand corrected: the ENOENT case should never happen, as this
>> case is tested in xenbus_write_transaction(). It doesn't hurt to test
>> for ENOENT, though.
>>
>> What should be handled is EINVAL: this would happen if a user specified
>> a string different from "T" and "F".
>
> Ok, I will handle those cases in xs_request_exit(). Although I don't
> like that this depends on the internals of xenstore (At least to me it's
> not obvious why it should only return ENOENT or EINVAL in this case).
>
> In the xenbus_write_transaction() case checking the string before
> sending the transaction (like the transaction itself is verified) would
> avoid this problem.
Right. I'd prefer this solution.
Remains the only problem you tried to tackle with your second patch: a
kernel driver going crazy and ending transactions it never started (or
ending them multiple times). The EINVAL case can't happen here, but
ENOENT can. Either ENOENT has to be handled in xs_request_exit() or you
need to keep track of the transactions like in the user interface and
refuse ending an unknown transaction. Or you trust the kernel users.
Trying to fix the usage counter seems to be the wrong approach IMO.
Juergen
On Fri, Mar 02, 2018 at 01:46:47PM +0100, Wolfram Sang wrote:
>
> So, maybe some words why I accepted this patch.
>
> On Fri, Mar 02, 2018 at 11:19:31AM +0000, Russell King - ARM Linux wrote:
> > Note that there have been patches proposed to make platform_get_irq()
> > return an error rather than returning a value of zero, so changing
> > the driver in this way is not a good idea.
>
> I'd much agree to such an approach, yet I didn't see it coming along so
> far for years(?) now.
It needs platform maintainers to be motivated to fix it, and one way to
provide that motivation is for subsystem maintainers to say no to patches
like this. If patches like this get accepted, then the "problem" gets
solved, and there is very little motivation to fix the platform itself.
If you look back at the history of this, the times when platforms have
been fixed is when they have a problem exactly like this, and they're
told to fix their platforms IRQ numbering instead of the patch to "fix"
the driver being accepted.
Why fix the interrupt numbering if patches to "fix" drivers get
accepted?
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up
Interrupt number 0 (returned by platform_get_irq()) might be a valid IRQ
so do not treat it as an error. If interrupt 0 was configured, the driver
would exit the probe early, before finishing initialization, but with
0-exit status.
Reported-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Cc: stable(a)vger.kernel.org
Fixes: e0d1ec97853f ("i2c-s3c2410: Change IRQ to be plain integer.")
Signed-off-by: Krzysztof Kozlowski <krzk(a)kernel.org>
---
drivers/i2c/busses/i2c-s3c2410.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-s3c2410.c b/drivers/i2c/busses/i2c-s3c2410.c
index 5d97510ee48b..783a93404f47 100644
--- a/drivers/i2c/busses/i2c-s3c2410.c
+++ b/drivers/i2c/busses/i2c-s3c2410.c
@@ -1178,7 +1178,7 @@ static int s3c24xx_i2c_probe(struct platform_device *pdev)
*/
if (!(i2c->quirks & QUIRK_POLL)) {
i2c->irq = ret = platform_get_irq(pdev, 0);
- if (ret <= 0) {
+ if (ret < 0) {
dev_err(&pdev->dev, "cannot find IRQ\n");
clk_unprepare(i2c->clk);
return ret;
--
2.7.4
On Fri, Mar 02, 2018 at 11:29:26AM +0100, Wolfram Sang wrote:
> On Thu, Mar 01, 2018 at 09:34:44PM +0100, Krzysztof Kozlowski wrote:
> > Interrupt number 0 (returned by platform_get_irq()) might be a valid IRQ
> > so do not treat it as an error. If interrupt 0 was configured, the driver
> > would exit the probe early, before finishing initialization, but with
> > 0-exit status.
> >
> > Reported-by: Dan Carpenter <dan.carpenter(a)oracle.com>
> > Cc: stable(a)vger.kernel.org
> > Fixes: e0d1ec97853f ("i2c-s3c2410: Change IRQ to be plain integer.")
>
> Please configure git to use 14 digits here.
Wait, when did we decide that 12 wasn't enough?
I just did a `git log | grep Fixes | tee baz | head -n 200` and only on
my git tree tehre were only 2 which used exactly 14 digits. The
standard is 12.
regards,
dan carpenter
The Highpoint RocketRAID 644L uses a Marvel 88SE9235 controller, as with
other Marvel controllers this needs a function 1 DMA alias quirk.
Note the RocketRAID 642L uses the same Marvel 88SE9235 controller and
already is listed with a function 1 DMA alias quirk.
Cc: stable(a)vger.kernel.org
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1534106
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
---
drivers/pci/quirks.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 8b14bd326d4a..46d47bd6ca1f 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3908,6 +3908,8 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MARVELL_EXT, 0x9230,
quirk_dma_func1_alias);
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0642,
quirk_dma_func1_alias);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_TTI, 0x0645,
+ quirk_dma_func1_alias);
/* https://bugs.gentoo.org/show_bug.cgi?id=497630 */
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_JMICRON,
PCI_DEVICE_ID_JMICRON_JMB388_ESD,
--
2.14.3
Like the Highpoint Rocketraid 642L and cards using a Marvel 88SE9235
controller in general, this RAID card also supports AHCI mode and short
of a custom driver, this is the only way to make it work under Linux.
Note that even though the card is called to 644L, it has a product-id
of 0x0645.
Cc: stable(a)vger.kernel.org
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1534106
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
---
drivers/ata/ahci.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c
index 355a95a83a34..1ff17799769d 100644
--- a/drivers/ata/ahci.c
+++ b/drivers/ata/ahci.c
@@ -550,7 +550,9 @@ static const struct pci_device_id ahci_pci_tbl[] = {
.driver_data = board_ahci_yes_fbs },
{ PCI_DEVICE(PCI_VENDOR_ID_MARVELL_EXT, 0x9230),
.driver_data = board_ahci_yes_fbs },
- { PCI_DEVICE(PCI_VENDOR_ID_TTI, 0x0642),
+ { PCI_DEVICE(PCI_VENDOR_ID_TTI, 0x0642), /* highpoint rocketraid 642L */
+ .driver_data = board_ahci_yes_fbs },
+ { PCI_DEVICE(PCI_VENDOR_ID_TTI, 0x0645), /* highpoint rocketraid 644L */
.driver_data = board_ahci_yes_fbs },
/* Promise */
--
2.14.3
This is a note to let you know that I've just added the patch titled
net_sched: get rid of rcu_barrier() in tcf_block_put_ext()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net_sched-get-rid-of-rcu_barrier-in-tcf_block_put_ext.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From efbf78973978b0d25af59bc26c8013a942af6e64 Mon Sep 17 00:00:00 2001
From: Cong Wang <xiyou.wangcong(a)gmail.com>
Date: Mon, 4 Dec 2017 10:48:18 -0800
Subject: net_sched: get rid of rcu_barrier() in tcf_block_put_ext()
From: Cong Wang <xiyou.wangcong(a)gmail.com>
commit efbf78973978b0d25af59bc26c8013a942af6e64 upstream.
Both Eric and Paolo noticed the rcu_barrier() we use in
tcf_block_put_ext() could be a performance bottleneck when
we have a lot of tc classes.
Paolo provided the following to demonstrate the issue:
tc qdisc add dev lo root htb
for I in `seq 1 1000`; do
tc class add dev lo parent 1: classid 1:$I htb rate 100kbit
tc qdisc add dev lo parent 1:$I handle $((I + 1)): htb
for J in `seq 1 10`; do
tc filter add dev lo parent $((I + 1)): u32 match ip src 1.1.1.$J
done
done
time tc qdisc del dev root
real 0m54.764s
user 0m0.023s
sys 0m0.000s
The rcu_barrier() there is to ensure we free the block after all chains
are gone, that is, to queue tcf_block_put_final() at the tail of workqueue.
We can achieve this ordering requirement by refcnt'ing tcf block instead,
that is, the tcf block is freed only when the last chain in this block is
gone. This also simplifies the code.
Paolo reported after this patch we get:
real 0m0.017s
user 0m0.000s
sys 0m0.017s
Tested-by: Paolo Abeni <pabeni(a)redhat.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Jiri Pirko <jiri(a)mellanox.com>
Cc: Jamal Hadi Salim <jhs(a)mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/sch_generic.h | 1 -
net/sched/cls_api.c | 29 +++++++++--------------------
2 files changed, 9 insertions(+), 21 deletions(-)
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -273,7 +273,6 @@ struct tcf_chain {
struct tcf_block {
struct list_head chain_list;
- struct work_struct work;
};
static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -211,8 +211,12 @@ static void tcf_chain_flush(struct tcf_c
static void tcf_chain_destroy(struct tcf_chain *chain)
{
+ struct tcf_block *block = chain->block;
+
list_del(&chain->list);
kfree(chain);
+ if (list_empty(&block->chain_list))
+ kfree(block);
}
static void tcf_chain_hold(struct tcf_chain *chain)
@@ -276,26 +280,12 @@ err_chain_create:
}
EXPORT_SYMBOL(tcf_block_get);
-static void tcf_block_put_final(struct work_struct *work)
-{
- struct tcf_block *block = container_of(work, struct tcf_block, work);
- struct tcf_chain *chain, *tmp;
-
- rtnl_lock();
-
- /* At this point, all the chains should have refcnt == 1. */
- list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
- tcf_chain_put(chain);
- rtnl_unlock();
- kfree(block);
-}
-
/* XXX: Standalone actions are not allowed to jump to any chain, and bound
* actions should be all removed after flushing.
*/
void tcf_block_put(struct tcf_block *block)
{
- struct tcf_chain *chain;
+ struct tcf_chain *chain, *tmp;
if (!block)
return;
@@ -310,12 +300,11 @@ void tcf_block_put(struct tcf_block *blo
list_for_each_entry(chain, &block->chain_list, list)
tcf_chain_flush(chain);
- INIT_WORK(&block->work, tcf_block_put_final);
- /* Wait for RCU callbacks to release the reference count and make
- * sure their works have been queued before this.
+ /* At this point, all the chains should have refcnt >= 1. Block will be
+ * freed after all chains are gone.
*/
- rcu_barrier();
- tcf_queue_work(&block->work);
+ list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
+ tcf_chain_put(chain);
}
EXPORT_SYMBOL(tcf_block_put);
Patches currently in stable-queue which might be from xiyou.wangcong(a)gmail.com are
queue-4.14/net_sched-get-rid-of-rcu_barrier-in-tcf_block_put_ext.patch
queue-4.14/net-sched-crash-on-blocks-with-goto-chain-action.patch
queue-4.14/net-sched-fix-use-after-free-in-tcf_block_put_ext.patch
queue-4.14/net-sched-fix-crash-when-deleting-secondary-chains.patch
This is a note to let you know that I've just added the patch titled
net: sched: fix use-after-free in tcf_block_put_ext
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-sched-fix-use-after-free-in-tcf_block_put_ext.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From df45bf84e4f5a48f23d4b1a07d21d566e8b587b2 Mon Sep 17 00:00:00 2001
From: Jiri Pirko <jiri(a)mellanox.com>
Date: Fri, 8 Dec 2017 19:27:27 +0100
Subject: net: sched: fix use-after-free in tcf_block_put_ext
From: Jiri Pirko <jiri(a)mellanox.com>
commit df45bf84e4f5a48f23d4b1a07d21d566e8b587b2 upstream.
Since the block is freed with last chain being put, once we reach the
end of iteration of list_for_each_entry_safe, the block may be
already freed. I'm hitting this only by creating and deleting clsact:
[ 202.171952] ==================================================================
[ 202.180182] BUG: KASAN: use-after-free in tcf_block_put_ext+0x240/0x390
[ 202.187590] Read of size 8 at addr ffff880225539a80 by task tc/796
[ 202.194508]
[ 202.196185] CPU: 0 PID: 796 Comm: tc Not tainted 4.15.0-rc2jiri+ #5
[ 202.203200] Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016
[ 202.213613] Call Trace:
[ 202.216369] dump_stack+0xda/0x169
[ 202.220192] ? dma_virt_map_sg+0x147/0x147
[ 202.224790] ? show_regs_print_info+0x54/0x54
[ 202.229691] ? tcf_chain_destroy+0x1dc/0x250
[ 202.234494] print_address_description+0x83/0x3d0
[ 202.239781] ? tcf_block_put_ext+0x240/0x390
[ 202.244575] kasan_report+0x1ba/0x460
[ 202.248707] ? tcf_block_put_ext+0x240/0x390
[ 202.253518] tcf_block_put_ext+0x240/0x390
[ 202.258117] ? tcf_chain_flush+0x290/0x290
[ 202.262708] ? qdisc_hash_del+0x82/0x1a0
[ 202.267111] ? qdisc_hash_add+0x50/0x50
[ 202.271411] ? __lock_is_held+0x5f/0x1a0
[ 202.275843] clsact_destroy+0x3d/0x80 [sch_ingress]
[ 202.281323] qdisc_destroy+0xcb/0x240
[ 202.285445] qdisc_graft+0x216/0x7b0
[ 202.289497] tc_get_qdisc+0x260/0x560
Fix this by holding the block also by chain 0 and put chain 0
explicitly, out of the list_for_each_entry_safe loop at the very
end of tcf_block_put_ext.
Fixes: efbf78973978 ("net_sched: get rid of rcu_barrier() in tcf_block_put_ext()")
Signed-off-by: Jiri Pirko <jiri(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Cc: Cong Wang <xiyou.wangcong(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sched/cls_api.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -290,21 +290,22 @@ void tcf_block_put(struct tcf_block *blo
if (!block)
return;
- /* Hold a refcnt for all chains, except 0, so that they don't disappear
+ /* Hold a refcnt for all chains, so that they don't disappear
* while we are iterating.
*/
list_for_each_entry(chain, &block->chain_list, list)
- if (chain->index)
- tcf_chain_hold(chain);
+ tcf_chain_hold(chain);
list_for_each_entry(chain, &block->chain_list, list)
tcf_chain_flush(chain);
- /* At this point, all the chains should have refcnt >= 1. Block will be
- * freed after all chains are gone.
- */
+ /* At this point, all the chains should have refcnt >= 1. */
list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
tcf_chain_put(chain);
+
+ /* Finally, put chain 0 and allow block to be freed. */
+ chain = list_first_entry(&block->chain_list, struct tcf_chain, list);
+ tcf_chain_put(chain);
}
EXPORT_SYMBOL(tcf_block_put);
Patches currently in stable-queue which might be from jiri(a)mellanox.com are
queue-4.14/net_sched-get-rid-of-rcu_barrier-in-tcf_block_put_ext.patch
queue-4.14/mlxsw-pci-wait-after-reset-before-accessing-hw.patch
queue-4.14/net-sched-crash-on-blocks-with-goto-chain-action.patch
queue-4.14/net-sched-fix-use-after-free-in-tcf_block_put_ext.patch
queue-4.14/net-sched-fix-crash-when-deleting-secondary-chains.patch