net/sctp/diag.c for instance is built into its own separate module (sctp_diag.ko) and requires the use of sctp_endpoint_{hold,put}() in order to prevent a recently found use-after-free issue.
In order to prevent data corruption of the pointer used to take a reference on a specific endpoint, between the time of calling sctp_endpoint_hold() and it returning, the API now returns a pointer to the exact endpoint that was incremented.
For example, in sctp_sock_dump(), we could have the following hunk:
sctp_endpoint_hold(tsp->asoc->ep); ep = tsp->asoc->ep; sk = ep->base.sk lock_sock(ep->base.sk);
It is possible for this task to be swapped out immediately following the call into sctp_endpoint_hold() that would change the address of tsp->asoc->ep to point to a completely different endpoint. This means a reference could be taken to the old endpoint and the new one would be processed without a reference taken, moreover the new endpoint could then be freed whilst still processing as a result, causing a use-after-free.
If we return the exact pointer that was held, we ensure this task processes only the endpoint we have taken a reference to. The resultant hunk now looks like this:
ep = sctp_endpoint_hold(tsp->asoc->ep); sk = ep->base.sk lock_sock(sk);
Cc: Vlad Yasevich vyasevich@gmail.com Cc: Neil Horman nhorman@tuxdriver.com Cc: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Cc: "David S. Miller" davem@davemloft.net Cc: Jakub Kicinski kuba@kernel.org Cc: lksctp developers linux-sctp@vger.kernel.org Cc: "H.P. Yarroll" piggy@acm.org Cc: Karl Knutson karl@athena.chicago.il.us Cc: Jon Grimm jgrimm@us.ibm.com Cc: Xingang Guo xingang.guo@intel.com Cc: Hui Huang hui.huang@nokia.com Cc: Sridhar Samudrala sri@us.ibm.com Cc: Daisy Chang daisyc@us.ibm.com Cc: Ryan Layer rmlayer@us.ibm.com Cc: Kevin Gao kevin.gao@intel.com Cc: linux-sctp@vger.kernel.org Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Fixes: 8f840e47f190c ("sctp: add the sctp_diag.c file") Signed-off-by: Lee Jones lee.jones@linaro.org --- include/net/sctp/structs.h | 2 +- net/sctp/endpointola.c | 5 ++++- 2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h index 651bba654d77d..78d71ca56452b 100644 --- a/include/net/sctp/structs.h +++ b/include/net/sctp/structs.h @@ -1380,7 +1380,7 @@ static inline struct sctp_endpoint *sctp_ep(struct sctp_ep_common *base) struct sctp_endpoint *sctp_endpoint_new(struct sock *, gfp_t); void sctp_endpoint_free(struct sctp_endpoint *); void sctp_endpoint_put(struct sctp_endpoint *); -void sctp_endpoint_hold(struct sctp_endpoint *); +struct sctp_endpoint *sctp_endpoint_hold(struct sctp_endpoint *); void sctp_endpoint_add_asoc(struct sctp_endpoint *, struct sctp_association *); struct sctp_association *sctp_endpoint_lookup_assoc( const struct sctp_endpoint *ep, diff --git a/net/sctp/endpointola.c b/net/sctp/endpointola.c index 48c9c2c7602f7..bdbf74fc7eb4c 100644 --- a/net/sctp/endpointola.c +++ b/net/sctp/endpointola.c @@ -222,10 +222,12 @@ static void sctp_endpoint_destroy(struct sctp_endpoint *ep) }
/* Hold a reference to an endpoint. */ -void sctp_endpoint_hold(struct sctp_endpoint *ep) +struct sctp_endpoint *sctp_endpoint_hold(struct sctp_endpoint *ep) { refcount_inc(&ep->base.refcnt); + return ep; } +EXPORT_SYMBOL_GPL(sctp_endpoint_hold);
/* Release a reference to an endpoint and clean up if there are * no more references. @@ -235,6 +237,7 @@ void sctp_endpoint_put(struct sctp_endpoint *ep) if (refcount_dec_and_test(&ep->base.refcnt)) sctp_endpoint_destroy(ep); } +EXPORT_SYMBOL_GPL(sctp_endpoint_put);
/* Is this the endpoint we are looking for? */ struct sctp_endpoint *sctp_endpoint_is_match(struct sctp_endpoint *ep,
The cause of the resultant dump_stack() reported below is a dereference of a freed pointer to 'struct sctp_endpoint' in sctp_sock_dump().
This race condition occurs when a transport is cached into its associated hash table followed by an endpoint/sock migration to a new association in sctp_assoc_migrate() prior to their subsequent use in sctp_diag_dump() which uses sctp_for_each_transport() to walk the hash table calling into sctp_sock_dump() where the dereference occurs.
BUG: KASAN: use-after-free in sctp_sock_dump+0xa8/0x438 [sctp_diag] Call trace: dump_backtrace+0x0/0x2dc show_stack+0x20/0x2c dump_stack+0x120/0x144 print_address_description+0x80/0x2f4 __kasan_report+0x174/0x194 kasan_report+0x10/0x18 __asan_load8+0x84/0x8c sctp_sock_dump+0xa8/0x438 [sctp_diag] sctp_for_each_transport+0x1e0/0x26c [sctp] sctp_diag_dump+0x180/0x1f0 [sctp_diag] inet_diag_dump+0x12c/0x168 netlink_dump+0x24c/0x5b8 __netlink_dump_start+0x274/0x2a8 inet_diag_handler_cmd+0x224/0x274 sock_diag_rcv_msg+0x21c/0x230 netlink_rcv_skb+0xe0/0x1bc sock_diag_rcv+0x34/0x48 netlink_unicast+0x3b4/0x430 netlink_sendmsg+0x4f0/0x574 sock_write_iter+0x18c/0x1f0 do_iter_readv_writev+0x230/0x2a8 do_iter_write+0xc8/0x2b4 vfs_writev+0xf8/0x184 do_writev+0xb0/0x1a8 __arm64_sys_writev+0x4c/0x5c el0_svc_common+0x118/0x250 el0_svc_handler+0x3c/0x9c el0_svc+0x8/0xc
To prevent this from happening we need to take a reference to the to-be-used/dereferenced 'struct sctp_endpoint' (which inherently holds a reference to the problematic 'struct sock') until such a time when we know they can be safely released.
When KASAN is not enabled, a similar, but slightly different NULL pointer derefernce crash occurs later along the thread of execution. This time in inet_sctp_diag_fill().
Cc: Vlad Yasevich vyasevich@gmail.com Cc: Neil Horman nhorman@tuxdriver.com Cc: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Cc: "David S. Miller" davem@davemloft.net Cc: Jakub Kicinski kuba@kernel.org Cc: lksctp developers linux-sctp@vger.kernel.org Cc: "H.P. Yarroll" piggy@acm.org Cc: Karl Knutson karl@athena.chicago.il.us Cc: Jon Grimm jgrimm@us.ibm.com Cc: Xingang Guo xingang.guo@intel.com Cc: Hui Huang hui.huang@nokia.com Cc: Sridhar Samudrala sri@us.ibm.com Cc: Daisy Chang daisyc@us.ibm.com Cc: Ryan Layer rmlayer@us.ibm.com Cc: Kevin Gao kevin.gao@intel.com Cc: linux-sctp@vger.kernel.org Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Fixes: 8f840e47f190c ("sctp: add the sctp_diag.c file") Signed-off-by: Lee Jones lee.jones@linaro.org --- net/sctp/diag.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/net/sctp/diag.c b/net/sctp/diag.c index 760b367644c12..998488a56ce2b 100644 --- a/net/sctp/diag.c +++ b/net/sctp/diag.c @@ -292,15 +292,17 @@ static int sctp_tsp_dump_one(struct sctp_transport *tsp, void *p)
static int sctp_sock_dump(struct sctp_transport *tsp, void *p) { - struct sctp_endpoint *ep = tsp->asoc->ep; + struct sctp_endpoint *ep; struct sctp_comm_param *commp = p; - struct sock *sk = ep->base.sk; + struct sock *sk; struct sk_buff *skb = commp->skb; struct netlink_callback *cb = commp->cb; const struct inet_diag_req_v2 *r = commp->r; struct sctp_association *assoc; int err = 0;
+ ep = sctp_endpoint_hold(tsp->asoc->ep); + sk = ep->base.sk; lock_sock(sk); list_for_each_entry(assoc, &ep->asocs, asocs) { if (cb->args[4] < cb->args[1]) @@ -341,6 +343,7 @@ static int sctp_sock_dump(struct sctp_transport *tsp, void *p) cb->args[4] = 0; release: release_sock(sk); + sctp_endpoint_put(ep); return err; }
From: Lee Jones
Sent: 17 December 2021 13:46
net/sctp/diag.c for instance is built into its own separate module (sctp_diag.ko) and requires the use of sctp_endpoint_{hold,put}() in order to prevent a recently found use-after-free issue.
In order to prevent data corruption of the pointer used to take a reference on a specific endpoint, between the time of calling sctp_endpoint_hold() and it returning, the API now returns a pointer to the exact endpoint that was incremented.
For example, in sctp_sock_dump(), we could have the following hunk:
sctp_endpoint_hold(tsp->asoc->ep); ep = tsp->asoc->ep; sk = ep->base.sk lock_sock(ep->base.sk);
It is possible for this task to be swapped out immediately following the call into sctp_endpoint_hold() that would change the address of tsp->asoc->ep to point to a completely different endpoint. This means a reference could be taken to the old endpoint and the new one would be processed without a reference taken, moreover the new endpoint could then be freed whilst still processing as a result, causing a use-after-free.
If we return the exact pointer that was held, we ensure this task processes only the endpoint we have taken a reference to. The resultant hunk now looks like this:
ep = sctp_endpoint_hold(tsp->asoc->ep);
sk = ep->base.sk lock_sock(sk);
Isn't that just the same as doing things in the other order? ep = tsp->assoc->ep; sctp_endpoint_hold(ep);
But if tsp->assoc->ep is allowed to change, can't it also change to something invalid? So I've have thought you should be holding some kind of lock that stops the data being changed before being 'allowed' to follow the pointers. In which case the current code is just a missing optimisatoion.
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Fri, 17 Dec 2021, David Laight wrote:
From: Lee Jones
Sent: 17 December 2021 13:46
net/sctp/diag.c for instance is built into its own separate module (sctp_diag.ko) and requires the use of sctp_endpoint_{hold,put}() in order to prevent a recently found use-after-free issue.
In order to prevent data corruption of the pointer used to take a reference on a specific endpoint, between the time of calling sctp_endpoint_hold() and it returning, the API now returns a pointer to the exact endpoint that was incremented.
For example, in sctp_sock_dump(), we could have the following hunk:
sctp_endpoint_hold(tsp->asoc->ep); ep = tsp->asoc->ep; sk = ep->base.sk lock_sock(ep->base.sk);
It is possible for this task to be swapped out immediately following the call into sctp_endpoint_hold() that would change the address of tsp->asoc->ep to point to a completely different endpoint. This means a reference could be taken to the old endpoint and the new one would be processed without a reference taken, moreover the new endpoint could then be freed whilst still processing as a result, causing a use-after-free.
If we return the exact pointer that was held, we ensure this task processes only the endpoint we have taken a reference to. The resultant hunk now looks like this:
ep = sctp_endpoint_hold(tsp->asoc->ep); sk = ep->base.sk lock_sock(sk);
Isn't that just the same as doing things in the other order? ep = tsp->assoc->ep; sctp_endpoint_hold(ep);
Sleep for a few milliseconds between those lines and see what happens.
'ep' could still be freed between the assignment and the call.
But if tsp->assoc->ep is allowed to change, can't it also change to something invalid?
Not sure I follow.
So I've have thought you should be holding some kind of lock that stops the data being changed before being 'allowed' to follow the pointers. In which case the current code is just a missing optimisatoion.
Locking would be another potential solution.
The current code already tries to lock.
lock_sock(sk);
The difficultly here is that we don't know whether 'sk' is still valid at this point. I've seen the current code panic here. Xin Long suggested something similar using the RCU infrastructure, but this code can sleep, so it wasn't suitable.
If we were to use locking, we'd need to figure out a) what to apply the lock to and b) where to apply the lock.
From: Lee Jones
Sent: 17 December 2021 14:35
On Fri, 17 Dec 2021, David Laight wrote:
From: Lee Jones
Sent: 17 December 2021 13:46
net/sctp/diag.c for instance is built into its own separate module (sctp_diag.ko) and requires the use of sctp_endpoint_{hold,put}() in order to prevent a recently found use-after-free issue.
In order to prevent data corruption of the pointer used to take a reference on a specific endpoint, between the time of calling sctp_endpoint_hold() and it returning, the API now returns a pointer to the exact endpoint that was incremented.
For example, in sctp_sock_dump(), we could have the following hunk:
sctp_endpoint_hold(tsp->asoc->ep); ep = tsp->asoc->ep; sk = ep->base.sk lock_sock(ep->base.sk);
It is possible for this task to be swapped out immediately following the call into sctp_endpoint_hold() that would change the address of tsp->asoc->ep to point to a completely different endpoint. This means a reference could be taken to the old endpoint and the new one would be processed without a reference taken, moreover the new endpoint could then be freed whilst still processing as a result, causing a use-after-free.
If we return the exact pointer that was held, we ensure this task processes only the endpoint we have taken a reference to. The resultant hunk now looks like this:
ep = sctp_endpoint_hold(tsp->asoc->ep); sk = ep->base.sk lock_sock(sk);
Isn't that just the same as doing things in the other order? ep = tsp->asoc->ep; sctp_endpoint_hold(ep);
Sleep for a few milliseconds between those lines and see what happens.
'ep' could still be freed between the assignment and the call.
It can also be freed half way through setting up the arguments to the call. So any call: xxx(tsp->asoc->ep); is only really valid if both tsp->asoc and asoc->ep are stable. So it is exactly the same as doing: ep = tsp->asoc->ep; xxx(ep); Returning the value of the argument doesn't help if any of the pointed-to items can get freed.
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Fri, 17 Dec 2021 13:46:06 +0000 Lee Jones wrote:
For example, in sctp_sock_dump(), we could have the following hunk:
sctp_endpoint_hold(tsp->asoc->ep); ep = tsp->asoc->ep; sk = ep->base.sk lock_sock(ep->base.sk);
It is possible for this task to be swapped out immediately following the call into sctp_endpoint_hold() that would change the address of tsp->asoc->ep to point to a completely different endpoint. This means a reference could be taken to the old endpoint and the new one would be processed without a reference taken, moreover the new endpoint could then be freed whilst still processing as a result, causing a use-after-free.
If we return the exact pointer that was held, we ensure this task processes only the endpoint we have taken a reference to. The resultant hunk now looks like this:
ep = sctp_endpoint_hold(tsp->asoc->ep);
sk = ep->base.sk lock_sock(sk);
If you have to explain what the next patch will do to make sense of this one it really is better to merge the two patches. Exporting something is not a functional change, nor does it make the changes easier to review, in fact the opposite is true.
Fixes: 8f840e47f190c ("sctp: add the sctp_diag.c file")
This patch in itself fixes exactly nothing.
linux-stable-mirror@lists.linaro.org