On 9/7/22 20:04, Eric Dumazet wrote:
On Wed, Sep 7, 2022 at 9:53 AM Leonard Crestez cdleonard@gmail.com wrote:
On 9/7/22 02:11, Eric Dumazet wrote:
On Mon, Sep 5, 2022 at 12:06 AM Leonard Crestez cdleonard@gmail.com wrote:
This is mainly intended to protect against local privilege escalations through a rarely used feature so it is deliberately not namespaced.
Enforcement is only at the setsockopt level, this should be enough to ensure that the tcp_authopt_needed static key never turns on.
No effort is made to handle disabling when the feature is already in use.
Signed-off-by: Leonard Crestez cdleonard@gmail.com
Documentation/networking/ip-sysctl.rst | 6 ++++ include/net/tcp_authopt.h | 1 + net/ipv4/sysctl_net_ipv4.c | 39 ++++++++++++++++++++++++++ net/ipv4/tcp_authopt.c | 25 +++++++++++++++++ 4 files changed, 71 insertions(+)
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index a759872a2883..41be0e69d767 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -1038,10 +1038,16 @@ tcp_challenge_ack_limit - INTEGER Note that this per netns rate limit can allow some side channel attacks and probably should not be enabled. TCP stack implements per TCP socket limits anyway. Default: INT_MAX (unlimited)
+tcp_authopt - BOOLEAN
Enable the TCP Authentication Option (RFC5925), a replacement for TCP
MD5 Signatures (RFC2835).
Default: 0
...
+#ifdef CONFIG_TCP_AUTHOPT +static int proc_tcp_authopt(struct ctl_table *ctl,
int write, void *buffer, size_t *lenp,
loff_t *ppos)
+{
int val = sysctl_tcp_authopt;
val = READ_ONCE(sysctl_tcp_authopt);
struct ctl_table tmp = {
.data = &val,
.mode = ctl->mode,
.maxlen = sizeof(val),
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_ONE,
};
int err;
err = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
if (err)
return err;
if (sysctl_tcp_authopt && !val) {
READ_ONCE(sysctl_tcp_authopt)
Note that this test would still be racy, because another cpu might change sysctl_tcp_authopt right after the read.
What meaningful races are possible here? This is a variable that changes from 0 to 1 at most once.
Two cpus can issue writes of 0 and 1 values at the same time.
Depending on scheduling writing the 0 can 'win' the race and overwrite the value back to 0.
This is in clear violation of the claim you are making (that the sysctl can only go once from 0 to 1)
Not clear why anyone would attempt to write 0, maybe to ensure that it's still disabled?
But you're right that userspace CAN do that and the kernel CAN misbehave in this scenario so it would be better to just make the changes you suggested.
In theory if two processes attempt to assign "non-zero" at the same time then one will "win" and the other will get an error but races between userspace writing different values are possible for any sysctl. The solution seems to be "write sysctls from a single place".
All the checks are in sockopts - in theory if the sysctl is written on one CPU then a sockopt can still fail on another CPU until caches are flushed. Is this what you're worried about?
In theory doing READ_ONCE might incur a slight penalty on sockopt but not noticeable.
Not at all. There is _no_ penalty using READ_ONCE(). Unless it is done in a loop and this prevents some compiler optimization.
Please use WRITE_ONCE() and READ_ONCE() for all sysctl values used in TCP stack (and elsewhere)
See all the silly patches we had recently.
OK