[PATCH] cgroup: pids: use {READ, WRITE}_ONCE for pids->limit operations

List overview All Threads
Download

newer

older

stable-rc/linux-5.3.y build: 208...

stable-rc/linux-4.14.y build: 201...

Aleksa Sarai

12 Oct 2019 12 Oct '19

1:05 a.m.

Because pids->limit can be changed concurrently (but we don't want to take a lock because it would be needlessly expensive), use the appropriate memory barriers.

Fixes: commit 49b786ea146f ("cgroup: implement the PIDs subsystem") Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: Aleksa Sarai cyphar@cyphar.com --- kernel/cgroup/pids.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/cgroup/pids.c b/kernel/cgroup/pids.c index 8e513a573fe9..a726e4a20177 100644 --- a/kernel/cgroup/pids.c +++ b/kernel/cgroup/pids.c @@ -152,7 +152,7 @@ static int pids_try_charge(struct pids_cgroup *pids, int num) * p->limit is %PIDS_MAX then we know that this test will never * fail. */ - if (new > p->limit) + if (new > READ_ONCE(p->limit)) goto revert; }

@@ -277,7 +277,7 @@ static ssize_t pids_max_write(struct kernfs_open_file *of, char *buf, * Limit updates don't need to be mutex'd, since it isn't * critical that any racing fork()s follow the new limit. */ - pids->limit = limit; + WRITE_ONCE(pids->limit, limit); return nbytes; }

@@ -285,7 +285,7 @@ static int pids_max_show(struct seq_file *sf, void *v) { struct cgroup_subsys_state *css = seq_css(sf); struct pids_cgroup *pids = css_pids(css); - int64_t limit = pids->limit; + int64_t limit = READ_ONCE(pids->limit);

if (limit >= PIDS_MAX) seq_printf(sf, "%s\n", PIDS_MAX_STR);

-- 2.23.0

Show replies by date

Tejun Heo

14 Oct 14 Oct

3:41 p.m.

New subject: [PATCH] cgroup: pids: use {READ,WRITE}_ONCE for pids->limit operations

On Sat, Oct 12, 2019 at 12:05:39PM +1100, Aleksa Sarai wrote:

...

Because pids->limit can be changed concurrently (but we don't want to take a lock because it would be needlessly expensive), use the appropriate memory barriers.

I can't quite tell what problem it's fixing. Can you elaborate a scenario where the current code would break that your patch fixes?

Thanks.

-- tejun

Aleksa Sarai

3:59 p.m.

New subject: [PATCH] cgroup: pids: use {READ,WRITE}_ONCE for pids->limit operations

On 2019-10-14, Tejun Heo tj@kernel.org wrote:

...

On Sat, Oct 12, 2019 at 12:05:39PM +1100, Aleksa Sarai wrote:

...
Because pids->limit can be changed concurrently (but we don't want to take a lock because it would be needlessly expensive), use the appropriate memory barriers.

I can't quite tell what problem it's fixing. Can you elaborate a scenario where the current code would break that your patch fixes?

As far as I can tell, not using *_ONCE() here means that if you had a process changing pids->limit from A to B, a process might be able to temporarily exceed pids->limit -- because pids->limit accesses are not protected by mutexes and the C compiler can produce confusing intermediate values for pids->limit[1].

But this is more of a correctness fix than one fixing an actually exploitable bug -- given the kernel memory model work, it seems like a good idea to just use READ_ONCE() and WRITE_ONCE() for shared memory access.

[1]: https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE

-- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Tejun Heo

4:33 p.m.

New subject: [PATCH] cgroup: pids: use {READ,WRITE}_ONCE for pids->limit operations

Hello, Aleksa.

On Tue, Oct 15, 2019 at 02:59:31AM +1100, Aleksa Sarai wrote:

...

On 2019-10-14, Tejun Heo tj@kernel.org wrote:

...
On Sat, Oct 12, 2019 at 12:05:39PM +1100, Aleksa Sarai wrote:

...
Because pids->limit can be changed concurrently (but we don't want to take a lock because it would be needlessly expensive), use the appropriate memory barriers.

I can't quite tell what problem it's fixing. Can you elaborate a scenario where the current code would break that your patch fixes?

As far as I can tell, not using *_ONCE() here means that if you had a process changing pids->limit from A to B, a process might be able to temporarily exceed pids->limit -- because pids->limit accesses are not protected by mutexes and the C compiler can produce confusing intermediate values for pids->limit[1].

But this is more of a correctness fix than one fixing an actually exploitable bug -- given the kernel memory model work, it seems like a good idea to just use READ_ONCE() and WRITE_ONCE() for shared memory access.

READ/WRITE_ONCE provides protection against compiler generating multiple accesses for a single operation. It won't prevent split writes / reads of 64bit variables on 32bit machines. For that, you'd have to switch them to atomic64_t's.

Thanks.

-- tejun

Aleksa Sarai

16 Oct 16 Oct

8:32 a.m.

New subject: [PATCH] cgroup: pids: use {READ,WRITE}_ONCE for pids->limit operations

On 2019-10-14, Tejun Heo tj@kernel.org wrote:

...

Hello, Aleksa.

On Tue, Oct 15, 2019 at 02:59:31AM +1100, Aleksa Sarai wrote:

...
On 2019-10-14, Tejun Heo tj@kernel.org wrote:

...
On Sat, Oct 12, 2019 at 12:05:39PM +1100, Aleksa Sarai wrote:

...
Because pids->limit can be changed concurrently (but we don't want to take a lock because it would be needlessly expensive), use the appropriate memory barriers.

I can't quite tell what problem it's fixing. Can you elaborate a scenario where the current code would break that your patch fixes?

As far as I can tell, not using *_ONCE() here means that if you had a process changing pids->limit from A to B, a process might be able to temporarily exceed pids->limit -- because pids->limit accesses are not protected by mutexes and the C compiler can produce confusing intermediate values for pids->limit[1].

But this is more of a correctness fix than one fixing an actually exploitable bug -- given the kernel memory model work, it seems like a good idea to just use READ_ONCE() and WRITE_ONCE() for shared memory access.

READ/WRITE_ONCE provides protection against compiler generating multiple accesses for a single operation. It won't prevent split writes / reads of 64bit variables on 32bit machines. For that, you'd have to switch them to atomic64_t's.

Maybe I'm misunderstanding Documentation/atomic_t.txt, but it looks to me like it's explicitly saying that I shouldn't use atomic64_t if I'm just using it for fetching and assignment.

...

The non-RMW ops are (typically) regular LOADs and STOREs and are canonically implemented using READ_ONCE(), WRITE_ONCE(), smp_load_acquire() and smp_store_release() respectively. Therefore, if you find yourself only using the Non-RMW operations of atomic_t, you do not in fact need atomic_t at all and are doing it wrong.

As for 64-bit on 32-bit machines -- that is a separate issue, but from [1] it seems to me like there are more problems that *_ONCE() fixes than just split reads and writes.

[1]: https://lwn.net/Articles/793253/

-- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Tejun Heo

2:27 p.m.

New subject: [PATCH] cgroup: pids: use {READ,WRITE}_ONCE for pids->limit operations

Hello, Aleksa.

On Wed, Oct 16, 2019 at 07:32:19PM +1100, Aleksa Sarai wrote:

...

Maybe I'm misunderstanding Documentation/atomic_t.txt, but it looks to me like it's explicitly saying that I shouldn't use atomic64_t if I'm just using it for fetching and assignment.

Hah, where is it saying that? The alternative would be seqlock or u64_stats or straight-up locking but idk for this atomic64_t should be fine.

...

...
The non-RMW ops are (typically) regular LOADs and STOREs and are canonically implemented using READ_ONCE(), WRITE_ONCE(), smp_load_acquire() and smp_store_release() respectively. Therefore, if you find yourself only using the Non-RMW operations of atomic_t, you do not in fact need atomic_t at all and are doing it wrong.

As for 64-bit on 32-bit machines -- that is a separate issue, but from [1] it seems to me like there are more problems that *_ONCE() fixes than just split reads and writes.

Your explanations are too wishy washy. If you wanna fix it, please do it correctly. R/W ONCE isn't the right solution here.

Thanks.

-- tejun

Aleksa Sarai

3:29 p.m.

New subject: [PATCH] cgroup: pids: use {READ,WRITE}_ONCE for pids->limit operations

On 2019-10-16, Tejun Heo tj@kernel.org wrote:

...

Hello, Aleksa.

On Wed, Oct 16, 2019 at 07:32:19PM +1100, Aleksa Sarai wrote:

...
Maybe I'm misunderstanding Documentation/atomic_t.txt, but it looks to me like it's explicitly saying that I shouldn't use atomic64_t if I'm just using it for fetching and assignment.

Hah, where is it saying that?

Isn't that what this says:

...

Therefore, if you find yourself only using the Non-RMW operations of atomic_t, you do not in fact need atomic_t at all and are doing it wrong.

Doesn't using just atomic64_read() and atomic64_set() fall under "only using the non-RMW operations of atomic_t"? But yes, I agree that any locking is overkill.

...

...
As for 64-bit on 32-bit machines -- that is a separate issue, but from [1] it seems to me like there are more problems that *_ONCE() fixes than just split reads and writes.

Your explanations are too wishy washy. If you wanna fix it, please do it correctly. R/W ONCE isn't the right solution here.

Sure, I will switch it to use atomic64_read() and atomic64_set() instead if that's what you'd prefer. Though I will mention that on quite a few architectures atomic64_read() is defined as:

#define atomic64_read(v) READ_ONCE((v)->counter)

-- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Tejun Heo

3:32 p.m.

New subject: [PATCH] cgroup: pids: use {READ,WRITE}_ONCE for pids->limit operations

Hello,

On Thu, Oct 17, 2019 at 02:29:46AM +1100, Aleksa Sarai wrote:

...

...
Hah, where is it saying that?

Isn't that what this says:

...
Therefore, if you find yourself only using the Non-RMW operations of atomic_t, you do not in fact need atomic_t at all and are doing it wrong.

Doesn't using just atomic64_read() and atomic64_set() fall under "only using the non-RMW operations of atomic_t"? But yes, I agree that any locking is overkill.

Yeah, I mean, it's an overkill. We can use seqlock or u64_stat here but it doesn't matter that much.

...

...
...
As for 64-bit on 32-bit machines -- that is a separate issue, but from [1] it seems to me like there are more problems that *_ONCE() fixes than just split reads and writes.

Your explanations are too wishy washy. If you wanna fix it, please do it correctly. R/W ONCE isn't the right solution here.

Sure, I will switch it to use atomic64_read() and atomic64_set() instead if that's what you'd prefer. Though I will mention that on quite a few architectures atomic64_read() is defined as:

#define atomic64_read(v) READ_ONCE((v)->counter)

Yeah, on archs which don't have split access on 64bits. On the ones which do, it does something else. The generic implementation is straight-up locking, I think.

Thanks.

-- tejun

Aleksa Sarai

3:35 p.m.

New subject: [PATCH] cgroup: pids: use {READ,WRITE}_ONCE for pids->limit operations

On 2019-10-17, Aleksa Sarai cyphar@cyphar.com wrote:

...

On 2019-10-16, Tejun Heo tj@kernel.org wrote:

...
Hello, Aleksa.

On Wed, Oct 16, 2019 at 07:32:19PM +1100, Aleksa Sarai wrote:

...
Maybe I'm misunderstanding Documentation/atomic_t.txt, but it looks to me like it's explicitly saying that I shouldn't use atomic64_t if I'm just using it for fetching and assignment.

Hah, where is it saying that?

Isn't that what this says:

...
Therefore, if you find yourself only using the Non-RMW operations of atomic_t, you do not in fact need atomic_t at all and are doing it wrong.

Doesn't using just atomic64_read() and atomic64_set() fall under "only using the non-RMW operations of atomic_t"? But yes, I agree that any locking is overkill.

...
...
As for 64-bit on 32-bit machines -- that is a separate issue, but from [1] it seems to me like there are more problems that *_ONCE() fixes than just split reads and writes.

Your explanations are too wishy washy. If you wanna fix it, please do it correctly. R/W ONCE isn't the right solution here.

Sure, I will switch it to use atomic64_read() and atomic64_set() instead if that's what you'd prefer. Though I will mention that on quite a few architectures atomic64_read() is defined as:

#define atomic64_read(v) READ_ONCE((v)->counter)

Though I guess that's because on those architectures it turns out that READ_ONCE is properly atomic?

-- Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH https://www.cyphar.com/

Tejun Heo

3:54 p.m.

New subject: [PATCH] cgroup: pids: use {READ,WRITE}_ONCE for pids->limit operations

On Thu, Oct 17, 2019 at 02:35:20AM +1100, Aleksa Sarai wrote:

...

...
Sure, I will switch it to use atomic64_read() and atomic64_set() instead if that's what you'd prefer. Though I will mention that on quite a few architectures atomic64_read() is defined as:

#define atomic64_read(v) READ_ONCE((v)->counter)

Though I guess that's because on those architectures it turns out that READ_ONCE is properly atomic?

Oh yeah, on archs where 64bit accesses are atomic, READ_ONCE() / WRITE_ONCE() would work here. If the limit variable were ulong instead of an explicit 64bit variable, RW ONCE would work too as ulong accesses are atomic on all archs IIRC.

Thanks.

-- tejun

2102

days inactive

2106

days old

linux-stable-mirror@lists.linaro.org

9 comments

participants

tags (0)

participants (2)

Aleksa Sarai
Tejun Heo