On Wed, Aug 14, 2024 at 1:21 AM Andrii Nakryiko andrii.nakryiko@gmail.com wrote:
On Tue, Aug 13, 2024 at 1:59 PM Jann Horn jannh@google.com wrote:
On Tue, Aug 13, 2024 at 2:29 AM Andrii Nakryiko andrii@kernel.org wrote:
Harden build ID parsing logic, adding explicit READ_ONCE() where it's important to have a consistent value read and validated just once.
Also, as pointed out by Andi Kleen, we need to make sure that entire ELF note is within a page bounds, so move the overflow check up and add an extra note_size boundaries validation.
Fixes tag below points to the code that moved this code into lib/buildid.c, and then subsequently was used in perf subsystem, making this code exposed to perf_event_open() users in v5.12+.
Sorry, I missed some things in previous review rounds:
[...]
@@ -18,31 +18,37 @@ static int parse_build_id_buf(unsigned char *build_id,
[...]
if (nhdr->n_type == BUILD_ID &&
nhdr->n_namesz == sizeof("GNU") &&
!strcmp((char *)(nhdr + 1), "GNU") &&
nhdr->n_descsz > 0 &&
nhdr->n_descsz <= BUILD_ID_SIZE_MAX) {
memcpy(build_id,
note_start + note_offs +
ALIGN(sizeof("GNU"), 4) + sizeof(Elf32_Nhdr),
nhdr->n_descsz);
memset(build_id + nhdr->n_descsz, 0,
BUILD_ID_SIZE_MAX - nhdr->n_descsz);
name_sz == note_name_sz &&
strcmp((char *)(nhdr + 1), note_name) == 0 &&
Please change this to something like "memcmp((char *)(nhdr + 1), note_name, note_name_sz) == 0" to ensure that we can't run off the end of the page if there are no null bytes in the rest of the page.
I did switch this to strncmp() at some earlier point, but then realized that there is no point because note_name is controlled by us and will ensure there is a zero at byte (note_name_sz - 1). So I don't think memcmp() buys us anything.
There are two reasons why using strcmp() here makes me uneasy.
First: We're still operating on shared memory that can concurrently change.
Let's say strcmp is implemented like this, this is the generic C implementation in the kernel (which I think is the implementation that's used for x86-64):
int strcmp(const char *cs, const char *ct) { unsigned char c1, c2;
while (1) { c1 = *cs++; c2 = *ct++; if (c1 != c2) return c1 < c2 ? -1 : 1; if (!c1) break; } return 0; }
No READ_ONCE() or anything like that - it's not designed for being used on concurrently changing memory.
And let's say you call it like strcmp(<shared memory>, "GNU"), and we're now in the fourth iteration. If the compiler decides to re-fetch the value of "c1" from memory for each of the two conditions, then it could be that the "if (c1 != c2)" sees c1='\0' and c2='\0', so the condition evaluates as false; but then at the "if (!c1)", the value in memory changed, and we see c1='A'. So now in the next round, we'll be accessing out-of-bounds memory behind the 4-byte string constant "GNU".
So I don't think strcmp() on memory that can concurrently change is allowed.
(It actually seems like the generic memcmp() is also implemented without READ_ONCE(), maybe we should change that...)
Second: You are assuming that if one side of the strcmp() is at most four bytes long (including null terminator), then strcmp() also won't access more than 4 bytes of the other string, even if that string does not have a null terminator at index 4. I don't think that's part of the normal strcmp() API contract.