On Wed, Feb 03, 2021 at 04:50:07PM +0000, Richard Fitzgerald wrote:
The existing code attempted to handle numbers by doing a strto[u]l(), ignoring the field width, and then repeatedly dividing to extract the field out of the full converted value. If the string contains a run of valid digits longer than will fit in a long or long long, this would overflow and no amount of dividing can recover the correct value.
This patch fixes vsscanf to obey number field widths when parsing
vsscanf()
the number.
A new _parse_integer_limit() is added that takes a limit for the number of characters to parse. The number field conversion in vsscanf is changed to use this new function.
If a number starts with a radix prefix, the field width must be long enough for at last one digit after the prefix. If not, it will be handled like this:
sscanf("0x4", "%1i", &i): i=0, scanning continues with the 'x' sscanf("0x4", "%2i", &i): i=0, scanning continues with the '4'
This is consistent with the observed behaviour of userland sscanf.
Note that this patch does NOT fix the problem of a single field value overflowing the target type. So for example:
sscanf("123456789abcdef", "%x", &i);
Will not produce the correct result because the value obviously overflows INT_MAX. But sscanf will report a successful conversion.
...
- for (; max_chars > 0; max_chars--) {
Less fragile is to write
while (max_chars--)
This allows max_char to be an unsigned type.
Moreover...
- return _parse_integer_limit(s, base, p, INT_MAX);
You have inconsistency with INT_MAX vs, size_t above.
...
+unsigned int _parse_integer_limit(const char *s, unsigned int base,
unsigned long long *res, size_t max_chars);
Also, can you leave res on previous line, so it will be easier to see what's the difference with the original one?
unsigned int _parse_integer(const char *s, unsigned int base, unsigned long long *res);
...
- unsigned long long result;
- const char *cp;
- unsigned long long result = 0ULL; unsigned int rv;
- cp = _parse_integer_fixup_radix(cp, &base);
- rv = _parse_integer(cp, base, &result);
- if (max_chars == 0) {
cp = startp;
goto out;
- }
It's redundant if I'm not mistaken.
- cp = _parse_integer_fixup_radix(startp, &base);
- if ((cp - startp) >= max_chars) {
cp = startp + max_chars;
goto out;
- }
This will be exactly the same, no?
Moreover you will have while (max_chars--) in the _limit() variant which is also a no-op.
...
Unrelated change.
+out: if (endp) *endp = (char *)cp;