RE: Problems with kernel support for hardware watchpoints

14 Feb 2011

      Hi Ulrich,
...
I've now got it working reliably on on Versatile Express, after fixing
a couple of bugs on the GDB side (both in the HW-watchpoint patch, and
in common GDB code).  The testsuite now passes with no regressions when
enabling HW watchpoints, except for two tests that require more than one
single watchpoint to be supported.
That's good to hear, thanks.
...
This raises another couple of issues/questions, however:

In testing on Versatile Express, I noticed what appears to be SMP
related bugs in handling regular software breakpoints: occasionally,
software breakpoints simply are not hit and execution continues as if
the underlying code had not been changed at all.  This symptom
completely goes away if GDB and the debugged process are forced to
the same CPU using the affinity feature (e.g. with schedtool).

I've seen this issue in the past but I thought I'd fixed it. What kernel are
you using and do you have CONFIG_ARM_ERRATA_720789 enabled?
...
My guess, just from seeing those symptoms, would be that when inserting
  a software breakpoint via ptrace, not all i-caches on all CPUs are
  reliably flushed ...   Any thoughts on this?
There was an I-cache aliasing problem in the kernel coupled with a TLB
invalidation hardware bug on the versatile express. I fixed these though
and haven't seen any problems since.
...

As mentioned above, the kernel currently only supports one single
watchpoint to be active at a time, even though hardware might support
multiple ones.  The reason seems to be that when a watchpoint triggers,
the kernel cannot figure out which one it was (if there's more than one
choice).
This is a bit unfortunate, given that GDB will attempt to insert two
or more watchpoints in many interesting cases (e.g. a "watch *p"
command will insert *two* low-level watchpoints, one at the address
of p, and one at the address where p (currently) points to).
In addition, for regular (write) watchpoints, GDB does not actually
*require* the underlying hardware/kernel to specify which watchpoint
was hit; GDB is able to find out by itself by checking whether the
values at any of the currently active locations actually changed.
(For read/access type watchpoints, GDB does require that underlying
support -- but those are much more rarely used anyway.)
Do you see any chance of improving upon the current behaviour?

Hmmm, I'll need to have a think about this. What does GDB do if it receives
a SIGTRAP with si_addr set to (potentially) complete nonsense? As an aside,
Cortex-A15 reports the faulting address for a watchpoint correctly, so we
will be able to use multiple watchpoints there.
...

Finally, I noticed when reading kernel code that under some
circumstances, the kernel will automatically do a single step to
get off a watchpoint that was just hit.  However, this does not
happen for user-space watchpoints installed via ptrace, right?
(Just wanting to confirm; since GDB currently does that single
step itself -- we don't want *both* kernel and GDB to issue a
single step each ...)

If the {break,watch}point has been inserted via ptrace, the kernel will
send a SIGTRAP instead of stepping the instruction.
...
I haven't gotten to looking further into other hardware (IGEP,
Panda) -- that's next on the list.
Good stuff, keep me posted if you see any further problems!
Thanks,
Will

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

RE: Problems with kernel support for hardware watchpoints