Quoting Chang, Bruce (2020-08-15 03:16:58)
On 8/14/2020 5:36 PM, Chang, Bruce wrote:
@@ -2498,9 +2498,22 @@ invalidate_csb_entries(const u64 *first, const u64 *last) */ static inline bool gen12_csb_parse(const u64 *csb) { - u64 entry = READ_ONCE(*csb); - bool ctx_away_valid = GEN12_CSB_CTX_VALID(upper_32_bits(entry)); - bool new_queue = + bool ctx_away_valid; + bool new_queue; + u64 entry;
+ /* XXX HSD */ + entry = READ_ONCE(*csb); + if (unlikely(entry == -1)) { + preempt_disable(); + if (wait_for_atomic_us((entry = READ_ONCE(*csb)) != -1, 50)) + GEM_WARN_ON("50us CSB timeout");
Out tests showed that 10us is not long enough, but 20us worked well. So 50us should be good enough.
Just realized this may not fully work, as one of the common issue we run into is that higher 32bit is updated from the HW, but lower 32bit update at a later time: meaning the csb will read like 0xFFFFFFFF:xxxxxxxx (low:high) . So this check (!= -1) can still pass but with a partial invalid csb status. So, we may need to check each 32bit separately.
After tested, with the new 64bit read, the above issue never happened so far. So, it seems this only applicable to 32bit read (CSB updated between the two lower and high 32bit reads). Assuming the HW 64bit CSB update is also atomic, the above code should be fine.
Fortunately for all the platforms we care about here, READ_ONCE(u64) will be a single 64b read and so both lower/upper dwords will be pulled from the same bus transfer. We really need a compiler warning for when READ_ONCE() is not a singular atomic operation. atomic64_t has too much connotation with cross-core atomicity for my liking when dealing with [cacheable] mmio semantics. -Chris