@@ -2498,9 +2498,22 @@ invalidate_csb_entries(const u64 *first, const u64 *last) */ static inline bool gen12_csb_parse(const u64 *csb) {
u64 entry = READ_ONCE(*csb);
bool ctx_away_valid = GEN12_CSB_CTX_VALID(upper_32_bits(entry));
bool new_queue =
bool ctx_away_valid;
bool new_queue;
u64 entry;
/* XXX HSD */
entry = READ_ONCE(*csb);
if (unlikely(entry == -1)) {
preempt_disable();
if (wait_for_atomic_us((entry = READ_ONCE(*csb)) != -1, 50))
GEM_WARN_ON("50us CSB timeout");
Out tests showed that 10us is not long enough, but 20us worked well. So 50us should be good enough.
Just realized this may not fully work, as one of the common issue we run into is that higher 32bit is updated from the HW, but lower 32bit update at a later time: meaning the csb will read like 0xFFFFFFFF:xxxxxxxx (low:high) . So this check (!= -1) can still pass but with a partial invalid csb status. So, we may need to check each 32bit separately.
preempt_enable();
}
WRITE_ONCE(*(u64 *)csb, -1);
A wmb() is probably needed here. it should be ok if CSB is in SMEM, but in the case CSB is allocated in LMEM, the memory type will be WC, so the memory write (WRITE_ONCE) is potentially still in the write combine buffer and not in any cache system, i.e., not visible to HW.