The length of Physical Address in General Media Event Record/DRAM Event Record is 64-bit, so the field mask should be defined as such length. Otherwise, this causes cxl_general_media and cxl_dram tracepoints to mask off the upper-32-bits of DPA addresses. The cxl_poison event is unaffected.
If userspace was doing its own DPA-to-HPA translation this could lead to incorrect page retirement decisions, but there is no known consumer (like rasdaemon) of this event today.
Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record") Cc: stable@vger.kernel.org Cc: Dan Williams dan.j.williams@intel.com Cc: Davidlohr Bueso dave@stgolabs.net Cc: Jonathan Cameron Jonathan.Cameron@huawei.com Cc: Ira Weiny ira.weiny@intel.com Signed-off-by: Shiyang Ruan ruansy.fnst@fujitsu.com --- drivers/cxl/core/trace.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h index e5f13260fc52..cdfce932d5b1 100644 --- a/drivers/cxl/core/trace.h +++ b/drivers/cxl/core/trace.h @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event, * DRAM Event Record * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44 */ -#define CXL_DPA_FLAGS_MASK 0x3F +#define CXL_DPA_FLAGS_MASK 0x3FULL #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK)
#define CXL_DPA_VOLATILE BIT(0)
Shiyang Ruan wrote:
The length of Physical Address in General Media Event Record/DRAM Event Record is 64-bit, so the field mask should be defined as such length. Otherwise, this causes cxl_general_media and cxl_dram tracepoints to mask off the upper-32-bits of DPA addresses. The cxl_poison event is unaffected.
If userspace was doing its own DPA-to-HPA translation this could lead to incorrect page retirement decisions, but there is no known consumer (like rasdaemon) of this event today.
Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record") Cc: stable@vger.kernel.org Cc: Dan Williams dan.j.williams@intel.com Cc: Davidlohr Bueso dave@stgolabs.net Cc: Jonathan Cameron Jonathan.Cameron@huawei.com Cc: Ira Weiny ira.weiny@intel.com
Apologies I thought I saw this go in before. But perhaps it was a different mask.
Reviewed-by: Ira Weiny ira.weiny@intel.com
Signed-off-by: Shiyang Ruan ruansy.fnst@fujitsu.com
drivers/cxl/core/trace.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h index e5f13260fc52..cdfce932d5b1 100644 --- a/drivers/cxl/core/trace.h +++ b/drivers/cxl/core/trace.h @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
- DRAM Event Record
- CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
*/ -#define CXL_DPA_FLAGS_MASK 0x3F +#define CXL_DPA_FLAGS_MASK 0x3FULL #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK)
#define CXL_DPA_VOLATILE BIT(0)
2.34.1
Shiyang Ruan wrote:
The length of Physical Address in General Media Event Record/DRAM Event Record is 64-bit, so the field mask should be defined as such length. Otherwise, this causes cxl_general_media and cxl_dram tracepoints to mask off the upper-32-bits of DPA addresses. The cxl_poison event is unaffected.
If userspace was doing its own DPA-to-HPA translation this could lead to incorrect page retirement decisions, but there is no known consumer (like rasdaemon) of this event today.
Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record") Cc: stable@vger.kernel.org Cc: Dan Williams dan.j.williams@intel.com Cc: Davidlohr Bueso dave@stgolabs.net Cc: Jonathan Cameron Jonathan.Cameron@huawei.com Cc: Ira Weiny ira.weiny@intel.com Signed-off-by: Shiyang Ruan ruansy.fnst@fujitsu.com
drivers/cxl/core/trace.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h index e5f13260fc52..cdfce932d5b1 100644 --- a/drivers/cxl/core/trace.h +++ b/drivers/cxl/core/trace.h @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
- DRAM Event Record
- CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
*/ -#define CXL_DPA_FLAGS_MASK 0x3F +#define CXL_DPA_FLAGS_MASK 0x3FULL #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK) #define CXL_DPA_VOLATILE BIT(0)
Looks good,
Reviewed-by: Dan Williams dan.j.williams@intel.com
On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
The length of Physical Address in General Media Event Record/DRAM Event Record is 64-bit, so the field mask should be defined as such length. Otherwise, this causes cxl_general_media and cxl_dram tracepoints to mask off the upper-32-bits of DPA addresses. The cxl_poison event is unaffected.
If userspace was doing its own DPA-to-HPA translation this could lead to incorrect page retirement decisions, but there is no known consumer (like rasdaemon) of this event today.
So, an invalid DPA is emitted in the trace event log and that could lead to 'incorrect page retirement decisions...'
Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record") Cc: stable@vger.kernel.org Cc: Dan Williams dan.j.williams@intel.com Cc: Davidlohr Bueso dave@stgolabs.net Cc: Jonathan Cameron Jonathan.Cameron@huawei.com Cc: Ira Weiny ira.weiny@intel.com Signed-off-by: Shiyang Ruan ruansy.fnst@fujitsu.com
drivers/cxl/core/trace.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h index e5f13260fc52..cdfce932d5b1 100644 --- a/drivers/cxl/core/trace.h +++ b/drivers/cxl/core/trace.h @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
- DRAM Event Record
- CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
*/ -#define CXL_DPA_FLAGS_MASK 0x3F +#define CXL_DPA_FLAGS_MASK 0x3FULL #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK) #define CXL_DPA_VOLATILE BIT(0)
This works but I'm thinking this is the time to convene on one CXL_EVENT_DPA_MASK for both all CXL events, rather than having cxl_poison event be different.
I prefer how poison defines it:
cxlmem.h:#define CXL_POISON_START_MASK GENMASK_ULL(63, 6)
Can we rename that CXL_EVENT_DPA_MASK and use for all events?
--Alison
-- 2.34.1
Alison Schofield wrote:
On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
[snip]
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h index e5f13260fc52..cdfce932d5b1 100644 --- a/drivers/cxl/core/trace.h +++ b/drivers/cxl/core/trace.h @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
- DRAM Event Record
- CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
*/ -#define CXL_DPA_FLAGS_MASK 0x3F +#define CXL_DPA_FLAGS_MASK 0x3FULL #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK) #define CXL_DPA_VOLATILE BIT(0)
This works but I'm thinking this is the time to convene on one CXL_EVENT_DPA_MASK for both all CXL events, rather than having cxl_poison event be different.
I prefer how poison defines it:
cxlmem.h:#define CXL_POISON_START_MASK GENMASK_ULL(63, 6)
Can we rename that CXL_EVENT_DPA_MASK and use for all events?
Ah! Great catch. I dont' know why I only masked off the 2 used bits. That was short sighted of me.
Yes we should consolidate these. Ira
在 2024/4/24 5:04, Ira Weiny 写道:
Alison Schofield wrote:
On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
[snip]
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h index e5f13260fc52..cdfce932d5b1 100644 --- a/drivers/cxl/core/trace.h +++ b/drivers/cxl/core/trace.h @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
- DRAM Event Record
- CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
*/ -#define CXL_DPA_FLAGS_MASK 0x3F +#define CXL_DPA_FLAGS_MASK 0x3FULL #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK) #define CXL_DPA_VOLATILE BIT(0)
This works but I'm thinking this is the time to convene on one CXL_EVENT_DPA_MASK for both all CXL events, rather than having cxl_poison event be different.
I prefer how poison defines it:
cxlmem.h:#define CXL_POISON_START_MASK GENMASK_ULL(63, 6)
Can we rename that CXL_EVENT_DPA_MASK and use for all events?
cxlmem.h:CXL_POISON_START_MASK is defined for MBOX commands (poison record, the lower 3 bits indicate "Error Source"), but CXL_DPA_MASK here is for events. They have different meaning though their values are same. So, I don't think we should consolidate them.
Ah! Great catch. I dont' know why I only masked off the 2 used bits.
Per spec, the lowest 2 bits of CXL event's DPA are defined for "Volatile or not" and "not repairable". So there is no mistake here.
That was short sighted of me.
Yes we should consolidate these. Ira
-- Thanks, Ruan.
Shiyang Ruan wrote:
在 2024/4/24 5:04, Ira Weiny 写道:
Alison Schofield wrote:
On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
[snip]
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h index e5f13260fc52..cdfce932d5b1 100644 --- a/drivers/cxl/core/trace.h +++ b/drivers/cxl/core/trace.h @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
- DRAM Event Record
- CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
*/ -#define CXL_DPA_FLAGS_MASK 0x3F +#define CXL_DPA_FLAGS_MASK 0x3FULL #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK) #define CXL_DPA_VOLATILE BIT(0)
This works but I'm thinking this is the time to convene on one CXL_EVENT_DPA_MASK for both all CXL events, rather than having cxl_poison event be different.
I prefer how poison defines it:
cxlmem.h:#define CXL_POISON_START_MASK GENMASK_ULL(63, 6)
Can we rename that CXL_EVENT_DPA_MASK and use for all events?
cxlmem.h:CXL_POISON_START_MASK is defined for MBOX commands (poison record, the lower 3 bits indicate "Error Source"), but CXL_DPA_MASK here is for events. They have different meaning though their values are same. So, I don't think we should consolidate them.
By definition the DPA in all these payloads can't use the lower 6 bits. We are defining a mask to get the DPA. This has nothing to do with what may be stored in the other 6 bits.
Defining a common DPA mask is correct per both sections of the spec.
Ah! Great catch. I dont' know why I only masked off the 2 used bits.
Per spec, the lowest 2 bits of CXL event's DPA are defined for "Volatile or not" and "not repairable". So there is no mistake here.
I appreciate your not calling out my code as a bug. :-D
However, bits [5:2] are also Reserved. So it is incorrect to mask off only the lower 2 bits. Even though the reserved bits must be 0 per the spec, it is still better to properly mask all reserved bits because they are by definition not part of the DPA.
Could you create a common macro for the next version of the patch?
Thanks, Ira
That was short sighted of me.
Yes we should consolidate these. Ira
-- Thanks, Ruan.
On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
The length of Physical Address in General Media Event Record/DRAM Event Record is 64-bit, so the field mask should be defined as such length. Otherwise, this causes cxl_general_media and cxl_dram tracepoints to mask off the upper-32-bits of DPA addresses. The cxl_poison event is unaffected.
If userspace was doing its own DPA-to-HPA translation this could lead to incorrect page retirement decisions, but there is no known consumer (like rasdaemon) of this event today.
Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record") Cc: stable@vger.kernel.org Cc: Dan Williams dan.j.williams@intel.com Cc: Davidlohr Bueso dave@stgolabs.net Cc: Jonathan Cameron Jonathan.Cameron@huawei.com Cc: Ira Weiny ira.weiny@intel.com Signed-off-by: Shiyang Ruan ruansy.fnst@fujitsu.com
Hi Ruan,
This fixup is important for the Event DPA->HPA translation work, so I grabbed it, updated it with most* of the review comments, and posted with that set. I expect you saw that in your mailbox.
DaveJ queued it in a topic branch for 6.10 here: https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.10/...
*I did not create a common mask for events and poison because I wanted to limit the changes. If you'd like to make that change it would be welcomed.
-- Alison
drivers/cxl/core/trace.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h index e5f13260fc52..cdfce932d5b1 100644 --- a/drivers/cxl/core/trace.h +++ b/drivers/cxl/core/trace.h @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
- DRAM Event Record
- CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
*/ -#define CXL_DPA_FLAGS_MASK 0x3F +#define CXL_DPA_FLAGS_MASK 0x3FULL #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK)
#define CXL_DPA_VOLATILE BIT(0)
2.34.1
linux-stable-mirror@lists.linaro.org