On Tue, May 19, 2020 at 03:12:42PM -0700, Dan Williams wrote:
The original copy_mc_fragile() implementation had negative performance implications since it did not use the fast-string instruction sequence to perform copies. For this reason copy_mc_to_kernel() fell back to plain memcpy() to preserve performance on platform that did not indicate the capability to recover from machine check exceptions. However, that capability detection was not architectural and now that some platforms can recover from fast-string consumption of memory errors the memcpy() fallback now causes these more capable platforms to fail.
Introduce copy_mc_generic() as the fast default implementation of copy_mc_to_kernel() and finalize the transition of copy_mc_fragile() to be a platform quirk to indicate 'fragility'. With this in place copy_mc_to_kernel() is fast and recovery-ready by default regardless of hardware capability.
Thanks to Vivek for identifying that copy_user_generic() is not suitable as the copy_mc_to_user() backend since the #MC handler explicitly checks ex_has_fault_handler().
/me is curious to know why #MC handler mandates use of _ASM_EXTABLE_FAULT().
[..]
+/*
- copy_mc_generic - memory copy with exception handling
- Fast string copy + fault / exception handling. If the CPU does
- support machine check exception recovery, but does not support
- recovering from fast-string exceptions then this CPU needs to be
- added to the copy_mc_fragile_key set of quirks. Otherwise, absent any
- machine check recovery support this version should be no slower than
- standard memcpy.
- */
+SYM_FUNC_START(copy_mc_generic)
- ALTERNATIVE "jmp copy_mc_fragile", "", X86_FEATURE_ERMS
- movq %rdi, %rax
- movq %rdx, %rcx
+.L_copy:
- rep movsb
- /* Copy successful. Return zero */
- xorl %eax, %eax
- ret
+SYM_FUNC_END(copy_mc_generic) +EXPORT_SYMBOL_GPL(copy_mc_generic)
- .section .fixup, "ax"
+.E_copy:
- /*
* On fault %rcx is updated such that the copy instruction could
* optionally be restarted at the fault position, i.e. it
* contains 'bytes remaining'. A non-zero return indicates error
* to copy_safe() users, or indicate short transfers to
copy_safe() is vestige of terminology of previous patches?
* user-copy routines.
*/
- movq %rcx, %rax
- ret
- .previous
- _ASM_EXTABLE_FAULT(.L_copy, .E_copy)
A question for my education purposes.
So copy_mc_generic() can handle MCE both on source and destination addresses? (Assuming some device can generate MCE on stores too). On the other hand copy_mc_fragile() handles MCE recovery only on source and non-MCE recovery on destination.
Thanks Vivek