On Thursday 24 September 2015 04:02:09 Drokin, Oleg wrote:
The lustre tracefile has a timestamp defined as
__u32 ph_sec; __u64 ph_usec;
which seems completely backwards, as the microsecond portion of a time stamp will always fit into a __u32 value, while the second portion will overflow in 2038 or 2106 (in case of unsigned seconds).
This rectifies the situation by swapping out the types to have 64-bit seconds like everything else.
While this constitutes an ABI change, it seems to be reasonable for a debugging interface to change and is likely what was originally intended.
This is going to wreak some havoc as the old tools would obviously misrepresent this, but the new tools also cannot assume blindly this change is in place, since people tend to stick to old lustre modules for a long time in production for various reasons, while the tools might get upgraded. So I wonder if we should include some sort of a hint somewhere that the lctl could read and see which format it's going to convert from. Either that or we'd need to play with some heuristic in the tools to observe where the leading zeros are (in little ending) in one and the other case (if the year is not quite 2038 yet) and make a decision based on that.
Ok, I see.
If you can prove that the user space tools interpret this value as a unsigned 32-bit number, that would work until 2106 and we could document it as a restriction that way.
Another option would be to change the code storing the times there to do:
header->ph_sec = (u32)ts.tv_sec; header->ph_usec = (ts.tv_sec & 0xffffffff00000000ull) | (ts.tv_nsec / NSEC_PER_USEC);
and do the reverse on the user space side. This would be both endian- safe and backwards compatible, although rather ugly.
Arnd