On Mon, 20 Jan 2025 09:30:48 -0800 Breno Leitao wrote:
Not sure I followed. The data ({userdata,extradata}_complete) was always inside nt field, which belongs to target_list.
I mean the buffer we use for formatting. Today it's this:
static char buf[MAX_PRINT_CHUNK]; /* protected by target_list_lock */ int header_len, msgbody_len; const char *msgbody;
right? I missed that "static" actually so it's not on the stack, it's in the .bss section.
Since you raised this topic, I don't think buf needs to be static for a functional perspective, since `buf` is completely overwritten every time send_msg functions are called.
It may be because it's relatively big and stack space used to be very limited.
My thinking was to handle it like the release. Print it at the send_msg_no_fragmentation() stage directly into the static buffer. Does that get hairy coding-wise?
I suppose the advantage of doing this approach is to reduce a memcpy/strcpy, right?
Not really, my main motivation is to try to find a common way of how various pieces of the output are protected and handled.
If this is what your motivation, I think we cannot remove it from the fragmented case. Let me share my thought process:
- sysdata needs to be appended to both send_msg_fragmented() and
send_msg_no_fragmentation(). The fragmented case is the problem.
It is trivially done in send_msg_fragmented() case.
For the send_msg_no_fragmentation() case, there is no trivial way to
get it done without using a secondary buffer and then memcpy to `buf`.
Let's suppose sysdata has "cpu=42", and original `buf` has only 5 available chars, thus it needs to have 2 msgs to accommodate the full message.
Then the it needs to track that `cpu=4` will be sent in a msg and create another message with the missing `2`.
The only way to do it properly is having a extra buffer where we have `cpu=42` and copy 5 bytes from there, and then copy the last one in the next iteration. I am not sure we can do it in one shot.
FWIW to simplify reasoning about the length I thought we could take the worst case, assume we'll need len(cpu=) + log10(nr_cpu_ids) of space.
On top of that, I am planning to increase other features in sysdata (such as current task name, modules and even consolidate the release as sysdata), which has two implications:
- Average messages size will become bigger. Thus, memcpy will be needed
one way or another.
- Unless we can come up with a smart solution, this solution will be
harder to reason about.
If you want to invest more time in this direction, I am more than happy to create a PoC, so we can discuss more concretely.
I don't feel super strongly about this. But hacking around is always good to get a sense of how hairy the implementation ends up being.
To rephrase my concern is that we have some data as static on the stack, some dynamically appended at the send_*() stage, now we're adding a third way of handling things. Perhaps the simplest way to make me happy would be to move the bufs which are currently static into nt.