On Fri, Jan 17, 2025 at 06:35:20PM -0800, Jakub Kicinski wrote:
On Fri, 17 Jan 2025 03:02:40 -0800 Breno Leitao wrote:
Looks like previously all the data was on the stack, now we have a mix.
Not sure I followed. The data ({userdata,extradata}_complete) was always inside nt field, which belongs to target_list.
I mean the buffer we use for formatting. Today it's this:
static char buf[MAX_PRINT_CHUNK]; /* protected by target_list_lock */ int header_len, msgbody_len; const char *msgbody;
right? I missed that "static" actually so it's not on the stack, it's in the .bss section.
Since you raised this topic, I don't think buf needs to be static for a functional perspective, since `buf` is completely overwritten every time send_msg functions are called.
Maybe we can pack all the bits of state into a struct for easier passing around, but still put it on the stack?
It depends on what state you need here. We can certainly pass runtime (aka sysdata in this patchset) data in the stack, but doing the same for userdata would require extra computation in runtime. In other words, the userdata_complete and length are calculated at configfs update time today, and only read during runtime, and there is no connection between configfs and runtime (write_ext_msg()) except through the stack.
On the other side, if we want to have extradata_complete in the stack, I still think that userdata will need to be in the stack, and create a buffer in runtime's frame and copy userdata + sysdata at run time, doing an extra copy.
Trying to put this in code, this is what I thought:
/* Copy to the stack (buf) the userdata string + sysdata */ static void append_runtime_sysdata(struct netconsole_target *nt, char *buf) { if (!(nt->sysdata_fields & CPU_NR)) return;
return scnprintf(buf, MAX_EXTRADATA_ENTRY_LEN * MAX_EXTRADATA_ITEMS, "%s cpu=%u\n", nt->userdata_complete, raw_smp_processor_id());
}
/* Move complete string in the stack and send from there */ static void send_ext_msg_udp(struct netconsole_target *nt, const char *msg, int msg_len) { ... #ifdef CONFIG_NETCONSOLE_DYNAMIC struct char buf[MAX_EXTRADATA_ENTRY_LEN * MAX_EXTRADATA_ITEMS]; extradata_len = append_runtime_sysdata(nt, buf); #endif
send_msg_{no}_fragmentation(nt, msg, buf, extradata_len, release_len) ...
}
My thinking was to handle it like the release. Print it at the send_msg_no_fragmentation() stage directly into the static buffer. Does that get hairy coding-wise?
I suppose the advantage of doing this approach is to reduce a memcpy/strcpy, right?
If this is what your motivation, I think we cannot remove it from the fragmented case. Let me share my thought process:
1) sysdata needs to be appended to both send_msg_fragmented() and send_msg_no_fragmentation(). The fragmented case is the problem.
2) It is trivially done in send_msg_fragmented() case.
3) For the send_msg_no_fragmentation() case, there is no trivial way to get it done without using a secondary buffer and then memcpy to `buf`.
Let's suppose sysdata has "cpu=42", and original `buf` has only 5 available chars, thus it needs to have 2 msgs to accommodate the full message.
Then the it needs to track that `cpu=4` will be sent in a msg and create another message with the missing `2`.
The only way to do it properly is having a extra buffer where we have `cpu=42` and copy 5 bytes from there, and then copy the last one in the next iteration. I am not sure we can do it in one shot.
On top of that, I am planning to increase other features in sysdata (such as current task name, modules and even consolidate the release as sysdata), which has two implications:
1) Average messages size will become bigger. Thus, memcpy will be needed one way or another.
2) Unless we can come up with a smart solution, this solution will be harder to reason about.
If you want to invest more time in this direction, I am more than happy to create a PoC, so we can discuss more concretely.
Thanks, --breno