On Tue, Nov 19, 2024 at 2:36 PM Yosry Ahmed yosryahmed@google.com wrote:
On Tue, Nov 19, 2024 at 11:30 AM Pasha Tatashin pasha.tatashin@soleen.com wrote:
On Tue, Nov 19, 2024 at 1:23 PM Roman Gushchin roman.gushchin@linux.dev wrote:
On Tue, Nov 19, 2024 at 10:08:36AM -0500, Pasha Tatashin wrote:
On Mon, Nov 18, 2024 at 8:09 PM Greg KH gregkh@linuxfoundation.org wrote:
On Mon, Nov 18, 2024 at 05:08:42PM -0500, Pasha Tatashin wrote:
Additionally, using crash/drgn is not feasible for us at this time, it requires keeping external tools on our hosts, also it requires approval and a security review for each script before deployment in our fleet.
So it's ok to add a totally insecure kernel feature to your fleet instead? You might want to reconsider that policy decision :)
Hi Greg,
While some risk is inherent, we believe the potential for abuse here is limited, especially given the existing CAP_SYS_ADMIN requirement. But, even with root access compromised, this tool presents a smaller attack surface than alternatives like crash/drgn. It exposes less sensitive information, unlike crash/drgn, which could potentially allow reading all of kernel memory.
The problem here is with using dmesg for output. No security-sensitive information should go there. Even exposing raw kernel pointers is not considered safe.
I am OK in writing the output to a debugfs file in the next version, the only concern I have is that implies that dump_page() would need to be basically duplicated, as it now outputs everything via printk's.
Perhaps you can refactor the code in dump_page() to use a seq_buf, then have dump_page() printk that seq_buf using seq_buf_do_printk(), and have page detective output that seq_buf to the debugfs file?
Good idea, I will look into modifying it this way.
We do something very similar with memory_stat_format(). We use the
void mem_cgroup_print_oom_meminfo(struct mem_cgroup *memcg) { /* Use static buffer, for the caller is holding oom_lock. */ static char buf[PAGE_SIZE]; .... seq_buf_init(&s, buf, sizeof(buf)); memory_stat_format(memcg, &s); seq_buf_do_printk(&s, KERN_INFO); }
This is a callosal stack allocation, given that our fleet only has 8K stacks. :-)
same function to generate the memcg stats in a seq_buf, then we use that seq_buf to output the stats to memory.stat as well as the OOM log.