From: Wenjing Liu <wenjing.liu(a)amd.com>
[why]
When populating dml pipes, odm combine policy should be assigned based
on the pipe topology of the context passed in. DML pipes could be
repopulated multiple times during single validate bandwidth attempt. We
need to make sure that whenever we repopulate the dml pipes it is always
aligned with the updated context. There is a case where DML pipes get
repopulated during FPO optimization after ODM combine policy is changed.
Since in the current code we reinitlaize ODM combine policy, even though
the current context has ODM combine enabled, we overwrite it despite the
pipes are already split. This causes DML to think that MPC combine is
used so we mistakenly enable MPC combine because we apply pipe split
with ODM combine policy reset. This issue doesn't impact non windowed
MPO with ODM case because the legacy policy has restricted use cases. We
don't encounter the case where both ODM and FPO optimizations are
enabled together. So we decide to leave it as is because it is about to
be replaced anyway.
Cc: stable(a)vger.kernel.org # 6.6+
Reviewed-by: Chaitanya Dhere <chaitanya.dhere(a)amd.com>
Reviewed-by: Alvin Lee <alvin.lee2(a)amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz(a)amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu(a)amd.com>
---
.../drm/amd/display/dc/dml/dcn32/dcn32_fpu.c | 15 ++++++++++----
drivers/gpu/drm/amd/display/dc/inc/resource.h | 20 ++++++++-----------
.../dc/resource/dcn32/dcn32_resource.c | 16 ++++++++++++++-
3 files changed, 34 insertions(+), 17 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
index a7981a0c4158..4edf7df4c6aa 100644
--- a/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
+++ b/drivers/gpu/drm/amd/display/dc/dml/dcn32/dcn32_fpu.c
@@ -1289,7 +1289,7 @@ static bool update_pipes_with_split_flags(struct dc *dc, struct dc_state *contex
return updated;
}
-static bool should_allow_odm_power_optimization(struct dc *dc,
+static bool should_apply_odm_power_optimization(struct dc *dc,
struct dc_state *context, struct vba_vars_st *v, int *split,
bool *merge)
{
@@ -1393,9 +1393,12 @@ static void try_odm_power_optimization_and_revalidate(
{
int i;
unsigned int new_vlevel;
+ unsigned int cur_policy[MAX_PIPES];
- for (i = 0; i < pipe_cnt; i++)
+ for (i = 0; i < pipe_cnt; i++) {
+ cur_policy[i] = pipes[i].pipe.dest.odm_combine_policy;
pipes[i].pipe.dest.odm_combine_policy = dm_odm_combine_policy_2to1;
+ }
new_vlevel = dml_get_voltage_level(&context->bw_ctx.dml, pipes, pipe_cnt);
@@ -1404,6 +1407,9 @@ static void try_odm_power_optimization_and_revalidate(
memset(merge, 0, MAX_PIPES * sizeof(bool));
*vlevel = dcn20_validate_apply_pipe_split_flags(dc, context, new_vlevel, split, merge);
context->bw_ctx.dml.vba.VoltageLevel = *vlevel;
+ } else {
+ for (i = 0; i < pipe_cnt; i++)
+ pipes[i].pipe.dest.odm_combine_policy = cur_policy[i];
}
}
@@ -1581,7 +1587,7 @@ static void dcn32_full_validate_bw_helper(struct dc *dc,
}
}
- if (should_allow_odm_power_optimization(dc, context, vba, split, merge))
+ if (should_apply_odm_power_optimization(dc, context, vba, split, merge))
try_odm_power_optimization_and_revalidate(
dc, context, pipes, split, merge, vlevel, *pipe_cnt);
@@ -2210,7 +2216,8 @@ bool dcn32_internal_validate_bw(struct dc *dc,
int i;
pipe_cnt = dc->res_pool->funcs->populate_dml_pipes(dc, context, pipes, fast_validate);
- dcn32_update_dml_pipes_odm_policy_based_on_context(dc, context, pipes);
+ if (!dc->config.enable_windowed_mpo_odm)
+ dcn32_update_dml_pipes_odm_policy_based_on_context(dc, context, pipes);
/* repopulate_pipes = 1 means the pipes were either split or merged. In this case
* we have to re-calculate the DET allocation and run through DML once more to
diff --git a/drivers/gpu/drm/amd/display/dc/inc/resource.h b/drivers/gpu/drm/amd/display/dc/inc/resource.h
index 1d51fed12e20..2eae2f3e846d 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/resource.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/resource.h
@@ -427,22 +427,18 @@ struct pipe_ctx *resource_get_primary_dpp_pipe(const struct pipe_ctx *dpp_pipe);
int resource_get_mpc_slice_index(const struct pipe_ctx *dpp_pipe);
/*
- * Get number of MPC "cuts" of the plane associated with the pipe. MPC slice
- * count is equal to MPC splits + 1. For example if a plane is cut 3 times, it
- * will have 4 pieces of slice.
- * return - 0 if pipe is not used for a plane with MPCC combine. otherwise
- * the number of MPC "cuts" for the plane.
+ * Get the number of MPC slices associated with the pipe.
+ * The function returns 0 if the pipe is not associated with an MPC combine
+ * pipe topology.
*/
-int resource_get_mpc_slice_count(const struct pipe_ctx *opp_head);
+int resource_get_mpc_slice_count(const struct pipe_ctx *pipe);
/*
- * Get number of ODM "cuts" of the timing associated with the pipe. ODM slice
- * count is equal to ODM splits + 1. For example if a timing is cut 3 times, it
- * will have 4 pieces of slice.
- * return - 0 if pipe is not used for ODM combine. otherwise
- * the number of ODM "cuts" for the timing.
+ * Get the number of ODM slices associated with the pipe.
+ * The function returns 0 if the pipe is not associated with an ODM combine
+ * pipe topology.
*/
-int resource_get_odm_slice_count(const struct pipe_ctx *otg_master);
+int resource_get_odm_slice_count(const struct pipe_ctx *pipe);
/* Get the ODM slice index counting from 0 from left most slice */
int resource_get_odm_slice_index(const struct pipe_ctx *opp_head);
diff --git a/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c b/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c
index ac04a9c9a3d8..71cd20618bfe 100644
--- a/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/resource/dcn32/dcn32_resource.c
@@ -1829,7 +1829,21 @@ int dcn32_populate_dml_pipes_from_context(
dcn32_zero_pipe_dcc_fraction(pipes, pipe_cnt);
DC_FP_END();
pipes[pipe_cnt].pipe.dest.vfront_porch = timing->v_front_porch;
- pipes[pipe_cnt].pipe.dest.odm_combine_policy = dm_odm_combine_policy_dal;
+ if (dc->config.enable_windowed_mpo_odm &&
+ dc->debug.enable_single_display_2to1_odm_policy) {
+ switch (resource_get_odm_slice_count(pipe)) {
+ case 2:
+ pipes[pipe_cnt].pipe.dest.odm_combine_policy = dm_odm_combine_policy_2to1;
+ break;
+ case 4:
+ pipes[pipe_cnt].pipe.dest.odm_combine_policy = dm_odm_combine_policy_4to1;
+ break;
+ default:
+ pipes[pipe_cnt].pipe.dest.odm_combine_policy = dm_odm_combine_policy_dal;
+ }
+ } else {
+ pipes[pipe_cnt].pipe.dest.odm_combine_policy = dm_odm_combine_policy_dal;
+ }
pipes[pipe_cnt].pipe.src.gpuvm_min_page_size_kbytes = 256; // according to spreadsheet
pipes[pipe_cnt].pipe.src.unbounded_req_mode = false;
pipes[pipe_cnt].pipe.scale_ratio_depth.lb_depth = dm_lb_19;
--
2.43.0
From: Linus Torvalds <torvalds(a)linux-foundation.org>
The eventfs inode had pointers to dentries (and child dentries) without
actually holding a refcount on said pointer. That is fundamentally
broken, and while eventfs tried to then maintain coherence with dentries
going away by hooking into the '.d_iput' callback, that doesn't actually
work since it's not ordered wrt lookups.
There were two reasonms why eventfs tried to keep a pointer to a dentry:
- the creation of a 'events' directory would actually have a stable
dentry pointer that it created with tracefs_start_creating().
And it needed that dentry when tearing it all down again in
eventfs_remove_events_dir().
This use is actually ok, because the special top-level events
directory dentries are actually stable, not just a temporary cache of
the eventfs data structures.
- the 'eventfs_inode' (aka ei) needs to stay around as long as there
are dentries that refer to it.
It then used these dentry pointers as a replacement for doing
reference counting: it would try to make sure that there was only
ever one dentry associated with an event_inode, and keep a child
dentry array around to see which dentries might still refer to the
parent ei.
This gets rid of the invalid dentry pointer use, and renames the one
valid case to a different name to make it clear that it's not just any
random dentry.
The magic child dentry array that is kind of a "reverse reference list"
is simply replaced by having child dentries take a ref to the ei. As
does the directory dentries. That makes the broken use case go away.
Link: https://lore.kernel.org/linux-trace-kernel/202401291043.e62e89dc-oliver.san…
Cc: stable(a)vger.kernel.org
Fixes: c1504e510238 ("eventfs: Implement eventfs dir creation functions")
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
Changes since v1: https://lore.kernel.org/linux-trace-kernel/20240130190355.11486-5-torvalds@…
- Put back the kstrdup_const()
- use kfree_rcu(ei, rcu);
- Replace simple_recursive_removal() with d_invalidate().
fs/tracefs/event_inode.c | 247 ++++++++++++---------------------------
fs/tracefs/internal.h | 7 +-
2 files changed, 77 insertions(+), 177 deletions(-)
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 0213a3375d53..31cbe38739fa 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -62,6 +62,35 @@ enum {
#define EVENTFS_MODE_MASK (EVENTFS_SAVE_MODE - 1)
+/*
+ * eventfs_inode reference count management.
+ *
+ * NOTE! We count only references from dentries, in the
+ * form 'dentry->d_fsdata'. There are also references from
+ * directory inodes ('ti->private'), but the dentry reference
+ * count is always a superset of the inode reference count.
+ */
+static void release_ei(struct kref *ref)
+{
+ struct eventfs_inode *ei = container_of(ref, struct eventfs_inode, kref);
+ kfree(ei->entry_attrs);
+ kfree_const(ei->name);
+ kfree_rcu(ei, rcu);
+}
+
+static inline void put_ei(struct eventfs_inode *ei)
+{
+ if (ei)
+ kref_put(&ei->kref, release_ei);
+}
+
+static inline struct eventfs_inode *get_ei(struct eventfs_inode *ei)
+{
+ if (ei)
+ kref_get(&ei->kref);
+ return ei;
+}
+
static struct dentry *eventfs_root_lookup(struct inode *dir,
struct dentry *dentry,
unsigned int flags);
@@ -289,7 +318,8 @@ static void update_inode_attr(struct dentry *dentry, struct inode *inode,
* directory. The inode.i_private pointer will point to @data in the open()
* call.
*/
-static struct dentry *lookup_file(struct dentry *dentry,
+static struct dentry *lookup_file(struct eventfs_inode *parent_ei,
+ struct dentry *dentry,
umode_t mode,
struct eventfs_attr *attr,
void *data,
@@ -302,7 +332,7 @@ static struct dentry *lookup_file(struct dentry *dentry,
mode |= S_IFREG;
if (WARN_ON_ONCE(!S_ISREG(mode)))
- return NULL;
+ return ERR_PTR(-EIO);
inode = tracefs_get_inode(dentry->d_sb);
if (unlikely(!inode))
@@ -321,9 +351,12 @@ static struct dentry *lookup_file(struct dentry *dentry,
ti = get_tracefs(inode);
ti->flags |= TRACEFS_EVENT_INODE;
+ // Files have their parent's ei as their fsdata
+ dentry->d_fsdata = get_ei(parent_ei);
+
d_add(dentry, inode);
fsnotify_create(dentry->d_parent->d_inode, dentry);
- return dentry;
+ return NULL;
};
/**
@@ -359,22 +392,29 @@ static struct dentry *lookup_dir_entry(struct dentry *dentry,
/* Only directories have ti->private set to an ei, not files */
ti->private = ei;
- dentry->d_fsdata = ei;
- ei->dentry = dentry; // Remove me!
+ dentry->d_fsdata = get_ei(ei);
inc_nlink(inode);
d_add(dentry, inode);
inc_nlink(dentry->d_parent->d_inode);
fsnotify_mkdir(dentry->d_parent->d_inode, dentry);
- return dentry;
+ return NULL;
}
-static void free_ei(struct eventfs_inode *ei)
+static inline struct eventfs_inode *alloc_ei(const char *name)
{
- kfree_const(ei->name);
- kfree(ei->d_children);
- kfree(ei->entry_attrs);
- kfree(ei);
+ struct eventfs_inode *ei = kzalloc(sizeof(*ei), GFP_KERNEL);
+
+ if (!ei)
+ return NULL;
+
+ ei->name = kstrdup_const(name, GFP_KERNEL);
+ if (!ei->name) {
+ kfree(ei);
+ return NULL;
+ }
+ kref_init(&ei->kref);
+ return ei;
}
/**
@@ -385,39 +425,13 @@ static void free_ei(struct eventfs_inode *ei)
*/
void eventfs_d_release(struct dentry *dentry)
{
- struct eventfs_inode *ei;
- int i;
-
- mutex_lock(&eventfs_mutex);
-
- ei = dentry->d_fsdata;
- if (!ei)
- goto out;
-
- /* This could belong to one of the files of the ei */
- if (ei->dentry != dentry) {
- for (i = 0; i < ei->nr_entries; i++) {
- if (ei->d_children[i] == dentry)
- break;
- }
- if (WARN_ON_ONCE(i == ei->nr_entries))
- goto out;
- ei->d_children[i] = NULL;
- } else if (ei->is_freed) {
- free_ei(ei);
- } else {
- ei->dentry = NULL;
- }
-
- dentry->d_fsdata = NULL;
- out:
- mutex_unlock(&eventfs_mutex);
+ put_ei(dentry->d_fsdata);
}
/**
* lookup_file_dentry - create a dentry for a file of an eventfs_inode
* @ei: the eventfs_inode that the file will be created under
- * @idx: the index into the d_children[] of the @ei
+ * @idx: the index into the entry_attrs[] of the @ei
* @parent: The parent dentry of the created file.
* @name: The name of the file to create
* @mode: The mode of the file.
@@ -434,17 +448,11 @@ lookup_file_dentry(struct dentry *dentry,
const struct file_operations *fops)
{
struct eventfs_attr *attr = NULL;
- struct dentry **e_dentry = &ei->d_children[idx];
if (ei->entry_attrs)
attr = &ei->entry_attrs[idx];
- dentry->d_fsdata = ei; // NOTE: ei of _parent_
- lookup_file(dentry, mode, attr, data, fops);
-
- *e_dentry = dentry; // Remove me
-
- return dentry;
+ return lookup_file(ei, dentry, mode, attr, data, fops);
}
/**
@@ -465,7 +473,7 @@ static struct dentry *eventfs_root_lookup(struct inode *dir,
struct tracefs_inode *ti;
struct eventfs_inode *ei;
const char *name = dentry->d_name.name;
- struct dentry *result = NULL;
+ struct dentry *result;
ti = get_tracefs(dir);
if (!(ti->flags & TRACEFS_EVENT_INODE))
@@ -482,7 +490,7 @@ static struct dentry *eventfs_root_lookup(struct inode *dir,
continue;
if (ei_child->is_freed)
goto enoent;
- lookup_dir_entry(dentry, ei, ei_child);
+ result = lookup_dir_entry(dentry, ei, ei_child);
goto out;
}
@@ -499,7 +507,7 @@ static struct dentry *eventfs_root_lookup(struct inode *dir,
if (entry->callback(name, &mode, &data, &fops) <= 0)
goto enoent;
- lookup_file_dentry(dentry, ei, i, mode, data, fops);
+ result = lookup_file_dentry(dentry, ei, i, mode, data, fops);
goto out;
}
@@ -659,25 +667,10 @@ struct eventfs_inode *eventfs_create_dir(const char *name, struct eventfs_inode
if (!parent)
return ERR_PTR(-EINVAL);
- ei = kzalloc(sizeof(*ei), GFP_KERNEL);
+ ei = alloc_ei(name);
if (!ei)
return ERR_PTR(-ENOMEM);
- ei->name = kstrdup_const(name, GFP_KERNEL);
- if (!ei->name) {
- kfree(ei);
- return ERR_PTR(-ENOMEM);
- }
-
- if (size) {
- ei->d_children = kcalloc(size, sizeof(*ei->d_children), GFP_KERNEL);
- if (!ei->d_children) {
- kfree_const(ei->name);
- kfree(ei);
- return ERR_PTR(-ENOMEM);
- }
- }
-
ei->entries = entries;
ei->nr_entries = size;
ei->data = data;
@@ -691,7 +684,7 @@ struct eventfs_inode *eventfs_create_dir(const char *name, struct eventfs_inode
/* Was the parent freed? */
if (list_empty(&ei->list)) {
- free_ei(ei);
+ put_ei(ei);
ei = NULL;
}
return ei;
@@ -726,28 +719,20 @@ struct eventfs_inode *eventfs_create_events_dir(const char *name, struct dentry
if (IS_ERR(dentry))
return ERR_CAST(dentry);
- ei = kzalloc(sizeof(*ei), GFP_KERNEL);
+ ei = alloc_ei(name);
if (!ei)
- goto fail_ei;
+ goto fail;
inode = tracefs_get_inode(dentry->d_sb);
if (unlikely(!inode))
goto fail;
- if (size) {
- ei->d_children = kcalloc(size, sizeof(*ei->d_children), GFP_KERNEL);
- if (!ei->d_children)
- goto fail;
- }
-
- ei->dentry = dentry;
+ // Note: we have a ref to the dentry from tracefs_start_creating()
+ ei->events_dir = dentry;
ei->entries = entries;
ei->nr_entries = size;
ei->is_events = 1;
ei->data = data;
- ei->name = kstrdup_const(name, GFP_KERNEL);
- if (!ei->name)
- goto fail;
/* Save the ownership of this directory */
uid = d_inode(dentry->d_parent)->i_uid;
@@ -778,7 +763,7 @@ struct eventfs_inode *eventfs_create_events_dir(const char *name, struct dentry
inode->i_op = &eventfs_root_dir_inode_operations;
inode->i_fop = &eventfs_file_operations;
- dentry->d_fsdata = ei;
+ dentry->d_fsdata = get_ei(ei);
/* directory inodes start off with i_nlink == 2 (for "." entry) */
inc_nlink(inode);
@@ -790,72 +775,11 @@ struct eventfs_inode *eventfs_create_events_dir(const char *name, struct dentry
return ei;
fail:
- kfree(ei->d_children);
- kfree(ei);
- fail_ei:
+ put_ei(ei);
tracefs_failed_creating(dentry);
return ERR_PTR(-ENOMEM);
}
-static LLIST_HEAD(free_list);
-
-static void eventfs_workfn(struct work_struct *work)
-{
- struct eventfs_inode *ei, *tmp;
- struct llist_node *llnode;
-
- llnode = llist_del_all(&free_list);
- llist_for_each_entry_safe(ei, tmp, llnode, llist) {
- /* This dput() matches the dget() from unhook_dentry() */
- for (int i = 0; i < ei->nr_entries; i++) {
- if (ei->d_children[i])
- dput(ei->d_children[i]);
- }
- /* This should only get here if it had a dentry */
- if (!WARN_ON_ONCE(!ei->dentry))
- dput(ei->dentry);
- }
-}
-
-static DECLARE_WORK(eventfs_work, eventfs_workfn);
-
-static void free_rcu_ei(struct rcu_head *head)
-{
- struct eventfs_inode *ei = container_of(head, struct eventfs_inode, rcu);
-
- if (ei->dentry) {
- /* Do not free the ei until all references of dentry are gone */
- if (llist_add(&ei->llist, &free_list))
- queue_work(system_unbound_wq, &eventfs_work);
- return;
- }
-
- /* If the ei doesn't have a dentry, neither should its children */
- for (int i = 0; i < ei->nr_entries; i++) {
- WARN_ON_ONCE(ei->d_children[i]);
- }
-
- free_ei(ei);
-}
-
-static void unhook_dentry(struct dentry *dentry)
-{
- if (!dentry)
- return;
- /*
- * Need to add a reference to the dentry that is expected by
- * simple_recursive_removal(), which will include a dput().
- */
- dget(dentry);
-
- /*
- * Also add a reference for the dput() in eventfs_workfn().
- * That is required as that dput() will free the ei after
- * the SRCU grace period is over.
- */
- dget(dentry);
-}
-
/**
* eventfs_remove_rec - remove eventfs dir or file from list
* @ei: eventfs_inode to be removed.
@@ -868,8 +792,6 @@ static void eventfs_remove_rec(struct eventfs_inode *ei, int level)
{
struct eventfs_inode *ei_child;
- if (!ei)
- return;
/*
* Check recursion depth. It should never be greater than 3:
* 0 - events/
@@ -881,28 +803,12 @@ static void eventfs_remove_rec(struct eventfs_inode *ei, int level)
return;
/* search for nested folders or files */
- list_for_each_entry_srcu(ei_child, &ei->children, list,
- lockdep_is_held(&eventfs_mutex)) {
- /* Children only have dentry if parent does */
- WARN_ON_ONCE(ei_child->dentry && !ei->dentry);
+ list_for_each_entry(ei_child, &ei->children, list)
eventfs_remove_rec(ei_child, level + 1);
- }
-
ei->is_freed = 1;
-
- for (int i = 0; i < ei->nr_entries; i++) {
- if (ei->d_children[i]) {
- /* Children only have dentry if parent does */
- WARN_ON_ONCE(!ei->dentry);
- unhook_dentry(ei->d_children[i]);
- }
- }
-
- unhook_dentry(ei->dentry);
-
- list_del_rcu(&ei->list);
- call_srcu(&eventfs_srcu, &ei->rcu, free_rcu_ei);
+ list_del(&ei->list);
+ put_ei(ei);
}
/**
@@ -913,22 +819,12 @@ static void eventfs_remove_rec(struct eventfs_inode *ei, int level)
*/
void eventfs_remove_dir(struct eventfs_inode *ei)
{
- struct dentry *dentry;
-
if (!ei)
return;
mutex_lock(&eventfs_mutex);
- dentry = ei->dentry;
eventfs_remove_rec(ei, 0);
mutex_unlock(&eventfs_mutex);
-
- /*
- * If any of the ei children has a dentry, then the ei itself
- * must have a dentry.
- */
- if (dentry)
- simple_recursive_removal(dentry, NULL);
}
/**
@@ -941,7 +837,11 @@ void eventfs_remove_events_dir(struct eventfs_inode *ei)
{
struct dentry *dentry;
- dentry = ei->dentry;
+ dentry = ei->events_dir;
+ if (!dentry)
+ return;
+
+ ei->events_dir = NULL;
eventfs_remove_dir(ei);
/*
@@ -951,5 +851,6 @@ void eventfs_remove_events_dir(struct eventfs_inode *ei)
* sticks around while the other ei->dentry are created
* and destroyed dynamically.
*/
+ d_invalidate(dentry);
dput(dentry);
}
diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
index 4b50a0668055..1886f1826cd8 100644
--- a/fs/tracefs/internal.h
+++ b/fs/tracefs/internal.h
@@ -35,8 +35,7 @@ struct eventfs_attr {
* @entries: the array of entries representing the files in the directory
* @name: the name of the directory to create
* @children: link list into the child eventfs_inode
- * @dentry: the dentry of the directory
- * @d_children: The array of dentries to represent the files when created
+ * @events_dir: the dentry of the events directory
* @entry_attrs: Saved mode and ownership of the @d_children
* @attr: Saved mode and ownership of eventfs_inode itself
* @data: The private data to pass to the callbacks
@@ -45,12 +44,12 @@ struct eventfs_attr {
* @nr_entries: The number of items in @entries
*/
struct eventfs_inode {
+ struct kref kref;
struct list_head list;
const struct eventfs_entry *entries;
const char *name;
struct list_head children;
- struct dentry *dentry; /* Check is_freed to access */
- struct dentry **d_children;
+ struct dentry *events_dir;
struct eventfs_attr *entry_attrs;
struct eventfs_attr attr;
void *data;
--
2.43.0
From: Linus Torvalds <torvalds(a)linux-foundation.org>
In order for the dentries to stay up-to-date with the eventfs changes,
just add a 'd_revalidate' function that checks the 'is_freed' bit.
Also, clean up the dentry release to actually use d_release() rather
than the slightly odd d_iput() function. We don't care about the inode,
all we want to do is to get rid of the refcount to the eventfs data
added by dentry->d_fsdata.
It would probably be cleaner to make eventfs its own filesystem, or at
least set its own dentry ops when looking up eventfs files. But as it
is, only eventfs dentries use d_fsdata, so we don't really need to split
these things up by use.
Another thing that might be worth doing is to make all eventfs lookups
mark their dentries as not worth caching. We could do that with
d_delete(), but the DCACHE_DONTCACHE flag would likely be even better.
As it is, the dentries are all freeable, but they only tend to get freed
at memory pressure rather than more proactively. But that's a separate
issue.
Link: https://lore.kernel.org/linux-trace-kernel/202401291043.e62e89dc-oliver.san…
Cc: stable(a)vger.kernel.org
Fixes: c1504e510238 ("eventfs: Implement eventfs dir creation functions")
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
Original-patch: https://lore.kernel.org/linux-trace-kernel/20240130190355.11486-6-torvalds@…
fs/tracefs/event_inode.c | 5 ++---
fs/tracefs/inode.c | 27 ++++++++++++++++++---------
fs/tracefs/internal.h | 3 ++-
3 files changed, 22 insertions(+), 13 deletions(-)
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 0289ec787367..0213a3375d53 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -378,13 +378,12 @@ static void free_ei(struct eventfs_inode *ei)
}
/**
- * eventfs_set_ei_status_free - remove the dentry reference from an eventfs_inode
- * @ti: the tracefs_inode of the dentry
+ * eventfs_d_release - dentry is going away
* @dentry: dentry which has the reference to remove.
*
* Remove the association between a dentry from an eventfs_inode.
*/
-void eventfs_set_ei_status_free(struct tracefs_inode *ti, struct dentry *dentry)
+void eventfs_d_release(struct dentry *dentry)
{
struct eventfs_inode *ei;
int i;
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 5c84460feeeb..d65ffad4c327 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -377,21 +377,30 @@ static const struct super_operations tracefs_super_operations = {
.show_options = tracefs_show_options,
};
-static void tracefs_dentry_iput(struct dentry *dentry, struct inode *inode)
+/*
+ * It would be cleaner if eventfs had its own dentry ops.
+ *
+ * Note that d_revalidate is called potentially under RCU,
+ * so it can't take the eventfs mutex etc. It's fine - if
+ * we open a file just as it's marked dead, things will
+ * still work just fine, and just see the old stale case.
+ */
+static void tracefs_d_release(struct dentry *dentry)
{
- struct tracefs_inode *ti;
+ if (dentry->d_fsdata)
+ eventfs_d_release(dentry);
+}
- if (!dentry || !inode)
- return;
+static int tracefs_d_revalidate(struct dentry *dentry, unsigned int flags)
+{
+ struct eventfs_inode *ei = dentry->d_fsdata;
- ti = get_tracefs(inode);
- if (ti && ti->flags & TRACEFS_EVENT_INODE)
- eventfs_set_ei_status_free(ti, dentry);
- iput(inode);
+ return !(ei && ei->is_freed);
}
static const struct dentry_operations tracefs_dentry_operations = {
- .d_iput = tracefs_dentry_iput,
+ .d_revalidate = tracefs_d_revalidate,
+ .d_release = tracefs_d_release,
};
static int trace_fill_super(struct super_block *sb, void *data, int silent)
diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
index 932733a2696a..4b50a0668055 100644
--- a/fs/tracefs/internal.h
+++ b/fs/tracefs/internal.h
@@ -78,6 +78,7 @@ struct dentry *tracefs_start_creating(const char *name, struct dentry *parent);
struct dentry *tracefs_end_creating(struct dentry *dentry);
struct dentry *tracefs_failed_creating(struct dentry *dentry);
struct inode *tracefs_get_inode(struct super_block *sb);
-void eventfs_set_ei_status_free(struct tracefs_inode *ti, struct dentry *dentry);
+
+void eventfs_d_release(struct dentry *dentry);
#endif /* _TRACEFS_INTERNAL_H */
--
2.43.0
From: Linus Torvalds <torvalds(a)linux-foundation.org>
The eventfs_find_events() code tries to walk up the tree to find the
event directory that a dentry belongs to, in order to then find the
eventfs inode that is associated with that event directory.
However, it uses an odd combination of walking the dentry parent,
looking up the eventfs inode associated with that, and then looking up
the dentry from there. Repeat.
But the code shouldn't have back-pointers to dentries in the first
place, and it should just walk the dentry parenthood chain directly.
Similarly, 'set_top_events_ownership()' looks up the dentry from the
eventfs inode, but the only reason it wants a dentry is to look up the
superblock in order to look up the root dentry.
But it already has the real filesystem inode, which has that same
superblock pointer. So just pass in the superblock pointer using the
information that's already there, instead of looking up extraneous data
that is irrelevant.
Link: https://lore.kernel.org/linux-trace-kernel/202401291043.e62e89dc-oliver.san…
Cc: stable(a)vger.kernel.org
Fixes: c1504e510238 ("eventfs: Implement eventfs dir creation functions")
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
Original patch: https://lore.kernel.org/linux-trace-kernel/20240130190355.11486-1-torvalds@…
fs/tracefs/event_inode.c | 26 ++++++++++++--------------
1 file changed, 12 insertions(+), 14 deletions(-)
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 824b1811e342..e9819d719d2a 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -156,33 +156,30 @@ static int eventfs_set_attr(struct mnt_idmap *idmap, struct dentry *dentry,
return ret;
}
-static void update_top_events_attr(struct eventfs_inode *ei, struct dentry *dentry)
+static void update_top_events_attr(struct eventfs_inode *ei, struct super_block *sb)
{
- struct inode *inode;
+ struct inode *root;
/* Only update if the "events" was on the top level */
if (!ei || !(ei->attr.mode & EVENTFS_TOPLEVEL))
return;
/* Get the tracefs root inode. */
- inode = d_inode(dentry->d_sb->s_root);
- ei->attr.uid = inode->i_uid;
- ei->attr.gid = inode->i_gid;
+ root = d_inode(sb->s_root);
+ ei->attr.uid = root->i_uid;
+ ei->attr.gid = root->i_gid;
}
static void set_top_events_ownership(struct inode *inode)
{
struct tracefs_inode *ti = get_tracefs(inode);
struct eventfs_inode *ei = ti->private;
- struct dentry *dentry;
/* The top events directory doesn't get automatically updated */
if (!ei || !ei->is_events || !(ei->attr.mode & EVENTFS_TOPLEVEL))
return;
- dentry = ei->dentry;
-
- update_top_events_attr(ei, dentry);
+ update_top_events_attr(ei, inode->i_sb);
if (!(ei->attr.mode & EVENTFS_SAVE_UID))
inode->i_uid = ei->attr.uid;
@@ -235,8 +232,10 @@ static struct eventfs_inode *eventfs_find_events(struct dentry *dentry)
mutex_lock(&eventfs_mutex);
do {
- /* The parent always has an ei, except for events itself */
- ei = dentry->d_parent->d_fsdata;
+ // The parent is stable because we do not do renames
+ dentry = dentry->d_parent;
+ // ... and directories always have d_fsdata
+ ei = dentry->d_fsdata;
/*
* If the ei is being freed, the ownership of the children
@@ -246,12 +245,11 @@ static struct eventfs_inode *eventfs_find_events(struct dentry *dentry)
ei = NULL;
break;
}
-
- dentry = ei->dentry;
+ // Walk upwards until you find the events inode
} while (!ei->is_events);
mutex_unlock(&eventfs_mutex);
- update_top_events_attr(ei, dentry);
+ update_top_events_attr(ei, dentry->d_sb);
return ei;
}
--
2.43.0
From: Linus Torvalds <torvalds(a)linux-foundation.org>
The tracefs-specific fields in the inode were not initialized before the
inode was exposed to others through the dentry with 'd_instantiate()'.
Move the field initializations up to before the d_instantiate.
Cc: stable(a)vger.kernel.org
Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode")
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Closes: https://lore.kernel.org/oe-lkp/202401291043.e62e89dc-oliver.sang@intel.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
Changes since v1: https://lore.kernel.org/linux-trace-kernel/20240130190355.11486-2-torvalds@…
- Since another patch zeroed out the entire tracefs_inode, there's no need
to initialize any of its fields to NULL.
fs/tracefs/event_inode.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 1c3dd0ad4660..824b1811e342 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -370,6 +370,8 @@ static struct dentry *create_dir(struct eventfs_inode *ei, struct dentry *parent
ti = get_tracefs(inode);
ti->flags |= TRACEFS_EVENT_INODE;
+ /* Only directories have ti->private set to an ei, not files */
+ ti->private = ei;
inc_nlink(inode);
d_instantiate(dentry, inode);
@@ -515,7 +517,6 @@ create_file_dentry(struct eventfs_inode *ei, int idx,
static void eventfs_post_create_dir(struct eventfs_inode *ei)
{
struct eventfs_inode *ei_child;
- struct tracefs_inode *ti;
lockdep_assert_held(&eventfs_mutex);
@@ -525,9 +526,6 @@ static void eventfs_post_create_dir(struct eventfs_inode *ei)
srcu_read_lock_held(&eventfs_srcu)) {
ei_child->d_parent = ei->dentry;
}
-
- ti = get_tracefs(ei->dentry->d_inode);
- ti->private = ei;
}
/**
--
2.43.0
From: "Steven Rostedt (Google)" <rostedt(a)goodmis.org>
eventfs uses the tracefs_inode and assumes that it's already initialized
to zero. That is, it doesn't set fields to zero (like ti->private) after
getting its tracefs_inode. This causes bugs due to stale values.
Just initialize the entire structure to zero on allocation so there isn't
any more surprises.
This is a partial fix to access to ti->private. The assignment still needs
to be made before the dentry is instantiated.
Cc: stable(a)vger.kernel.org
Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode")
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Closes: https://lore.kernel.org/oe-lkp/202401291043.e62e89dc-oliver.sang@intel.com
Suggested-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt(a)goodmis.org>
---
Changes since last version: https://lore.kernel.org/all/20240130230612.377a1933@gandalf.local.home/
- Moved vfs_inode to top of tracefs_inode structure so that the rest can
be initialized with memset_after() as the vfs_inode portion is already
cleared with a memset() itself in inode_init_once().
fs/tracefs/inode.c | 6 ++++--
fs/tracefs/internal.h | 3 ++-
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index e1b172c0e091..888e42087847 100644
--- a/fs/tracefs/inode.c
+++ b/fs/tracefs/inode.c
@@ -38,8 +38,6 @@ static struct inode *tracefs_alloc_inode(struct super_block *sb)
if (!ti)
return NULL;
- ti->flags = 0;
-
return &ti->vfs_inode;
}
@@ -779,7 +777,11 @@ static void init_once(void *foo)
{
struct tracefs_inode *ti = (struct tracefs_inode *) foo;
+ /* inode_init_once() calls memset() on the vfs_inode portion */
inode_init_once(&ti->vfs_inode);
+
+ /* Zero out the rest */
+ memset_after(ti, 0, vfs_inode);
}
static int __init tracefs_init(void)
diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
index 91c2bf0b91d9..7d84349ade87 100644
--- a/fs/tracefs/internal.h
+++ b/fs/tracefs/internal.h
@@ -11,9 +11,10 @@ enum {
};
struct tracefs_inode {
+ struct inode vfs_inode;
+ /* The below gets initialized with memset_after(ti, 0, vfs_inode) */
unsigned long flags;
void *private;
- struct inode vfs_inode;
};
/*
--
2.43.0
Hi Ard,
I have an ESPRESSObin Ultra (aarch64) that uses U-Boot as its bootloader. It shipped from the manufacturer with with v5.10, and I've been trying to upgrade. U-Boot supports booting Image directly via EFI (https://u-boot.readthedocs.io/en/latest/usage/cmd/bootefi.html), and I have been using it that way to successfully boot the system up to and including v6.0.19. However, v6.1 and v6.5 kernels fail to boot.
When booting successfully, the following messages are displayed:
EFI stub: Booting Linux Kernel...EFI stub: ERROR: FIRMWARE BUG: efi_loaded_image_t::image_base has bogus value
EFI stub: ERROR: FIRMWARE BUG: kernel image not aligned on 64k boundary
EFI stub: Using DTB from configuration table
EFI stub: ERROR: Failed to install memreserve config table!
EFI stub: Exiting boot services...
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd034]
I suspect many of the above error messages are simply attributable to using U-Boot to load an EFI stub and can be safely ignored given that the system boots and runs fine.
When boot fails (v6.5), the following messages are displayed:
EFI stub: Booting Linux Kernel...
EFI stub: ERROR: FIRMWARE BUG: efi_loaded_image_t::image_base has bogus value
EFI stub: ERROR: FIRMWARE BUG: kernel image not aligned on 64k boundary
EFI stub: ERROR: Failed to install memreserve config table!
EFI stub: Using DTB from configuration table
EFI stub: Exiting boot services...
EFI stub: ERROR: Unable to construct new device tree.
EFI stub: ERROR: Failed to update FDT and exit boot services
In case it's relevant, the device tree for this device is arch/arm64/boot/marvell/armada-3720-espressobin-ultra.dts
Hopefully I've reported this in the correct place or that the information provided is sufficient to get it where it needs to be. Let me know if there is additional information I can provide. I am also able to use the device to test.
Sincerely,
Ben Schneider
#regzbot introduced v6.0..v6.1