Re: [Linaro-mm-sig] [PATCH v2 3/3] gpu: ion: oom killer

5 Sep 2012


      On Tue, Sep 4, 2012 at 6:47 PM, Nishanth Peethambaran
nishanth.peethu@gmail.com wrote:
...
On Tue, Sep 4, 2012 at 3:24 PM, zhangfei gao zhangfei.gao@gmail.com wrote:
...
...
...
+retry:
        mutex_lock(&dev->lock);
        for (n = rb_first(&dev->heaps); n != NULL; n = rb_next(n)) {

          struct ion_heap *heap = rb_entry(n, struct ion_heap, node);




          heap = rb_entry(n, struct ion_heap, node);
          /* if the client doesn't support this heap type */
          if (!((1 << heap->type) & client->heap_mask))
                  continue;



@@ -404,6 +410,11 @@ struct ion_handle *ion_alloc(struct ion_client *client, size_t len,
        }
        mutex_unlock(&dev->lock);

  if (buffer == ERR_PTR(-ENOMEM)) {


          if (!ion_shrink(heap, 0))


                  goto retry;


  }




The heap which is attempted for shrink in this patch would be the last
heap registered by platform.
The heap is searched out according to id and type.
...
It should be the last heap user mentioned to try as per heap_mask.
Else, it could go into infinite loop also.
The last cma/carveout heap attempted for ion_buffer_create() would be
a better candidate for ion_shrink().
Alos, exiting retry after a few iterations and returning allocation
failure is a safe choice.

ion_shrink happens only buffer == ERR_PTR(-ENOMEM),  which happens

when heap alloc fail.
Take the case of heap ids 0-3 were registered with ion device and user
asks for heap2. Heap 2 will be tried for ion_buffer_create() and
assume it failed. The rbtree loop will iterate heap3 also but will not
be attempted for ion_buffer_create() because of the heap_mask check.
ion_shrink() will try shrinking heap3 instead of heap2.
Thanks Nishanth
There really have issue if several id with same type, which not
considered before.
Since ion_debug_heap_total takes care of type instead of id.
It can be easily solved by change ion_debug_heap_total(type) to  ->
ion_debug_heap_total(id).
Then ion_debug_heap_total return 0 for other id.
I see you already submit one patch to replace heap->type with heap->id.
This patch could based on that one, or your patch base on this one :)
Besides, another mistake is heap_found should be used.
...
...

retry happens if ion_shrink succeed, if no process could be kill,

ion_alloc will return fail without retry.
For example, if all process have adj=0, then ion_shrink will not be called.
Got it.
...
...
...
+/*


ion_shrink



kill all tasks referd the buffer by selected task


*/

+static int ion_shrink(struct ion_heap *heap, int kill_adj)
+{

  struct rb_node *n;


  struct ion_client *client = NULL;


  struct ion_device *dev = heap->dev;


  struct task_struct *selected = NULL;


  int selected_size = 0;


  int selected_oom_score_adj = 0;



  for (n = rb_first(&dev->clients); n; n = rb_next(n)) {


          size_t size;


          struct task_struct *p;



          client = rb_entry(n, struct ion_client, node);


          if (!client->task)


                  continue;



          p = client->task;



          if ((p->signal->oom_score_adj <= kill_adj) ||


                  (p->signal->oom_score_adj < selected_oom_score_adj))


                  continue;



          size = ion_debug_heap_total(client, heap->type);


          if (!size)


                  continue;


          if (size < selected_size)


                  continue;



          selected = p;


          selected_size = size;


          selected_oom_score_adj = p->signal->oom_score_adj;


  }



  if (selected) {


          /* kill all proeces refer buffer shared with this client */


          mutex_lock(&client->lock);


          for (n = rb_first(&client->handles); n; n = rb_next(n)) {


                  struct rb_node *r;


                  struct ion_client *c;


                  struct ion_handle *handle = rb_entry(n,


                                  struct ion_handle,


                                  node);



                  for (r = rb_first(&dev->clients); r; r = rb_next(r)) {


                          struct ion_handle *h;



                          c = rb_entry(r, struct ion_client, node);


                          h = ion_handle_lookup(c, handle->buffer);


                          if (!IS_ERR_OR_NULL(h)) {


                                  send_sig(SIGKILL, c->task, 0);


                                  pr_info("SIGKILL pid: %u\n",


                                                  c->task->pid);


                          }



                  }


          }


          mutex_unlock(&client->lock);



          send_sig(SIGKILL, selected, 0);


          set_tsk_thread_flag(selected, TIF_MEMDIE);


          pr_info("SIGKILL pid: %u size: %u adj: %u\n",


                          selected->pid, selected_size,


                          selected_oom_score_adj);


          msleep(20);



The msleep() might not be the right way to do it. Signalling from
client_destroy/buffer_free may be better.
msleep is let SIGKILL and schedule take effective as soon as possible.
The release function of the killed process will be called immediately.
Correct me if wrong. It can't be assumed that release function of the
killed process will be called immediately or within 20ms if system
load is high. I am using msleep() as of now because of simplicity and
ease of implementation.
Safer option is to do a wait_event instead of msleep() with wake_up
happening when buffers get released. It is a bit harder to do. Before
send_sig() in ion_shrink(), the process going to wait should add
itself to a waitque list. When the client gets released (ideally when
buffer gets released), the wake_up should be sent to the waitqueues.
This is what I was thinking.
I don't think this is good suggestion.
msleep is used to release cpu and let other task to run.
If system load is high, no difference between msleep and wait_event.
The killed process and alloc process are different, why they wait for
each other.
Also no wait in lowmemorykiller.c
And the chance for calling ion_shrink suppose limited.
Thanks

    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Linaro-mm-sig] [PATCH v2 3/3] gpu: ion: oom killer