From: Steven Rostedt Date: Fri, 25 Jan 2008 20:08:12 +0000 (+0100) Subject: sched: RT-balance, avoid overloading X-Git-Tag: v2.6.25-rc1~1237^2~66 X-Git-Url: https://err.no/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=e1f47d891c0f00769d6d40ac5740f943e998d089;p=linux-2.6 sched: RT-balance, avoid overloading This patch changes the searching for a run queue by a waking RT task to try to pick another runqueue if the currently running task is an RT task. The reason is that RT tasks behave different than normal tasks. Preempting a normal task to run a RT task to keep its cache hot is fine, because the preempted non-RT task may wait on that same runqueue to run again unless the migration thread comes along and pulls it off. RT tasks behave differently. If one is preempted, it makes an active effort to continue to run. So by having a high priority task preempt a lower priority RT task, that lower RT task will then quickly try to run on another runqueue. This will cause that lower RT task to replace its nice hot cache (and TLB) with a completely cold one. This is for the hope that the new high priority RT task will keep its cache hot. Remeber that this high priority RT task was just woken up. So it may likely have been sleeping for several milliseconds, and will end up with a cold cache anyway. RT tasks run till they voluntarily stop, or are preempted by a higher priority task. This means that it is unlikely that the woken RT task will have a hot cache to wake up to. So pushing off a lower RT task is just killing its cache for no good reason. Signed-off-by: Steven Rostedt Signed-off-by: Ingo Molnar --- diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c index 87d7b3ff38..9becc3710b 100644 --- a/kernel/sched_rt.c +++ b/kernel/sched_rt.c @@ -160,11 +160,23 @@ static int select_task_rq_rt(struct task_struct *p, int sync) struct rq *rq = task_rq(p); /* - * If the task will not preempt the RQ, try to find a better RQ - * before we even activate the task + * If the current task is an RT task, then + * try to see if we can wake this RT task up on another + * runqueue. Otherwise simply start this RT task + * on its current runqueue. + * + * We want to avoid overloading runqueues. Even if + * the RT task is of higher priority than the current RT task. + * RT tasks behave differently than other tasks. If + * one gets preempted, we try to push it off to another queue. + * So trying to keep a preempting RT task on the same + * cache hot CPU will force the running RT task to + * a cold CPU. So we waste all the cache for the lower + * RT task in hopes of saving some of a RT task + * that is just being woken and probably will have + * cold cache anyway. */ - if ((p->prio >= rq->rt.highest_prio) - && (p->nr_cpus_allowed > 1)) { + if (unlikely(rt_task(rq->curr))) { int cpu = find_lowest_rq(p); return (cpu == -1) ? task_cpu(p) : cpu;