From: Nishanth Aravamudan Date: Fri, 1 Sep 2006 04:27:53 +0000 (-0700) Subject: [PATCH] fix NUMA interleaving for huge pages X-Git-Tag: v2.6.18-rc6~8 X-Git-Url: https://err.no/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=3b98b087fc2daab67518d2baa8aef19a6ad82723;p=linux-2.6 [PATCH] fix NUMA interleaving for huge pages Since vma->vm_pgoff is in units of smallpages, VMAs for huge pages have the lower HPAGE_SHIFT - PAGE_SHIFT bits always cleared, which results in badd offsets to the interleave functions. Take this difference from small pages into account when calculating the offset. This does add a 0-bit shift into the small-page path (via alloc_page_vma()), but I think that is negligible. Also add a BUG_ON to prevent the offset from growing due to a negative right-shift, which probably shouldn't be allowed anyways. Tested on an 8-memory node ppc64 NUMA box and got the interleaving I expected. Signed-off-by: Nishanth Aravamudan Signed-off-by: Adam Litke Cc: Andi Kleen Acked-by: Christoph Lameter Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- diff --git a/mm/mempolicy.c b/mm/mempolicy.c index e07e27e846..a9963ceddd 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1176,7 +1176,15 @@ static inline unsigned interleave_nid(struct mempolicy *pol, if (vma) { unsigned long off; - off = vma->vm_pgoff; + /* + * for small pages, there is no difference between + * shift and PAGE_SHIFT, so the bit-shift is safe. + * for huge pages, since vm_pgoff is in units of small + * pages, we need to shift off the always 0 bits to get + * a useful offset. + */ + BUG_ON(shift < PAGE_SHIFT); + off = vma->vm_pgoff >> (shift - PAGE_SHIFT); off += (addr - vma->vm_start) >> shift; return offset_il_node(pol, vma, off); } else