Instead of allocating pages for the buffer one by one, take advantage of the
buddy alloc system and request them 2^order at a time. This increases the chance
for bigger buffer parts to be contigious and reduces loop iteration count. While
at it, rename function __idetape_kmalloc_stage() to ide_tape_kmalloc_buffer().
[bart: fold with "ide-tape: fix mem leak" patch to preserve bisectability]