The linux bitset operators (test_bit, set_bit etc) work on arrays of "unsigned
long". dm-log uses such bitsets but treats them as arrays of uint32_t, only
allocating and zeroing a multiple of 4 bytes (as 'clean_bits' is a uint32_t).
The patch below fixes this problem.
The problem is specific to 64-bit big endian machines such as s390x or ppc-64
and can prevent pvmove terminating.
In the simplest case, if "region_count" were (say) 30, then
bitset_size (below) would be 4 and bitset_uint32_count would be 1.
Thus the memory for this butset, after allocation and zeroing would
be
0 0 0 0 X X X X
On a bigendian 64bit machine, bit 0 for this bitset is in the 8th
byte! (and every bit that dm-log would use would be in the X area).
0 0 0 0 X X X X
^
here
which hasn't been cleared properly.
As the dm-raid1 code only syncs and counts regions which have a 0 in the
'sync_bits' bitset, and only finishes when it has counted high enough, a large
number of 1's among those 'X's will cause the sync to not complete.
It is worth noting that the code uses the same bitsets for in-memory and
on-disk logs. As these bitsets are host-endian and host-sized, this means
that they cannot safely be moved between computers with
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
lc->sync = sync;
/*
- * Work out how many words we need to hold the bitset.
+ * Work out how many "unsigned long"s we need to hold the bitset.
*/
bitset_size = dm_round_up(region_count,
- sizeof(*lc->clean_bits) << BYTE_SHIFT);
+ sizeof(unsigned long) << BYTE_SHIFT);
bitset_size >>= BYTE_SHIFT;
lc->bitset_uint32_count = bitset_size / 4;