In a multibyte locale such as en_GB.UTF-8, the pg command cannot handle files
containing a form feed character (ASCII 0x0c) at the start of a line. The
program enters an infinite loop.
I've traced the problem to the function endline_for_mb in file pg.c. The code
assumes that the libc function wcwidth will return a nonnegative value, which
is not true for a form feed character. wcwidth returns -1 and the unsigned
variable "pos" goes into underflow.
I'll attach a patch which tests whether the character is printable before
calling wcwidth. If not, it uses instead the width of the constant L'?' which
is later used to replace nonprintable characters. I trust that we can assume
printability of this constant :-)
Steps to Reproduce:
1. Select a multibyte locale (tested with en_GB.UTF-8)
2. Create a file with a form feed character (0x0c) at the start of a line.
3. Try to display this file using the pg command.
Reported-by: Mark Calderbank <m.calderbank@iname.com>
Reported-by: Mike Frysinger <vapier@gentoo.org>
Addresses: https://bugs.gentoo.org/297717
Signed-off-by: Karel Zak <kzak@redhat.com>
pos += TABSIZE - (pos % TABSIZE);
break;
default:
- pos += wcwidth(*p);
+ if (iswprint(*p))
+ pos += wcwidth(*p);
+ else
+ pos += wcwidth(L'?');
}
if (pos > col) {
if (*p == L'\t')