Rather than have build_mlx_header() return a negative value on failure
and the length of the segments it builds on success, add a pointer
parameter to return the length and return 0 on success. This matches
the calling convention used for build_lso_seg() and generates slightly
smaller code -- eg, on 64-bit x86:
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-22 (-22)
function old new delta
mlx4_ib_post_send 2023 2001 -22