Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cast to unsigned type when interpreting HID descriptor length bytes (libusb 0.1) #1320

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions drivers/libusb0.c
Original file line number Diff line number Diff line change
Expand Up @@ -377,7 +377,7 @@ static int libusb_open(usb_dev_handle **udevp,

upsdebug_hex(3, "HID descriptor, method 1", buf, 9);

rdlen1 = buf[7] | (buf[8] << 8);
rdlen1 = (uint8_t)buf[7] | ((uint8_t)buf[8] << 8);
Copy link
Member

@jimklimov jimklimov Mar 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wondering in hind-sight: is a char << 8 safe on all platforms (to produce the high bits of a two-byte word here, I suppose) or would it result in a zero (all bits rotated from sight) or a same value (rotated around, ending where we started), or should there be more casting or similar magic here, to be portable?

Like ((uint16_t)buf[7] & 0x00FF) | (((uint16_t)buf[8] & 0x00FF) << 8)? Or plain multiplication like 256 * (uint8_t)buf[8] + (uint8_t)buf[7] (where 256 implicitly dictates an int for the interim manipulated value, I believe)?

Or do the bit maths imply int anyway? (Might be so, according to many warnings I "fought" in the past year)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C integer type-promotion rules are not exactly obvious, but I think that this StackOverflow article is an easier read than the actual spec, and quotes the spec.

To start, left shift, << is never a circular shift. char << 8 (not unsigned char) is not going to produce the expected answer on any system that treats characters as signed. For char sc = 0x9a;, sc << 8 is 0xffff9a00.

Both (uint16_t) and (uint8_t) will be promoted to a larger unsigned type before the arithmetic is done -- but note that (uint16_t) (char) 0x9a is 0xff9a -- so you would be introducing trouble and then trying to compensate for it with the & 0x00ff instead of just saying what you mean with (uint8_t).

The type of an integer constant (with no suffix, e.g. 256) is the first type into which the value will fit, see the list in https://en.cppreference.com/w/c/language/integer_constant -- so 256 is an int (not an unsigned int), but that won't matter in the case of 256 * (uint8_t)... (but you'd get burned by 256 * (uint16_t)(char), see previous paragraph). I believe it is going to do that arithmetic in an int sized container, so you'd have to think carefully if you were shifting left by 3 bytes with multiplication instead of 1 byte.

The net result is that I believe that the suggested change was both portable and quite clear about the programmers intent, without introducing new conversions which would require the reader to go back to the type promotion document.

Comments welcome...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, to actually answer your final question:

Or do the bit maths imply int anyway?

Yes, they do -- see this copy of the C99 standard at the section "6.5.7 Bitwise shift operators" --

Constraints
Each of the operands shall have integer type.
Semantics
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed reply, guess I was bitten a long time ago, maybe before C99 behavior was standardized across the board.

I followed up on this idea in #1325 so simplified it back (push pending); would extend the make check instead to make sure NUT builds, with whatever tools and CPUs build-systems use, do behave as we expect bit-wise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and looked at the original K&R C book -- and it said

The shift operators << and >> group left-to-righ. Both perform the usual arithmetic conversions on their operands, each of which must be integral. Then the right operand is converted to int; the type of the result is that of the left operand.

It goes on to say, of the results:

The value of E1<<E2 is E1 (interpreted as a bit pattern) left-shifted E2 bits; vacated bits are 0-filled. The value of E1 >>E2 is E1 right-shifted E2 bit positions. The right shift is guaranteed to be logical (0-fill) if E1 is unsigned; otherwise it may be (and is, on the PDP-11) arithmetic (fill by a copy of the sign bit).

The usual arithmetic conversions are defined this way (section 6.6):

A great many operators cause conversions and yield result types in a similar way. This pattern will be called the "usual arithmetic conversions."
First, any operands of type char or short are converted to int, and any of type float are converted to double.
Then, if either operand is double, the other is converted to double and that is the type of the result.
Otherwise, if either operand is long, the other is converted to long and that is the type of the result.
Otherwise, if either operand is unsigned, the other is converted to unsigned and that is the type of the result.
Otherwise, both operands must be int, and that is the type of the result.

About characters and integers it says (section 6.1):

A character or or a short integer may be used wherever an integer may be used. In all cases the value is converted to an integer. Conversion of a shorter integer to a longer always involves sign extension; integers are signed quantities. Whether or not sign-extension occurs for characters is machine dependent, but it is guaranteed that a member of the standard character set is non-negative. Of the machines treated by this manual, only the PDP-11 sign-extends. On the PDP-11, character variables range in value from -128 to 127; the characters of the ASCII alphabet are all positive. A character constant specified with an ocatl escape suffers sign extension and may appear negative; for example, '\377' has the value -1.
When a longer integer is converted to a shorter or to a char, it is truncated on the left; excess bits are simply discarded.

If one were to read the section on type specifiers (section 8.2) it's not even clear that unsigned char was a legal declaration. They finish up with Portability considerations (section 16)

[...] Purely hardware issues like word size and the properties of floating point arithmetic and integer division have proven in practice to be not much of a problem. Other facets of the hardware are reflected in differing implementations. Some of these, particularly sign extensions (converting a negative character into a negative integer) and the order in which bytes are placed in a word, are a nuisance that must be carefully watched.

When I was first writing code in C, for portability, we had to check the specific compiler & hardware combination to see how to define our own types in terms of the compiler/hardware supported types to get the behavior that we wanted -- at least now the standards give us <stdint.h> with int8_t and uint8_t types and it's the compiler writer's job to make them work correctly!

}

if (rdlen1 < -1) {
Expand Down Expand Up @@ -407,7 +407,7 @@ static int libusb_open(usb_dev_handle **udevp,
) {
p = (usb_ctrl_char *)&iface->extra[i];
upsdebug_hex(3, "HID descriptor, method 2", p, 9);
rdlen2 = p[7] | (p[8] << 8);
rdlen2 = (uint8_t)p[7] | ((uint8_t)p[8] << 8);
break;
}
}
Expand Down