Binary
Goals
- Be able to convert base 2 and base 16 to base 10.
- Know how to work with bit operators.
- Understand two's complement representation of negative values.
Concepts
- binary
- bit
- bitwise operator
- byte
- complement operator
- least-significant
- mask
- most-significant
- overflow
- sign bit
- sign extension
- truth table
- two's complement
Language
&
^
|
<<
>>
>>>
~
Library
Lesson
In one of the first lessons of this course, you discovered that computers use these base 2 or binary numbers as a convenient way to represent values using electrical signals. Computers refer to individual binary digits as bits, and typically group them in sequences of eight bits called bytes. High-level language such as Java groups these bytes into various primitive types
such as int
.
123
(0b01111011
) stored as a 32-bit Java int
.123 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 1 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Byte 3 | Byte 2 | Byte 1 | Byte 0 |
For the most part Java hides the details of binary storage, allowing you to manipulate values in terms of their primitive types. However understanding more about binary representation and binary operations will allow you to accomplish more advanced tasks in Java, as well as avoid certain pitfalls presented by certain idiosyncrasies of the Java language.
Conversion to Decimal
In elementary school you probably learned that the right-hand digit of a number was the ones-column
, the next column was the tens-column
, followed by the hundreds-column
, etc. progressing from right to left. You might have been taught that the number 456
is equivalent to (4 × 100) + (5 × 10) + (6 × 1)
.
456
from its decimal digits.4 × 100 | |
Sum | = 456 |
---|
Keeping in mind that the number 456 is represented in base 10 (decimal), you might notice that the values 100
, 10
, and 1
are all powers of 10: 102 = 100, 101 = 10, and 100 = 1. Any number to the power of 0 is equal to 1.
456
(base 10).Power of 10 | 102 | 101 | 100 | Result |
---|---|---|---|---|
Multiplier | 100 | 10 | 1 | |
Input | 4 | 5 | 6 | |
Output | 400 | + 50 | + 6 | = 456 |
You can therefore use the following rule to convert any number in any base to decimal: the multiplier in each column is equal to the number base to the power of the zero-based column from right to left. Let's examine how this would work to convert the base 2 (binary) number 0b01111011
to decimal:
b01111011
(base 2).Power of 2 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | Result |
---|---|---|---|---|---|---|---|---|---|
Multiplier | 128 | 64 | 32 | 16 | 8 | 4 | 2 | 1 | |
Input | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | |
Output | 0 | + 64 | + 32 | + 16 | + 8 | + 0 | + 2 | + 1 | = 123 |
Bit Operators
Bit Shift Operators
There are three operators for shifting all bits to the left or the right a certain number of positions: <<
(left shift), >>
(right shift), and >>>
(unsigned right shift). The difference between the two right shift operators is that unsigned right shift operator will always fill the empty space on the left with binary 0
, while the (signed) right shift operator will replicate the first bit that was originally on the left. See Negative Values below for more information on how this relates to signed and unsigned values.
int
value 0xABCD
(1010101111001101
).0xABCD | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 1 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0xABCD << 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 0 |
0xABCD << 2 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 |
0xABCD >> 1 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 0 |
0xABCD >> 2 | 1 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 1 |
0xABCD >>> 1 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 0 |
0xABCD >>> 2 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 1 |
Bitwise Boolean Operators
By now working with the logical operators ||
(OR), &&
(AND), and !
(NOT) has become second nature to you. Java provides a complete set of bitwise operators that parallel the logical operators, except that instead of working on the values false
and true
, they work on individual bit values of 0
and 1
. That is to say the bitwise operators work like the logical if the bit value 0
represented false
and the bit value 1
represented true
.
The bitwise operators are &&
(AND), |
(OR), and ^ (XOR). The complement operator ~
performs an analogous function to a logical NOT. You recall that it is possible to create a truth table containing all possible outcomes involving each logical operator. The same can be done the binary operators; you'll note that the results are parallel.
Input Values | AND | OR | XOR | NOT | |
---|---|---|---|---|---|
boolean foo | boolean bar | foo && bar | foo || bar | foo ^ bar | !foo |
false | false | false | false | false | true |
false | true | false | true | true | true |
true | false | false | true | true | false |
true | true | true | true | false | false |
Input Values | AND | OR | XOR | NOT (complement) | |
---|---|---|---|---|---|
int foo | int bar | foo & bar | foo | bar | foo ^ bar | ~foo |
0 | 0 | 0 | 0 | 0 | 1 |
0 | 1 | 0 | 1 | 1 | 1 |
1 | 0 | 0 | 1 | 1 | 0 |
1 | 1 | 1 | 1 | 0 | 0 |
The following figure shows the result of performing each of the bitwise operators on two values.
short
values 123
(0b01111011
) and 456
(0b111001000
).123 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
456 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | |
123 & 456 | 72 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 |
123 | 456 | 507 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 | 1 |
123 ^ 456 | 435 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 1 | 1 |
~123 | -124 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
Masking
The bitwise are often used with special bit patterns to mask a value, turning certain bits on or off; or to toggle
certain bits.
Goal | Turn Bits On | Turn Bits Off | Toggle Bits |
---|---|---|---|
Operator | OR | AND | XOR |
Description | 1 turns bit on, 0 keeps existing value | 0 turns bit off, 1 keeps existing value | 1 toggles bit, 0 keeps existing value |
Value | 0b01111011 | 0b01111011 | 0b01111011 |
Mask | | 0b11110000 | & 0b11110000 | ^ 0b11110000 |
Result | 0b11111011 | 0b01110000 | 0b10001011 |
Negative Values
There's a peculiarity you may have noticed in the table above. The value resulting from ~123
(~0b01111011
) is -124
(0b10000100
). Why is the result a negative number? In fact if we fill up all eight bits in a byte to give 0b11111111
, using the base conversion calculations from the beginning of this lesson we would expect this to represent the decimal value 255
. How therefore would it even be possible to represent a negative number in binary?
The approach computers usually use is to reserve one bit to be the sign bit, indicating if the value is positive or negative. The sign bit is the highest, most-significant bit on the left. But if you take away one bit from a byte, it only leaves seven for the actual value. The highest value stored in the remaining seven bits is 0b1111111
, which is 127
decimal. Thus if one bit is considered a sign bit, the highest value a byte can hold is 127
, and the lowest value it can hold is -127
. (Why not -128
?)
Two's Complement
… | … | … | … | … | … | … | … | … |
---|---|---|---|---|---|---|---|---|
-3 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 1 |
-2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 |
-1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
2 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
3 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
… | … | … | … | … | … | … | … | … |
The way the negative bit is encoded results in 0b11111111
for -1
and 0b11111110
for -2
, etc. This encoding is called two's complement. You can visualize two's complement by thinking of all the bits as a series of rotating dials, as you did in the early lesson on values. If we start with all the bits turned off
we represent the value 0
(0b00000000
). Rotating the dials forward one position gives us 1
(0b00000001
). Therefore rotating the dials backwards one position to the setting right before -1
must give us -1
(0b11111111
).
Put another way, if you continue rotating the dials until you reach 255
(0b11111111
), rotating one more position brings the dials back to 0
(0b00000000
). The value before 0
is -1
, so 0b11111111
must have been another way to represent -1
.
So do eight bits that are turned on, 11111111
, represent 255
or -1
? Whether a value with its most-significant bit set to 1
is positive or negative depends on whether you are considering the bytes as signed or unsigned.
Converting Two's Complement
You can convert any positive binary number to its two's complement negative version (or vice-versa) by taking the complement of the bits and then adding 1
:
123
(0b0000000001111011
) to -123
(0b1111111110000101
) in two's complement stored as a 16-bit Java short
.n | 123 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 1 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
~123 | -124 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
~123 + 1 | -123 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
Sign Extension
It should come as no mystery that as you move to larger types used to store a positive integer, the leftmost bits are filled with 0
. Thus 123
would be stored as a Java byte
using 0b01111011
, as a Java short using 0b0000000001111011
, and as a Java int
using 0b00000000000000000000000001111011
. This is no different than with everyday decimal numbers. (Don't forget that Java considers representations starting with 0 as indicating octal, however.)
A close examination of the representation of the 16-bit representation of -123
will show that its leftmost bits have been instead filled with 1
. (This follows from the above formula for converting a positive to a negative value using two's complement: the leading zeros will be changed to ones.) Thus -123
would be stored as a Java byte
using 0b10000101
, as a Java short using 0b1111111110000101
, and as a Java int
using 0b11111111111111111111111110000101
. The unexpected consequence is that when widening a signed value the sign bit (eight 0
or 1
) will be extended
to fill the remaining space! This makes complete sense, as one would expect the byte
value -123
(0b10000101
) to remain -123
(0b11111111111111111111111110000101
) when widened to an int
.
Widening Unsigned Values
This sign extension can cause enormous problems when widening a signed type that is being interpreted as holding an unsigned value. As you saw above, although a Java considers a byte
to be signed, in most cases one can ignore the sign and consider the bits as representing an unsigned value. Thus the eight bits 0b10000101
(which Java interprets as -123
) can also be interpreted as holding unsigned value 133
. The problem arises when attempting to retrieve the unsigned value by casting to a wider type, because Java will extend the sign into the new space of the wider type, as discussed above.
Therefore when widening a signed type that is being used to hold an unsigned value, you must remove the extended sign bits. This is easily done performing an AND operation with a mask in which the upper bits are all 0
.
byte
133
(0b0000000001111011
) (also interpreted as -123
) to an unsigned value by widening to a 16-bit short
and removing the extended sign bits.byte b = (byte)133 | -123 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
short signed = b | -123 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
short unsigned = signed & 0b0000000011111111 | 133 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 |
Review
Summary
Operators | Precedence |
---|---|
additive | + - |
assignment | = += -= *= /= %= &= ^= |= <<= >>= >>>= |
bitwise AND | & |
bitwise exclusive OR | ^ |
bitwise inclusive OR | | |
equality | == != |
logical AND | && |
logical OR | || |
multiplicative | * / % |
postfix | expr++ expr-- |
relational | < > <= >= instanceof |
shift | << >> >>> |
ternary | ? : |
unary | ++expr --expr +expr -expr ~ ! |
Gotchas
- The symbols
&&
and||
are logical operators and are short-circuiting. The symbols&
and|
can be used as bitwise operators orlogical
operators, but they are not short-circuiting. - Remember that the Java integer types are signed!
- When widening a signed type you are using to store unsigned values, without taking precautions the sign will be extended and not produce the unsigned value you intended.
In the Real World
- You no longer have to do manual unsigned widening conversions, as this functionality is now available in the primitive wrapper classes in methods such as
Byte.toUnsignedInt(byte)
.
Self Evaluation
- What is the result of
9128734650
? - What is the difference between the operator
>>
and the operator>>>
? - What is the difference between the operator
||
and the operator|
? - How could you use the
&
operator to determine if anint
is positive or negative? - How could you replicate the results of the complement operator using the XOR operator?
- In two's complement representation, is the sign represented by the least-significant bit or the most-significant bit?
- What Java integer type is unsigned?
- What Java type would you use to represent the values
0
–255
, that is the full value range of eight bits?
Task
Create a method getBinaryByteValues(byte[] bytes)
to extract byte values to integers.
- The method will consider each byte to be an unsigned value.
- The method will return an
int[]
containing the same unsigned values as were stored in thebyte[]
. - Test the method extensively. Include the
[2, 3, 4, 234, 0, 0, 0, 234]
as inputs in one of the tests.
See Also
- Bitwise and Bit Shift Operators (Oracle - The Java™ Tutorials)
- Two's Complement (Thomas Finley)
- Java SE 8's New Compact Profiles and Integer APIs: Unsigned Integer API (Jeff Friesen, Informit)
References
- Elementary arithmetic (Wikipedia)
- Summary of Operators (Oracle - The Java™ Tutorials)
- Operator Precedence (Oracle - The Java™ Tutorials)
- The Java® Language Specification, Java SE 8 Edition: Chapter 15.22. Bitwise and Logical Operators (Oracle)
- Mask (computing) (Wikipedia)
- Two's complement (Wikipedia)
Acknowledgments
- Combination disc image modified from illustration of a simple combination lock by Wapcaplet and licensed under Creative Commons Attribution-Share Alike 3.0 Unported.