CH01.PDF

(152 KB) Pobierz
1134924 UNPDF
Data Representation
Chapter One
) numbers represent abso-
lute proof that God never intended anyone to work in assembly language. While it is true
that hexadecimal numbers are a little different from what you may be used to, their
advantages outweigh their disadvantages by a large margin. Nevertheless, understanding
these numbering systems is important because their use simplifies other complex topics
including boolean algebra and logic design, signed numeric representation, character
codes, and packed data.
1
1.0 Chapter Overview
This chapter discusses several important concepts including the binary and hexadeci-
mal numbering systems, binary data organization (bits, nibbles, bytes, words, and double
words), signed and unsigned numbering systems, arithmetic, logical, shift, and rotate
operations on binary values, bit fields and packed data, and the ASCII character set. This
is basic material and the remainder of this text depends upon your understanding of these
concepts. If you are already familiar with these terms from other courses or study, you
should at least skim this material before proceeding to the next chapter. If you are unfa-
miliar with this material, or only vaguely familiar with it, you should study it carefully
before proceeding.
All of the material in this chapter is important!
Do not skip over any mate-
rial.
1.1 Numbering Systems
Most modern computer systems do not represent numeric values using the decimal
system. Instead, they typically use a binary or two’s complement numbering system. To
understand the limitations of computer arithmetic, you must understand how computers
represent numbers.
1.1.1 A Review of the Decimal System
You’ve been using the decimal (base 10) numbering system for so long that you prob-
ably take it for granted. When you see a number like “123”, you don’t think about the
value 123; rather, you generate a mental image of how many items this value represents.
In reality, however, the number 123 represents:
1*10
2
+ 2 * 10
1
+ 3*10
0
or
100+20+3
Each digit appearing to the left of the decimal point represents a value between zero
and nine times an increasing power of ten. Digits appearing to the right of the decimal
point represent a value between zero and nine times an increasing negative power of ten.
For example, the value 123.456 means:
1*10
2
+ 2*10
1
+ 3*10
0
+ 4*10
-1
+ 5*10
-2
+ 6*10
-3
or
1. Hexadecimal is often abbreviated as
hex
even though, technically speaking, hex means base six, not base six-
teen.
Page 11
Thi d
t
t d ith F
M k 4 0 2
Probably the biggest stumbling block most beginners encounter when attempting to
learn assembly language is the common use of the binary and hexadecimal numbering
systems. Many programmers think that hexadecimal (or hex
1134924.002.png
Chapter 01
100 + 20 + 3 + 0.4 + 0.05 + 0.006
1.1.2 The Binary Numbering System
Most modern computer systems (including the IBM PC) operate using binary logic.
The computer represents values using two voltage levels (usually 0v and +5v). With two
such levels we can represent exactly two different values. These could be any two differ-
ent values, but by convention we use the values zero and one. These two values, coinci-
dentally, correspond to the two digits used by the binary numbering system. Since there is
a correspondence between the logic levels used by the 80x86 and the two digits used in
the binary numbering system, it should come as no surprise that the IBM PC employs the
binary numbering system.
The binary numbering system works just like the decimal numbering system, with
two exceptions: binary only allows the digits 0 and 1 (rather than 0-9), and binary uses
powers of two rather than powers of ten. Therefore, it is very easy to convert a binary
number to decimal. For each “1” in the binary string, add in 2
n
2
repre-
sents:
1*2
7
+ 1*2
6
+ 0*2
5
+ 0*2
4
+ 1*2
3
+ 0*2
2
+ 1*2
1
+ 0*2
0
=
128 + 64 + 8 + 2
=
202
10
To convert decimal to binary is slightly more difficult. You must find those powers of
two which, when added together, produce the decimal result. The easiest method is to
work from the a large power of two down to 2
0
. Consider the decimal value 1359:
=2048. So 1024 is the largest power of two less than 1359.
Subtract 1024 from 1359 and begin the binary value on the left with a “1”
digit. Binary = ”1”, Decimal result is 1359 - 1024 = 335.
• The next lower power of two (2
=1024, 2
11
= 512) is greater than the result from
above, so add a “0” to the end of the binary string. Binary = “10”, Decimal
result is still 335.
• The next lower power of two is 256 (2
9
). Subtract this from 335 and add a
“1” digit to the end of the binary number. Binary = “101”, Decimal result
is 79.
• 128 (2
8
) is greater than 79, so tack a “0” to the end of the binary string.
Binary = “1010”, Decimal result remains 79.
• The next lower power of two (2
7
= 64) is less than79, so subtract 64 and
append a “1” to the end of the binary string. Binary = “10101”, Decimal
result is 15.
• 15 is less than the next power of two (2
6
= 32) so simply add a “0” to the
end of the binary string. Binary = “101010”, Decimal result is still 15.
• 16 (2
5
) is greater than the remainder so far, so append a “0” to the end of
the binary string. Binary = “1010100”, Decimal result is 15.
•2
4
(eight) is less than 15, so stick another “1” digit on the end of the binary
string. Binary = “10101001”, Decimal result is 7.
•2
is less than seven, so subtract four from seven and append another one
to the binary string. Binary = “101010011”, decimal result is 3.
•2
is less than three, so append a one to the end of the binary string and
subtract two from the decimal value. Binary = “1010100111”, Decimal
result is now 1.
• Finally, the decimal result is one, which is 2
, so add a final “1” to the end
of the binary string. The final binary result is “10101001111”
0
Page 12
where “n” is the
zero-based position of the binary digit. For example, the binary value 11001010
10
•2
3
2
1
1134924.003.png
Data Representation
Binary numbers, although they have little importance in high level languages, appear
everywhere in assembly language programs.
1.1.3 Binary Formats
In the purest sense, every binary number contains an infinite number of digits (or
bits
which is short for binary digits). For example, we can represent the number five by:
101
00000101
0000000000101
...
000000000000101
Any number of leading zero bits may precede the binary number without changing its
value.
We will adopt the convention ignoring any leading zeros. For example, 101
2
.
In the United States, most people separate every three digits with a comma to make
larger numbers easier to read. For example, 1,023,435,208 is much easier to read and com-
prehend than 1023435208. We’ll adopt a similar convention in this text for binary num-
bers. We will separate each group of four binary bits with a space. For example, the binary
value 1010111110110010 will be written 1010 1111 1011 0010.
We often pack several values together into the same binary number. One form of the
80x86 MOV instruction (see appendix D) uses the binary encoding
2
or 00000101
2
to
pack three items into 16 bits: a five-bit operation code (10110), a three-bit register field
(rrr), and an eight-bit immediate value (dddd dddd). For convenience, we’ll assign a
numeric value to each bit position. We’ll number each bit as follows:
1) The rightmost bit in a binary number is bit position zero.
2) Each bit to the left is given the next successive bit number.
1011 0rrr dddd dddd
An eight-bit binary value uses bits zero through seven:
X
7
X
6
X
5
X
4
X
3
X
2
X
1
X
0
A 16-bit binary value uses bit positions zero through fifteen:
X
15
X
14
X
13
X
12
X
11
X
10
X
9
X
8
X
7
X
6
X
5
X
4
X
3
X
2
X
1
X
0
(L.O.) bit. The left-most bit is typically
called the high order (H.O.) bit. We’ll refer to the intermediate bits by their respective bit
numbers.
low order
1.2 Data Organization
In pure mathematics a value may take an arbitrary number of bits. Computers, on the
other hand, generally work with some specific number of bits. Common collections are
single bits, groups of four bits (called nibbles ), groups of eight bits (called bytes ), groups of
16 bits (called words ), and more. The sizes are not arbitrary. There is a good reason for
these particular values. This section will describe the bit groups commonly used on the
Intel 80x86 chips.
Page 13
repre-
sents the number five. Since the 80x86 works with groups of eight bits, we’ll find it much
easier to zero extend all binary numbers to some multiple of four or eight bits. Therefore,
following this convention, we’d represent the number five as 0101
Bit zero is usually referred to as the
1134924.004.png
Chapter 01
1.2.1 Bits
The smallest “unit” of data on a binary computer is a single bit . Since a single bit is
capable of representing only two different values (typically zero or one) you may get the
impression that there are a very small number of items you can represent with a single bit.
Not true! There are an infinite number of items you can represent with a single bit.
With a single bit, you can represent any two distinct items. Examples include zero or
one, true or false, on or off, male or female, and right or wrong. However, you are not lim-
ited to representing binary data types (that is, those objects which have only two distinct
values). You could use a single bit to represent the numbers 723 and 1,245. Or perhaps
6,254 and 5. You could also use a single bit to represent the colors red and blue. You could
even represent two unrelated objects with a single bit,. For example, you could represent
the color red and the number 3,256 with a single bit. You can represent any two different
values with a single bit. However, you can represent only two different values with a sin-
gle bit.
To confuse things even more, different bits can represent different things. For exam-
ple, one bit might be used to represent the values zero and one, while an adjacent bit
might be used to represent the values true and false. How can you tell by looking at the
bits? The answer, of course, is that you can’t. But this illustrates the whole idea behind
computer data structures: data is what you define it to be . If you use a bit to represent a bool-
ean (true/false) value then that bit (by your definition) represents true or false. For the bit
to have any true meaning, you must be consistent. That is, if you’re using a bit to represent
true or false at one point in your program, you shouldn’t use the true/false value stored in
that bit to represent red or blue later.
Since most items you’ll be trying to model require more than two different values, sin-
gle bit values aren’t the most popular data type you’ll use. However, since everything else
consists of groups of bits, bits will play an important role in your programs. Of course,
there are several data types that require two distinct values, so it would seem that bits are
important by themselves. However, you will soon see that individual bits are difficult to
manipulate, so we’ll often use other data types to represent boolean values.
1.2.2 Nibbles
A nibble is a collection of four bits. It wouldn’t be a particularly interesting data struc-
ture except for two items: BCD ( binary coded decimal ) numbers and hexadecimal numbers.
It takes four bits to represent a single BCD or hexadecimal digit. With a nibble, we can rep-
resent up to 16 distinct values. In the case of hexadecimal numbers, the values 0, 1, 2, 3, 4,
5, 6, 7, 8, 9, A, B, C, D, E, and F are represented with four bits (see “The Hexadecimal
Numbering System” on page 17) . BCD uses ten different digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) and
requires four bits. In fact, any sixteen distinct values can be represented with a nibble, but
hexadecimal and BCD digits are the primary items we can represent with a single nibble.
1.2.3 Bytes
Without question, the most important data structure used by the 80x86 microproces-
sor is the byte. A byte consists of eight bits and is the smallest addressable datum (data
item) on the 80x86 microprocessor. Main memory and I/O addresses on the 80x86 are all
byte addresses. This means that the smallest item that can be individually accessed by an
80x86 program is an eight-bit value. To access anything smaller requires that you read the
byte containing the data and mask out the unwanted bits. The bits in a byte are normally
numbered from zero to seven using the convention in Figure 1.1.
Bit 0 is the low order bit or least significant bit , bit 7 is the high order bit or most significant
bit of the byte. We’ll refer to all other bits by their number.
Page 14
1134924.005.png
Data Representation
7 6 5 4 3 2 1 0
Figure 1.1: Bit Numbering in a Byte
Note that a byte also contains exactly two nibbles (see Figure 1.2 ).
7654321 0
Figure 1.2: The Two Nibbles in a Byte
Bits 0..3 comprise the low order nibble , bits 4..7 form the high order nibble . Since a byte
contains exactly two nibbles, byte values require two hexadecimal digits.
Since a byte contains eight bits, it can represent 2 8 , or 256, different values. Generally,
we’ll use a byte to represent numeric values in the range 0..255, signed numbers in the
range -128..+127 (see “Signed and Unsigned Numbers” on page 23 ), ASCII/IBM character
codes, and other special data types requiring no more than 256 different values. Many
data types have fewer than 256 items so eight bits is usually sufficient.
Since the 80x86 is a byte addressable machine (see “Memory Layout and Access” on
page 145), it turns out to be more efficient to manipulate a whole byte than an individual
bit or nibble. For this reason, most programmers use a whole byte to represent data types
that require no more than 256 items, even if fewer than eight bits would suffice. For exam-
ple, we’ll often represent the boolean values true and false by 00000001 2 and 00000000 2
(respectively).
Probably the most important use for a byte is holding a character code. Characters
typed at the keyboard, displayed on the screen, and printed on the printer all have
numeric values. To allow it to communicate with the rest of the world, the IBM PC uses a
variant of the ASCII character set (see “The ASCII Character Set” on page 28 ). There are
128 defined codes in the ASCII character set. IBM uses the remaining 128 possible values
for extended character codes including European characters, graphic symbols, Greek let-
ters, and math symbols. See Appendix A for the character/code assignments.
1.2.4 Words
A word is a group of 16 bits. We’ll number the bits in a word starting from zero on up to
fifteen. The bit numbering appears in Figure 1.3.
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Figure 1.3: Bit Numbers in a Word
Like the byte, bit 0 is the low order bit and bit 15 is the high order bit. When referencing
the other bits in a word use their bit position number.
Page 15
1134924.001.png
Zgłoś jeśli naruszono regulamin