# bits 'game'

Discussion in 'Mathematics and Physics' started by electroRF, Jan 10, 2014.

Not open for further replies.
1. ### electroRFMember

Joined:
Jun 23, 2012
Messages:
689
Likes:
9
Location:
Portugal
Hi,
I'm wondering about the following question:

assume W,X,Y,Z are 8-bit registers.

Given the following expression:

Code (C):
(((W-X)>>3) + Y-Z+4)<<2
how many bits are needed to prevent overflow?

I'm wondering what would be a 'neat' way to solve it?

* it isn't mentioned in the question, but I assumed the registers are signed 8-bit numbers, because I think that gives the maximum number of bits required to avoid overflow.

Last edited: Jan 10, 2014
2. ### RatchitWell-Known Member

Joined:
Mar 12, 2008
Messages:
1,955
Likes:
83
ONLINE
electroRF,

"how many bits are needed to prevent overflow?"

No matter how many bits a register has, an overflow can still happen.

"I'm wondering what would be a 'neat' way to solve it?"

You have to define the problem in a way that makes sense first.

"* it isn't mentioned in the question, but I assumed the registers are signed 8-bit numbers, because I think that gives the maximum number of bits required to avoid overflow. "

What you say above does not make sense. For addition and subtraction, signed and unsigned are the same. The results depend on how the result is intrepreted. The ADD and SUB instructions are the same for either type of number.

Ratch

3. ### electroRFMember

Joined:
Jun 23, 2012
Messages:
689
Likes:
9
Location:
Portugal
Hi Ratchit,
The question was about how many bits should the result have to avoid overflow.

it DOES matter whether the registers are signed or unsigned.

if W,X are unsigned, then W-X will never require more than 8 bits for the result.
if W,X are signed, then W-X may require 9 bits for the result:
say W = 2^7 - 1 , ans X = - 2^7
Then W-X = 2^8 - 1, which is represented in 9 bits for signed numbers.

Joined:
Jan 12, 1997
Messages:
-
Likes:
0

5. ### RatchitWell-Known Member

Joined:
Mar 12, 2008
Messages:
1,955
Likes:
83
ONLINE

electroRF,

"The question was about how many bits should the result have to avoid overflow."

And the answer is still that an overflow can occur no matter what the size of the register is.

"it DOES matter whether the registers are signed or unsigned"

The hardware of every arithmetic register is the same. There are no signed or unsigned registers. Whether a number is signed or not depends on how it is interpreted. For instance, an 8-bit register whose contents are 0FFh can be interpreted as; 255 for unsigned, -0 in 1's complement, or -1 in 2's complement. Try doing the computations on paper, and see that adding and subtracting numbers are the same regardless whether they are interpreted as signed or unsigned.

I think you have to study what overflow and carryout means to answer your next questions.

Ratch

6. ### misterTWell-Known MemberMost Helpful Member

Joined:
Apr 19, 2010
Messages:
2,697
Likes:
368
Location:
Finland
You need to apply some common sense and logical thinking when you have a question like this. The description of the problem says:
"assume W,X,Y,Z are 8-bit registers."

This means that the range of these variables are [0, 255] for unsigned values and [-128, 127] for signed values.

If you add two of these values using 32 bit registers, you will never have an overflow.

Now the question is: What is the minimum size working register so that overflow does not happen when you calculate:
(((W-X)>>3)+ Y-Z+4)<<2

You need to assume that you first move the 8-bit variables from their original 8-bit register to the working register/registers.

7. ### RatchitWell-Known Member

Joined:
Mar 12, 2008
Messages:
1,955
Likes:
83
ONLINE
misterT,

"You need to apply some common sense and logical thinking when you have a question like this. The description of the problem says:
"assume W,X,Y,Z are 8-bit registers.""

What you really mean to say is that you have to intrepret what the OP meant.

"This means that the range of these variables are [0, 255] for unsigned values and [-128, 127] for signed values."

Yes, that is correct for 2's complement representation.

"If you add two of these values using 32 bit registers, you will never have an overflow"

Yes, that is correct also.

"Now the question is: What is the minimum size working register so that overflow does not happen when you calculate:
(((W-X)>>3)+ Y-Z+4)<<2"

The term "working register" is not mentioned by the OP. Why should I interpret it that way? Also, does the ">>" symbol mean arithmetic shift or logical shift?

"You need to assume that you first move the 8-bit variables from their original 8-bit register to the working register/registers."

No, if you move them to a larger register, then the overflow will not happen. You need to work the the worst case with pencil and paper.

Ratch

8. ### NorthGuyWell-Known Member

Joined:
Sep 8, 2013
Messages:
1,218
Likes:
206
Location:
Assuming that the result is going to be stored in 8-bit register, you can select values of W, X, Y and Z (e.g. Y = 127 and the rest is zero) such than an overflow occurs no matter what.

If you're asked such question on the interview ... run.

9. ### RatchitWell-Known Member

Joined:
Mar 12, 2008
Messages:
1,955
Likes:
83
ONLINE
NorthGuy,

I think he wants to know how many bits he needs to do the operation (((W-X)>>3) + Y-Z+4)<<2 using the worst case maximum values for 8-bit 2's complement numbers.

One bit is used for the sign, so that leaves 7-bits for the number. The positive number that will fill all the 7-bits are 127 decimal=7F hex. And -127 decimal = 81 hex.

If W = 7F and X = 81 hex we get W-X = 1FE. Shifting 3 bits right we get 3F. Adding Y =7F gives 46 hex. Subtracting Z = 81 hex gives 1C5. Adding 4 gives 1C9 hex. Shifting 2 bits left gives 724 hex which takes 11 bits to represent plus 1 sign bit for a total of 12 bits.

Ratch

10. ### NorthGuyWell-Known Member

Joined:
Sep 8, 2013
Messages:
1,218
Likes:
206
Location:
Minumum 8-bit signed number is -128 (0x80).

11. ### misterTWell-Known MemberMost Helpful Member

Joined:
Apr 19, 2010
Messages:
2,697
Likes:
368
Location:
Finland
Yes, and I rather interpret it so that it makes sense. Not like some sleazy lawyer trying to find something he can use against others.

12. ### misterTWell-Known MemberMost Helpful Member

Joined:
Apr 19, 2010
Messages:
2,697
Likes:
368
Location:
Finland
Let me show you how you can find the solution step by step. Take this part of the equation for example:

(W-X)
Those are 8-bit variables, so the maximum of that equation is:

255-(-128) = 383
You need 9 bits to store that (10 with 2's complement).

The minimum is:
-128-255 = -383
You need 10 bits to store that in 2's complent.

So, you need at least 10 bit register to avoid overflow.

Last edited: Jan 12, 2014
13. ### NorthGuyWell-Known Member

Joined:
Sep 8, 2013
Messages:
1,218
Likes:
206
Location:
Assuming all signed variables and arithmetic shifts, we can set all variables to their maximums(127) and minimums(-128) in such a way that the end result is maximuzed. It'll give us 1163. If we set them all so that the end result is minimized, you get -1131. To represent numbers between -1131 to 1163, you'll need 12-bit (-2048 to 2047). 11 bits (-1024 to 1023) will not do it.

If you assume unsigned variables then you cannot prevent overflow (going below zero) with any amount of bits.

This all could make some sense if an intermediate result would require more bits than the end result, but this is not the case. This tells us that the person who created this task is lacking imagination. If he asks such question, he cannot offer anything but a very boring job.

14. ### RatchitWell-Known Member

Joined:
Mar 12, 2008
Messages:
1,955
Likes:
83
ONLINE
misterT,

"255-(-128) = 383
You need 9 bits to store that (10 with 2's complement)."

You cannot use 255 for a 8-bit signed number. 127 is the maximum positive value for a signed 8-bit number. And you must use signed number representation to activate the overflow flag.

Ratch

15. ### misterTWell-Known MemberMost Helpful Member

Joined:
Apr 19, 2010
Messages:
2,697
Likes:
368
Location:
Finland
You must realize that my example was the "absolute worst case" where both signed and unsigned can be used. It is perfectly fine. In that case both need to be promoted to 10 bit 2's complement and then the operation is executed.
Have you ever seen a C compiler giving an error or a warning if you try to add or subtract uint8_t with int8_t?? No, because it is perfectly fine.

16. ### RatchitWell-Known Member

Joined:
Mar 12, 2008
Messages:
1,955
Likes:
83
ONLINE
No, it is not perfectly fine. 255 unsigned represents a negative signed number for 8 bits. Subtracting signed -128 will guarantee that overflow will never happen. But, overflow will will happen if 127 -(-128).

Ratch

17. ### RatchitWell-Known Member

Joined:
Mar 12, 2008
Messages:
1,955
Likes:
83
ONLINE
NorthGuy,

"Minumum 8-bit signed number is -128 (0x80). "

Yes, you are correct. My mistake.

I guess you agree with me that 12 bits are needed.

Ratch

18. ### misterTWell-Known MemberMost Helpful Member

Joined:
Apr 19, 2010
Messages:
2,697
Likes:
368
Location:
Finland
I know you are a smart guy and you know exactly what I mean with my previous post.. so just stop trolling.
It is perfectly fine to mix different datatypes in calculations.. There are even specific assembly instructions to "multiply signed with unsigned" etc.
You only need to make sure that the register where you store the result is proper size and type.

Last edited: Jan 12, 2014
19. ### RatchitWell-Known Member

Joined:
Mar 12, 2008
Messages:
1,955
Likes:
83
ONLINE
misterT,

255 unsigned represents a negative signed number in 8-bits. Subtracting -128 is the same adding +128. Addition of two numbers with unlike signs will never be an overflow. Multiplication need different CPU instructions for signed or unsigned multiply. Integer addition and subtraction do not.

Ratch

20. ### misterTWell-Known MemberMost Helpful Member

Joined:
Apr 19, 2010
Messages:
2,697
Likes:
368
Location:
Finland
Exactly.. Signed and unsigned do not need any special CPU instructions. So there is no problem. Just place the 255 in the (bigger) result register (this is called implicit casting and promoting, which happens all the time) and then add/subtract the other variable from it. No problem.
https://www.securecoding.cert.org/c.../INT02-C.+Understand+integer+conversion+rules

21. ### RatchitWell-Known Member

Joined:
Mar 12, 2008
Messages:
1,955
Likes:
83
ONLINE
I program in assembler, not C, therefore that link does not mean anything to me. However, the C compiler uses the native word size which is usually larger than 8-bits. I still stand by what I said about 255 being a negative number in signed 8-bit words. Perhaps the compiler expands the 8-bit word to a larger size if if the value does not fit into 8-bits. If it does, that goes beyond the problem we are discussing.

Ratch