# 32 bit Multiplication Problem

Discussion in 'Arduino' started by MrAl, Apr 4, 2014.

1. ### MrAlWell-Known MemberMost Helpful Member

Joined:
Sep 7, 2008
Messages:
11,026
Likes:
951
Location:
NJ
Hello there,

In the Arduino doc's they specify a type "unsigned long" which is a 32 bit unsigned number. They also state the max value is 2^32-1. This is pretty standard.

What happens though is when i multiply a number by another number and the result comes out to more than 24 bits the result is not correct and might even be somewhat random, or at least appear that way.

For all the following examples, we have Count declared as:
unsigned long Count;

and the analog return result (from analogRead) is 512 decimal.

What works for example:

What does NOT work:

I noted that the difference between these two examples is that the first does not result in something that requires 25 bits but the second does require 25 bits. That's one more than 24 bits so i thought 24 bits might be somehow the max we can use. This would be very strange if this is true so i did another test...

What else works fine:
Count=5000*4*analogRead(A0); //as before returns a successful result.
Count=Count+Count; //doubles the previous value which is what i wanted.

This achieves the desired result of 5000*8*analogRead(A0) and works just fine, even though it returns a result that requires at least 25 bits. This round about method works while the direct method does not work.

I never had this problem in 'regular' C on a 'regular' C compiler.

2. ### PommieWell-Known MemberMost Helpful Member

Joined:
Mar 18, 2005
Messages:
10,011
Likes:
316
Location:
Brisbane Australia
ONLINE
Have you checked how it handles constants? It may be using 24 bits for those and causing this strange (I would say buggy) result.

Is there any way to cast the constants to long?

Mike.

3. ### misterTWell-Known MemberMost Helpful Member

Joined:
Apr 19, 2010
Messages:
2,697
Likes:
368
Location:
Finland
Yes.. that goes over the limit of int16. You need to write:

Compilers for microcontrollers can be tricky that way. Variable promotion does not work as expected etc.
"Standard C will automatically promote their operands to an int, which is (by default) 16 bits in avr-gcc."

Last edited: Apr 4, 2014
4. ### DaveNew Member

Joined:
Jan 12, 1997
Messages:
-
Likes:
0

Joined:
Mar 28, 2011
Messages:
9,148
Likes:
907
Location:
Rochdale UK

I almost always cast like this...

count = ( Unsigned long) analogRead(A0) * 8 * 5000;

It seems to work for me..... This casts the ADC result to a long then multiplies...

6. ### MrAlWell-Known MemberMost Helpful Member

Joined:
Sep 7, 2008
Messages:
11,026
Likes:
951
Location:
NJ
Hello there guys,

Well i dont know why i didnt think of casting to unsigned longs, i should have known that

All the solutions you guys presented here worked.

Using the casts it came out to (Counta is unsigned long from a call to TakeReading() which calls analogRead():
Count=5000ul*8ul*Counta;

Putting the unsigned long as the first argument also worked as Pommie pointed out:
Count=Counta*5000*8;

So using the explicit casts works and putting the first argument that is the right type first also works.

Thanks much

I'll be using this data type for almost everything i do so i really need this type.

/*============================================================*/

Secondary question:
-----------------------

I realize that there could be a big performance penalty using doubles if they were possible, but is it true that the compiler makes type double the same as type float, which means only 32 bit floats are possible and not 64 bit doubles as in 'regular' C ?

7. ### misterTWell-Known MemberMost Helpful Member

Joined:
Apr 19, 2010
Messages:
2,697
Likes:
368
Location:
Finland
Yes, from avr-libc documentation: "float and double are 32 bits (this is the only supported floating point format)"

Edit: I found this: "On the Arduino Due, doubles have 8-byte (64 bit) precision."
http://www.arduino.cc/en/Reference/Double

Last edited: Apr 4, 2014
8. ### MrAlWell-Known MemberMost Helpful Member

Joined:
Sep 7, 2008
Messages:
11,026
Likes:
951
Location:
NJ
Hi misterT,

Oh yes very interesting. I really wanted to use the Atmega chips however, too bad for me i guess. I dont know if i want to get involve with Arm processors yet. Maybe i'll have to look into them a little more anyway though.
For most of my stuff i'll probably end up using pseudo floating point anyway using integers, but it would be nice if they allowed doubles natively.
A friend wanted to create a voltmeter with built in calculator, with a decent calculator numerical precision.

[LATER]

Wow, took a look at the 'Due' and the Arm processor. That's quite a bit more power there with the 12 bit ADC and also 12 bit DAC, real time clock, etc.
Down side is it is a one off unit, where it can not be used with just the chip alone (separate chip like the Uno) unless we want to solder a whole bunch of tiny pins.

Last edited: Apr 4, 2014
9. ### misterTWell-Known MemberMost Helpful Member

Joined:
Apr 19, 2010
Messages:
2,697
Likes:
368
Location:
Finland
Just use integers.. much better. Floats are slow and the library that operates on floats is relatively large.

And the truth is that you get better accuracy using integer math instead of floats.
http://www.exploringbinary.com/floating-point-questions-are-endless-on-stackoverflow-com/

Last edited: Apr 4, 2014
10. ### MrAlWell-Known MemberMost Helpful Member

Joined:
Sep 7, 2008
Messages:
11,026
Likes:
951
Location:
NJ
Hi misterT,

Yes i agree. I think the lib adds at least another full kilobyte or more.

Not sure if you saw my addition to my previous post. I took a look at the Arm processor and the Due. Quite a bit more power there with the 12 bit ADC and 12 bit DAC and real time clock and stuff. Looks nice although it will be hard to make another board with the ARM chip because of the small pins.

Last edited: Apr 4, 2014
11. ### NorthGuyWell-Known Member

Joined:
Sep 8, 2013
Messages:
1,218
Likes:
206
Location:
It's not as bad as it seems. You don't do every pin individually. You just drag your soldering iron accross the pins. And if you need to manufacture many, it could be less expensive than TH because you do not need any holes.

Although if you have a good idea of the range of your values, as is usually a case with MCUs, fixed point arithmetic is quite sufficient.

12. ### MrAlWell-Known MemberMost Helpful Member

Joined:
Sep 7, 2008
Messages:
11,026
Likes:
951
Location:
NJ
Hi,

Yes thanks Northguy. I usually use my own version of pseudo floating point using integers anyway, but for the PC (not uC) i have used doubles too because the floating point units in modern PC CPU's are very very fast too these days.

But i ran into a related problem when moving to the AMD eight core PC CPU chip. I found out after i bought it that AMD pulled a cheapie on that chip that i had no idea anyone would be silly enough to do. What they did was on their new 'design' (the word 'design' in quotes here because i have to wonder if it is a real design) was they built their cores as eight integer cores and only 4 loading points uints. So that means for every two integer "cores" (again in quotes because of the difference from other CPU's) there only one floating point unit. Thus we have 8 integer units but only 4 floating point units on an AMD 8 core processor!
This may not seem like a big deal, but it is. If we have an application that could benefit from 8 float units we're beat because we dont have 8 float units just 4, so we're down to 4 cores for that application. The slowdown for some instructions is really nuts.

So you can see that i had to build my own pseudo floating point software using integer math in order to get past this ridiculous float bottleneck problem with AMD cheating on the cores. Intel does not do that of course but i dont think they have an 8 core CPU yet. AMD did not do that with their six core Phenom, but they started with those FX chips.

13. ### misterTWell-Known MemberMost Helpful Member

Joined:
Apr 19, 2010
Messages:
2,697
Likes:
368
Location:
Finland
This sounds interesting. Are you talking about fixed point math which is quite common, or do you actually have more elaborate thing? For example, do you allow the decimal point move around and you just keep track of it for every variable?

14. ### MrAlWell-Known MemberMost Helpful Member

Joined:
Sep 7, 2008
Messages:
11,026
Likes:
951
Location:
NJ
Hi misterT,

Actually both, although i have used the "fixed point" integer math the most because most of my applications worked with a known range of data probably similar to what we find on uC's most of the time.

The true floating point isnt that hard to do though. It just means keeping track of the exponent too really. The library i created a long long time ago was for the Z80 CPU which had no floating point back then (and only 4MHz clock ha ha) but i needed that CPU at the time so i had to do something.
All the numbers are stored with the high bit set too, so we always see numbers like 0x8______ and then the following bytes are for sign and exponent. Then it's just a matter of determining the new sign and the new exponent, based on the old data. Of course the sign is easy for multiplication because like signs produce a positive sign and unlike signs produce a negative sign.
It has been quite a while now since i looked at this, but now that im playing around with the AVR chips i might start looking into it again, and if you like i can try to post more information. I would think by now though there should be something on the web about this like IEEE floating point format and the like. Back when i had to do it there was no internet to speak of yet.
So it's just a matter of shifting and adding, and keeping track of sign and exponent. the exponent is what tells us where the decimal point is.

Oh yeah, there is also the mantissa truncation issue which there is a little variation on the ideas of the best way to do it. Some say that we just round, while others say we get more stable math by using the rule of "every other result gets rounded up (and every other other gets rounded down) regardless of LSB" and then there's the "every result gets rounded by virtue of a random LSB bit added to the result". This last one is interesting because sometimes say 1010.0 gets rounded to 1010 and sometimes it gets rounded to 1011, and sometimes 1010.1 gets rounded to 1010 and sometimes to 1011, based on a random LSB rather than the true LSB.

With an integer 8 bit math built into the chip it would be much faster than on the Z80 because the Z80 had no multiplication at all, so everything had to be done bit by bit, shifting one bit at a time. With 8 bit built in math we can shift 8 bits at a time which should speed things up quite a bit for multiplication.

Last edited: Apr 4, 2014
• Like x 1
15. ### misterTWell-Known MemberMost Helpful Member

Joined:
Apr 19, 2010
Messages:
2,697
Likes:
368
Location:
Finland
That is interesting. It just sounds that with the same effort you could have implemented standard IEEE floating point in software I have always intended to write a "fixed point" library where the decimal point is not actually fixed in a constant place, but its place is stored with the variable itself.. something like:

struct fp{
int32_t value;
uint8_t n;
}

Where n is the number of fractional bits.. The only reason I have not written the library is that I have no use for it.. haha
But, I think it could be a good compromise between software (ieee) floats and traditional fixed point. Well, I think there are systems that use that solution, it is not a new thing.

Joined:
Mar 28, 2011
Messages:
9,148
Likes:
907
Location:
Rochdale UK
I hope you realise if you need proper floating point math, you can get a bolt on FPU quite cheap... SPI or I2C..
https://www.sparkfun.com/products/8450

I have never used one ( I think microchip does one aswell ) The aforementioned chip has an arduino library...

Just for reference.....

• Like x 1
17. ### misterTWell-Known MemberMost Helpful Member

Joined:
Apr 19, 2010
Messages:
2,697
Likes:
368
Location:
Finland
That would be interesting to try out.. or hear from somebody that has tried it. Maybe google finds some projects etc.

18. ### MrAlWell-Known MemberMost Helpful Member

Joined:
Sep 7, 2008
Messages:
11,026
Likes:
951
Location:
NJ
Hi misterT,

Yes it was similar to the IEEE spec except i wanted extended exponent precision.
I might get around to doing something with this as i might want to do 64 bit floats for a few apps in the uC chips.

There are tricks that make it faster, such as keeping the leading bit always equal to 1 and adjusting the exponent to compensate. It's the equivalent of scientific notation where everything has a single digit before the decimal:
1.34e+06
3.14e+00

but of course we are lucky to be limited to only 1's and zeros so we have:
1.000111e+04
1.1001111e-03
-1.1001e+12

For addition and subtract there is always an adjustment of the mantissa first, and for mult and div there is always an adjustment of the exponent after the multiplies and adds.
Of course the multiplies are "fun" separating all the bits into two or more groups and multiplying as we do in decimal:
456*321
where would first multiply 1*6, then 1^5, etc, then shift, then 2*6, etc. In the binary world it's a tiny bit simpler because we only have a 1 or 0, so we either add the first number or dont add it, it's that simple.
With the uC math though that would be 8 bits at a time, so we'd have to work it like eight bit number groups:
0x0456 * 0x0321
so we'd multiply the hex 21 times the hex 56, then the hex 21 times the hex 04, etc., then 03 times 56 and 03 times 04, then add the two results. For a full 64 bit mantissa we'd have:
0x12345678 *0x87654321
so it would start hex 21 times hex 78, etc.
That's about how it would work i would imagine. I did the binary version though not the 8 bit word version.
Also interesting in binary taking the square root is very much simpler than in decimal.

just for a few examples.

I almost forgot to mention that with the floating point routine for math itself comes the extra weight of the conversion routines. That's for converting from decimal to floating point and from floating point to decimal. Requires more calculations and calculation time to convert if the user wants to see the output on some display. The raw coding requires a base ten calculator routine but if we already have the Arduino math lib then maybe we can use that. Also storage of some conversion constants so we dont have to keep calculating them over and over again.

Last edited: Apr 5, 2014
19. ### MrAlWell-Known MemberMost Helpful Member

Joined:
Sep 7, 2008
Messages:
11,026
Likes:
951
Location:
NJ

Hi Ian,

Nice find there, but isnt that also 32 bit? We have 32 bit floats in the Arduino so i was looking for 64 bit floats or maybe better.
Or does it do 64 bit floats too?

20. ### NorthGuyWell-Known Member

Joined:
Sep 8, 2013
Messages:
1,218
Likes:
206
Location:
They should have SSE instructions, which is much more computing power than FPU.

21. ### MrAlWell-Known MemberMost Helpful Member

Joined:
Sep 7, 2008
Messages:
11,026
Likes:
951
Location:
NJ
Hi NorthGuy,

They have SSE but they would still have to rely on the floating point unit to do the actual floating point math.