Floating Point Helper

wannaBinventor · Feb 5, 2011

I asked a couple days ago about some floating point stuff with sorceforge. I got no responses, so I fell back on the easier to understand Microchip routines. Microchip includes a program to do conversions for you (I presume that's what it's for), but it won't open on my computer. Thus, I made a spreadsheet to help me out and I also did some testing.

It will take a Microchip format floating point hex number and covert it to decimal, and vice versa. It will also do the following math operations on the input, and display the answers in the floating point hex and in decimal so they can be compared to the output of a routine in a PIC program:

Addition
Subtraction
Multiplication
Division
Sine
Cosine
Tangent
Arc Sine
Arc Cosine
Arc Tangent
Square Root
Cube Root
Square
Cube
Log (Base 10)
Natural Log

I may make some other modifications to it later, but for not I think I've got all the bugs out of it.

Hopefully this will help some others get over the floating point fear. AN575 is fairly understandable, and I think the spreadsheet may help bring it full circle.

IMO, floating point gets too bad of a wrap. Sure, its slower, but who gives a crap? How many people are doing something so time sensitive that they need to give up a lot of accuracy for the sake of saving 25 millionths of a second on a division routine?

Only enter info where the text is red. The math operation is selected by the drop down box in that cell. Depending on your version, you may need to put a ' (single quote) before the value if the hex value that you enter is completely numeric.

EDIT: I did this mostly in Open Office, so it may require some formatting adjustments if used in Excel (IE: Column with, text size, etc).

Sceadwian · Feb 11, 2011

Floating point itself is over rated, the math errors that in can introduce can be bad. Your first question when working with floating point math is "Do I need to use floating point math?" Aside from rather esoteric mathmatic and simulation areas for the bulk majority of practical uses floating point math is not required, you can do it faster better and with less error using integer math.

Mr RB · Feb 12, 2011

I agree. Scale up your data, do integer math and scale it down again as needed giving you exact control of rounding issues etc.

To my mind, the whole idea of a decimal point in binary is dumb.

However I do respect that you have done some work to make AN575 more accessible for everyone who wants to use floating point.

Sceadwian · Feb 12, 2011

Even the best floating point routines you can find are gonna be incredibly slow compared to a well worked out integer solutions. The only reason floating point numbers are used in many other cases is because the have dedicated floating point hardware, or it's required for the application.

Diver300 · Feb 13, 2011

Mr RB said:
I agree. Scale up your data, do integer math and scale it down again as needed giving you exact control of rounding issues etc.

To my mind, the whole idea of a decimal point in binary is dumb.

I think that those two statements are, in some ways, direct contradictions of each other. If you have a value where the resolution is smaller than one unit, and you represent it in binary, you are implying a decimal point. (I suppose it should be called a binary point, but let's not go there.)

Most temperature sensors have a resolution smaller than 1 ºC, and the binary digits represent, for example:-
-128 ºC
64 ºC
...
2 ºC
1 ºC
0.5 ºC
0.25 ºC
0.125 ºC

so 25.625 ºC is represented by 11001.101 ºC

You could say that your number is 0xCD and the units are each 1/8 ºC, but the operations that are done are exactly the same.

Sceadwian · Feb 13, 2011

The operation is the same, the code to execute the same operation is COMPLETLY different. In the case you're talking about with temperature you wouldn't even use floating point math, you're using fixed point math and that's different from both floating point and integer math... On a CPU or micro controller without a floating point unit integer math is always faster, fixed point math is comparable to integer math but the routines are still slower.

Diver300 · Feb 13, 2011

Have you got an example of how you would use fixed point maths, and an example of integer maths, to do the same operation?

bryan · Feb 13, 2011

Yes, I would be interested in a example as well.

Sceadwian · Feb 13, 2011

Lookup table - Wikipedia, the free encyclopedia

They figured out right at the start that raw calculation wasn't the answer.
Ever seen a times table in grade school? Number one on the left column number two on top row, result at intersection.

If you can memorize a large enough number table you can make calculations that take no longer than proper addressing, this is the fundamental method by which some floating point modules work. Do you remember the big Penitum bug years ago? They accidentally sent a bad die off to production and floating point calculations would round off wrong on certain instructions.. You are limited by flash or eeprom address space, but it's all application specific. Lookup tables are as fast as you get, actually working the binary math out takes a massive number of cycles or precise hardware designed for it.

Floating-point unit - Wikipedia, the free encyclopedia

Mr RB · Feb 13, 2011

Diver300-

I think that those two statements are, in some ways, direct contradictions of each other. If you have a value where the resolution is smaller than one unit, and you represent it in binary, you are implying a decimal point. (I suppose it should be called a binary point, but let's not go there.)

I get your point, but (as Sceadwian said) it's all in the implementation.

Rather than use a floating point calc, I would prefer to use (and do use) something like the following;

Code:

unsigned long x;
x = (in << 16); // x = (in * 65536) scale up x to large integer
(math done on x here as unsigned long integer)
out = ((x+32768) >> 16);  // scale back to integer, note forced rounding

By scaling up the values you get absolute control of the resolution. By using <<16 for scaling you get extremely fast math, sometimes that can even be a reassignment of one var to another.

And you get total control of the rounding when you turn it back to an integer, in this case +32768 will round equally up or down but you can round it up, or down at will or even some proportion.

Most temperature sensors have a resolution smaller than 1 ºC, and the binary digits represent, for example:-
-128 ºC
64 ºC
...
2 ºC
1 ºC
0.5 ºC
0.25 ºC
0.125 ºC

so 25.625 ºC is represented by 11001.101 ºC

You could say that your number is 0xCD and the units are each 1/8 ºC, but the operations that are done are exactly the same.

To me that's a perfect example of where NOT to use binary float math. I would keep the temperature variable at it's native resolution (in 1/8 'C resolution) at all times. Then write a fast output routine to display at as degrees and decimal points. The alternative; handling all the temperature data as type float, then displaying type float to decimal, is horrendous.

wannaBinventor · Feb 14, 2011

I'm no computer scientist, but I think that floating point is quite useful when you have different sizes of numbers. Consider the entire reason that I started trying to learn floating point: To convert Lat/Long to UTM and then to MGRS. In this conversion, there are some constants than can be stored, but a lot of math has to be performed. For instance, to calculate the merdional arc you have you to use these constants:

A0: 6367449.145
B0: 16038.43
C0: 16.83261
D0: .021984
E0: .000313

These constants can all be represented with 4 bites each in floating point format, while keeping the precision shown above. Consider if I were to scale E0 to A0. It starts to get ridiculous and I think one would spend so much time converting from one scale to the other as you passed this intermediate calculation to that intermediate calculation and so on that you may have well just used that 400 cycle floating point routine in the first place.

Besides, as I alluded to earlier -- how often do see a hobbyist post this:

"Hey, I'm running my uC at 8 MIPS and I am trying to do V, W, X, Y, and Z. Despite my use of interrupt control for some of the processes, I can't seem to get the uC to run to takt with my 8 million instructions per second speed. Please help me shave a few µs from each process so I can get the uC to do everything it needs to do."

Unless the guy is trying to do a bunch of graphics and what not, I don't think it's ever an issue. Even if it's a product being developed, does the end user ever "feel" or "see" a difference of 1 or 2 milliseconds when he intends to give the uC an input and see his desired output? What he does "feel" or "see" with say a GPS if the program ends up truncating enough places to put him 50 meters from his actual position? My guess is the user would rather wait 1 or 2 more milliseconds and actually know where he is. If you've ever cheated on a military land navigation course and used a GPS instead of your issued compass to find that little ammo box they painted a number on and stuck in the middle of the woods, you'd rather have the precision. It's widely accepted that most of the time a program is wasting it's time waiting anyway.

IMO one routine taking 50µs longer than the other routine is a theorotician's concern, whereas the practicioner really shouldn't care if it makes things easier and more precise.

skyhawk · Feb 14, 2011

For instance, to calculate the merdional arc you have you to use these constants:

A0: 6367449.145
B0: 16038.43
C0: 16.83261
D0: .021984
E0: .000313

These constants can all be represented with 4 bites each in floating point format, while keeping the precision shown above.

A 4 byte floating point number has about 7 decimal digits of precision. Thus A0 has a value as a 4-byte floating point value of about 6367449. with possible error in the last digit. The D0 and E0 terms generally will not contribute at all to the final result.

Mr RB · Feb 14, 2011

Yep, it's a classic problem, just thinking you can use type float and throw all the numbers at it and get an accurate result.

If you know the min and max ranges for those numbers you get more accuracy handling each of them in the scaled format best for that number, and the binary math handled accordingly.

Sceadwian · Feb 14, 2011

wannabeInventor, this was already stated explicitly. That's what floating point numbers are useful for, a very wide range of numbers. Something like the value from a temperature sensor such a range does not occur.

We're not talking about something taking a 100 micro seconds vs 50 microseconds. A binary shift is equivalent to division or multiplication by powers of 2, those are atomic instructions on most modern processor. The same routine run using floating point math routines would take hundreds of cycles, that's an order of magnitude difference in speed for the SAME hardware.

You want to take your 20mhz PIC and turn it into a 2mhz pic by not thinking about the appropriate method to do the math on it go ahead, but the DRAMATIC performance increase if you work things out properly is well worth any additional time.

Welcome to our site!

Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

Floating Point Helper

wannaBinventor

Member

Attachments

Sceadwian

Banned

Mr RB

Well-Known Member

Sceadwian

Banned

Diver300

Well-Known Member

Sceadwian

Banned

Diver300

Well-Known Member

bryan

Member

Sceadwian

Banned

Mr RB

Well-Known Member

wannaBinventor

Member

skyhawk

New Member

Mr RB

Well-Known Member

Sceadwian

Banned

Similar threads

Latest threads

New Articles From Microcontroller Tips