Continue to Site

Welcome to our site!

Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

  • Welcome to our site! Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.

Float Vs Double - C Programming

Status
Not open for further replies.

electroRF

Member
Hi,

I'm trying to understand what is the difference between Float and Double and when would one prefer one type over the other.

I'm reading MSDN and Wiki pages on that - but they seem very confusing and differ from each other.

I don't understand why with Float's range
1.2E-38 to 3.4E+38

You get only 6-digit percision?


Could you please indicate on the differences and the usage of each type?

Thank you.
 
Last edited:
A float is the actual word to use.... A float can either be single or double precision...
However!!! We can have 24bit, 32bit and 64bit floats ( floating point numbers )

The exponential ( 8 bit ) is how many shifts in the decimal position..
so we have the sign bit the exponent bits and the significant bits

The exponent is indeed 8 bits and see the sizes available..
2^127 = 1.7014118346046923173168730371588e+38
2^-128 = 2.9387358770557187699218413430556e-39

You'll agree are very lager numbers... You must realise that a floating point number is a represented number.
All the numbers cannot be represented..

The precision is the trick part... Play with this converter and you will soon see how they work. https://www.h-schmidt.net/FloatConverter/
 
You also have to check your compilers documentation. For example in avr-gcc float and double are the same (32 bit). This is non-standard, but not rare for microcontroller compilers. It is up to the compiler to do all the floating point math and conversions when the processor does not have a floating point unit. Compilers are known to have bugs in their floating point routines. If you need absolute precision, use decimal math.
www.exploringbinary.com/the-four-stages-of-floating-point-competence/
 
Last edited:
Also, Precision and range are two different things. If you play with the converter Ian linked, you'll see that the number PI is
3.1415927410125732
You see that there is plenty of decimals, but only the first 7 are correct.

If you convert PI*10000, you'll get:
31415.92578125
Much larger number, the precision now is also 7 digits, but after that the digits are wrong. And the wrong digits are not the same as in the previous example.

if you convert the number 0.1 the result is:
0.10000000149011612
That is accurate to 8 digits.
 
I saved this site from the last time you posted it; and read it often, plenty of good information.

Oh, now I remember that you have posted several questions about floating points and binary numbers. Surprising that you ask this question now.. after posting all those previous questions that are more advanced topics. Maybe you have studied floating points too hard :) Sometimes a break helps to understand all the information better. I think you are just confused about the difference of range and precision.

EDIT: Did not notice this was from killivolt.. my post was meant mainly for ElectroRF :)
 
Last edited:
I said:
You'll agree are very lager numbers

Gee Ive got beer on the brain... I meant large....
I agree with KV.. foating point on micro's is very insane... Fixed math is tons better...
 
Hi Guys,
Thanks a lot!

I thoroughly read your answers and the spec of the Float (single and Double) using the great links you provided!
 
Regarding the Term of Precision

7-digit precision means that the smallest number you can get has 6 contiguous zeros after the decimal dot?

i.e. float mantisa is 23-bit, therefore the smallest number is 2^-23 = 0.00000011920928955078125 ---> 6 zeros after the decimal dot.

As in the above post, I thank you very much!
 
eRF said:
What I still don't manage to understand is: how come float has only 7-digit precision?

Not always... It very much depends on the answer..

Recursive numbers are the worst.. ie 66.66666 as more precision bits are needed to represent it.. As Mr T said if you look at Pi it would need quadrupole precision (128 bit ) to get somewhere near...
 
I think you are getting confused somewhere!! Why are you getting hung up on this 7 precision thing.

This is solely dependant on the size of the float... If you have 23bit mantissa then your precision will be limited as I described... Using a double or the quadrupole you will get far more precision...

Floating point precision is governed by weighted fractions... 0.5 is easy to represent... 0.000000369 is far more difficult..
 
I can give you an electronic analogy. Look at 22uF capacitor. It may be 10% precision (tolerance), which means only one digit is correct. Even though the capacitor has 22uF value, the real capacitance might be 20uF or 25uF. But if you look at 1% capacitor, it'll be more precise. On the other hand, there's a wide range of capacitor values which has nothing to do with the precision. Regardless of precision, capacitors may range may be very big - from 15pF to 100F. Precision and range are not related.

The same thing with float numbers. If they have 7 digit precision that's the same as capacitor having 0.000o1% precision. If they have 1.2E-38 to 3.4+38 it's like capacitors being in the range of 0.000000000000000000000000012 pF to 340000000000000000000000000000000000000 F, but each of these capacitors has 0.00001% precision.

If you look at markings on small ceramic capacitors, you will see that 220 means 22pF, 221 means 220pF, 222 means 2.2nF etc. These markings are floating point numbers. The first 2 digits of the marking is mantissa, which has 2-digit (1%) precision. The third number allows to cover wide range from 10pF to 99mF. Exactly the same concept us used in your processor, but with higher precision and wider range.
 
I liked what I saw here. Nice work guys.

I do want to add this **broken link removed** as reading.

https://en.wikipedia.org/wiki/IEEE_floating_point

Float and double are mostly artifacts as to how we got to where we are. Float is a type and at one time there was onlyFloat and Double. The IEEE standard uses binary 16, binary 32, binary 64 and binary 128.
Float would be single precision or binary32 and double would be double precision or 2*64.

The key as was pointed out is that not all decimal numbers can be equally converted to binary and we don't do: 101.11b in binary although it's certainly possible.

Significant figures, accuracy and precision are concepts that were drilled into my head in physics classes.

Take a simple kitchen measuring cup filled with a 1/4 cup of water.

The smallest graduation is the precision. So, it might be 4 oz that I'm measuring. The one I'm looking at, the finest graduation is 1 oz, so that is the precision I can read it, So, I have 4 oz +- 1 oz.

In any event 0.25 and 0.250 and 0.2500 have progressively more precision.

Accuracy is more of how well it measures up to the standard volume.

In many cases we care more about repeatability.
 
The Decimal, Double, and Float variable types are different in the way that they store the values. Precision is the main difference where float is a single precision (32 bit) floating point data type, double is a double precision (64 bit) floating point data type and decimal is a 128-bit floating point data type.

Float - 32 bit (7 digits)

Double - 64 bit (15-16 digits)

Decimal - 128 bit (28-29 significant digits)

More about...the difference between Decimal, Float and Double
 
Status
Not open for further replies.

Latest threads

Back
Top