1. Welcome to our site! Electro Tech is an online community (with over 170,000 members) who enjoy talking about and building electronic circuits, projects and gadgets. To participate you need to register. Registration is free. Click here to register now.
    Dismiss Notice

Float Vs Double - C Programming

Discussion in 'Microcontrollers' started by electroRF, Dec 30, 2013.

  1. electroRF

    electroRF Member

    Joined:
    Jun 23, 2012
    Messages:
    689
    Likes:
    9
    Location:
    Portugal
    Hi,

    I'm trying to understand what is the difference between Float and Double and when would one prefer one type over the other.

    I'm reading MSDN and Wiki pages on that - but they seem very confusing and differ from each other.

    I don't understand why with Float's range
    1.2E-38 to 3.4E+38

    You get only 6-digit percision?


    Could you please indicate on the differences and the usage of each type?

    Thank you.
     
    Last edited: Dec 30, 2013
  2. Ian Rogers

    Ian Rogers Super Moderator Most Helpful Member

    Joined:
    Mar 28, 2011
    Messages:
    9,306
    Likes:
    914
    Location:
    Rochdale UK
    ONLINE
    A float is the actual word to use.... A float can either be single or double precision...
    However!!! We can have 24bit, 32bit and 64bit floats ( floating point numbers )

    The exponential ( 8 bit ) is how many shifts in the decimal position..
    so we have the sign bit the exponent bits and the significant bits

    The exponent is indeed 8 bits and see the sizes available..
    2^127 = 1.7014118346046923173168730371588e+38
    2^-128 = 2.9387358770557187699218413430556e-39

    You'll agree are very lager numbers... You must realise that a floating point number is a represented number.
    All the numbers cannot be represented..

    The precision is the trick part... Play with this converter and you will soon see how they work. http://www.h-schmidt.net/FloatConverter/
     
  3. misterT

    misterT Well-Known Member Most Helpful Member

    Joined:
    Apr 19, 2010
    Messages:
    2,697
    Likes:
    368
    Location:
    Finland
    You also have to check your compilers documentation. For example in avr-gcc float and double are the same (32 bit). This is non-standard, but not rare for microcontroller compilers. It is up to the compiler to do all the floating point math and conversions when the processor does not have a floating point unit. Compilers are known to have bugs in their floating point routines. If you need absolute precision, use decimal math.
    www.exploringbinary.com/the-four-stages-of-floating-point-competence/
     
    Last edited: Dec 30, 2013
  4. dave

    Dave New Member

    Joined:
    Jan 12, 1997
    Messages:
    -
    Likes:
    0


     
  5. misterT

    misterT Well-Known Member Most Helpful Member

    Joined:
    Apr 19, 2010
    Messages:
    2,697
    Likes:
    368
    Location:
    Finland

    Also, Precision and range are two different things. If you play with the converter Ian linked, you'll see that the number PI is
    3.1415927410125732
    You see that there is plenty of decimals, but only the first 7 are correct.

    If you convert PI*10000, you'll get:
    31415.92578125
    Much larger number, the precision now is also 7 digits, but after that the digits are wrong. And the wrong digits are not the same as in the previous example.

    if you convert the number 0.1 the result is:
    0.10000000149011612
    That is accurate to 8 digits.
     
    • Like Like x 1
  6. killivolt

    killivolt Well-Known Member

    Joined:
    Mar 12, 2008
    Messages:
    3,213
    Likes:
    121
    Location:
    U.S.
  7. misterT

    misterT Well-Known Member Most Helpful Member

    Joined:
    Apr 19, 2010
    Messages:
    2,697
    Likes:
    368
    Location:
    Finland
    Oh, now I remember that you have posted several questions about floating points and binary numbers. Surprising that you ask this question now.. after posting all those previous questions that are more advanced topics. Maybe you have studied floating points too hard :) Sometimes a break helps to understand all the information better. I think you are just confused about the difference of range and precision.

    EDIT: Did not notice this was from killivolt.. my post was meant mainly for ElectroRF :)
     
    Last edited: Dec 30, 2013
  8. Ian Rogers

    Ian Rogers Super Moderator Most Helpful Member

    Joined:
    Mar 28, 2011
    Messages:
    9,306
    Likes:
    914
    Location:
    Rochdale UK
    ONLINE
    Gee Ive got beer on the brain... I meant large....
    I agree with KV.. foating point on micro's is very insane... Fixed math is tons better...
     
  9. electroRF

    electroRF Member

    Joined:
    Jun 23, 2012
    Messages:
    689
    Likes:
    9
    Location:
    Portugal
    Hi Guys,
    Thanks a lot!

    I thoroughly read your answers and the spec of the Float (single and Double) using the great links you provided!
     
  10. electroRF

    electroRF Member

    Joined:
    Jun 23, 2012
    Messages:
    689
    Likes:
    9
    Location:
    Portugal
    Regarding the Term of Precision

    7-digit precision means that the smallest number you can get has 6 contiguous zeros after the decimal dot?

    i.e. float mantisa is 23-bit, therefore the smallest number is 2^-23 = 0.00000011920928955078125 ---> 6 zeros after the decimal dot.

    As in the above post, I thank you very much!
     
  11. Ian Rogers

    Ian Rogers Super Moderator Most Helpful Member

    Joined:
    Mar 28, 2011
    Messages:
    9,306
    Likes:
    914
    Location:
    Rochdale UK
    ONLINE
    Not always... It very much depends on the answer..

    Recursive numbers are the worst.. ie 66.66666 as more precision bits are needed to represent it.. As Mr T said if you look at Pi it would need quadrupole precision (128 bit ) to get somewhere near...
     
  12. electroRF

    electroRF Member

    Joined:
    Jun 23, 2012
    Messages:
    689
    Likes:
    9
    Location:
    Portugal
    Hi Ian,
    I edited my question (Post #9), if you could please see.
     
  13. Ian Rogers

    Ian Rogers Super Moderator Most Helpful Member

    Joined:
    Mar 28, 2011
    Messages:
    9,306
    Likes:
    914
    Location:
    Rochdale UK
    ONLINE
    I think you are getting confused somewhere!! Why are you getting hung up on this 7 precision thing.

    This is solely dependant on the size of the float... If you have 23bit mantissa then your precision will be limited as I described... Using a double or the quadrupole you will get far more precision...

    Floating point precision is governed by weighted fractions... 0.5 is easy to represent... 0.000000369 is far more difficult..
     
  14. NorthGuy

    NorthGuy Well-Known Member

    Joined:
    Sep 8, 2013
    Messages:
    1,218
    Likes:
    206
    Location:
    Northern Canada
    I can give you an electronic analogy. Look at 22uF capacitor. It may be 10% precision (tolerance), which means only one digit is correct. Even though the capacitor has 22uF value, the real capacitance might be 20uF or 25uF. But if you look at 1% capacitor, it'll be more precise. On the other hand, there's a wide range of capacitor values which has nothing to do with the precision. Regardless of precision, capacitors may range may be very big - from 15pF to 100F. Precision and range are not related.

    The same thing with float numbers. If they have 7 digit precision that's the same as capacitor having 0.000o1% precision. If they have 1.2E-38 to 3.4+38 it's like capacitors being in the range of 0.000000000000000000000000012 pF to 340000000000000000000000000000000000000 F, but each of these capacitors has 0.00001% precision.

    If you look at markings on small ceramic capacitors, you will see that 220 means 22pF, 221 means 220pF, 222 means 2.2nF etc. These markings are floating point numbers. The first 2 digits of the marking is mantissa, which has 2-digit (1%) precision. The third number allows to cover wide range from 10pF to 99mF. Exactly the same concept us used in your processor, but with higher precision and wider range.
     
  15. electroRF

    electroRF Member

    Joined:
    Jun 23, 2012
    Messages:
    689
    Likes:
    9
    Location:
    Portugal
    NorthGuy, Ian, Mister T, KV - I thank you very much guys!

    I learned a lot from this thread
     
  16. KeepItSimpleStupid

    KeepItSimpleStupid Well-Known Member Most Helpful Member

    Joined:
    Oct 30, 2010
    Messages:
    9,966
    Likes:
    1,099
    I liked what I saw here. Nice work guys.

    I do want to add this http://support.microsoft.com/kb/42980 as reading.

    http://en.wikipedia.org/wiki/IEEE_floating_point

    Float and double are mostly artifacts as to how we got to where we are. Float is a type and at one time there was onlyFloat and Double. The IEEE standard uses binary 16, binary 32, binary 64 and binary 128.
    Float would be single precision or binary32 and double would be double precision or 2*64.

    The key as was pointed out is that not all decimal numbers can be equally converted to binary and we don't do: 101.11b in binary although it's certainly possible.

    Significant figures, accuracy and precision are concepts that were drilled into my head in physics classes.

    Take a simple kitchen measuring cup filled with a 1/4 cup of water.

    The smallest graduation is the precision. So, it might be 4 oz that I'm measuring. The one I'm looking at, the finest graduation is 1 oz, so that is the precision I can read it, So, I have 4 oz +- 1 oz.

    In any event 0.25 and 0.250 and 0.2500 have progressively more precision.

    Accuracy is more of how well it measures up to the standard volume.

    In many cases we care more about repeatability.
     
  17. joviermark

    joviermark New Member

    Joined:
    Jul 5, 2017
    Messages:
    1
    Likes:
    0
    The Decimal, Double, and Float variable types are different in the way that they store the values. Precision is the main difference where float is a single precision (32 bit) floating point data type, double is a double precision (64 bit) floating point data type and decimal is a 128-bit floating point data type.

    Float - 32 bit (7 digits)

    Double - 64 bit (15-16 digits)

    Decimal - 128 bit (28-29 significant digits)

    More about...the difference between Decimal, Float and Double
     

Share This Page