1. Computing & Technology

Floating Point and JavaScript

From , former About.com Guide

When you start using numbers that contain decimals in your JavaScript calculations you may notice that the results of the calculations are not always exactly what you were expecting them to be. This is a result of the way that computers work rather than a specific JavaScript problem and there is not really any way that JavaScript can avoid the problem. It is up to us in writing our JavaScript code (or any other language for that matter) to handle this issue appropriately.

The first step in being able to handle that JavaScript doesn't produce the exact answers to calculations that we expect when we use floating point numbers is to understand why that is the case.

The first thing you need to be aware of is that computers do not work with decimal numbers, they work with binary numbers. In binary 2 is written as 10, 3 as 11, 4 as 100, 5 as 101, 6 as 110, 7 as 111, 8 as 1000, 9 as 1001, and 10 as 1010 etc. Basically every single number when written in binary consists simply of ones and zeros which the computer can easily keep track of using circuits that are either on (for one) or off (for zero). These binary digits (or bits) are usually grouped together in the computer in sets of eight bits (where 11111111 represents the decimal number 255.These sets of eight bits are called bytes.

Since we normally need to be able to process numbers outside of the range 0 through 255 the computer puts several of these bytes together so as to allow much larger numbers to be stored.With integers the biggest number that the system can handle depends on how many bytes each number is made up of. To handle negative numbers one bit is reserved for the sign and so the biggest number is then halved from what it would be if only positive numbers were allowed. With integers, as long as the numbers you are working with are never greater than that biggest number you will always get the correct answer provided that the answer is an integer.

Floating point numbers allow us to specify fractions as well and it is these fractional parts where the answers start coming out different from what is expected. This is duer to the difference between the decimal numbers that we are using and the binary numbers the computer is using. Where we specify numbers with tenths, hundredths, thousandths, etc the computer sees numbers with halves, quarters, eighths, sixteenths etc. Of course each decimal fraction does have an exact binary fraction but the computer is limited as to how many bytes it can allocate to each number. If the exact binary equivalent to our decimal fraction requires more space than the computer has allocated then the computer rounds the number off to the nearest binary fraction that it can store. This rounding off is half of the problem we are looking at.The other half of the problem is that when converting back the other way the exact match may again consist of a huge number of decimal digits and so the computer limits the number of digits it displays which again means that the actual value displayed may not be exactly the value the computer has stored.

In practical terms what this means is that each of the floating point numbers you are working with may differ slightly in the fifteenth decimal place from what you expect. Depending on just how many calculations that you do with these numbers the error in the intermediate results will gradually grow bigger leading to an even bigger discrepancy in the final answer (a huge number of calculations may lead to a slight error in the thirteenth or perhaps even the twelfth decimal place of the answer).

Now these errors don't sound like much because they are after all extremely small compared to the size of the numbers that you will usually be dealing with. The only reason that you need worry about it at all is because this slight error can lead to your getting answers such as 5.999999999999999 instead of 6.000000000000000 where even though the error is so small as to be insignificant it makes a big difference to the number that is displayed as the answer.

There are two ways to fix this. The first solution is where you know that all of the numbers you are going to be dealing with have a specific number of decimal places (for example when dealing with money). If you have a currency such as dollars here there are 100 cents to a dollar (or 100 pence to a pound) then the simple solution is to do all the calculations using cents rather than dollars. This simply means multiplying all the numbers by 100 (and then rounding to the nearest integer) before you do anything else with them. The rounding corrects completely for any difference between the decimal and binary versions of the fractions since we know that the original number had exactly two digits after the decimal. All subsequent calculations can then be performed using integers instead of floating point numbers and we can simply insert a decimal point before the last two digits when displaying the answer.

The way you resolve floating point discrepancies where you don't know how many digits that the numbers are supposed to contain in the first place is to decide just how many digits you need the answers you are displaying to contain. Normally you will need far fewer than the 15 digits that the computer can display after the decimal point. If you only need three or perhaps five decimal places then simply use the toFixed() method on your answer to round that answer to that many decimal places. In most cases the answer you get when rounding the computer calculated answer to that number of decimal places will be exactly the same as you would expect to get if you multiplied the original numbers together and rounded the answer to that same number of decimal places.

That will leave you with very few cases where the computer still produces the wrong answer and in those cases there is no simple solution as the calculations you want to perform exceed the precision that the computer can handle.

©2012 About.com. All rights reserved.

A part of The New York Times Company.