Обсуждение: my float function returns NaN

Поиск
Список
Период
Сортировка

my float function returns NaN

От
Frank Bax
Дата:
I have a function to calculate distance between two points.  When asking
for distance between two points that are actually the same point; my
application (which rounds to only very few decimal places), might return
zero or NaN depending on decimal accuracy of point.

I am able to reproduce the problem with the following SQL statements.

CREATE OR REPLACE FUNCTION geo_deg2rad(float) RETURNS float AS $$ SELECT
($1 * pi()::float / 180::float) $$ LANGUAGE SQL IMMUTABLE

CREATE OR REPLACE FUNCTION geo_dist(point,point) RETURNS float AS $$
SELECT 6371.0 * acos( sin(geo_deg2rad($1[0]))*sin(geo_deg2rad($2[0])) +
cos(geo_deg2rad($1[0]))*cos(geo_deg2rad($2[0])) *
cos(geo_deg2rad($2[1])-geo_deg2rad($1[1])) ) $$ LANGUAGE SQL IMMUTABLE

create table geo (p point);
insert into geo values (point(44.85051666667,-79.5405));
insert into geo values (point(44.850516667,-79.5405));

select p,geo_dist(p,p) from geo;
              p             |       geo_dist
---------------------------+----------------------
  (44.85051666667,-79.5405) |                  NaN
  (44.850516667,-79.5405)   | 0.000134258785931453

Why does the point with more decimal accuracy result in NaN?

If its not easy to "fix" my function; is the a function that can convert
NaN to zero (like coalesce(p,0) would convert null to zero)?


PostgreSQL 9.0.4 on x86_64-unknown-openbsd5.0, compiled by GCC cc (GCC)
4.2.1 20070719 , 64-bit

Re: my float function returns NaN

От
Nathaniel Trellice
Дата:
Dear Frank,

My guess would be that due to rounding errors, the argument passed to acos() is outwith the interval [-1,1] and acos() is only defined within this interval (for reals). I.e the expression:

 sin(theta)^2 + cos(theta)^2*cos(0)

is coming out ever so slightly greater than 1, when it should, of course, be exactly 1.

You could simply insert a check prior to calling acos() to ensure it's argument is within the acceptable range. Eg, in pseudo code:

x = sin(lat1)*sin(lat2)+cos(lat1)*cos(lat2)*cos(dlon)
if x > 1 then x=1 else if x< -1 then x = -1
acos(x)

But I'd strongly suggest you investigate using the haversine formulation for spherical distances as it is much better for small distances. Your version is mathematically sound, but computationally prone to rounding errors.


You may still need to check the input to asin() is within the interval [-1,1].

Nathaniel


On 7 Apr 2012, at 03:29, Frank Bax <fbax@sympatico.ca> wrote:

I have a function to calculate distance between two points.  When asking for distance between two points that are actually the same point; my application (which rounds to only very few decimal places), might return zero or NaN depending on decimal accuracy of point.

I am able to reproduce the problem with the following SQL statements.

CREATE OR REPLACE FUNCTION geo_deg2rad(float) RETURNS float AS $$ SELECT ($1 * pi()::float / 180::float) $$ LANGUAGE SQL IMMUTABLE

CREATE OR REPLACE FUNCTION geo_dist(point,point) RETURNS float AS $$ SELECT 6371.0 * acos( sin(geo_deg2rad($1[0]))*sin(geo_deg2rad($2[0])) + cos(geo_deg2rad($1[0]))*cos(geo_deg2rad($2[0])) * cos(geo_deg2rad($2[1])-geo_deg2rad($1[1])) ) $$ LANGUAGE SQL IMMUTABLE

create table geo (p point);
insert into geo values (point(44.85051666667,-79.5405));
insert into geo values (point(44.850516667,-79.5405));

select p,geo_dist(p,p) from geo;
            p             |       geo_dist
---------------------------+----------------------
(44.85051666667,-79.5405) |                  NaN
(44.850516667,-79.5405)   | 0.000134258785931453

Why does the point with more decimal accuracy result in NaN?

If its not easy to "fix" my function; is the a function that can convert NaN to zero (like coalesce(p,0) would convert null to zero)?


PostgreSQL 9.0.4 on x86_64-unknown-openbsd5.0, compiled by GCC cc (GCC) 4.2.1 20070719 , 64-bit

--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice

Re: my float function returns NaN

От
Tom Lane
Дата:
Nathaniel Trellice <naptrel@yahoo.co.uk> writes:
> My guess would be that due to rounding errors, the argument passed to
> acos() is outwith the interval [-1,1] and acos() is only defined
> within this interval (for reals).

Yeah.  It's not helping any that you've defined geo_deg2rad() to do its
arithmetic in float (that is float4) precision.  Making it work in
float8 would probably improve matters.  Still, I agree that switching to
a more numerically stable distance calculation would be a good idea.

            regards, tom lane

Re: my float function returns NaN

От
Frank Bax
Дата:
On 04/07/12 10:54, Tom Lane wrote:
> Nathaniel Trellice<naptrel@yahoo.co.uk>  writes:
>> My guess would be that due to rounding errors, the argument passed to
>> acos() is outwith the interval [-1,1] and acos() is only defined
>> within this interval (for reals).
>
> Yeah.  It's not helping any that you've defined geo_deg2rad() to do its
> arithmetic in float (that is float4) precision.  Making it work in
> float8 would probably improve matters.  Still, I agree that switching to
> a more numerically stable distance calculation would be a good idea.
>


Thanks Nathaniel & Tom:

Using float8 rather than float4 for deg2rad() made no difference to
results; I'll continue to use float8 anyway.

I've read about haversine being more accurate and more expensive
(cycles), but I haven't been able to find anything that says "how" much
the accuracy is affected.  My queries are typically between points less
than 100km apart and rounding error of 1km is acceptable.  When points
are farther apart; larger error is acceptable.  The extra decimal digits
provided in my example (which seems to be causing the problems with NaN)
just happen to be what's provided to me in database.  The NaN result is
causing problems when I ORDER BY distance since NaN is treated more like
infinity rather than zero.

Checking bounds of ACOS() does resolve the NaN issue; however, now I
have a performance question.  If I code the function using "LANGUAGE
SQL" is the optimizer smart enough to know that distA (see below) should
only be calculated once?  It might be easier to read my function if I
present the PHP code used to create it...

$R = "6371.0";
$lat1="geo_deg2rad($1[0])"; $lat2="geo_deg2rad($2[0])";
$lng1="geo_deg2rad($1[1])"; $lng2="geo_deg2rad($2[1])";
$distA="sin($lat1)*sin($lat2) + cos($lat1)*cos($lat2) * cos($lng2-$lng1)";

$distB="CASE WHEN $distA > 1 THEN 1 WHEN $distA < -1 THEN -1 ELSE $distA
END";

$dist_2="$R * acos( $distB )";

Otherwise I will need to rewrite in something like plpgsql (which I have
not done before now).