Postgres has two very useful extensions - earthdistance and postgis. PostGIS is much more accurate but I found earthdistance to be very easy to use and accurate enough for my purpose (finding UK postcodes within a radius of a point).
To install it first find your Postgres version and then install the appropriate package. On my Debian Mint dev box it looks like the below snippet. My production machine is an Amazon RDS and you can skip this step in that environment.
Having done that you should launch psql and run these two commands. Make sure that you install cube first because it is a requirement of earthdistance.
Now that the extensions are installed you have access to all of the functions they provide.
If you want to check that they're working you can run SELECT earth(); as a quick way to test a function. It should return the earth's radius.
Earthdistance treats earth as a perfect sphere but Postgis is more accurate. Since I'm working in a fairly local region (less than 20 statute miles) I felt that approximating the surface as a sphere was sufficiently accurate.
I had a table of postcodes and an example query looked like this:
This took quite a long time (*cough* a minute *cough*) to run through the 1.7 million postcodes in my table. After a little searching I realized that I should index the calculation with the result that queries now take about 100ms, which is plenty fast enough on my dev box.
For reference - using the Haversine formula in a query was taking around 30 seconds to process - even after constraining my query to a box before calculating radii.
So now I can search for UK postcodes that are close to a point and I can do so sufficiently quickly that my user won't get bored of watching a loading spinner gif.
To install it first find your Postgres version and then install the appropriate package. On my Debian Mint dev box it looks like the below snippet. My production machine is an Amazon RDS and you can skip this step in that environment.
psql -V
sudo apt-get install postgresql-contrib postgresql-contrib-9.3
sudo service postgresql restart
Having done that you should launch psql and run these two commands. Make sure that you install cube first because it is a requirement of earthdistance.
CREATE EXTENSION cube; CREATE EXTENSION earthdistance;
Now that the extensions are installed you have access to all of the functions they provide.
If you want to check that they're working you can run SELECT earth(); as a quick way to test a function. It should return the earth's radius.
Earthdistance treats earth as a perfect sphere but Postgis is more accurate. Since I'm working in a fairly local region (less than 20 statute miles) I felt that approximating the surface as a sphere was sufficiently accurate.
I had a table of postcodes and an example query looked like this:
SELECT * FROM postcodes
WHERE earth_box(ll_to_earth(-0.20728725565285000, 51.48782225571700000), 100) @> ll_to_earth(lat, lng)
This took quite a long time (*cough* a minute *cough*) to run through the 1.7 million postcodes in my table. After a little searching I realized that I should index the calculation with the result that queries now take about 100ms, which is plenty fast enough on my dev box.
For reference - using the Haversine formula in a query was taking around 30 seconds to process - even after constraining my query to a box before calculating radii.
CREATE INDEX postcode_gis_index on postcodes USING gist(ll_to_earth(lat, lng));
So now I can search for UK postcodes that are close to a point and I can do so sufficiently quickly that my user won't get bored of watching a loading spinner gif.
Hey
ReplyDeleteI am just wandering what is this 100 figure in function used in query.
WHERE earth_box(ll_to_earth(-0.20728725565285000, 51.48782225571700000), 100) @> ll_to_earth(lat, lng)
Is this in M, KM, Miles?
Hi,
ReplyDeleteThe syntax for earth_box looks like this: ... WHERE earth_box(ll_to_earth({longitude_float_value}, {latitude_float_value}), {radius_in_meters}) @> ll_to_earth(some_table.latitude, some_table.longitude)
So the 100 is meters.
I'd recommend trying both PostGIS and earthdistance by the way - I found that PostGIS was faster in some cases but earthdistance is good enough for most things.