Skip to main content

Searching in a radius with Postgres

Postgres has two very useful extensions - earthdistance and postgis.  PostGIS is much more accurate but I found earthdistance to be very easy to use and accurate enough for my purpose (finding UK postcodes within a radius of a point).

To install it first find your Postgres version and then install the appropriate package.  On my Debian Mint dev box it looks like the below snippet. My production machine is an Amazon RDS and you can skip this step in that environment.

 psql -V  
 sudo apt-get install postgresql-contrib postgresql-contrib-9.3  
 sudo service postgresql restart  

Having done that you should launch psql and run these two commands.  Make sure that you install cube first because it is a requirement of earthdistance.

 CREATE EXTENSION cube; CREATE EXTENSION earthdistance;  

Now that the extensions are installed you have access to all of the functions they provide.

If you want to check that they're working you can run SELECT earth(); as a quick way to test a function. It should return the earth's radius.

Earthdistance treats earth as a perfect sphere but Postgis is more accurate. Since I'm working in a fairly local region (less than 20 statute miles) I felt that approximating the surface as a sphere was sufficiently accurate.

I had a table of postcodes and an example query looked like this:

 SELECT * FROM postcodes  
 WHERE earth_box(ll_to_earth(-0.20728725565285000, 51.48782225571700000), 100) @> ll_to_earth(lat, lng)  

This took quite a long time (*cough* a minute *cough*) to run through the 1.7 million postcodes in my table. After a little searching I realized that I should index the calculation with the result that queries now take about 100ms, which is plenty fast enough on my dev box.

 For reference - using the Haversine formula in a query was taking around 30 seconds to process - even after constraining my query to a box before calculating radii.

 CREATE INDEX postcode_gis_index on postcodes USING gist(ll_to_earth(lat, lng));  

So now I can search for UK postcodes that are close to a point and I can do so sufficiently quickly that my user won't get bored of watching a loading spinner gif.

Comments

  1. Hey

    I am just wandering what is this 100 figure in function used in query.

    WHERE earth_box(ll_to_earth(-0.20728725565285000, 51.48782225571700000), 100) @> ll_to_earth(lat, lng)

    Is this in M, KM, Miles?

    ReplyDelete
  2. Hi,

    The syntax for earth_box looks like this: ... WHERE earth_box(ll_to_earth({longitude_float_value}, {latitude_float_value}), {radius_in_meters}) @> ll_to_earth(some_table.latitude, some_table.longitude)

    So the 100 is meters.

    I'd recommend trying both PostGIS and earthdistance by the way - I found that PostGIS was faster in some cases but earthdistance is good enough for most things.

    ReplyDelete

Post a Comment

Popular posts from this blog

Separating business logic from persistence layer in Laravel

There are several reasons to separate business logic from your persistence layer.  Perhaps the biggest advantage is that the parts of your application which are unique are not coupled to how data are persisted.  This makes the code easier to port and maintain. I'm going to use Doctrine to replace the Eloquent ORM in Laravel.  A thorough comparison of the patterns is available  here . By using Doctrine I am also hoping to mitigate the risk of a major version upgrade on the underlying framework.  It can be expected for the ORM to change between major versions of a framework and upgrading to a new release can be quite costly. Another advantage to this approach is to limit the access that objects have to the database.  Unless a developer is aware of the business rules in place on an Eloquent model there is a chance they will mistakenly ignore them by calling the ActiveRecord save method directly. I'm not implementing the repository pattern in all its glory in this demo.  

Fixing puppet "Exiting; no certificate found and waitforcert is disabled" error

While debugging and setting up Puppet I am still running the agent and master from CLI in --no-daemonize mode.  I kept getting an error on my agent - ""Exiting; no certificate found and waitforcert is disabled". The fix was quite simple and a little embarrassing.  Firstly I forgot to run my puppet master with root privileges which meant that it was unable to write incoming certificate requests to disk.  That's the embarrassing part and after I looked at my shell prompt and noticed this issue fixing it was quite simple. Firstly I got the puppet ssl path by running the command   puppet agent --configprint ssldir Then I removed that directory so that my agent no longer had any certificates or requests. On my master side I cleaned the old certificate by running  puppet cert clean --all  (this would remove all my agent certificates but for now I have just the one so its quicker than tagging it). I started my agent up with the command  puppet agent --test   whi

Preventing Directory Traversal attacks in PHP

Directory traversal attacks occur when your program reads or writes a file where the name is based on some sort of input that can be maliciously tampered with.  When used in conjunction with log poisoning this can lead to an attacker gaining remote shell access to your server. At the most simple it could be to include a file like this: echo file_get_contents($_GET['sidebar']); The intention would be for you to be able to call your URL and send a parameter indicating which sidebar content you want to load... like this:  http://foo.bar/myfile.php?sidebar=adverts.html Which is really terrible practice and would not be done by any experienced developer. Another common place where directory traversal attacks can occur is in displaying content based on a database call. If you are reading from or writing to a file based on some input (like GET, POST, COOKIE, etc) then make sure that you remove paths .  The PHP function basename will strip out paths and make sure that y