Kochka's notes

Random thoughts

Geo/Distance Search From Scratch Using Rails & MySQL

With an explosive growth of the internet, data need to be localized more than ever. Many DBMS already include some GIS features but support is often incomplete. InnoDB’s MySQL engine currently supports some spacial data types but not indexes on it. Building an efficient GIS system from scratch is quite easy, let’s see how to do it using Ruby on Rails & MySQL.

The distances between two points on a sphere from their longitudes and latitudes can be calulated using the Haversine formula (half-versed-sine). Because the earth is not a perfect sphere but an oblate spheroid, this formula gives an approximated distance with a margin of error generally below 0.3% (the earth radius varies from 6356.752 km at the poles to 6378.137 km at the equator). For a greater accuracy, the Vincenty’s formula could be used.
Because we are using high-precision floating point numbers, we are able to use the simpler Spherical law of cosines formula : d = acos(sin(φ1).sin(φ2) + cos(φ1).cos(φ2).cos(Δλ)).R where φ is latitude, λ is longitude.

The first thing to do is to implement the Spherical law of cosines formula in a MySQL function to be able to use it easily into queries.
The earth radius can be chosen depending on where you want to make those calculations. Miles can also be used instead of kilometers.

1
2
3
4
5
6
7
8
9
DELIMITER //

DROP FUNCTION IF EXISTS distance//
CREATE FUNCTION distance(lat1 DOUBLE, lng1 DOUBLE, lat2 DOUBLE, lng2 DOUBLE) RETURNS DOUBLE
  LANGUAGE SQL
  DETERMINISTIC
  COMMENT 'Calculate distance in km between two points on earth'
RETURN ACOS(SIN(RADIANS(lat1)) * SIN(RADIANS(lat2)) + COS(RADIANS(lat1)) * COS(RADIANS(lat2))
       * COS(RADIANS(lng1 - lng2))) * 6371;//

So if we consider that we have a table point_of_interests including 2 fields lat (latitude) and lng (longitude), a way to find all POIs in a 50 km radius around a given point 48.852842, 2.350333 could be :

1
2
SELECT * FROM PointOfInterests
  WHERE distance(48.852842, 2.350333, lat, lng) < 50;

The big performance issue with this query is that it does a whole table scan to calculate distance between the point and all POIs. It can be ok if you have few POIs but in most cases it sucks hard.

To limit the number of records to scan, we need to find the latitude and longitude boundaries where all POIs we are looking for are included.
These boundaries are illustrated on the schema and are located at 4 differents bearings and obviously at a radius distance.

  • θ 0 : Maximum latitude
  • θ 90 : Maximum longitude
  • θ 180 : Minimum latitude
  • θ 270 : Minimum longitude

Using the Haversine formula, we are able to find geographic coordinates of a point with another point, a bearing and the distance between the two points.

Latitude φ2 = asin(sin(φ1)*cos(d/R) + cos(φ1)*sin(d/R)*cos(θ))
Longitude λ2 = λ1 + atan2(sin(θ)*sin(d/R)*cos(φ1), cos(d/R)−sin(φ1)*sin(φ2))
where φ is latitude, λ is longitude, θ is the bearing (in radians, clockwise from north), d is the distance, R is the earth’s radius (d/R is the angular distance, in radians).

To be able to add the distance search feature to any Active Record models, the search code can be included into a Concern.

app/models/concerns/localizable.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
module Localizable
  extend ActiveSupport::Concern

  included do
    scope :near, -> lat, lng, radius {
      d =-> b { destination_point(lat, lng, b, radius) }
      where(["lat BETWEEN ? AND ? AND lng BETWEEN ? AND ?", d[180][:lat], d[0][:lat], d[270][:lng], d[90][:lng]])
      .where(["COALESCE(distance(?, ?, lat, lng), 0) < ?", lat, lng, radius])
    }
  end

  module ClassMethods

    # Return destination point given distance and bearing from start point
    def destination_point(lat, lng, initial_bearing, distance)
      d2r =-> x { x * Math::PI / 180 }
      r2d =-> x { x * 180 / Math::PI }
      angular_distance = distance / 6371.0
      lat1, lng1, bearing = d2r.(lat), d2r.(lng), d2r.(initial_bearing)
      lat2 = Math.asin(Math.sin(lat1) * Math.cos(angular_distance) + Math.cos(lat1) * Math.sin(angular_distance) * Math.cos(bearing))
      lng2 = lng1 + Math.atan2(Math.sin(bearing) * Math.sin(angular_distance) * Math.cos(lat1), Math.cos(angular_distance) - Math.sin(lat1) * Math.sin(lat2))
      { :lat => r2d.(lat2).round(7), :lng => r2d.(lng2).round(7) }
    end

  end
end

Just include the Localizable concern into each models you need to (off course the corresponding table must have lat & lng fields).

app/models/point_of_interest.rb
1
2
3
4
5
class PointOfInterest < ActiveRecord::Base
  include Localizable


end

You’re done !

1
2
3
PointOfInterest.near(48.852842, 2.350333, 50).count
(23.4ms)  SELECT COUNT(*) FROM `point_of_interests` WHERE (lat BETWEEN 48.4031812 AND 49.3025028 AND lng BETWEEN 1.6669714 AND 3.0336946) AND (COALESCE(distance(48.852842, 2.350333, lat, lng), 0) < 50)
=> 969

Comments