[postgis-users] Geographies, geometries, international date line

Maxime Poulin mpoulin at expedia.com
Thu Oct 11 08:42:22 PDT 2012


Quote from Paul:
---
Not if your table is using the geography type. Geography doesn't care about datelines or poles, it laughs in the faces of cartesian singularities.
---

Yes, I saw that. The thing is the way our polygons are created. They are created by geographers and historically they used a flat map to do so. So when they create their polygons on the flat map, they "think" they include such or such hotel within that polygon. However, if using geographic calculations, then the polygon might not include some edge-hotels.

So in other words, my problem right now is a "polygon-creation process problem", and the code has to mitigate it.

As of now, I have the following solution.
First, our DB, everything is Geography(4326): polygons, points etc.
All distance calculations are geographic.
All "within" calculations are geometric.

We only have 2 procs accepts user-defined polygons. Before invoking these procs, I simply and naively check if that polygon crosses the date line or not:

    boolean nearDateLineWest = false;
    boolean nearDateLineEast = false;
    Coordinate[] coordinates = p.getCoordinates();
    for (Coordinate coordinate: coordinates) {
      if (coordinate.x >= 170d) {
        nearDateLineWest = true;
      } else if (coordinate.x <= -170d) {
        nearDateLineEast = true;
      }
    }
    return nearDateLineWest & nearDateLineEast;

As you can see, this is naïve and well, it is a patch for the next 3 weeks or so. I used 170 as a reference because we filter out any poly that would be wider anyways.

Anyways, if the polygon crosses the date line, then I invoke the proc that uses geographic calculations. Otherwise, I use geometric.

If our stress tests show that geometric's ST_Within(geom,geom) (+ bounding box) is way faster than ST_Intersects(geog,geog), then my work around will be to check if a poly crosses the date line and if so, split the poly in two according to date line, create a multi-polygon to be sent to PostGIS (well, I am not sure it supports multi-polys yet...).

BTW, I absolutely love PG. What a great thing it is. Been working with in since last March.


-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net [mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of Paul Ramsey
Sent: Thursday, October 11, 2012 11:07 AM
To: PostGIS Users Discussion
Subject: Re: [postgis-users] Geographies, geometries, international date line

On Thu, Oct 11, 2012 at 12:12 AM, Nicolas Ribot <nicolas.ribot at gmail.com> wrote:
> Hi Maxime,
>
> Yes indeed, the new geography type take advantage of a very fast Gist 
> index,often leading to queries faster than their geometry equivalent 
> (see Paul Ramsey's blog about this new type).
> St_distance might also benefit from this new code.

The st_distance code for geography has been sped up, but only for 2.1, not in the 2.0 series.

> Concerning polygons crossind dateline, yes you will have to cut them 
> in 2 parts (trigger could be a good option for that)

Not if your table is using the geography type. Geography doesn't care about datelines or poles, it laughs in the faces of cartesian singularities.

> Finally, yes, this list is the right place to discuss about best 
> practices and DB optimization.
>
> Nicolas
>
> On 10 October 2012 22:27, Maxime Poulin <mpoulin at expedia.com> wrote:
>> Hi everyone.
>>
>> And so this is my first post to this mailing list, and since I 
>> registered, I haven't seen so many messages so I hope the level of 
>> this email is up to this list's standards.
>>
>>
>>
>> Here is the situation.
>>
>> At Expedia, we are working on upgrading our geo system and that 
>> implies migrating to PostGIS.
>>
>>
>>
>> During my tests, I did the following:
>>
>>
>>
>> I have two tables for regions, one, say region_geometry, keeps the 
>> polygon as a geometry(4326). The second one, say region_geography, 
>> keeps the polygon of a region as a Geography(4326).
>>
>>
>>
>> We also have other tables mapping hotels and points of interests, 
>> both storing their location as a geography point(4326).
>>
>>
>>
>> The test case is to find everything that is within a region. In other 
>> words, find all hotels and points of interests that are within a 
>> given region's polygon.
>>
>>
>>
>> The first test uses the region_geometry table, and the query uses 
>> ST_Within function (using geometries and the && operator).
>>
>> The second test uses the region_geography table, and the query uses 
>> ST_Intersects function (using geographies).
>>
>>
>>
>> I was expecting the geometry version to be much faster, but I 
>> actually find that performance is somewhat the same. Both queries use 
>> the special index, perform the index scan and so on.  So how can this 
>> be possible, how can I be sure that geometries are actually faster than geographies ?
>>
>>
>>
>> The main issue with these two tests is regarding polygons crossing 
>> the international date line. So suppose I have a polygon that maps 
>> the Fiji islands. That polygon will cross the international date 
>> line. The geography method works fine, but the polygon, when used in 
>> geometry world, is actually a ribbon around the planet. I absolutely 
>> understand the reason, but if I really want to stay in the geometry 
>> world, how can I work around this problem ? Do I really have to split 
>> my polygon to create 2 polygons not crossing the line ? Do I have to 
>> filter my requests, find if the polygon crosses the date line and if so, "revert" to geographic calculations ?
>>
>>
>>
>> I have somewhat the same questions regarding distance calculations. 
>> In geometry world, if I want all the hotels in a 200 meters radius of 
>> a given point, I would use something like ST_Distance_spheroid(p1, 
>> p2) < 200. In geography, I would use ST_DWithin(p1, p2, 200). My 
>> tests shows that STD_Within (geography, geography) is most of the 
>> time faster than st_distance(p1, p2) with or without spheroid. Again, 
>> in both cases, the spatial indexes are used.
>>
>>
>>
>> I have many quite precise and edgy questions regarding good practices 
>> and performance tuning for PostGIS.  I would like to provide samples 
>> and discuss around them, see what I / we can find. Is this list the right place to ask ?
>>
>>
>>
>> Maxime Poulin.
>>
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users



More information about the postgis-users mailing list