[postgis-users] Using SRID with meters, but still not getting UNITS in meters with ST_DWithin

Paul Ramsey pramsey at cleverelephant.ca
Tue Apr 12 12:50:22 PDT 2016


This:

"while lat-long SRIDs (like 4326) say the units are "meters""

is not correct. Long-lat SRS's pretty clearly state their units are
degrees. More to the point, the units are evaluated in a spherical
context, not a planar one, so applying planar calculations to them
will yield junk answers. You want to apply planar calculations to
planar coordinates, and spherical calculations to spherical
coordinates.

This is why we have the geometry (planar) and geography (spherical) bifurcation.

In my previous email I gave you a recipe to run your problem in
geography (spherical) space. Since you've sort of indicated that your
data are USA only, I'll also give you a planar (geometry) recipe,
using a suitable planar projection for the USA.

We will use the USA National Atlas Albers projection
(https://epsg.io/2163) for this example. It won't provide meter-level
accuracy over such a wide area, but presumable for something as coarse
as a "postal codes in radius" query, that won't matter.

ALTER TABLE tpostalcoordinate ADD COLUMN geom geometry(point, 2136);
UPDATE tpostalcoordinate SET geom =
ST_Transform(ST_SetSRID(ST_MakePoint(longitude, latitude), 4326),2163);
CREATE INDEX tpostalcoordinate_geom_ix ON tpostalcoordinate USING GIST (geom);
SELECT a.*
  FROM tpostalcoordinate a
  JOIN  tpostalcoordinate  b
  ON ST_DWithin(a.geom, b.geom, 10000)
  WHERE
    a.countrycode2tcountry = 'US' AND
    b.countrycode2tcountry = 'US' AND
    b.postalcode = '94404';

Note the differences with the geography approach:

- use use the geometry type, and we populate the column with data
transformed into our working planar projection
- then we're free to just use ST_DWithin(geom, geom, radius) and
everything magically works fine.

ATB,

P


On Tue, Apr 12, 2016 at 12:40 PM, Michael Moore <michaeljmoore at gmail.com> wrote:
> Lee, Paul,
> Thanks so much for your time.
> Lee,
> My understanding is that there are about 42,000 zips in the US, So a cross
> join should result in about 1.7 Billion entries into the
> postal_zone_distance table. Maybe that's not so bad, I'll have to run the
> numbers. Where as this might be the fastest possible solution, it's probably
> over-kill for my meager needs. I understand what you mean by needing to know
> first the SRID of the Latitude and Longitude from my source table. It means
> very little without knowing the Datum from which they were derived. Thanks
> to some youtube videos I think I grasp this.
>
> You said:
>>
>>  ...while lat-long SRIDs (like 4326) say the units are "meters",
>> ST_Distance and other distance and area functions always just work in
>> Cartesian coordinates, meaning they treat long and lat as X and Y in a
>> Cartesian plan and they take parameters and return answers in degrees, which
>> isn't really meaningful.
>
>
>  I believe you and my experience seems to confirm it, however I can't seem
> to reconcile this with the documentation for ST_Distance which says ...
>
>  http://postgis.net/docs/ST_DWithin.html
>>
>> For Geometries: The distance is specified in units defined by the spatial
>> reference system of the geometries. For this function to make sense, the
>> source geometries must both be of the same coordinate projection, having the
>> same SRID.
>
>
> Is the documentation wrong??? or am I just not interpreting it correctly.
>
> If it IS wrong, for god sake I hope they fix it as I have now just wasted
> too much of everybody's time on and issue that could have been cleared up by
> better documentation.
>
> Again, thanks to everybody who replied, you are most generous.
> Regards,
> Mike
>
>
>
> On Tue, Apr 12, 2016 at 9:26 AM, Lee Hachadoorian
> <Lee.Hachadoorian+L at gmail.com> wrote:
>>
>> Michael,
>>
>> There are many ways to accomplish this, and the most efficient approach is
>> going to depend on what else you are using this data for. If literally the
>> *only* thing you are using the geometries for is to calculate distances
>> between postal zones, I would consider creating a table like:
>>
>> CREATE TABLE postal_zone_distance (
>>   source_zone text,
>>   dest_zone text,
>>   distance_m double precision,
>>   PRIMARY KEY (source_zone, dest_zone)
>> );
>>
>> and populating it *once* with distances between all postal codes in the
>> US, so that you can run future queries using an attribute select (WHERE
>> distance_m < 50000) instead of a spatial select. Rerun whenever postal zones
>> change, which shouldn't be all that often. You can populate it with
>> something like:
>>
>> INSERT INTO postal_zone_distance
>> SELECT a.postalcode, b.postalcode, ST_Distance(a.geo_position::geography,
>> b.geo_position::geography)
>> FROM tpostalcoordinate a CROSS JOIN tpostalcoordinate b;
>>
>> Will take a while to run, but future queries will be faster.
>>
>> If you intend to use this for more than just these distances, there are
>> several options, all of which are valid, but again it depends what you want
>> to use it for. The main issue between using Paul's suggestion to just define
>> columns as geography vs keeping in geometry is whether you might visualize
>> the postal zones using a GIS. If you will be using QGIS or ArcGIS to connect
>> to PostGIS to map the zones, you will need to keep in geometry. In this case
>> you can, as Lucas suggest, define a functional index on the geography cast,
>> while keeping the underlying data in geometry.
>>
>> Some other notes:
>>
>> * To get to your original question about units of measurement, while
>> lat-long SRIDs (like 4326) say the units are "meters", ST_Distance and other
>> distance and area functions always just work in Cartesian coordinates,
>> meaning they treat long and lat as X and Y in a Cartesian plan and they take
>> parameters and return answers in degrees, which isn't really meaningful.
>> There are special functions like ST_Distance_Spheroid which take lat-long
>> geometry and return distance in meters on the spheroid. And ST_Distance on
>> geography type also returns meters.
>>
>> * Be careful about your choice of SRID. The fact that you tried to
>> transform (or assign?) this to EPSG 4896 is a red flag because because one
>> shouldn't just "try" SRIDs. Your data come to you in an SRID, and you should
>> only transform it for a particular purpose. In fact you should make sure
>> that you know the datum of your original lat-long geometries. Considering
>> that the data are for the US, it is quite possible that your data are in
>> EPSG 4269 (lat-long North American Datum 1983, used for much US government
>> data including by the Census Bureau), rather than being EPSG 4326 (World
>> Geodetic System 1984, the datum used by GPS satellites).
>>
>> I highly recommend Obe & Hsu's PostGIS in Action
>> (https://www.manning.com/books/postgis-in-action-second-edition), esp. Ch 3
>> on Spatial Reference System Considerations.
>>
>> Best,
>> --Lee
>>
>> --
>> Lee Hachadoorian
>> Assistant Professor of Instruction, Geography & Urban Studies
>> Assistant Director, Professional Science Master's in GIS
>> Temple University
>>
>> On Mon, Apr 11, 2016 at 11:00 PM, Paul Ramsey <pramsey at cleverelephant.ca>
>> wrote:
>>>
>>> Go back to basics, use a geography column from the start:
>>>
>>> ALTER TABLE tpostalcoordinate ADD COLUMN geog geography(point, 4326);
>>> UPDATE tpostalcoordinate SET geog =
>>> geography(ST_SetSRID(ST_MakePoint(longitude, latitude), 4326));
>>> CREATE INDEX tpostalcoordinate_gix ON tpostalcoordinate USING GIST
>>> (geog);
>>> SELECT a.*
>>>   FROM tpostalcoordinate a
>>>   JOIN  tpostalcoordinate  b
>>>   ON ST_DWithin(a.geog, b.geog, 10000)
>>>   WHERE
>>>     a.countrycode2tcountry = 'US' AND
>>>     b.countrycode2tcountry = 'US' AND
>>>     b.postalcode = '94404';
>>>
>>> A geocentric system is one where every point is referenced relative to
>>> the center of the earth, so they all have an x,y,z. It's not something
>>> you want to use.
>>>
>>> P
>>>
>>>
>>> On Mon, Apr 11, 2016 at 12:31 PM, Michael Moore <michaeljmoore at gmail.com>
>>> wrote:
>>> >
>>> >
>>> > On Mon, Apr 11, 2016 at 11:50 AM, Paul Ramsey
>>> > <pramsey at cleverelephant.ca>
>>> > wrote:
>>> >>
>>> >> Before I can even begin to address what's going on with the
>>> >> index/query side, I have to ask about the coordinate system:
>>> >>
>>> >> You are using a geocentric system for postal code points?
>>> >>
>>> >> https://epsg.io/4896
>>> >>
>>> >> This seems very odd indeed. If you were flying a satellite or drilling
>>> >> a deep well, maybe you'd use geocentric coordinates, but simple postal
>>> >> code queries? So, perhaps there's a core error to deal with first: why
>>> >> are you in EPSG:4896?
>>> >>
>>> >> P
>>> >>
>>> >>
>>> >> On Mon, Apr 11, 2016 at 11:21 AM, Michael Moore
>>> >> <michaeljmoore at gmail.com>
>>> >> wrote:
>>> >> > I am trying to find all zip codes withing a given range of current
>>> >> > zip
>>> >> > code.
>>> >> > For example, User is at zip code 95076 and want to know all zip
>>> >> > codes
>>> >> > within
>>> >> > a 30 mile range. This will always be a short distance, nothing over
>>> >> > 50
>>> >> > miles. The source zip code is any place in the USA.  My input table
>>> >> > has
>>> >> > a
>>> >> > latitude field, a longitude field and a POINT field for each zip
>>> >> > code.
>>> >> > I want to supply in input in meters, not degrees. I do NOT want to
>>> >> > cast
>>> >> > to
>>> >> > geography because this prevents the index from being used.
>>> >> > Here is a working query that uses the index, but ST_DWithin  is not
>>> >> > understanding ".05" as meters:
>>> >> >
>>> >> >
>>> >> > lcd1_dev=>   select t2.city, t2.postalcode,
>>> >> > ST_Distance(t1.geo_position,t2.geo_position)
>>> >> > lcd1_dev->   from      tpostalcoordinate t1
>>> >> > lcd1_dev->   left join tpostalcoordinate t2 on
>>> >> > ST_DWithin(t1.geo_position,t2.geo_position , .05)
>>> >> > lcd1_dev->   where t1.postalcode = '94404'and
>>> >> > t1.countrycode2tcountry =
>>> >> > 'US'
>>> >> > and t2.countrycode2tcountry= 'US';
>>> >> >      city     | postalcode |     st_distance
>>> >> > --------------+------------+---------------------
>>> >> >  Redwood City | 94065      |  0.0273766323714193
>>> >> >  San Mateo    | 94408      | 0.00504738546179631
>>> >> >  Belmont      | 94002      |  0.0440065904155286
>>> >> >  San Mateo    | 94404      |                   0
>>> >> >  San Mateo    | 94403      |  0.0370314731005901
>>> >> >  San Mateo    | 94407      |  0.0416118372581607
>>> >> >
>>> >> > This shows that I am using SRID 4896
>>> >> > lcd1_dev=> select * from geometry_columns where f_table_name =
>>> >> > 'tpostalcoordinate';
>>> >> >  f_table_catalog | f_table_schema |   f_table_name    |
>>> >> > f_geometry_column |
>>> >> > coord_dimension | srid | type
>>> >> >
>>> >> >
>>> >> > -----------------+----------------+-------------------+-------------------+-----------------+------+-------
>>> >> >  lcd1_dev        | qsn_app        | tpostalcoordinate | geo_position
>>> >> > |
>>> >> > 2 | 4896 | POINT
>>> >> >
>>> >> > 4896 is UNIT "metre" as show here:
>>> >> >  4896 | EPSG      |      4896 |
>>> >> >
>>> >> >
>>> >> > GEOCCS["ITRF2005",DATUM["International_Terrestrial_Reference_Frame_2005",SPHEROID["GRS
>>> >> > 1980",6378137,298.257222101,AUTHORIT
>>> >> >
>>> >> >
>>> >> > Y["EPSG","7019"]],AUTHORITY["EPSG","6896"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Geocentric
>>> >> > X",OTH
>>> >> > ER],AXIS["Geocentric Y",OTHER],AXIS["Geocentric
>>> >> > Z",NORTH],AUTHORITY["EPSG","4896"]] | +proj=geocent +ellps=GRS80
>>> >> > +units=m
>>> >> > +no_de
>>> >> >
>>> >> > The following (from EXPLAIN) proves that the index is being used in
>>> >> > the
>>> >> > working query shown above:
>>> >> >    ->  Index Scan using tpostalcoordinate_pk on tpostalcoordinate t1
>>> >> > (cost=0.42..8.45 rows=1 width=32)
>>> >> >
>>> >> > What do I need to do to get this in meters without losing my index
>>> >> > access?
>>> >> >
>>> >> > TIA,
>>> >> > Mike
>>> >> >
>>> >> > _______________________________________________
>>> >> > postgis-users mailing list
>>> >> > postgis-users at lists.osgeo.org
>>> >> > http://lists.osgeo.org/mailman/listinfo/postgis-users
>>> >> _______________________________________________
>>> >> postgis-users mailing list
>>> >> postgis-users at lists.osgeo.org
>>> >> http://lists.osgeo.org/mailman/listinfo/postgis-users
>>> >
>>> >
>>> > Here is what we do to create the column on the table:
>>> >
>>> >
>>> >
>>> > 1.    SELECT AddGeometryColumn('qsn_app','tpostalcoordinate',
>>> > 'geo_position',4326,'POINT',2);
>>> >
>>> >
>>> >
>>> > 2.    UPDATE tpostalcoordinate pc set pc.geo_position =
>>> > ST_SetSRID(ST_Point(pc.longitude, pc.latitude), 4326);
>>> >
>>> >
>>> >
>>> > 3.    CREATE INDEX tpostal_geo_position_idx ON tpostalcoordinate USING
>>> > gist(geo_position);
>>> >
>>> >
>>> > I am on EPSG:4896 because I tried about 10 others that were also meters
>>> > and
>>> > that's just the one I ended up on .  There is one thing you said that
>>> > leads
>>> > me to believe there is something I don't understand: "...geocentric
>>> > system...". My understanding is that there are two datatypes to choose
>>> > from
>>> > ; geometry and geography. I have chosen geometry because 1. it should
>>> > be
>>> > faster due to simpler calculations and 2. It should be faster because
>>> > it can
>>> > use the index.  Beyond that I thought that I would then need to choose
>>> > a
>>> > proper SRID. The two criteria for choosing an SRID for me are 1. the
>>> > units
>>> > are meters and 2. it is based on a datum that is appropriate for the
>>> > USA.
>>> > Now the fact that you said: "geocentric system" makes me think that
>>> > maybe
>>> > there is something else I need to be looking at when choosing a proper
>>> > SRID.
>>> > A little more on the application: The user wants to know all of the
>>> > schools
>>> > that are within a specific distance of his current location, his zip
>>> > code.
>>> > Precision is not important. We get the user's zip code from the screen
>>> > and
>>> > we have a list of schools and their zip codes.
>>> >
>>> >
>>> > Thanks,
>>> >
>>> > Mike
>>> >
>>> >
>>> > _______________________________________________
>>> > postgis-users mailing list
>>> > postgis-users at lists.osgeo.org
>>> > http://lists.osgeo.org/mailman/listinfo/postgis-users
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at lists.osgeo.org
>>> http://lists.osgeo.org/mailman/listinfo/postgis-users
>>
>>
>>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users


More information about the postgis-users mailing list