[postgis-users] Using SRID with meters, but still not getting UNITS in meters with ST_DWithin
Paul Ramsey
pramsey at cleverelephant.ca
Wed Apr 13 11:08:14 PDT 2016
Yep. go forth and sin no more.
P
On Tue, Apr 12, 2016 at 4:45 PM, Michael Moore <michaeljmoore at gmail.com> wrote:
> Okay, let's see if I finally get this. The reason I need to start with SRID
> 4326 is because the Coordinate system has an Axes of latitude and longitude
> and I only have latitude and longitude to start with. My starting latitude
> and longitude were based on the WSG84 Datum. looking at Coordinate system
> for https://epsg.io/4326. 4326 is also based on WSG84, so it's a match.
> Unfortunately for me, 4326 is ellipsoidal and as such
> the Unit is degrees. So in order do do my calculations in meters, I need to
> "translate" my point onto a flat map. 2163 represents such a map as as would
> be expected with a flat map, the unit is in meters.
> Also there is the little issue of the "covered area" that needs to be taken
> into consideration but I understand that.
> Thanks!
> Mike
>
> On Tue, Apr 12, 2016 at 12:50 PM, Paul Ramsey <pramsey at cleverelephant.ca>
> wrote:
>>
>> This:
>>
>> "while lat-long SRIDs (like 4326) say the units are "meters""
>>
>> is not correct. Long-lat SRS's pretty clearly state their units are
>> degrees. More to the point, the units are evaluated in a spherical
>> context, not a planar one, so applying planar calculations to them
>> will yield junk answers. You want to apply planar calculations to
>> planar coordinates, and spherical calculations to spherical
>> coordinates.
>>
>> This is why we have the geometry (planar) and geography (spherical)
>> bifurcation.
>>
>> In my previous email I gave you a recipe to run your problem in
>> geography (spherical) space. Since you've sort of indicated that your
>> data are USA only, I'll also give you a planar (geometry) recipe,
>> using a suitable planar projection for the USA.
>>
>> We will use the USA National Atlas Albers projection
>> (https://epsg.io/2163) for this example. It won't provide meter-level
>> accuracy over such a wide area, but presumable for something as coarse
>> as a "postal codes in radius" query, that won't matter.
>>
>> ALTER TABLE tpostalcoordinate ADD COLUMN geom geometry(point, 2136);
>> UPDATE tpostalcoordinate SET geom =
>> ST_Transform(ST_SetSRID(ST_MakePoint(longitude, latitude), 4326),2163);
>> CREATE INDEX tpostalcoordinate_geom_ix ON tpostalcoordinate USING GIST
>> (geom);
>> SELECT a.*
>> FROM tpostalcoordinate a
>> JOIN tpostalcoordinate b
>> ON ST_DWithin(a.geom, b.geom, 10000)
>> WHERE
>> a.countrycode2tcountry = 'US' AND
>> b.countrycode2tcountry = 'US' AND
>> b.postalcode = '94404';
>>
>> Note the differences with the geography approach:
>>
>> - use use the geometry type, and we populate the column with data
>> transformed into our working planar projection
>> - then we're free to just use ST_DWithin(geom, geom, radius) and
>> everything magically works fine.
>>
>> ATB,
>>
>> P
>>
>>
>> On Tue, Apr 12, 2016 at 12:40 PM, Michael Moore <michaeljmoore at gmail.com>
>> wrote:
>> > Lee, Paul,
>> > Thanks so much for your time.
>> > Lee,
>> > My understanding is that there are about 42,000 zips in the US, So a
>> > cross
>> > join should result in about 1.7 Billion entries into the
>> > postal_zone_distance table. Maybe that's not so bad, I'll have to run
>> > the
>> > numbers. Where as this might be the fastest possible solution, it's
>> > probably
>> > over-kill for my meager needs. I understand what you mean by needing to
>> > know
>> > first the SRID of the Latitude and Longitude from my source table. It
>> > means
>> > very little without knowing the Datum from which they were derived.
>> > Thanks
>> > to some youtube videos I think I grasp this.
>> >
>> > You said:
>> >>
>> >> ...while lat-long SRIDs (like 4326) say the units are "meters",
>> >> ST_Distance and other distance and area functions always just work in
>> >> Cartesian coordinates, meaning they treat long and lat as X and Y in a
>> >> Cartesian plan and they take parameters and return answers in degrees,
>> >> which
>> >> isn't really meaningful.
>> >
>> >
>> > I believe you and my experience seems to confirm it, however I can't
>> > seem
>> > to reconcile this with the documentation for ST_Distance which says ...
>> >
>> > http://postgis.net/docs/ST_DWithin.html
>> >>
>> >> For Geometries: The distance is specified in units defined by the
>> >> spatial
>> >> reference system of the geometries. For this function to make sense,
>> >> the
>> >> source geometries must both be of the same coordinate projection,
>> >> having the
>> >> same SRID.
>> >
>> >
>> > Is the documentation wrong??? or am I just not interpreting it
>> > correctly.
>> >
>> > If it IS wrong, for god sake I hope they fix it as I have now just
>> > wasted
>> > too much of everybody's time on and issue that could have been cleared
>> > up by
>> > better documentation.
>> >
>> > Again, thanks to everybody who replied, you are most generous.
>> > Regards,
>> > Mike
>> >
>> >
>> >
>> > On Tue, Apr 12, 2016 at 9:26 AM, Lee Hachadoorian
>> > <Lee.Hachadoorian+L at gmail.com> wrote:
>> >>
>> >> Michael,
>> >>
>> >> There are many ways to accomplish this, and the most efficient approach
>> >> is
>> >> going to depend on what else you are using this data for. If literally
>> >> the
>> >> *only* thing you are using the geometries for is to calculate distances
>> >> between postal zones, I would consider creating a table like:
>> >>
>> >> CREATE TABLE postal_zone_distance (
>> >> source_zone text,
>> >> dest_zone text,
>> >> distance_m double precision,
>> >> PRIMARY KEY (source_zone, dest_zone)
>> >> );
>> >>
>> >> and populating it *once* with distances between all postal codes in the
>> >> US, so that you can run future queries using an attribute select (WHERE
>> >> distance_m < 50000) instead of a spatial select. Rerun whenever postal
>> >> zones
>> >> change, which shouldn't be all that often. You can populate it with
>> >> something like:
>> >>
>> >> INSERT INTO postal_zone_distance
>> >> SELECT a.postalcode, b.postalcode,
>> >> ST_Distance(a.geo_position::geography,
>> >> b.geo_position::geography)
>> >> FROM tpostalcoordinate a CROSS JOIN tpostalcoordinate b;
>> >>
>> >> Will take a while to run, but future queries will be faster.
>> >>
>> >> If you intend to use this for more than just these distances, there are
>> >> several options, all of which are valid, but again it depends what you
>> >> want
>> >> to use it for. The main issue between using Paul's suggestion to just
>> >> define
>> >> columns as geography vs keeping in geometry is whether you might
>> >> visualize
>> >> the postal zones using a GIS. If you will be using QGIS or ArcGIS to
>> >> connect
>> >> to PostGIS to map the zones, you will need to keep in geometry. In this
>> >> case
>> >> you can, as Lucas suggest, define a functional index on the geography
>> >> cast,
>> >> while keeping the underlying data in geometry.
>> >>
>> >> Some other notes:
>> >>
>> >> * To get to your original question about units of measurement, while
>> >> lat-long SRIDs (like 4326) say the units are "meters", ST_Distance and
>> >> other
>> >> distance and area functions always just work in Cartesian coordinates,
>> >> meaning they treat long and lat as X and Y in a Cartesian plan and they
>> >> take
>> >> parameters and return answers in degrees, which isn't really
>> >> meaningful.
>> >> There are special functions like ST_Distance_Spheroid which take
>> >> lat-long
>> >> geometry and return distance in meters on the spheroid. And ST_Distance
>> >> on
>> >> geography type also returns meters.
>> >>
>> >> * Be careful about your choice of SRID. The fact that you tried to
>> >> transform (or assign?) this to EPSG 4896 is a red flag because because
>> >> one
>> >> shouldn't just "try" SRIDs. Your data come to you in an SRID, and you
>> >> should
>> >> only transform it for a particular purpose. In fact you should make
>> >> sure
>> >> that you know the datum of your original lat-long geometries.
>> >> Considering
>> >> that the data are for the US, it is quite possible that your data are
>> >> in
>> >> EPSG 4269 (lat-long North American Datum 1983, used for much US
>> >> government
>> >> data including by the Census Bureau), rather than being EPSG 4326
>> >> (World
>> >> Geodetic System 1984, the datum used by GPS satellites).
>> >>
>> >> I highly recommend Obe & Hsu's PostGIS in Action
>> >> (https://www.manning.com/books/postgis-in-action-second-edition), esp.
>> >> Ch 3
>> >> on Spatial Reference System Considerations.
>> >>
>> >> Best,
>> >> --Lee
>> >>
>> >> --
>> >> Lee Hachadoorian
>> >> Assistant Professor of Instruction, Geography & Urban Studies
>> >> Assistant Director, Professional Science Master's in GIS
>> >> Temple University
>> >>
>> >> On Mon, Apr 11, 2016 at 11:00 PM, Paul Ramsey
>> >> <pramsey at cleverelephant.ca>
>> >> wrote:
>> >>>
>> >>> Go back to basics, use a geography column from the start:
>> >>>
>> >>> ALTER TABLE tpostalcoordinate ADD COLUMN geog geography(point, 4326);
>> >>> UPDATE tpostalcoordinate SET geog =
>> >>> geography(ST_SetSRID(ST_MakePoint(longitude, latitude), 4326));
>> >>> CREATE INDEX tpostalcoordinate_gix ON tpostalcoordinate USING GIST
>> >>> (geog);
>> >>> SELECT a.*
>> >>> FROM tpostalcoordinate a
>> >>> JOIN tpostalcoordinate b
>> >>> ON ST_DWithin(a.geog, b.geog, 10000)
>> >>> WHERE
>> >>> a.countrycode2tcountry = 'US' AND
>> >>> b.countrycode2tcountry = 'US' AND
>> >>> b.postalcode = '94404';
>> >>>
>> >>> A geocentric system is one where every point is referenced relative to
>> >>> the center of the earth, so they all have an x,y,z. It's not something
>> >>> you want to use.
>> >>>
>> >>> P
>> >>>
>> >>>
>> >>> On Mon, Apr 11, 2016 at 12:31 PM, Michael Moore
>> >>> <michaeljmoore at gmail.com>
>> >>> wrote:
>> >>> >
>> >>> >
>> >>> > On Mon, Apr 11, 2016 at 11:50 AM, Paul Ramsey
>> >>> > <pramsey at cleverelephant.ca>
>> >>> > wrote:
>> >>> >>
>> >>> >> Before I can even begin to address what's going on with the
>> >>> >> index/query side, I have to ask about the coordinate system:
>> >>> >>
>> >>> >> You are using a geocentric system for postal code points?
>> >>> >>
>> >>> >> https://epsg.io/4896
>> >>> >>
>> >>> >> This seems very odd indeed. If you were flying a satellite or
>> >>> >> drilling
>> >>> >> a deep well, maybe you'd use geocentric coordinates, but simple
>> >>> >> postal
>> >>> >> code queries? So, perhaps there's a core error to deal with first:
>> >>> >> why
>> >>> >> are you in EPSG:4896?
>> >>> >>
>> >>> >> P
>> >>> >>
>> >>> >>
>> >>> >> On Mon, Apr 11, 2016 at 11:21 AM, Michael Moore
>> >>> >> <michaeljmoore at gmail.com>
>> >>> >> wrote:
>> >>> >> > I am trying to find all zip codes withing a given range of
>> >>> >> > current
>> >>> >> > zip
>> >>> >> > code.
>> >>> >> > For example, User is at zip code 95076 and want to know all zip
>> >>> >> > codes
>> >>> >> > within
>> >>> >> > a 30 mile range. This will always be a short distance, nothing
>> >>> >> > over
>> >>> >> > 50
>> >>> >> > miles. The source zip code is any place in the USA. My input
>> >>> >> > table
>> >>> >> > has
>> >>> >> > a
>> >>> >> > latitude field, a longitude field and a POINT field for each zip
>> >>> >> > code.
>> >>> >> > I want to supply in input in meters, not degrees. I do NOT want
>> >>> >> > to
>> >>> >> > cast
>> >>> >> > to
>> >>> >> > geography because this prevents the index from being used.
>> >>> >> > Here is a working query that uses the index, but ST_DWithin is
>> >>> >> > not
>> >>> >> > understanding ".05" as meters:
>> >>> >> >
>> >>> >> >
>> >>> >> > lcd1_dev=> select t2.city, t2.postalcode,
>> >>> >> > ST_Distance(t1.geo_position,t2.geo_position)
>> >>> >> > lcd1_dev-> from tpostalcoordinate t1
>> >>> >> > lcd1_dev-> left join tpostalcoordinate t2 on
>> >>> >> > ST_DWithin(t1.geo_position,t2.geo_position , .05)
>> >>> >> > lcd1_dev-> where t1.postalcode = '94404'and
>> >>> >> > t1.countrycode2tcountry =
>> >>> >> > 'US'
>> >>> >> > and t2.countrycode2tcountry= 'US';
>> >>> >> > city | postalcode | st_distance
>> >>> >> > --------------+------------+---------------------
>> >>> >> > Redwood City | 94065 | 0.0273766323714193
>> >>> >> > San Mateo | 94408 | 0.00504738546179631
>> >>> >> > Belmont | 94002 | 0.0440065904155286
>> >>> >> > San Mateo | 94404 | 0
>> >>> >> > San Mateo | 94403 | 0.0370314731005901
>> >>> >> > San Mateo | 94407 | 0.0416118372581607
>> >>> >> >
>> >>> >> > This shows that I am using SRID 4896
>> >>> >> > lcd1_dev=> select * from geometry_columns where f_table_name =
>> >>> >> > 'tpostalcoordinate';
>> >>> >> > f_table_catalog | f_table_schema | f_table_name |
>> >>> >> > f_geometry_column |
>> >>> >> > coord_dimension | srid | type
>> >>> >> >
>> >>> >> >
>> >>> >> >
>> >>> >> > -----------------+----------------+-------------------+-------------------+-----------------+------+-------
>> >>> >> > lcd1_dev | qsn_app | tpostalcoordinate |
>> >>> >> > geo_position
>> >>> >> > |
>> >>> >> > 2 | 4896 | POINT
>> >>> >> >
>> >>> >> > 4896 is UNIT "metre" as show here:
>> >>> >> > 4896 | EPSG | 4896 |
>> >>> >> >
>> >>> >> >
>> >>> >> >
>> >>> >> > GEOCCS["ITRF2005",DATUM["International_Terrestrial_Reference_Frame_2005",SPHEROID["GRS
>> >>> >> > 1980",6378137,298.257222101,AUTHORIT
>> >>> >> >
>> >>> >> >
>> >>> >> >
>> >>> >> > Y["EPSG","7019"]],AUTHORITY["EPSG","6896"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Geocentric
>> >>> >> > X",OTH
>> >>> >> > ER],AXIS["Geocentric Y",OTHER],AXIS["Geocentric
>> >>> >> > Z",NORTH],AUTHORITY["EPSG","4896"]] | +proj=geocent +ellps=GRS80
>> >>> >> > +units=m
>> >>> >> > +no_de
>> >>> >> >
>> >>> >> > The following (from EXPLAIN) proves that the index is being used
>> >>> >> > in
>> >>> >> > the
>> >>> >> > working query shown above:
>> >>> >> > -> Index Scan using tpostalcoordinate_pk on tpostalcoordinate
>> >>> >> > t1
>> >>> >> > (cost=0.42..8.45 rows=1 width=32)
>> >>> >> >
>> >>> >> > What do I need to do to get this in meters without losing my
>> >>> >> > index
>> >>> >> > access?
>> >>> >> >
>> >>> >> > TIA,
>> >>> >> > Mike
>> >>> >> >
>> >>> >> > _______________________________________________
>> >>> >> > postgis-users mailing list
>> >>> >> > postgis-users at lists.osgeo.org
>> >>> >> > http://lists.osgeo.org/mailman/listinfo/postgis-users
>> >>> >> _______________________________________________
>> >>> >> postgis-users mailing list
>> >>> >> postgis-users at lists.osgeo.org
>> >>> >> http://lists.osgeo.org/mailman/listinfo/postgis-users
>> >>> >
>> >>> >
>> >>> > Here is what we do to create the column on the table:
>> >>> >
>> >>> >
>> >>> >
>> >>> > 1. SELECT AddGeometryColumn('qsn_app','tpostalcoordinate',
>> >>> > 'geo_position',4326,'POINT',2);
>> >>> >
>> >>> >
>> >>> >
>> >>> > 2. UPDATE tpostalcoordinate pc set pc.geo_position =
>> >>> > ST_SetSRID(ST_Point(pc.longitude, pc.latitude), 4326);
>> >>> >
>> >>> >
>> >>> >
>> >>> > 3. CREATE INDEX tpostal_geo_position_idx ON tpostalcoordinate
>> >>> > USING
>> >>> > gist(geo_position);
>> >>> >
>> >>> >
>> >>> > I am on EPSG:4896 because I tried about 10 others that were also
>> >>> > meters
>> >>> > and
>> >>> > that's just the one I ended up on . There is one thing you said
>> >>> > that
>> >>> > leads
>> >>> > me to believe there is something I don't understand: "...geocentric
>> >>> > system...". My understanding is that there are two datatypes to
>> >>> > choose
>> >>> > from
>> >>> > ; geometry and geography. I have chosen geometry because 1. it
>> >>> > should
>> >>> > be
>> >>> > faster due to simpler calculations and 2. It should be faster
>> >>> > because
>> >>> > it can
>> >>> > use the index. Beyond that I thought that I would then need to
>> >>> > choose
>> >>> > a
>> >>> > proper SRID. The two criteria for choosing an SRID for me are 1. the
>> >>> > units
>> >>> > are meters and 2. it is based on a datum that is appropriate for the
>> >>> > USA.
>> >>> > Now the fact that you said: "geocentric system" makes me think that
>> >>> > maybe
>> >>> > there is something else I need to be looking at when choosing a
>> >>> > proper
>> >>> > SRID.
>> >>> > A little more on the application: The user wants to know all of the
>> >>> > schools
>> >>> > that are within a specific distance of his current location, his zip
>> >>> > code.
>> >>> > Precision is not important. We get the user's zip code from the
>> >>> > screen
>> >>> > and
>> >>> > we have a list of schools and their zip codes.
>> >>> >
>> >>> >
>> >>> > Thanks,
>> >>> >
>> >>> > Mike
>> >>> >
>> >>> >
>> >>> > _______________________________________________
>> >>> > postgis-users mailing list
>> >>> > postgis-users at lists.osgeo.org
>> >>> > http://lists.osgeo.org/mailman/listinfo/postgis-users
>> >>> _______________________________________________
>> >>> postgis-users mailing list
>> >>> postgis-users at lists.osgeo.org
>> >>> http://lists.osgeo.org/mailman/listinfo/postgis-users
>> >>
>> >>
>> >>
>> >
>> >
>> > _______________________________________________
>> > postgis-users mailing list
>> > postgis-users at lists.osgeo.org
>> > http://lists.osgeo.org/mailman/listinfo/postgis-users
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/postgis-users
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/postgis-users
More information about the postgis-users
mailing list