[postgis-users] Merging census block data in postgresql

Tsering W. Shawa shawatw at princeton.edu
Tue May 12 06:39:02 PDT 2020


Hi Paul,

Thanks for your email. I have a background in a GIS but not in a relational database. I am trying to understand conceptually how process should work. This is what I was thinking.....after I import all the individual state census block layer or table on the Postgresql database through the PostGIS Shapefile Import/Export Manager then I have to merge or combine them into one. After I merged all the census block boundaries and imported geocoded location data that is in a shapefile format onto the Postgresql database then I could run the spatial join SQL statement something like below example.



SELECT *

FROM census_blocks AS c

JOIN address AS a

ON ST_Contains (c.geom, a.geom);


I, therefore, trying to understand what options are there to do a similar function in the Postgresql - PostGIS world.

Any help and suggestions will be appreciated.

Thanks,
-Tsering


________________________________
From: postgis-users <postgis-users-bounces at lists.osgeo.org> on behalf of Paul Ramsey <pramsey at cleverelephant.ca>
Sent: Monday, May 11, 2020 4:54 PM
To: PostGIS Users Discussion <postgis-users at lists.osgeo.org>
Subject: Re: [postgis-users] Merging census block data in postgresql

Check for a GIST index on the geometry column of the address points in particular, but also the blocks.
If missing,

CREATE INDEX blocks_geom_x ON blocks USING GIST(geom);
CREATE INDEX addresses_geom_x ON addresses USING GIST(geom);

What SQL you run will depend to some extent on what you are trying to accomplish with the query. I'll assume you just want to make an address/block lookup table.

CREATE table addresses_blocks (
  address_id bigint,
  block_id bigint
);

INSERT INTO addresses_blocks (address_id, block_id)
  SELECT a.address_id, b.block_id
  FROM addresses a
  JOIN blocks b ON ST_Intersects(b.geom, a.geom);

Then just wait. Depending on your version of PgSQL and PostGIS this may or may not result in a parallel plan. Use EXPLAIN on the query before running it for real to see if you're getting multiple workers.

If you aren't getting multiple workers, you can get parallel behaviour by establishing multiple connections to the database and running independent queries in each that work on separate pieces of the problem.

INSERT INTO addresses_blocks (address_id, block_id)
  SELECT a.address_id, b.block_id
  FROM addresses a
  JOIN blocks b ON ST_Intersects(b.geom, a.geom);
  WHERE b.state = 'NY';

INSERT INTO addresses_blocks (address_id, block_id)
  SELECT a.address_id, b.block_id
  FROM addresses a
  JOIN blocks b ON ST_Intersects(b.geom, a.geom);
  WHERE b.state = 'CA';

Etc...
Depending on your platform, you might find that GNU parallel is a useful tool for spawning a set of workers that chew through a task file and run all the jobs you want in an optimally parallel way (ie, 4 cpus => 4 workers).

ATB,

P

> On May 11, 2020, at 1:27 PM, Tsering W. Shawa <shawatw at princeton.edu> wrote:
>
>
> I have to run a spatial join between millions of addresses and the census block boundaries covering whole United States. I can do this function in ArcGIS Pro by merging all the census blocks of each state to one File geodatabase and then run a spatial join tool. Many of you might know that the census block boundaries for the whole United States is over 11 GB and therefore too large for a shapefile.
>
> I am learning PostGIS and wanted to see how fast spatial join function works in PostGIS compare to ArcGIS Pro or QGIS. I have already imported all the census block boundaries in postgresql and plan to import geocoded addresses also in the postgresql database.
>
> My question is...what sql function should I use to merge them together? Do I need to create a new database or table that has all the census block boundaries merged together? Any spatial index I need to use to run it faster?
>
> Any workflow or suggestion or example of sql will be appreciated.
>
> Many thanks in advance.
>
> -Tsering
> _______________________________________________
> postgis-users mailing list
> postgis-users at lists.osgeo.org
> https://lists.osgeo.org/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
postgis-users at lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/postgis-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20200512/285282f8/attachment.html>


More information about the postgis-users mailing list