[postgis-devel] pgsql2shp dump optimization and change

Tue Sep 21 03:11:08 PDT 2004

On Tue, Sep 21, 2004 at 08:17:27PM +1200, Jeremy Palmer wrote:
> Hi,
> 
> I was just try to dump a shape from postgis that had 30 million rows was
> wondering why the initialisation was taking so long. I think I found the
> problem: On the geometry type validation pgsql2shp a distinct clause is
> being used to aggregate geometry types in the table. In PostgreSQL a group
> by aggregation would be must faster. i.e. change
> 
> 	if ( schema )
> 	{
> 		sprintf(query, "SELECT DISTINCT geometrytype(\"%s\") "
> 			"FROM \"%s\".\"%s\" WHERE NOT geometrytype(\"%s\") "
> 			"IS NULL", geo_col_name, schema, table,
> geo_col_name);
> 	}
> 	else
> 	{
> 		sprintf(query, "SELECT DISTINCT geometrytype(\"%s\") "
> 			"FROM \"%s\" WHERE NOT geometrytype(\"%s\") IS
> NULL",
> 			geo_col_name, table, geo_col_name);
> 	}
> 
> To
> 
> 	if ( schema )
> 	{
> 		sprintf(query, "SELECT geometrytype(\"%s\") "
> 			"FROM \"%s\".\"%s\" WHERE NOT geometrytype(\"%s\") "
> 			"IS NULL GROUP BY SELECT geometrytype(\"%s\")", 
>                   geo_col_name, schema, table, geo_col_name, geo_col_name );
> 	}
> 	else
> 	{
> 		sprintf(query, "SELECT geometrytype(\"%s\") "
> 			"FROM \"%s\" WHERE NOT geometrytype(\"%s\") IS NULL
> ",
> 			"GROUP BY geometrytype(\"%s\")", 
>                   geo_col_name, table, geo_col_name, geo_col_name);
> 	}

mmm.. I've made a test on a 386,048-records table.
distinct() ran in 2 secs
group by   ran in 18 secs

> 
> This is a side topic but related: Could it be possible to implement
> pgsql2shp to dump a shape file for each geometry class in a geometry table.
> Either that or also allow pgsql2shp to take a view or query as source input.

pgsql2shp can already be run on views.

> 
> Thanks.
> 

--strk;