[postgis-devel] Generation of statistics fails for 3D geometry attribute

Paul Ramsey pramsey at cleverelephant.ca
Sun Feb 3 16:12:13 PST 2019


Several questions here...

> On Jan 31, 2019, at 9:04 AM, Ditzel, Lars <lars.ditzel at here.com> wrote:
> 
> Hello All,
>  
> We use a generic service to store spatial data in a Postgres backend.
> The service defines a generic table structure with a 3D geometry column as follows:
>                 geometry(GeometryZ,4326)
> with a spatial index of
> "table" gist (geo)
>  
> However, from a use case perspective we only have 2D data. So the data in the table e.g. looks like this:
> LINESTRING Z (24.2062447 47.7438004 nan,24.2072655 47.7446771 nan, …)
> with a bounding box of
> BOX3D(-71.14481 42.74781 nan,-71.14431 42.7482 nan)
>  
> After users had reported slow queries we came to know that the analyze command fails for the table with:
>                 NOTICE:  no non-null/empty features, unable to compute statistics
>  
> By looking at the source code it became clear that all samples from the table are considered to have broken geometry, hence statistics creation fails.
> Without debug information available we are not sure were exactly execution fails, but there are two calls to the method which does the actual work, once for 2D and once for ND modes:
>  
> static void
> compute_gserialized_stats(VacAttrStats *stats, AnalyzeAttrFetchFunc fetchfunc,
>                           int sample_rows, double total_rows)
> {
>        /* 2D Mode */
>        compute_gserialized_stats_mode(stats, fetchfunc, sample_rows, total_rows, 2);
>        /* ND Mode */
>        compute_gserialized_stats_mode(stats, fetchfunc, sample_rows, total_rows, 0);
> }
>  
> I assume that invocation for 2D mode is fine, but within ND mode it fails, because the bounding boxes have a Z value of NaN.
>  
> So far, so good. But what I do not get here is, why do we need the ND mode here at all, as the spatial index is 2D only, as per PostGIS documentation?
> What is the reasoning behind this implementation, and what would be our options to work around that failure, other than writing 0 values as Z ordinates?

We don’t “need” it per se, but the stats are gathered per-column, not per-operator, so we have to gather stats suitable for all the operators that might be called on it. In theory we could only gather an n-d histogram, but in practice having separate 2d stats works “better”. I cannot remember why. 

Working around… yeah, having the z ordinates be valid numbers certainly wouldn’t hurt. I cannot honestly thing of another way. Just store 2D features? Why have the Z stored if you’re just storing junk in it?

> Wouldn’t it be preferable to treat failure of ND mode independent of success of 2D mode?

Yes, I think that’s a reasonable enhancement. There are also folks with ZM shape files where the app has filled in the values with REALLY HUGE NUMBER, and that too stymies the stats gatherer.

ATB,

P.

>  
> Thanks in advance,
> Lars
>  
> _______________________________________________
> postgis-devel mailing list
> postgis-devel at lists.osgeo.org <mailto:postgis-devel at lists.osgeo.org>
> https://lists.osgeo.org/mailman/listinfo/postgis-devel <https://lists.osgeo.org/mailman/listinfo/postgis-devel>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-devel/attachments/20190203/bb17d3b9/attachment.html>


More information about the postgis-devel mailing list