[postgis-devel] RFC: Remove mmin and mmax from GBOX
Mark Cave-Ayland
mark.cave-ayland at siriusit.co.uk
Mon Jul 11 08:00:51 PDT 2011
On 11/07/11 15:05, chodgson at refractions.net wrote:
>> If you are starting to ask those type of queries, then M is not an
>> appropriate place for this data - you would spin it out into a separate
>> column within your table and index it appropriately there. Then the
>> PostgreSQL planner has a lot more information to be able to choose a
>> better query plan.
>
> If your query is - In this set of traces, select all the traces within a
> specified bounding box that also pass through regions where the mosquito count
> is within a given range, then I think it is appropriate to put your data in the
> M column, and to index it as 4-D. So I think the semantics of what is a
> "dimension" and what is not are dependent on the problem space more that on the
> units of the M value.
Really? In my experience, stuffing everything into a single
multi-dimensional index is the *worst* way to handle this type of query.
Think of how much larger the multi-dimensional index will become, and
also if you're using mosquito bites/temperature then the fact that these
readings like within a very narrow range will skew the index and make it
much less efficient.
Effectively what you're stating here is that instead of having a table
with columns X, Y, Z and M, a table with a single column of ARRAY[X, Y,
Z, M] would be more efficient which is very unlikely to be the case. For
example, having a separate M column with a B-Tree index means that range
searches on M can eliminate huge chunks of I/O if the lower selectivity
means the comparatively larger M-D index is ignored.
(As a separate off-topic thought, I've never understood why the OGC
model includes M as part of the main object instead of placing the value
into a separate column. It comes across to me that while they understood
its purpose within shapefiles, they didn't quite understand its
equivalence within the relational model)
> However I do agree that there are many situations where it doesn't make sense to
> index the M value, and it would be nice if there was a way to disable it. And it
> also makes sense that these same situations will affect the treatment of the
> data by other functions. I'm not certain that the situation of "4-d index makes
> sense/is useful" is necessarily the same situation as "4th coordinate is a
> 'true' dimension". So perhaps there are two separate bits of metadata to capture
> there.
I agree you bring up a valid point here; how do we control how many
dimensions should be indexed? It seems to me that we both see that there
is something lacking in the current setup, but have different ideas on
how to go about solving this.
ATB,
Mark.
--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063
Sirius Labs: http://www.siriusit.co.uk/labs
More information about the postgis-devel
mailing list