[postgis-users] Re: Help: Performance of spatial query

Jeremy Palmer jpalmer at paradise.net.nz
Fri Mar 26 04:02:34 PST 2004


Hi sorry to bother everyone again but I'll disparate to find an answer to
this problem. I have tried multiple ways to re-write the query but every
time it crashes the server unless I limit the queries to 15 or so records.

I know that the postgresql Redhat server that I have setup is capable of
running queries on high volumes of data (i.e tables with 20 million rows).
So I get the feeling that is not setup of the server but postgis. I have the
setup the log level up on postmaster to see if anything further is logged
during the crash of the database but all it returned was this:

DEBUG:  PortalRun
DEBUG:  reaping dead processes
DEBUG:  child process (PID 24913) was terminated by signal 11
LOG:  server process (PID 24913) was terminated by signal 11
LOG:  terminating any other active server processes
LOG:  all server processes terminated; reinitializing
DEBUG:  shmem_exit(0)
DEBUG:  invoking IpcMemoryCreate(size=10436608)
LOG:  database system was interrupted at 2004-03-27 11:35:23 NZST
LOG:  checkpoint record is at 0/D6BC1184
LOG:  redo record is at 0/D6BC1184; undo record is at 0/0; shutdown TRUE
LOG:  next transaction ID: 1192; next OID: 10283784
LOG:  database system was not properly shut down; automatic recovery in
progress
LOG:  record with zero length at 0/D6BC11C4
LOG:  redo is not required
LOG:  database system is ready
DEBUG:  proc_exit(0)
DEBUG:  shmem_exit(0)
DEBUG:  exit(0)
DEBUG:  reaping dead processes

Which was not very helpful. Can anyone guide me or help me out to track down
the errors as to why the postgis library is crashing the server.

Much appreciated

Jeremy Palmer


Quoting Jeremy Palmer <jpalmer at paradise.net.nz>

>Hi sorry about my late reply I have been on holiday. I tried using the your
>suggested method with the expand function to create a bounding box instead
>of using the buffer function
>
>i.e
>
>SELECT  NOD1.id as nod_id1,
>        NOD2.id as nod_id2,
>        distance_spheroid(NOD1.shape, NOD2.shape,
>'SPHEROID["GRS_1980",6378137,298.257222101]')
>FROM    node NOD1,
>        node NOD2
>WHERE   expand(NOD1.shape, 0.000002) && NOD2.shape
>AND     distance(NOD1.shape, NOD2.shape) <= 0.000002
>AND     NOD1.id < NOD2.id;
>
>
>However about 3mins into the query it seems that the database crashes and
>the client return this
>message: " server closed the connection unexpectedly
>	This probably means the server terminated abnormally
>	before or while processing the request."
>
>I have checked the shape geometries to see if they are valid plus I have
>re-created the spatial index but I still get the same error message. Is
>there something wrong with my query?
>I don't have a great knowledge of how the postgresql backend server works,
>is it possible to see debug information as to why the crash occurred?
>
>My postgres version is 7.41 and
>Postgis version is CVS from 26/02/2004
>
>Thanks again
>Jeremy Palmer
>
>Quoting Paul Ramsey <pramsey at refractions.net>
>
>Check out the expand function, for a handy shortcut for creating a 
>"slightly larger" bbox to use for this problem.
>
>Chris is right, the slowness is caused by using an expensive function 
> (buffer) in the midst of a high volume query.
>
>On Monday, March 15, 2004, at 09:59 AM, chodgson at refractions.net wrote:
>
>> I expect that using the geos function to buffer your point is slowing 
>> things
>> down because it has to convert into a geos object and back toa postgis 
>> object.
>> It would probably be better to just use the distance function, and 
>> then in
>> order to index it, build a bounding box which is effectively your 
>> buffered
>> point by using the minx() and miny() functions and string 
>> concatenation ie.
>> build the wkt of a box object by concatenating some strings with the 
>> minx() and
>> miny() function. Your where clause should look something like:
>>
>> WHERE NOD2.shape && (concatenated string using minx/y( NOD1.shape) )
>> AND distance( NOD1.shape, NOD2.shape ) <= 0.2
>>
>> I don't know the exact syntax for the string stuff otherwise I'd give 
>> it to
>> you...
>>
>> HTH.
>> Chris
>>
>>
>> Quoting Jeremy Palmer <jpalmer at paradise.net.nz>:
>>
>>> Hi,
>>>
>>> I have been trying to complete a query that tries to locate duplicate 
>>> points
>>> i.e. points that have been accidentally created twice or more  These 
>>> points
>>> may not exactly be in the same geometric position. My basic 
>>> methodology is
>>> to complete a 0.2m proximity check for each geometry point in the 
>>> database.
>>> The datum of the geometries is in NZGD2000 (basically the same 
>>> parameters as
>>> WGS84).
>>> To do this I was using the buffer function with the overlaps operator 
>>> to
>>> find points within 0.000002 of a degree (~ 0.2m) of each other. I was 
>>> also
>>> trying to stop the point reporting on itself by checking that NOD1.id 
>>> <
>>> NOD2.id. However as explain query shows below my method produces a 
>>> Cartesian
>>> product between the 2 aliases of the node table.
>>>
>>> Can anyone help me with constructing a more efficient query? I have 
>>> provided
>>> the query explain and table details if that helps.
>>>
>>>
>>> SELECT NOD1.id as nod_id1,
>>>        NOD2.id as nod_id2,
>>>        NOD1.shape as shape1,
>>>        NOD2.shape as shape2
>>> FROM node NOD1,
>>>           node NOD2
>>> WHERE NOD2.shape && buffer(NOD1.shape, 0.000002)
>>> AND   NOD1.id < NOD2.id;
>>>
>>> Explain Query was 
>>>
>>>   Nested Loop  (cost=590312.88..5062636139085.67 rows=3509048593179
>>> width=288)
>>>    Join Filter: (("inner".shape && buffer("outer".shape, 2e-06::double
>>> precision)) AND ("outer".id < "inner".id))
>>>    ->  Seq Scan on crs_node nod1  (cost=0.00..379897.88 rows=10260188
>>> width=144)
>>>    ->  Materialize  (cost=590312.88..903329.76 rows=10260188 
>>> width=144)
>>>          ->  Seq Scan on crs_node nod2  (cost=0.00..379897.88 
>>> rows=10260188
>>> width=144)
>>>
>>> Table "public. node"
>>>      Column      |         Type         | Modifiers
>>> -----------------+----------------------+-----------
>>>  id              | integer              | not null
>>>  cos_id_official | integer              | not null
>>>  type            | character varying(4) | not null
>>>  status          | character varying(4) | not null
>>>  order_group_off | integer              | not null
>>>  sit_id          | integer              |
>>>  alt_id          | integer              |
>>>  wrk_id_created  | integer              |
>>>  se_row_id       | integer              |
>>>  audit_id        | integer              | not null
>>>  shape           | geometry             |
>>> Indexes:
>>>     "pkey_node" primary key, btree (id)
>>>     "nod_aud_id" unique, btree (audit_id)
>>>     "fk_nod_alt" btree (alt_id)
>>>     "fk_nod_cos" btree (cos_id_official)
>>>     "fk_nod_sit" btree (sit_id)
>>>     "fk_nod_wrk" btree (wrk_id_created)
>>>     "nod_shape_index" gist (shape)
>>> Check constraints:
>>>     "$1" CHECK (srid(shape) = 104108)
>>>     "$2" CHECK (geometrytype(shape) = 'POINT'::text OR shape IS NULL)
>>>
>>> Thank you very much
>>>
>>> Jeremy Palmer
>>>
>>>
>>>
>>> _______________________________________________
>>> postgis-users mailing list
>>> postgis-users at postgis.refractions.net
>>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>>
>>
>>
>>
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
 >     Paul Ramsey
 >     Refractions Research
 >     Email: pramsey at refractions.net
 >     Phone: (250) 885-0632





More information about the postgis-users mailing list