[postgis-users] Improving ST_DWithin Query Performance
robertvc
robert.cazaciuc at baesystems.com
Fri Mar 11 03:39:42 PST 2011
Hi All,
As a new user to PostGIS I thought it would be a good idea to post here and
get some advice/help on my particular problem.
I have a table with around 30,000 entries every 10 minutes or so. The table
format is the following:
-- Table: locations
-- DROP TABLE locations;
CREATE TABLE locations
(
"timestamp" timestamp without time zone NOT NULL,
id integer NOT NULL,
"position" geography(Point,4326),
CONSTRAINT locations_pkey PRIMARY KEY ("timestamp", id),
CONSTRAINT locations_mmsi_fkey FOREIGN KEY (id)
REFERENCES cars (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
OIDS=FALSE
);
ALTER TABLE locations OWNER TO postgres;
-- Index: locations_gix
-- DROP INDEX locations_gix;
CREATE INDEX locations_gix
ON locations
USING gist
("position");
Now, I want to design/implement a function that given a certain timestamp it
will return me a table with all the pairs of the locations within a certain
range of each other (x) at that particular time as well as the distance. The
returned table format would be :
timestamp, id1, id2, distance
I have tried a few things and the best solution that implemented so far is
implementing my own function in pgplsql. The function is the following:
create type carPairs as (t timestamp,id1 int,id2 int,d float8);
create type idDistancePair as (id int,distance float8);
-- Function: getallpairsinrange(timestamp without time zone)
-- DROP FUNCTION getallpairsinrange(timestamp without time zone);
CREATE OR REPLACE FUNCTION getallpairsinrange(timestamp without time zone)
RETURNS SETOF carPairs AS
$BODY$
declare
inRangePair carPairs;
locRow locations%rowtype;
locRowComp idDistancePair;
begin
for locRow in select * from locations where timestamp = $1 loop
for locRowComp in select id, ST_Distance(position, locRow.position) as
distance from locations where timestamp = $1 and ST_DWithin(position,
locRow.position, 3000) and id < locRow.id loop
inRangePair.t := locRow.timestamp;
inRangePair.id1 := locRow.id;
inRangePair.id2 := locRowComp.id;
inRangePair.d := locRowComp.distance;
return next inRangePair;
end loop;
end loop;
return;
end
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100
ROWS 1000;
ALTER FUNCTION getallpairsinrange(timestamp without time zone) OWNER TO
postgres;
This function seems to perform a lot faster than anything I have previously
been able to implement. Now, I would just like to ask you if you think there
is a way to further improve this query to get a better time. At the moment,
the query is executed in around 30 seconds, given that at a certain
timestamp there are around 30 000 locations.
I should also probably mention the following changes in the postgresql.conf
file that I have made after I did some searches online:
#1st change
#max_connections = 100 # (change requires restart)
max_connections = 10
#2nd change
#shared_buffers = 32MB # min 128kB (change requires restart)
shared_buffers = 1024MB
#3rd change
max_locks_per_transaction = 300 # min 10 (change requires restart)
# I don't seem to have the initial value saved anywhere in the file
#4th change
#effective_cache_size = 128MB
effective_cache_size = 1024MB
The computer is running on Windows XP SP2 with 3543MB of memory available.
Any suggestions would be appreciated.
Thank you,
Robert
--
View this message in context: http://old.nabble.com/Improving-ST_DWithin-Query-Performance-tp31124281p31124281.html
Sent from the PostGIS - User mailing list archive at Nabble.com.
More information about the postgis-users
mailing list