<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Hi Trang,<br>
<br>
I think, it could help to create btree indices on "startts" and
"uuid", too, since you are using them in your where clause as a
filter (a probably unnecessary question regarding your date filter:
I would expect the result of "t.startts > '2015-01-16' and
t.startts < '2015-01-17'" to be null because there are no days
between the 16th and the 17th of january - but perhaps it was only
an example...). And in general, my suggestion would be to reduce the
use of st_intersects to the necessary mimimum.<br>
You could use the with clause for the filtration of your input data
and afterwards double join the two tables first on the startloc and
second on the endloc for the assignment of the origin and the
destination zone. Then group by origin and destination zones while
counting your trips and you should have your end result.<br>
<br>
Here is how I would imagine the query:<br>
<br>
with<br>
t as (select trip_id, startloc, endloc from od1.trip_v1 where
startts between 'minimum start date' and 'maximum start date'),<br>
z as (select zone from od1.taz where uuid in ('kansas_303',
'kansas_601', 'kansas_603', etc))<br>
<p class="MsoNormal">select count(t.trip_id) as number_of_trips,
orig.zone as orig_zone, dest.zone as dest_zone from t left join z
as orig on st_intersects(t.startloc, z.geom) left join z as dest
on st_intersects(t.endloc, z.geom) group by orig.zone, dest.zone;<br>
</p>
<p class="MsoNormal"><br>
</p>
However, I am not sure about how the gist indices work together with
the subselects of the with clause...<br>
<br>
Hope this helps,<br>
<br>
Birgit.<br>
<br>
<br>
<br>
<div class="moz-cite-prefix">Am 11.02.2015 um 08:19 schrieb Trang
Nguyen:<br>
</div>
<blockquote
cite="mid:7BA3BA5B1126A649BF2B6E9A6C43B528CEC5A0@COREXH10.inrix.corpnet.local"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<meta name="Generator" content="Microsoft Word 14 (filtered
medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:"Calibri","sans-serif";
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
<div class="WordSection1">
<p class="MsoNormal">Hi,<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I am a newbie to Postgres/PostGIS and have
a long running query that I would like to optimize.<o:p></o:p></p>
<p class="MsoNormal">There are two tables (trip and zone) that I
am joining in the query, one which has “startloc” and “endloc”
columns with type Geometry(Point) and other which contains a
Geometry(MultiPolygon). There are GIST indexes on all above
columns:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">CREATE TABLE od1.trip_v1<o:p></o:p></p>
<p class="MsoNormal">(<o:p></o:p></p>
<p class="MsoNormal"> pkey bigint NOT NULL,<o:p></o:p></p>
<p class="MsoNormal"> trip_id character varying,<o:p></o:p></p>
<p class="MsoNormal"> startts timestamp without time zone,<o:p></o:p></p>
<p class="MsoNormal"> endts timestamp without time zone,<o:p></o:p></p>
<p class="MsoNormal"> startloc geometry(Point),<o:p></o:p></p>
<p class="MsoNormal"> endloc geometry(Point),<o:p></o:p></p>
<p class="MsoNormal"> …<o:p></o:p></p>
<p class="MsoNormal">)<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">CREATE TABLE od1.taz<o:p></o:p></p>
<p class="MsoNormal">(<o:p></o:p></p>
<p class="MsoNormal"> uuid character varying NOT NULL,<o:p></o:p></p>
<p class="MsoNormal"> zone character varying NOT NULL,<o:p></o:p></p>
<p class="MsoNormal"> createdts timestamp without time zone NOT
NULL,<o:p></o:p></p>
<p class="MsoNormal"> updatedts timestamp without time zone NOT
NULL,<o:p></o:p></p>
<p class="MsoNormal"> geom geometry(MultiPolygon) NOT NULL,<o:p></o:p></p>
<p class="MsoNormal"> CONSTRAINT taz_pkey PRIMARY KEY (uuid)<o:p></o:p></p>
<p class="MsoNormal">)<o:p></o:p></p>
<p class="MsoNormal"><br>
I’m interested in building a matrix that, for a given set of
input zones, returns trips along with their start and end
zones. Output looks like:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">10 trips that start at Zone A, ends at Zone
B<o:p></o:p></p>
<p class="MsoNormal">2 trips that start at Zone A, ends at Zone
C<o:p></o:p></p>
<p class="MsoNormal">9 trips that start at Zone A, ends at other<o:p></o:p></p>
<p class="MsoNormal">13 trips that start at Zone C, ends at Zone
D<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Since I am dealing with a large dataset
(> 24 million records and growing), I was planning on
writing a query that returns the trips grouped by each zone
along with match condition (start, end or both) and doing
aggregation on the client layer. I’m not sure whether this is
the best approach but I expect that otherwise, I would end up
having to write a very complex query to handle that type of
aggregation.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Even so, the current query is very slow
with very high cost:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">SELECT t.trip_id, <o:p></o:p></p>
<p class="MsoNormal">case <o:p></o:p></p>
<p class="MsoNormal"> when
ST_intersects(t.startloc, z.geom) and ST_intersects(t.endloc,
z.geom) then 'orig-dest'<o:p></o:p></p>
<p class="MsoNormal"> when
ST_intersects(t.startloc, z.geom) then 'orig'<o:p></o:p></p>
<p class="MsoNormal"> when
ST_intersects(t.endloc, z.geom) then 'dest'<o:p></o:p></p>
<p class="MsoNormal"> else 'none'<o:p></o:p></p>
<p class="MsoNormal">end as match_cond,<o:p></o:p></p>
<p class="MsoNormal">z.zone from od1.trip_v1 t,
od1.taz z <o:p></o:p></p>
<p class="MsoNormal">where t.startts > '2015-01-16' and
t.startts < '2015-01-17' <o:p>
</o:p></p>
<p class="MsoNormal">and z.uuid in ('kansas_303', 'kansas_601',
'kansas_603', etc)<o:p></o:p></p>
<p class="MsoNormal">and ST_intersects(t.startloc, z.geom) <o:p></o:p></p>
<p class="MsoNormal">or ST_intersects(t.endloc, z.geom)<o:p></o:p></p>
<p class="MsoNormal">group by z.zone, t.trip_id, match_cond;<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Explain plan:<o:p></o:p></p>
<p class="MsoNormal">"Group
(cost=231446695055.73..245971533247.59 rows=14240037443
width=3498)"<o:p></o:p></p>
<p class="MsoNormal">" -> Sort
(cost=231446695055.73..231482295149.34 rows=14240037443
width=3498)"<o:p></o:p></p>
<p class="MsoNormal">" Sort Key: z.zone, t.trip_id, (CASE
WHEN ((t.startloc && z.geom) AND
_st_intersects(t.startloc, z.geom) AND (t.endloc &&
z.geom) AND _st_intersects(t.endloc, z.geom)) THEN
'orig-dest'::text WHEN ((t.startloc && z.geom) AND
_st_intersects(t.startloc, (...)"<o:p></o:p></p>
<p class="MsoNormal">" -> Nested Loop
(cost=91.70..14401634128.24 rows=14240037443 width=3498)"<o:p></o:p></p>
<p class="MsoNormal">" -> Seq Scan on taz z
(cost=0.00..739.19 rows=4619 width=3406)"<o:p></o:p></p>
<p class="MsoNormal">" -> Bitmap Heap Scan on
trip_v1 t (cost=91.70..4151.26 rows=453 width=107)"<o:p></o:p></p>
<p class="MsoNormal">" Recheck Cond:
((startloc && z.geom) OR (endloc && z.geom))"<o:p></o:p></p>
<p class="MsoNormal">" Filter: (((startts
> '2015-01-16 00:00:00'::timestamp without time zone) AND
(startts < '2015-01-17 00:00:00'::timestamp without time
zone) AND ((z.uuid)::text = ANY
('{kansas_303,kansas_601,kansas_603,kansas_604,kansas_10,kansas_11,kan
(...)"<o:p></o:p></p>
<p class="MsoNormal">" -> BitmapOr
(cost=91.70..91.70 rows=2706 width=0)"<o:p></o:p></p>
<p class="MsoNormal">" -> Bitmap
Index Scan on idx_trip_v1_startloc (cost=0.00..45.74
rows=1353 width=0)"<o:p></o:p></p>
<p class="MsoNormal">" Index
Cond: (startloc && z.geom)"<o:p></o:p></p>
<p class="MsoNormal">" -> Bitmap
Index Scan on idx_trip_v1_endloc (cost=0.00..45.74 rows=1353
width=0)"<o:p></o:p></p>
<p class="MsoNormal">" Index
Cond: (endloc && z.geom)"<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Some help or suggestions on how to speed up
the query would be much appreciated.<o:p></o:p></p>
<p class="MsoNormal">Also, I currently don’t have a specific map
projection defined on the geom columns so they are using the
default of 0 in postgis. The points in both trip and zone
geometries are lon/lat. Is this an issue for the ST_intersect?<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Thanks,<br>
Trang<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
postgis-users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:postgis-users@lists.osgeo.org">postgis-users@lists.osgeo.org</a>
<a class="moz-txt-link-freetext" href="http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users">http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users</a></pre>
</blockquote>
<br>
</body>
</html>