<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    Hi Trang,<br>

    <br>

    I think, it could help to create btree indices on "startts" and

    "uuid", too, since you are using them in your where clause as a

    filter (a probably unnecessary question regarding your date filter:

    I would expect the result of "t.startts > '2015-01-16' and

    t.startts < '2015-01-17'" to be null because there are no days

    between the 16th and the 17th of january - but perhaps it was only

    an example...). And in general, my suggestion would be to reduce the

    use of st_intersects to the necessary mimimum.<br>

    You could use the with clause for the filtration of your input data

    and afterwards double join the two tables first on the startloc and

    second on the endloc for the assignment of the origin and the

    destination zone. Then group by origin and destination zones while

    counting your trips and you should have your end result.<br>

    <br>

    Here is how I would imagine the query:<br>

    <br>

    with<br>

    t as (select trip_id, startloc, endloc from od1.trip_v1 where

    startts between 'minimum start date' and 'maximum start date'),<br>

    z as (select zone from od1.taz where uuid in ('kansas_303',

    'kansas_601', 'kansas_603', etc))<br>

    <p class="MsoNormal">select  count(t.trip_id) as number_of_trips,

      orig.zone as orig_zone, dest.zone as dest_zone from t left join z

      as orig on st_intersects(t.startloc, z.geom) left join z as dest

      on st_intersects(t.endloc, z.geom) group by orig.zone, dest.zone;<br>

    </p>

    <p class="MsoNormal"><br>

    </p>

    However, I am not sure about how the gist indices work together with

    the subselects of the with clause...<br>

    <br>

    Hope this helps,<br>

    <br>

    Birgit.<br>

    <br>

    <br>

    <br>

    <div class="moz-cite-prefix">Am 11.02.2015 um 08:19 schrieb Trang

      Nguyen:<br>

    </div>

    <blockquote

cite="mid:7BA3BA5B1126A649BF2B6E9A6C43B528CEC5A0@COREXH10.inrix.corpnet.local"

      type="cite">

      <meta http-equiv="Content-Type" content="text/html;

        charset=windows-1252">

      <meta name="Generator" content="Microsoft Word 14 (filtered

        medium)">

      <style><!--

/* Font Definitions */

@font-face

        {font-family:Calibri;

        panose-1:2 15 5 2 2 2 4 3 2 4;}

/* Style Definitions */

p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0in;

        margin-bottom:.0001pt;

        font-size:11.0pt;

        font-family:"Calibri","sans-serif";}

a:link, span.MsoHyperlink

        {mso-style-priority:99;

        color:blue;

        text-decoration:underline;}

a:visited, span.MsoHyperlinkFollowed

        {mso-style-priority:99;

        color:purple;

        text-decoration:underline;}

span.EmailStyle17

        {mso-style-type:personal-compose;

        font-family:"Calibri","sans-serif";

        color:windowtext;}

.MsoChpDefault

        {mso-style-type:export-only;

        font-family:"Calibri","sans-serif";}

@page WordSection1

        {size:8.5in 11.0in;

        margin:1.0in 1.0in 1.0in 1.0in;}

div.WordSection1

        {page:WordSection1;}

--></style><!--[if gte mso 9]><xml>

<o:shapedefaults v:ext="edit" spidmax="1026" />

</xml><![endif]--><!--[if gte mso 9]><xml>

<o:shapelayout v:ext="edit">

<o:idmap v:ext="edit" data="1" />

</o:shapelayout></xml><![endif]-->

      <div class="WordSection1">

        <p class="MsoNormal">Hi,<o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">I am a newbie to Postgres/PostGIS and have

          a long running query that I would like to optimize.<o:p></o:p></p>

        <p class="MsoNormal">There are two tables (trip and zone) that I

          am joining in the query, one which has “startloc” and “endloc”

          columns with type Geometry(Point) and other which contains a

          Geometry(MultiPolygon). There are GIST indexes on all above

          columns:<o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">CREATE TABLE od1.trip_v1<o:p></o:p></p>

        <p class="MsoNormal">(<o:p></o:p></p>

        <p class="MsoNormal">  pkey bigint NOT NULL,<o:p></o:p></p>

        <p class="MsoNormal">  trip_id character varying,<o:p></o:p></p>

        <p class="MsoNormal">  startts timestamp without time zone,<o:p></o:p></p>

        <p class="MsoNormal">  endts timestamp without time zone,<o:p></o:p></p>

        <p class="MsoNormal">  startloc geometry(Point),<o:p></o:p></p>

        <p class="MsoNormal">  endloc geometry(Point),<o:p></o:p></p>

        <p class="MsoNormal">  …<o:p></o:p></p>

        <p class="MsoNormal">)<o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">CREATE TABLE od1.taz<o:p></o:p></p>

        <p class="MsoNormal">(<o:p></o:p></p>

        <p class="MsoNormal">  uuid character varying NOT NULL,<o:p></o:p></p>

        <p class="MsoNormal">  zone character varying NOT NULL,<o:p></o:p></p>

        <p class="MsoNormal">  createdts timestamp without time zone NOT

          NULL,<o:p></o:p></p>

        <p class="MsoNormal">  updatedts timestamp without time zone NOT

          NULL,<o:p></o:p></p>

        <p class="MsoNormal">  geom geometry(MultiPolygon) NOT NULL,<o:p></o:p></p>

        <p class="MsoNormal">  CONSTRAINT taz_pkey PRIMARY KEY (uuid)<o:p></o:p></p>

        <p class="MsoNormal">)<o:p></o:p></p>

        <p class="MsoNormal"><br>

          I’m interested in building a matrix that, for a given set of

          input zones, returns trips along with their start and end

          zones. Output looks like:<o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">10 trips that start at Zone A, ends at Zone

          B<o:p></o:p></p>

        <p class="MsoNormal">2 trips that start at Zone A, ends at Zone

          C<o:p></o:p></p>

        <p class="MsoNormal">9 trips that start at Zone A, ends at other<o:p></o:p></p>

        <p class="MsoNormal">13 trips that start at Zone C, ends at Zone

          D<o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">Since I am dealing with a large dataset

          (> 24 million records and growing), I was planning on

          writing a query that returns the trips grouped by each zone

          along with match condition (start, end or both) and doing

          aggregation on the client layer. I’m not sure whether this is

          the best approach but I expect that otherwise, I would end up

          having to write a very complex query to handle that type of

          aggregation.<o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">Even so, the current query is very slow

          with very high cost:<o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">SELECT t.trip_id, <o:p></o:p></p>

        <p class="MsoNormal">case <o:p></o:p></p>

        <p class="MsoNormal">                when

          ST_intersects(t.startloc, z.geom) and ST_intersects(t.endloc,

          z.geom) then 'orig-dest'<o:p></o:p></p>

        <p class="MsoNormal">                when

          ST_intersects(t.startloc, z.geom) then 'orig'<o:p></o:p></p>

        <p class="MsoNormal">                when

          ST_intersects(t.endloc, z.geom) then 'dest'<o:p></o:p></p>

        <p class="MsoNormal">                else 'none'<o:p></o:p></p>

        <p class="MsoNormal">end  as match_cond,<o:p></o:p></p>

        <p class="MsoNormal">z.zone from od1.trip_v1               t,

          od1.taz z <o:p></o:p></p>

        <p class="MsoNormal">where t.startts > '2015-01-16' and

          t.startts < '2015-01-17' <o:p>

          </o:p></p>

        <p class="MsoNormal">and z.uuid in ('kansas_303', 'kansas_601',

          'kansas_603', etc)<o:p></o:p></p>

        <p class="MsoNormal">and ST_intersects(t.startloc, z.geom) <o:p></o:p></p>

        <p class="MsoNormal">or ST_intersects(t.endloc, z.geom)<o:p></o:p></p>

        <p class="MsoNormal">group by z.zone, t.trip_id, match_cond;<o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">Explain plan:<o:p></o:p></p>

        <p class="MsoNormal">"Group 

          (cost=231446695055.73..245971533247.59 rows=14240037443

          width=3498)"<o:p></o:p></p>

        <p class="MsoNormal">"  ->  Sort 

          (cost=231446695055.73..231482295149.34 rows=14240037443

          width=3498)"<o:p></o:p></p>

        <p class="MsoNormal">"        Sort Key: z.zone, t.trip_id, (CASE

          WHEN ((t.startloc && z.geom) AND

          _st_intersects(t.startloc, z.geom) AND (t.endloc &&

          z.geom) AND _st_intersects(t.endloc, z.geom)) THEN

          'orig-dest'::text WHEN ((t.startloc && z.geom) AND

          _st_intersects(t.startloc, (...)"<o:p></o:p></p>

        <p class="MsoNormal">"        ->  Nested Loop 

          (cost=91.70..14401634128.24 rows=14240037443 width=3498)"<o:p></o:p></p>

        <p class="MsoNormal">"              ->  Seq Scan on taz z 

          (cost=0.00..739.19 rows=4619 width=3406)"<o:p></o:p></p>

        <p class="MsoNormal">"              ->  Bitmap Heap Scan on

          trip_v1 t  (cost=91.70..4151.26 rows=453 width=107)"<o:p></o:p></p>

        <p class="MsoNormal">"                    Recheck Cond:

          ((startloc && z.geom) OR (endloc && z.geom))"<o:p></o:p></p>

        <p class="MsoNormal">"                    Filter: (((startts

          > '2015-01-16 00:00:00'::timestamp without time zone) AND

          (startts < '2015-01-17 00:00:00'::timestamp without time

          zone) AND ((z.uuid)::text = ANY

          ('{kansas_303,kansas_601,kansas_603,kansas_604,kansas_10,kansas_11,kan

          (...)"<o:p></o:p></p>

        <p class="MsoNormal">"                    ->  BitmapOr 

          (cost=91.70..91.70 rows=2706 width=0)"<o:p></o:p></p>

        <p class="MsoNormal">"                          ->  Bitmap

          Index Scan on idx_trip_v1_startloc  (cost=0.00..45.74

          rows=1353 width=0)"<o:p></o:p></p>

        <p class="MsoNormal">"                                Index

          Cond: (startloc && z.geom)"<o:p></o:p></p>

        <p class="MsoNormal">"                          ->  Bitmap

          Index Scan on idx_trip_v1_endloc  (cost=0.00..45.74 rows=1353

          width=0)"<o:p></o:p></p>

        <p class="MsoNormal">"                                Index

          Cond: (endloc && z.geom)"<o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">Some help or suggestions on how to speed up

          the query would be much appreciated.<o:p></o:p></p>

        <p class="MsoNormal">Also, I currently don’t have a specific map

          projection defined on the geom columns so they are using the

          default of 0 in postgis. The points in both trip and zone

          geometries are lon/lat. Is this an issue for the ST_intersect?<o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

        <p class="MsoNormal">Thanks,<br>

          Trang<o:p></o:p></p>

        <p class="MsoNormal"><o:p> </o:p></p>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

postgis-users mailing list

<a class="moz-txt-link-abbreviated" href="mailto:postgis-users@lists.osgeo.org">postgis-users@lists.osgeo.org</a>

<a class="moz-txt-link-freetext" href="http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users">http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users</a></pre>

    </blockquote>

    <br>

  </body>

</html>