[postgis-users] speeding up "not in" query

Paragon Corporation lr at pcorp.us
Sun Jul 25 22:05:58 PDT 2010


David,

You are probably being bitten by nulls.  Check to make sure your parcelnumbs
don't have nulls in them.

This article we wrote might help

http://www.postgresonline.com/journal/archives/166-NOT-IN-NULL-Uniqueness-tr
ickery.html

Leo and Regina
http://www.postgis.us 

-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of David
Epstein
Sent: Monday, July 26, 2010 12:24 AM
To: postgis-users at postgis.refractions.net
Subject: Re: [postgis-users] speeding up "not in" query

Thank you all for your suggestions. 

My original query not only took 4.6 hours to run, but also returned the
wrong answer. The "not exists" version below returned the correct answer (as
far as I can tell) in 2 minutes. I thought these were essentially the same
query but apparently they are not. I have not yet tried the left join
version.

SELECT *
FROM parcels_esri
WHERE NOT EXISTS
(SELECT survey.parcelnum
FROM survey WHERE parcels_esri.parcelnumb = survey.parcelnum)

"Seq Scan on parcels_esri  (cost=0.00..3320785.37 rows=193444
width=4537) (actual time=17.793..4154.274 rows=84489 loops=1)"
"  Filter: (NOT (subplan))"
"  SubPlan"
"    ->  Index Scan using parcelnum_idx on survey  (cost=0.00..8.35
rows=1 width=26) (actual time=0.009..0.009 rows=1 loops=386887)"
"          Index Cond: ($0 = parcelnum)"
"Total runtime: 4200.203 ms"


Running EXPLAIN ANALYZE on my original query takes more time than I am
willing to wait. This seems like a very problematic formulation:

SELECT *
FROM parcels_esri
WHERE parcels_esri.parcelnumb NOT IN
(SELECT survey.parcelnum
FROM survey)


Thanks again for your help. I now know the huge improvements that can come
from such variations.

-david

_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users





More information about the postgis-users mailing list