[postgis-users] speeding up "not in" query

Sat Jul 24 21:35:23 PDT 2010

Even better idea since you are looking for non matches.  Use a LEFT JOIN

SELECT survey.*
FROM survey LEFT JOIN parcel_esri.parcelnum
	ON (survey.parcelnum = parcel_esri.parcelnum)
WHERE parcel_esri.primary_key IS NULL;

Your primary key stuff a field that never has null values.

Leo and Regina
http://www.postgis.us

-----Original Message-----
From: postgis-users-bounces at postgis.refractions.net
[mailto:postgis-users-bounces at postgis.refractions.net] On Behalf Of David
Epstein
Sent: Saturday, July 24, 2010 5:09 PM
To: postgis-users at postgis.refractions.net
Subject: [postgis-users] speeding up "not in" query

I have one spatial table and one non-spatial table each with about 380k
rows. They can be joined by "parcelnum". However, there are some parcel
numbers in each table that do not match. I want the full rows of the
non-matching from each. I've made a standard (b-tree) index for "parcelnum"
in both tables and then ran this query to get non-matching rows in a single
direction:

SELECT *
FROM survey
WHERE survey.parcelnum NOT IN
(SELECT parcel_esri.parcelnum
FROM parcels_esri)

This query has already run for 35 minutes and is still running. I have a
laptop running Ubuntu 9.04 64bit with two P8700 2.53GHz CPUs and 3.8GB of
memory. Only 1 CPU and 1.3GB of memory is being used. Is there anything I
can do to speed up not-in (and join) queries?

thank you,
-david

_______________________________________________
postgis-users mailing list
postgis-users at postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users