[postgis-users] Loading spatial point from text file
Sairam Krishnamurthy
kmsram420 at gmail.com
Mon Dec 6 19:34:11 PST 2010
Kevin,
Thanks a lot. I did figure out that Rules doesn't work with "COPY FROM". And
the triggers are much faster. Normally it takes like 20 min to insert the
data and update the spatial points. With triggers it is reduced now to 7
mins.
Thanks Again.
-Sairam
On Mon, Dec 6, 2010 at 7:38 PM, Kevin Neufeld <kneufeld.ca at gmail.com> wrote:
> So after some testing, it looks like the COPY command bypasses any RULEs
> placed on the target table, so disregard my RULE suggestion. I'm sure you
> can still use the TRIGGER suggestion, that's up to you. Is it faster? Do a
> test and find out.
>
> Initial setup (create a 1,000,000 row sample csv file):
> create temp table sample as
> select
> (random()::double precision * 360) - 180 AS lon,
> (random()::double precision * 180) - 90 AS lat
> from generate_series(1, 1000000);
>
> copy sample TO E'D:\\Program Files\\PostgreSQL\\9.0\\data\\sample.csv'
> CSV;
>
>
> Trigger approach: (9906ms)
> CREATE TABLE foo (lon double precision, lat double precision, pt
> geometry);
>
> CREATE OR REPLACE FUNCTION foo_insert_tr() RETURNS TRIGGER AS
> $body$
> BEGIN
> NEW.pt := ST_SetSRID(ST_MakePoint(NEW.lon, NEW.lat), 4326);
> RETURN NEW;
> END;
> $body$ LANGUAGE plpgsql;
>
> CREATE TRIGGER foo_insert_tr BEFORE INSERT OR UPDATE ON foo
> FOR EACH ROW EXECUTE PROCEDURE foo_insert_tr();
>
> COPY foo (lon, lat) FROM E'D:\\Program
> Files\\PostgreSQL\\9.0\\data\\sample.csv' CSV;
> -- 9656 ms
>
> ANALYZE foo;
> -- 250ms
>
>
> COPY / UPDATE approach: (27531 ms)
> CREATE TABLE foo (lon double precision, lat double precision, pt
> geometry);
>
> COPY foo (lon, lat) FROM E'D:\\Program
> Files\\PostgreSQL\\9.0\\data\\sample.csv' CSV;
> -- 3625 ms
>
> -- An update is an expensive operation ... it's like a DELETE followed by
> an INSERT
> -- and you still have to deal with the dead space (it'll make your table
> twice as
> -- big physically)
> UPDATE foo SET pt = ST_SetSRID(ST_MakePoint(lon, lat), 4326);
> -- 18609 ms
>
> VACUUM FULL ANALYZE foo;
> -- 5297 ms.
>
>
> Cheers,
> -- Kevin
>
>
>
> On 12/6/2010 12:06 PM, Sairam Krishnamurthy wrote:
>
> Hey Kevin,
>
> Thanks for your reply. Will this be faster than updating the spatial
> points after loading the lat/lon values? I will sure try this out in a
> while. But if this is going to slower that that things wont be easier for
> me. What is your opinion?
>
> Thanks,
> Sairam
>
> On Mon, Dec 6, 2010 at 12:00 PM, Kevin Neufeld <kneufeld.ca at gmail.com>wrote:
>
>> You could create a BEFORE INSERT TRIGGER on your table that modifies the
>> records you are inserting by populating the spatial column using the
>> NEW.long and NEW.lat values.
>>
>> Alternatively, you mentioned that you didn't even want long/lat in the
>> database at all. In that case, as part of your loading script, you could
>> try to:
>> - create a temp dummy table as the target for your COPY command
>> - place a RULE on the dummy table to redirects INSERTs into a table for
>> all your data
>> - run the COPY and drop the dummy table.
>>
>> i.e., this works.
>>
>> CREATE TABLE foo (pt geometry);
>> CREATE TEMP TABLE foo_tmp (long double precision, lat double precision);
>>
>> CREATE RULE foo_tmp_insert_rule AS
>> ON INSERT TO foo_tmp
>> DO INSTEAD
>> INSERT INTO foo (pt) VALUES (ST_MakePoint(NEW.long, NEW.lat));
>>
>> -- You would use your COPY command here to insert records
>> INSERT INTO foo_tmp (long, lat) VALUES (-56.57647, -64.7647);
>> DROP TABLE foo_tmp CASCADE;
>>
>> SELECT ST_AsText(pt) FROM foo;
>> st_astext
>> ---------------------------
>> POINT(-56.57647 -64.7647)
>> (1 rows)
>>
>>
>> Cheers,
>> Kevin
>>
>>
>>
>> On 12/6/2010 8:43 AM, Sairam Krishnamurthy wrote:
>>
>> All,
>>
>> I have a test file that contains the data to the loaded to the spatial
>> table I have. The table structure is <lat,lon,data1,data2,spatialPoint>.
>> Sample data in the txt file : <-64.7647, -56.57647, 1234548, 1221312>.
>>
>> So I want to load the table from this text file. I use the COPY query
>> below.
>>
>> *COPY "table_name" (lat,lon,data1,data2) FROM 'FILE' DELIMITER ','*
>>
>> But the problem is I am not able to update the spatialPoint using copy
>> query. So for now I am loading the lat,lon,data1,data2 fields and then
>> update the spatialPoint using a separate query similar to one below:
>>
>> *UPDATE "table_name" SET "spatialPoint" =
>> ST_SetSRID(ST_MakePoint(lat,lon),4326) WHERE "spatialPoint" IS NULL*
>>
>> My question is, is there a way to avoid the second query so that I can
>> also load the spatialPoint in the COPY query?
>>
>> Also I initial dint want to have lat,lon in the table and have only the
>> spatialPoint field. Because of the above problem I was forced to have
>> lat,lon fields in the table. Someway to achieve this will help me a lot.
>>
>>
>>
>> Thanks,
>> Sairam Krishnamurthy
>> +1 612 859 8161
>>
>>
>> _______________________________________________
>> postgis-users mailing listpostgis-users at postgis.refractions.nethttp://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>>
>>
>> _______________________________________________
>> postgis-users mailing list
>> postgis-users at postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>>
>>
>
> _______________________________________________
> postgis-users mailing listpostgis-users at postgis.refractions.nethttp://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20101206/55da65e7/attachment.html>
More information about the postgis-users
mailing list