<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#ffffff" text="#000000">

    <font face="sans-serif">Hi,<br>

      <br>

      I suggest you try "double precision" instead of "numeric".<br>

      From Java perspective, "double precision" is a double whilst

      "numeric" is a BigDecimal, which consumes much more memory and

      network bandwidth than double. You could also consider "real"

      instead of "double precision", which maps to Java floats, if you

      don't need much accuracy.<br>

      <br>

      <a class="moz-txt-link-freetext" href="http://www.postgresql.org/docs/8.4/static/datatype-numeric.html">http://www.postgresql.org/docs/8.4/static/datatype-numeric.html</a><br>

      <br>

      I hope it helps<br>

      <br>

      Cheers :)<br>

      <br>

    </font>

    <pre class="moz-signature" cols="72">Richard Gomes

<a class="moz-txt-link-freetext" href="http://www.jquantlib.org/index.php/User:RichardGomes">http://www.jquantlib.org/index.php/User:RichardGomes</a>

twitter: frgomes

JQuantLib is a library for Quantitative Finance written in Java.

<a class="moz-txt-link-freetext" href="http://www.jquantlib.com/">http://www.jquantlib.com/</a>

twitter: jquantlib

</pre>

    <br>

    On 07/04/11 17:57, Paul & Caroline Lewis wrote:

    <blockquote cite="mid:DUB103-w38B4DB5D4AF648D3684579B7A40@phx.gbl"

      type="cite">

      <meta http-equiv="Context-Type" content="text/html;

        charset=iso-8859-1">

      DB SIZE PROBLEM:<br>

         I have run the following tests on a Postgresql 9 with postgis

      1.5 platform and am getting significant table and index size

      differences.<br>

      TestSet1 is run with a file tunnel6.asc, a CSV file with the

      following being a sample of the data:<br>

      -6.34223029,53.39211958,132.586<br>

      The file is 6 GB in size with 70 million rows. After running

      TestSet1 the table has the correct number of rows (70 mill),

      random viewing of the data and it seems fine, while the table size

      is 16 GB and the index size is 6195 MB.<br>

       <br>

      I now Drop Cascade the tunnel6  table for TestSet2<br>

       <br>

      For TestSet2 it is run on a preprocessed version of tunnel6.asc,

      now called tunnel6_py.asc, with the following being a sample of

      the data:<br>

      -6.34223029,53.39211958,132.586,SRID=4326;POINT(-6.34223029

      53.39211958 132.586)<br>

      This file grows to 8 GB and still has 70 million rows but after

      following TestSet2 steps while the table still has the correct

      number of rows (70 mill), random viewing of the data and it still

      seems fine but now the table size is 9.5 GB and the index size

      is 3363 MB. <br>

       <br>

      Have I done something significantly wrong in these tests?<br>

      The TestSet2 process loads the data about 10 minutes quicker than

      TestSet1 so I would like to use it but I don't trust it now given

      the significant differences in table sizes.<br>

       <br>

--********************************************************************/<br>

      --TestSet1<br>

--********************************************************************/<br>

      CREATE TABLE tunnel6<br>

      (<br>

      latitude numeric,<br>

      longitude numeric,<br>

      altitude numeric);<br>

       <br>

      COPY tunnel6 (<br>

         latitude,<br>

         longitude,<br>

         altitude,)<br>

        FROM '/media/storage/tunnel6.asc'<br>

        CSV<br>

        HEADER;<br>

       <br>

      SELECT AddGeometryColumn('tunnel6','wgs_geom','4326','POINT',3);<br>

       <br>

      UPDATE tunnel6 SET wgs_geom = ST_SETSRID(ST_MAKEPOINT(longitude,

      latitude, altitude),4326);<br>

       <br>

      CREATE INDEX tunnel6_wgs_point ON tunnel6 USING gist(wgs_geom);<br>

--********************************************************************/<br>

      --TestSet2 - Python PreProcessed Input File<br>

--********************************************************************/<br>

--********************************************************************/<br>

      CREATE TABLE tunnel6<br>

      (<br>

      latitude numeric,<br>

      longitude numeric,<br>

      altitude numeric<br>

      );<br>

       <br>

      SELECT AddGeometryColumn('tunnel6','wgs_geom','4326','POINT',3);<br>

       <br>

      COPY tunnel6(<br>

         latitude,<br>

         longitude,<br>

         altitude,<br>

         wgs_geom)<br>

        FROM '/media/storage/tunnel6_py.asc'<br>

        CSV<br>

        HEADER;<br>

       <br>

      CREATE INDEX tunnel6_wgs_point ON tunnel6 USING gist(wgs_geom);<br>

--********************************************************************/<br>

       <br>

      Any help or insights would be much appreciated.<br>

      Thanks.<br>

      <pre wrap="">

<fieldset class="mimeAttachmentHeader"></fieldset>

_______________________________________________

postgis-users mailing list

<a class="moz-txt-link-abbreviated" href="mailto:postgis-users@postgis.refractions.net">postgis-users@postgis.refractions.net</a>

<a class="moz-txt-link-freetext" href="http://postgis.refractions.net/mailman/listinfo/postgis-users">http://postgis.refractions.net/mailman/listinfo/postgis-users</a>

</pre>

    </blockquote>

  </body>

</html>