[postgis-users] design problem

Fri Mar 1 14:31:17 PST 2013

Steve,

Could this process of inherited tables be used to roll up GPS data by time increments for example?  Maybe roll up the data by day, week or month?  I need to figure out a way to handle queries potentially against millions of records for reporting purposes.

Bobb

>>  -----Original Message-----
>>  From: postgis-users-bounces at lists.osgeo.org [mailto:postgis-
>>  users-bounces at lists.osgeo.org] On Behalf Of Stephen Woodbridge
>>  Sent: Friday, March 01, 2013 4:23 PM
>>  To: postgis-users at lists.osgeo.org
>>  Subject: Re: [postgis-users] design problem
>>  
>>  On 3/1/2013 4:44 PM, Andy Colson wrote:
>>  > On 3/1/2013 3:11 PM, Denise Janson wrote:
>>  >> Hi,
>>  >>
>>  >> I have an application that is going to receive lots of
>>  georeferenced
>>  >> files every day. Each file has information of several points.
>>  >> Probably in few years my application will have a Tera of
>>  points
>>  >> information stored.
>>  >>
>>  >> I think I can do this design  in two ways:
>>  >>
>>  >> 1. Two tables, one of “uploaded_files”, and another of
>>  “points” (one
>>  >> uploadedFile to N points). And I'll have to partition the
>>  points
>>  >> table, maybe by month … 2. Or I can create one table per file,
>>  having
>>  >> thousands of tables in few years.
>>  >>
>>  >> Which case is better for my application?  Is there any better
>>  way to
>>  >> do this?
>>  >>
>>  >
>>  > If performance is a concern, and the file's are of any
>>  meaningful
>>  > size, you might consider leaving them on the filesystem and
>>  have the
>>  > table point to it (full path name sort of thing).
>>  >
>>  > Storing the file in PG is possible, and its nice because
>>  everything is
>>  > kept together, but if you have to get to and read the files
>>  fast, then
>>  > leave them on the filesystem.
>>  >
>>  > The lots of tables approach is problematic if you ever want to
>>  write
>>  > queries that look back in time.  Its much harder to say, give
>>  me every
>>  > record from the beginning of time at this point.
>>  >
>>  > With a good index, PG wont have a problem with a single table
>>  > containing billions of rows.  Just try to avoid doing bulk
>>  operations
>>  > (like update and delete) on the entire table.
>>  >
>>  >
>>  >  > uploadedFile to N points). And I'll have to partition the
>>  points
>>  > table,
>>  >
>>  >
>>  > Why will you have to partition it?
>>  
>>  you might want to consider using inherited tables. This you can
>>  have something like:
>>  
>>  master_table
>>      - table1 inherits from master_table
>>      - table2 inherits from master_table
>>      - etc
>>  
>>  This has the advantange that you can set constraints on the sub-
>>  tables like date_from, data_to or other constraints that you
>>  might need in your queries.
>>  
>>  Then when you make you query on the master_table if will
>>  eliminate all the tables that fail the constraint test and this
>>  is very fast. Also if you ever need to make adhoc queries on the
>>  master_table you still have a structure that supports that.
>>  
>>  There might ne some issues with inheriting 10 of 1000s of tables.
>>  
>>  The real answer to your design can only be answered by
>>  understanding what your queries are going to look like with
>>  respect to all this data.
>>  
>>  -Steve
>>  
>>  _______________________________________________
>>  postgis-users mailing list
>>  postgis-users at lists.osgeo.org
>>  http://lists.osgeo.org/cgi-bin/mailman/listinfo/postgis-users