[postgis-users] OT: Generic postgresql Q about LARGE number of tables

Guyren G Howe gisborne at emailuser.net
Sat Nov 27 13:31:47 PST 2010


On Nov 27, 2010, at 1:27 PM, Stephen Woodbridge wrote:

> Hi all,
> 
> I'm helping a nephew with a thesis project in economics and finance. We need to load historical stock quote data for like 8000 symbols where each symbol will have something like 2,000-30,000 records. I have the data in csv files so loading it with copy table will work fine.
> 
> I'm wondering if people have any thoughts on organizing this. The obvious options the come to my mind are:
> 
> 1. 8000 tables
> 2. 8000 tables inherited from a master table
> 3. one huge table that will have something like 80M records in it (est. as 8000 tables with on avg. 10,000 records) I would need to add a column to every record based on the stock symbol to identify which are which.
> 
> So 1. and 2. sounds simple enough and the symbol can be the table name. I think there are some advantages if I want to update that data in the future.
> 
> 3. sounds like I might run into performance problems especially if I need to compare two symbols because this would require a self join.

3. There are petabyte databases in Postgres; 80M rows is nothing, and this will be much, much easier to use (otherwise, if you want to analyze data across 1000 different symbols, you would have to do a union of 1000 different queries).


More information about the postgis-users mailing list