[postgis-users] OT: Generic postgresql Q about LARGE number of tables

Sat Nov 27 13:27:14 PST 2010

Hi all,

I'm helping a nephew with a thesis project in economics and finance. We 
need to load historical stock quote data for like 8000 symbols where 
each symbol will have something like 2,000-30,000 records. I have the 
data in csv files so loading it with copy table will work fine.

I'm wondering if people have any thoughts on organizing this. The 
obvious options the come to my mind are:

1. 8000 tables
2. 8000 tables inherited from a master table
3. one huge table that will have something like 80M records in it (est. 
as 8000 tables with on avg. 10,000 records) I would need to add a column 
to every record based on the stock symbol to identify which are which.

So 1. and 2. sounds simple enough and the symbol can be the table name. 
I think there are some advantages if I want to update that data in the 
future.

3. sounds like I might run into performance problems especially if I 
need to compare two symbols because this would require a self join.

Typically we will want to be comparing one symbol against another and 
synchronizing records based on dates to do correlation analysis. This 
makes me think that separate tables is probably the best way to go.

My hesitation is in the fact that I have never created a database with 
8000+ tables in postgresql and I'm not sure how that will work out in an 
of itself.

Anyway, sorry for the off topic post, but if anyone has any experience 
with large number of tables I would appreciate you thoughts. Off list 
would be fine to avoid additional OT noise here.

Thanks,
   -Steve