[postgis-users] OT: Generic postgresql Q about LARGE number of tables
Stephen Woodbridge
woodbri at swoodbridge.com
Sat Nov 27 13:27:14 PST 2010
Hi all,
I'm helping a nephew with a thesis project in economics and finance. We
need to load historical stock quote data for like 8000 symbols where
each symbol will have something like 2,000-30,000 records. I have the
data in csv files so loading it with copy table will work fine.
I'm wondering if people have any thoughts on organizing this. The
obvious options the come to my mind are:
1. 8000 tables
2. 8000 tables inherited from a master table
3. one huge table that will have something like 80M records in it (est.
as 8000 tables with on avg. 10,000 records) I would need to add a column
to every record based on the stock symbol to identify which are which.
So 1. and 2. sounds simple enough and the symbol can be the table name.
I think there are some advantages if I want to update that data in the
future.
3. sounds like I might run into performance problems especially if I
need to compare two symbols because this would require a self join.
Typically we will want to be comparing one symbol against another and
synchronizing records based on dates to do correlation analysis. This
makes me think that separate tables is probably the best way to go.
My hesitation is in the fact that I have never created a database with
8000+ tables in postgresql and I'm not sure how that will work out in an
of itself.
Anyway, sorry for the off topic post, but if anyone has any experience
with large number of tables I would appreciate you thoughts. Off list
would be fine to avoid additional OT noise here.
Thanks,
-Steve
More information about the postgis-users
mailing list