[postgis-users] OT: Generic postgresql Q about LARGE number of tables

Paragon Corporation lr at pcorp.us
Sun Nov 28 09:13:23 PST 2010


> As I noted above, I am not sure if 8000+ tables itself is a problem. My 
sense is that that should actually work fine. It is 8000+ *inherited* 
tables that are a problem. This sense was concurred with on Pg list by 
some of the Pg gurus.


I'll both disagree and agree with Puneet here.  The cost of the inherited
tables approach is really in the planning.   After 500 tables in a hierarchy
trying to do a query from the top bottom becomes
unusuable as the cost of query planning becomes really high. Though some of
this has imporoved in 9.0 and 9.1.

However -- there is really not much of a difference between 8000 tables
non-inherited verses 8000 inherited if you do a self-join against two child
tables.  There is no  inheritance planning going on there.

The inheritance thing we generally stratify though given your dataset, you
may not want to stratisfy of may not stratisfy by date range.

You might actually want to split your data by symbol ranges rather than
individual symbols

So for example put all the A-Cs in one table
Ds-Fs in another 

And so on.

Then you are not talking about 8000 tables but say 10 manageable inherited
tables.  Then your check constraints would be by symbol ranges instead of
individual symbols.

So you still get a fairly speedy self-join 

E.g if you want to query MSFT you could do 

SELECT M.stock, J.stock
FROM m_symbols As M INNER JOIN o_symbols As O ON M.ticker_date =
S.ticker_date
WHERE M.stock = 'MSFT'  AND O.stock = 'ORCL'


(note that m_symbols goes staight to MSFT table, so you save a few cycles by
going straight to the table instead of having the planner figure out the
m_symbols table contains Microsoft and O contains Oracle)

Leo
http://www.postgis.us







More information about the postgis-users mailing list