[postgis-users] When to use Vacuum Analyze

Stephen Woodbridge woodbri at swoodbridge.com
Tue Feb 14 18:28:19 PST 2012


May it is best to try to understand what this command does and that will 
help you understand how to use it. First, I'm not an expert on this, so 
I fear the my description may have some of my own misconceptions in it, 
but here goes:

vacuum and analyze are actually two separate operations but they can be 
run at the same time.

Vacuum is the operations of recovering free space and compacting data 
into pages. When you do operations on a table like update and delete, 
records get rewritten and the old row version is marked as dead. If you 
want to retrieve 5 rows and they all happen to be on the same page , 
then you only need to read one page of data, but if updates and 
deletions have fragmented you data across multiple pages, then to fetch 
all 5 records you need to load multiple pages which is more costly. The 
same thing happens to indexes as you update, delete and add records to a 
table.

I have a lot of databases where a prepare data initially and then the 
data is only queried repeatedly. So after I'm done preparing it I vacuum 
analyze it once and I'm done.

Analyze typically scans some percentage of the rows of a table and 
updates the statistics about the table. This helps the query planner 
make the correct assumptions about how to plan a query's execution in 
the most efficient manner.

You can ANALYZE <table> without running VACUUM ANALYZE <table> and this 
is fast.

Hope this helps.
   -Steve

On 2/14/2012 7:38 PM, Sheara Cohen wrote:
> Hi –
>
> I’m a relatively new user to PostGIS/Postgres and was told that I should
> use Vacuum Analyze whenever a I create a new table. I have been
> discovering, however, that the time it takes to run vacuum analyze on a
> table is often **MUCH** greater than the time saved during dependent
> operations.
>
> So, this makes me wonder when it is really critical to run Vacuum
> Analyze vs. when it is not worth it. (I’m leaning towards not worth it
> the vast majority of the time.)
>
> For instance, is it important to run AFTER…
>
> ·deleting rows?
>
> ·inserting rows?
>
> ·running st_union?
>
> ·intersecting two geometries?
>
> Or perhaps BEFORE…
>
> ·intersecting two geometries
>
> ·joining two or more tables
>
> ·making row selections and column calculations
>
> Etc…
>
> Thank you!
>
> Sheara
>
> *Sheara Cohen*
> Planner
>
> *C A L T H O R P E**A S S O C I A T E S*
> 2095 ROSE STREET, SUITE 201, BERKELEY, CALIFORNIA, 94709 USA
> 510 809-1165 (direct) | 510-548-6800 x35 (main) | 510 548-6848 (fax)
> sheara at calthorpe.com <mailto:sheara at calthorpe.com> | www.calthorpe.com
>
>
>
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users




More information about the postgis-users mailing list