[postgis-users] Greetings, and road data questions

Fri Oct 18 13:35:17 PDT 2002

> I tried doing this, but it didn't work.  What's the syntax?

I knew you were going to ask that. It was a while since I tried this,
and I don't remember the exact incantations. One of the issues that I do
remember is that I couldn't get GiST to work with a geometry column and
a character column, since GiST only does numeric types. In the contrib
directory of postrges, there are some other integer operators that you
can use. Either convert your cfcc codes to integers, or add another
column to your table that contains a hash of the character fields you
want to index on. Sorry about the vagueness... I tried this as an
experiment and since it didn't yield any significant performance gains,
I didn't bother fleshing it out.

> It's not typically hard to join them.  I have a Perl scrip that will
> do this, in fact, after generating a table with tlid and start and end
> points.  It takes multiple days to run, but afterwards it has produced
> a lot of new line segments.

What do you mean by "a lot of new line segments"?

> The problem is address ranges.  I sort of want to retain that as much
> as possible, since one thing I'd like to be able to do is "show me a
> map of 1234 south 4th ave in Chicago" or some such.  I suppose I could
> retain that info by using a clever "from point 3 to 4 in line 123123,
> the address range is 1200 to 1299" or something.

For geocoding, you probably want to keep the street segment table
structure. For rendering, you don't need the detailed data, just the
geometries. In fact, you might even think about Douglas-Peuker line
smoothing to *reduce* the number of segments that you need to render.
> 
> BTW, the machine I'm using is a box I threw together for this side
> project.  It only has 512 MB of ram currently, but if this becomes
> serious it'll end up with 3.5 or so.  It has two Athlon XP 2000+ CPUs.

RAM is the key thing. By default, postgres uses a relatively small
amount of shared memory. So you're probably seeing a lot of disk
activity even to use your GiST index. For our dataset, I think the index
all by itself is several gigabytes. We've got a 4-way Xeon box with much
slower processors than yours (i.e., 500MHz CPUs), but it has 4 gig of
RAM of which 1 gig is specifically allocated to postgres buffer cache.

We do geocoding with some PLPGSQL scripts that run in the database and
they're very snappy performance-wise.