[postgis-users] ESRI Shapes from database

Chris Hermansen chris.hermansen at timberline.ca
Tue Oct 16 14:18:59 PDT 2007


OK, I think I'm following, let's see.

Your C# code has difficulty converting the ESRI geometry into ASCII info
due to some kind of limitation in the C# string processing routines in
conjunction with large-ish MULTIPOLYGON data, so you want to shove the
ESRI shape stuff out to the PostgreSQL back end and have it reprocess
that data into PostGIS geometry.  Is that right?

So I see the "c" word in your description of your problem - "concatenate".

How big are these MULTIPOLYGONs of yours, anyway (# of points, for
instance)?

If a MULTIPOLYGON has, say, 10,000 points, then you would be doing
10,000 string concatenations, probably times 2, for every MULTIPOLYGON,
and as you got to the end of the MULTIPOLYGON you'd be throwing away
bigger and bigger strings.  This doesn't sound efficient to me.  If I
were doing something like this in Java, I'd think seriously about using
a byte[] or char[] or StringBuffer rather than a String because I
wouldn't want to be doing all that garbage collecting.

When you get the BYTEA data, you know how many points are in it. 
Therefore you know how big to build your char[] - or whatever - array;
it has to be (# of points) * (# of chars per point) + overhead.  So you
allocate that, then pass over your BYTEA data and fill in the output buffer.

You can also re-use that buffer until you hit a bigger chunk of BYTEA
data, at which time you either extend the array or get yourself a new one.

Much less garbage collection required, and that which is, is small
stuff, 20 characters at a time or so.  No 9,999 * 20 * 2 ~ 400K strings
to be garbage collecting, except the odd time you need a new, bigger,
buffer.

Lee Keel wrote:
>> -----Original Message-----
>> From: postgis-users-bounces at postgis.refractions.net [mailto:postgis-users-
>> bounces at postgis.refractions.net] On Behalf Of Chris Hermansen
>> Sent: Tuesday, October 16, 2007 3:13 PM
>> To: PostGIS Users Discussion
>> Subject: Re: [postgis-users] ESRI Shapes from database
>>  
>> So in C#, you tried building a great big string containing a set of SQL
>> statements, which you were going to hand off to some kind of C# /
>> PostgreSQL interface.  The string got too big and it broke.
>>
>> What is the nature of that C# interface?  Is it ODBC-like?  Or were you
>> just going to heave the string at psql embedded in some kind of "system"
>> call?  Pardon my ignorance of C# - PostgreSQL interfaces...
>>
>> If the C# interface is ODBC like, can you not issue a bunch of method
>> calls, one for the CREATE TABLE, one for each INSERT INTO, one for the
>> index, etc.  Then your strings should be nice and short, and you can
>> conceal - to some degree at least - the user name and password within
>> the compiled code?
>>
>> If that's not the case, what about writing the SQL out to a temporary
>> text file, or set of text files if length is an issue, and then either
>> using the "system" facility to call psql or writing a small Python
>> program that uses say psychopg to fire the data off into PostgreSQL?
>> You can obscure the user name and password by compiling the Python.
>>
>> Or you could write the whole thing in Python, which you should be able
>> to use to pull data out of the geodatabase as well as put data into
>> PostgreSQL.  Or maybe Java, JDBC allowing you the same ability to hide
>> user names and passwords within the compiled code.
>>
>>     
> [Lee Keel] 
>
> The process that I am using now is select * from table using an sqlserver
> driver on sql server side and select * from table using npgsql driver on
> postgres side.  Yes, these drivers are ODBC-like.  I then go through and
> update all columns on npgsql table with values from sql server.  When I get
> to the geometry column, I parse that binary as a byte array and determine
> the shape type, number of parts, points, etc.  The problem comes in when
> trying to get a string together with all the points in it for some of my
> multipart polygons.  The Microsoft bug that I have found is that I am using
> a stringbuilder to concatenate these points together (this is a microsoft
> object that does all the concatenation work for you and is suppose to be the
> most efficient in .NET) and there is a limitation to the size this object
> can get.  The problem is that garbage collection doesn't occur so it tries
> to allocate double memory next time it comes into the function.  I have
> added a garbage collection call and seeing if that will get me around that
> problem, but in either case the driver to push the data into the system is
> just not performing as well as I had hoped, so that is why I went to the
> postgres function and tried to do the math in the function for diffing the
> shapes.  Which I could do everything except the one bytea to float method.
> Which I have looked at the C code from shp2pgsql and it is doing a memcpy to
> push the 8bytes from the byte array to the float.  I would be willing to do
> this part in C or Perl or Anything else if I just knew how to get these
> functions added so a pgsql function could call it.
>
> Hope that all makes sense...
> LK
> This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail.
> _______________________________________________
> postgis-users mailing list
> postgis-users at postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>   


-- 
Regards,

Chris Hermansen · mailto:clh at timberline.ca
tel:+1.604.714.2878 · fax:+1.604.733.0631
Timberline Natural Resource Group · http://www.timberline.ca
401 · 958 West 8th Avenue · Vancouver BC · Canada · V5Z 1E5

C'est ma façon de parler.




More information about the postgis-users mailing list