[OSGeo-Discuss] Re: "git" like for geodata management

Kalyan Janakiraman Kalyan.Janakiraman at lpma.nsw.gov.au
Mon Sep 27 03:13:39 EDT 2010


Hi

Versions are also used to demarcate the the geospatial transaction boundary. I didn’t see this point articulated.

We have been sucessfully running replication of ArcSDE geodatabase from data maintenance environment to different geodatabase repositories (about 150 repositories) for many years now through event-driven mediation framework. Because we had used the event-driven mediation approach, we could replicate irrespective of the version or the vendor.

Because ESRI didn’t support robust replication before, we did this ourselves. In gist the version is boundary of each geospatial transaction. When a version is posted, the transactions in it are picked up and shipped across as XML event feeds.


I published this as a paper. I can send it to anyone interested.


-       Kalyan

From: discuss-bounces at lists.osgeo.org [mailto:discuss-bounces at lists.osgeo.org] On Behalf Of Ragi Burhum
Sent: Friday, 24 September 2010 4:06 AM
To: discuss at lists.osgeo.org; Noli Sicad
Subject: [OSGeo-Discuss] Re: "git" like for geodata management

Hi Noli,

thanks for the link. That is definitely a step in the right direction, but it is hardly comparable to git ArcSDE versioning at that.

The article and sample code you describe above generates hashes for all rows and tables in the db and compares them to the target db. So 1 million rows in a db, regardless if the two dbs are identical, would cause 1 million hashes to go over the wire. Every single time you ask to sync you pay the price.

Git and ArcSDE keep track of changesets, and when it is time to synchronize, they exchange that changeset and apply it. One insert? That is all that needs to be sent.

Another issue is that there is nothing about conflict resolution there (what happens when you delete one row in one db and modify it in another one?). There is also the problem of allowing multiple versions of the data in the same db (Like having multiple heads).

Regardless, thank you for the link,

- Ragi


Date: Thu, 23 Sep 2010 13:22:17 +1000
From: Noli Sicad <nsicad at gmail.com<mailto:nsicad at gmail.com>>
Subject: Re: [OSGeo-Discuss] Re: "git" like for geodata management
To: OSGeo Discussions <discuss at lists.osgeo.org<mailto:discuss at lists.osgeo.org>>
Message-ID:
            <AANLkTi=3anC4BAANd4HK9UUZFsasXn-8ybPNKYoNG+Fw at mail.gmail.com<mailto:AANLkTi=3anC4BAANd4HK9UUZFsasXn-8ybPNKYoNG+Fw at mail.gmail.com>>
Content-Type: text/plain; charset=ISO-8859-1

PostgreSQL Synchronization Tool  --- psync [1]

" The article introduces a method of synchronizing two PostgreSQL
databases. Although, this seems to be an easy task, no product (slony,
londiste, ...) really satisfied the needs within the maps.bremen.de<http://maps.bremen.de>
project. Either they have special prerequsits that didn't apply for
our problem or they didn't support synchronizing of large objects.

Large objects are used to store tiles of a street/aerial map within
PostgreSQL. My GIS-server queries the database and gets the tiles out.
By using this construction we are getting a flexible infrastructure
for updating and maintaining different versions of the maps.

Everything was working fine until the service needs to be spread over
three servers. How can we easily synchronize the databases? I really
found no really working solution that is clean and easy to use.  "

[1]http://www.codeproject.com/KB/database/psync.aspx


Noli

On 9/23/10, Ragi Burhum <ragi at burhum.com<mailto:ragi at burhum.com>> wrote:

Are you looking for an alternative to (1)ESRI's versioning, (2)ESRI's
disconnected editing, or a mix of both (3)git like? the scenario that you
described first was more like (2), but this one fits (1).

I would love to see something like (3), but truth of the matter, AFAIK,
there is nothing like that implemented for geo (yet).

On Sep 22, 2010, at 9:00 AM, discuss-request at lists.osgeo.org<mailto:discuss-request at lists.osgeo.org> wrote:

On Wed, 2010-09-22 at 12:10 +0800, maning sambale wrote:
Any real world cases for this?

Imagine the following scenario:

* 50 ~ 70 digitizers
* 5 QA
* 1 Manager

Each QA has 10 digitizers assigned. After all the data is validated, the
manager merges it and generates the geodb.

All users work against the same DB, most of them linked. This causes
disconnections, duplicated data, and lots of random errors.

Also, they can't be forced to work on different DB's because they are
all working on the same project, at the same time.

This is the real scenario of GISWorking (http://www.gisworking.com/), a
company we are working with.

It would be perfect to have smaller groups (ideally 1 person), working
against separated databases, but that can be synchronized with the rest
of the data when needed.

Then each QA merges data from the people he supervises. After it's
validated the manager merges the complete dataset, and generates the
final "product".

I don't know if this it's the exact same case, but we are working on it
with a similar approach.

_______________________________________________


***************************************************************
This message is intended for the addressee named and may contain confidential information. If you are not the intended recipient, please delete it and notify the sender. Views expressed in this message are those of the individual sender, and are not necessarily the views of the Land and Property Management Authority. This email message has been swept by MIMEsweeper for the presence of computer viruses.
***************************************************************
Please consider the environment before printing this email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/discuss/attachments/20100927/d1809542/attachment-0001.html


More information about the Discuss mailing list