[OSGeo-Discuss] Raster data on a DBMS

Chris Puttick chris.puttick at thehumanjourney.net
Tue Nov 4 07:30:05 PST 2008


It is not necessary to store the image file itself in the database to get concurrency control, data protection, integrity and management features. There are a number of good document management systems (Alfresco, KnowledgeTree) that offer all the above for files, and the Zimbra collaboration system makes use of database for emails much the same reason. None of these actually store the files in database; the database is used to provide all the controls, and the access to the files is only via an interface that references the database and the additional functionality it provides.

OTOH Microsoft put all their Exchange emails into the database and anyone who has ever managed an Exchange installation of any size
can tell you just how many problems that can cause you...

Chris

----- "Gilberto Camara" <gilberto.camara at inpe.br> wrote:
> Dear OSGEO
> 
> Jim Gray´s paper and much more on
> this issue is on his site at MS Research.
> 
> Storing images on a database gives much
> more benefits that simple retrieval of
> metadata. Databases offer concurrency control,
> data protection, integrity and management features
> that simple file systems are lacking.
> 
> If you have hundreds of images scattered around
> as files, you lack data management. Your metadata
> may point to a file that could have been deleted.
> In a multi-user environment, file systems do not
> prevent different users from updating the same
> image. The result may be a data which is inconsistent.
> 
> Allow me to reiterate my earlier argument, which is
> that FOSS4G should **allow** users the option of storing
> raster data in a database. Storing images in a database
> is not recommended in each and every situation.
> The user should have the option, according to his needs.
> 
> The current debate on whether images should be stored
> on an RDBMS reminds me of a similar debate during the
> early 90s, concerning whether vector data should be
> stored in an RDBMS. Remember the days of ARC-INFO?
> 
> In mid 90s, our team at INPE tried to use the
> Postgres-95 RDBMS to store vector data. The result
> was a system with a very slow performance.
> The concept was right, but the implementation was
> lacking. It was only when PostgreSQL and PostGIS
> came of age that we could develop a multi-user
> spatial database with good performance.
> 
> By the same argument, these are early days of
> storing raster data in RDBMS. There are missing
> features on the database and the performance may
> be slower than file systems. But the concept
> is fundamentally correct. I predict that five
> years hence this debate will be solved and we
> will look at it as a relique of the past.
> 
> Best Regards
> Gilberto
> 
> Christopher Schmidt said:
> > I don't see anything in that paragraph that indicates that storing
> the
> > *image data* in the database is important. (A link to the paper
> online
> > or something could change that, of course.) Specifically, I don't
> think
> > there's any doubt that if you have many-many files, it makes sense
> to
> > store the *queryable image information* -- things like spatial
> extent,
> > temporal extent, etc. -- belong in a database. The question is, in
> the
> > "data" column, do you store a File Path, or the Image Data?
> Until/Unless
> > databases get/have image manipulation tools directly, I can't see
> the 
> > value of storing the image data itself in the database.
> > 
> > The points above argue against file-system based metadata
> > storage/retrieval: sorting files by date, searching through index
> files,
> > etc., so far as I can tell, but I don't see a compelling argument
> for
> > image data in the database above.
> > 
> > Of course, this is assuming that the image data access pattern is
> the
> > same "in the database" and "on disk": for example, storing GeoTIFF
> data,
> > then using GDAL to parse the string from the database as a GeoTIFF
> file.
> > If the database you're using has a different (faster) Image access
> > algorithm, then of course there can be benefits. However, those
> same
> > benefits could presumably be realized with sufficiently complete
> > libraries for accessing the image externally: If Oracles' Database
> > product, for example, internally tiles the image, and they had a
> library
> > to access the image in the same way, presumably you could store
> those
> > bits on disk as well. However, if that library depends internally on
> a
> > database, then integration of all points into the same database
> might
> > help in some ways.
> > 
> > In any case, I think there's obvious reasons to store your image
> > metadata in a database -- and *using the same tools for accessing
> the
> > images*, I don't think we've yet seen a compelling argument for
> storing
> > image blobs in the database. Of course, all things are not equal  :)
> 
> > If your database has built in MrSID support, for example, you could
> > imagine using Database Storage for Images, because you'd get the
> > automatic compression combined with the querying -- but that's not
> about
> > the Database Specifically, just the image storage/reading library
> that
> > comes along with it.
> > 
> > Regards,
> > -- Christopher Schmidt Web Developer
> 
> 
> -- 
> ===========================================
> Dr.Gilberto Camara
> Director General
> National Institute for Space Research (INPE)
> Sao Jose dos Campos, Brazil
> 
> voice: +55-12-3945-6035
> fax:   +55-12-3921-6455
> web:   http://www.dpi.inpe.br/gilberto
> blog:  http://techne-episteme.blogspot.com/
> ============================================
> 
> _______________________________________________
> Discuss mailing list
> Discuss at lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/discuss



------
Files attached to this email may be in ISO 26300 format (OASIS Open Document Format). If you have difficulty opening them, please visit http://iso26300.info for more information.




More information about the Discuss mailing list