[gdal-dev] Follow on to the "ISO Metadata" post

Damian Dixon damian.dixon at gmail.com
Mon Oct 26 12:04:38 PDT 2015


My thoughts on an XML encapsulation of metadata would be (I'll leave the
exact layout and details to the experts):


<product>

   <name>name of product</name>

   <data>

     <format>vpf/shape etc...</format>

     <item key="k1" value="val1" />

     <item key="k2" value="val2" />

     <item key="kN" value="valN" />

   </data>

</product>


Problems I can see with this:


   - Should the data wrap the product?
   - How do you encapsulate XML metadata?
   - What information should be captured?

*Product*

GDAL/OGR reads data and does not identify product.


A product tends to use a carrier format such as shape, S57, VPF, GML,
etc... If you know what the product is you can derive additional
information that can be very useful in automatic styling or handling of the
data.


*Encapsulating XML metadata*

Some data formats may contain a mixture of binary and XML. Take for example
JPEG which contains both binary information and XML data such as Geo
Spatial information.


I see no point translating native XML metadata to a different XML format.
You risk losing information.


*What information should be captured?*

Some of the information can be derived from the data or the way the data is
stored on the media.


Information that can be derived from the data can be just as important as
the metadata stored in the data.


This in part refers to identifying the product, scale of the data, intended
us, provenance, use restrictions, modification dates, creation dates,
expiry dates, who created the data, etc...


The information may not be in the data itself but alongside the data in
additional files.


Derived metadata should be created at the point that the data is read to
generate the metadata. Sounds odd but consider, metadata is used in the
process of cataloging data so that you can find the data you need for your
GIS application.


*Key/Value pairs*

The keys are unique to the data and potentially to the product.


The aim should be to not lose information that is read from the data.




On 26 October 2015 at 12:59, Tim Crook <tim.crook at sympatico.ca> wrote:

> Yes, it had occurred to me that XSLT would be a flexible way of handling a
> lot of the metadata mappings.
>
> *From:* Damian Dixon <damian.dixon at gmail.com>
> *Sent:* Monday, October 26, 2015 8:36 AM
> *To:* Tim Crook <tim.crook at sympatico.ca>
> *Cc:* doug_newcomb at fws.gov ; gdal dev <gdal-dev at lists.osgeo.org>
> *Subject:* Re: Follow on to the "ISO Metadata" post
>
> Hi Tim,
>
> Personally I would not use ISO 19115-1 as an internal format.
>
> There are not a huge number of data formats/products that store metadata
> as XML out of the box. When they do store metadata it is usually specific
> to the data and data product (regardless of how the metadata is stored).
>
> There have been attempts at adding metadata alongside data products such
> as UK MOD profile of IS0 19115 (MOD profile has problems). The French
> equivalent of the MOD have for a number of years mandated a metadata format
> alongside all data products used by them (wish I could find the actual
> standard for the metadata).
>
> The biggest problem is actually mapping from data/'data product' metadata
> to the target metadata specification.
>
> Just to highlight how much a problem the mapping of fields from one
> metadata format to another is; we have been arguing off and on for more
> than a year internally about the meaning of dates and which date should be
> in which field. Two of our big customers do not agree on the meaning of
> some of the source data date fields and the mappings we have done.
>
> I believe ESRI have their own internal metadata format that they provide a
> tool to translate to other XML metadata specifications.
>
> Where I work I have been pushing a per data/'data product' format that is
> XML based that uses tag value pairs. The tags would basically be a dump of
> all available information and specific to each data/'data product'. A set
> of XSLT scripts would then translate the information to what ever metadata
> standard you wanted to use and if you needed to modify the mapping you
> could change the XSLT script for that data/'data product'.
>
> We have found that hard-coding the mapping is too costly to maintain and
> very difficult to get right.
>
> Probably not the answer you are looking for.
>
> Regards
> Damian
>
>
> On 22 October 2015 at 13:29, Tim Crook <tim.crook at sympatico.ca> wrote:
>
>> Hello Doug and Damian.
>>
>> I saw your post about ISO 19103, ISO 19115 and  ISO 19115-1. I am
>> starting to look at ticket #3549 (https://trac.osgeo.org/gdal/ticket/3549).
>> This ticket is a specific problem for metadata translation for image
>> transformations to the PCIDSK format. The ticket references JPEG and TIFF.
>>
>> The first thing I thought of was when I saw your posts was mapping the
>> XML metadata from different sources into an internal format to GDAL, then
>> passing through the information for mapping to the destination format. I
>> suppose there are some image source formats that don't use XML to store
>> their metadata, so this would require additional handling.
>>
>> I suppose the internal format to GDAL could be XML in the ISO 19115-1
>> format.
>>
>> Am I completely off base here?
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20151026/5da952d7/attachment.html>


More information about the gdal-dev mailing list