[gdal-dev] Decoding feature blobs and tracking FIDs in MVT datasets

Linda Kladivová L.Kladivova at seznam.cz
Tue Dec 23 05:00:09 PST 2025


Hi Even,




I was a bit confused about FIDs in the MVT driver, so thanks a lot for your 
earlier explanations. (Sorry I didn’t reply directly to your email — I can 
see it in the GDAL archives but not in my inbox, not sure what happened.)




When implementing CreateFeature and DeleteFeature, I initially assumed that 
the idx identifier in the temporary SQLite table (created in ICreateFeature)
could be relied on as a kind of “global FID”. So I implemented DeleteFeature
with idx parameter as input. :-) I now understand that this assumption was 
wrong, and that there are actually three distinct identifiers involved:

   1. 
   idx/nSerial (ID in temp SQLITE db)
   This identifier is created in ICreateFeature and simply represents the 
   order in which features are written into the temporary table - has no 
   direct relationship to the FIDs exposed later through the GDAL API.
   2. 
   localID (feature ID in tile)
   This ID exists only when reading a specific tile. It corresponds to 
   poUnderlyingFeature->GetFID() and uniquely identifies a feature fragment 
   within a single tile.
   3. 
   dataset-level (global) FID
   This is the FID returned by GetFID() when reading an MVT dataset. It is 
   computed from a tile-specific FIDBase combined with the localFID, as you 
   described in your email.
   

In addition, I implemented the small extension you suggested: when the OGR_
MVT_ADD_TILE_FIELDS option is enabled, I can now retrieve the tile 
coordinates (z, x, y) for each feature while reading the dataset. I’ve 
pushed it as a PR https://github.com/OSGeo/gdal/pull/13596, as it’s a useful
improvement that may be helpful when reading the dataset anyway. Could you 
please take a look at it?




With this in place, I am currently able to build a table like the following 
when reading an MVT dataset (example shown for the first three parcel 
features):

<code class='-wm-western'>globalFID   fidBase   z   x   y   parcel_id      localFID</code>
<code class='-wm-western'>5926        5926      7   46  38  76343521010    0</code>
<code class='-wm-western'>22310       5926      7   46  38  73527254010    1</code>
<code class='-wm-western'>38694       5926      7   46  38  73527253010    2</code>

So now I know which globalFIDs correspond to each parcel, including its 
fidBase and the tile coordinates (z, x, y). Nice :-) 

And internally I also have the full update pipeline implemented - when the 
dataset is opened in GDAL_OF_UPDATE mode, an instance of OGRMVTWriterDataset
is created, which allows me to call DeleteFeature (D) and CreateFeature (I),
or a combination of both when an existing feature needs to be updated (U). 
During these operations, I collect the affected tiles, and at the end of the
update process the pipeline calls the UpdateOutput method to regenerate only
the .pbf files or MBTILES rows that were impacted.




So the remaining challenge is that I don’t see a clear way to modify the 
driver so that a feature can be reliably deleted from the intermediate 
SQLite database. I initially considered whether it might be possible to 
store the globalFID (or at least the localFID) back into the temporary 
SQLite table during the CreateOutput step, after encoding each individual 
tile. However, at that stage the driver is not in a dataset-reading context,
and I do not see a reliable way to access or derive the localFID during 
writing (perhaps such a mechanism could be implemented, but even if it 
exists, it does not seem like an ideal approach).




Given this, I started thinking about an alternative solution based on a 
stable, domain-level identifier such as domain_id (in practice, e.g., parcel
_id), which is already known at write time. This would require storing such 
an identifier explicitly in the intermediate SQLite table (in addition to 
the internal idx, geometry, and tile coordinates). DeleteFeature could then 
be implemented by removing all rows associated with that domain-level 
identifier (that would be an input parameter).




What do you think, Even? Does this approach make architectural sense to you?




Thanks again for your help, and I hope you have a great Christmas!
Linda Karlovská




---------- Původní e-mail ----------
Od: Linda Kladivová via gdal-dev <gdal-dev at lists.osgeo.org>
Komu: gdal-dev at lists.osgeo.org
Datum: 23. 11. 2025 13:07:27
Předmět: [gdal-dev] Decoding feature blobs and tracking FIDs in MVT datasets
"
Hi,




I’d like to ask about one specific aspect of the MVT driver. In my workflow,
I need to track the FIDs of features inside an MVT dataset together with 
their corresponding internal codes (actual parcel IDs).




My use case involves an MVT dataset of about 20 million features that is 
frequently updated. For this purpose, I have implemented an extension of the
MVT driver that can open an MVT dataset in update mode and create/delete 
features based on changes detected by a preceding ETL process. Updates 
typically affect only a few individual features every hour.




After the initial loading of the full dataset using ogr2ogr I need a way to 
iterate through the dataset and build a separate table mapping each internal
FID to the parcel’s real ID. I have tried opening and inspecting the temp.db
file (which I do not delete):




GDALDataset* poTempDb =
    (GDALDataset*) GDALOpenEx(dstTempDb.c_str(),
                              GDAL_OF_VECTOR, nullptr, nullptr, nullptr);

OGRLayer* poTempLayer = poTempDb->GetLayerByName("temp");
poTempLayer->ResetReading();

OGRFeature* poFeat = nullptr;

while ((poFeat = poTempLayer->GetNextFeature()) != nullptr)
{
    GIntBig fid = poFeat->GetFID();
    std::cout << "FID=" << fid << " — ";
    int idx = poFeat->GetFieldIndex("feature");
    int blobSize = 0;
    const GByte* pBlob = poFeat->GetFieldAsBinary(idx, &blobSize);
    std::cout << " Blob size=" << blobSize << std::endl;
}



I can successfully access the feature blob, but I need to decode it in order
to extract my custom parcel ID attribute and store those values externally. 
This would allow me to keep track of which parcel corresponds to which 
internal FID.


This is a one-time (potentially slow) process. Afterward, during incremental
updates, I will be calling CreateFeature and DeleteFeature (and updating my 
external mapping table accordingly). For edits, I currently use a 
combination of DeleteFeature and CreateFeature (I haven’t implemented 
SetFeature yet).




My question is:
Is there a way to decode this blob using existing GDAL/MVT functionality, or
would this require implementing a new function?

Thank you very much for your help.




Linda Karlovská
_______________________________________________ 
gdal-dev mailing list 
gdal-dev at lists.osgeo.org 
https://lists.osgeo.org/mailman/listinfo/gdal-dev 
"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20251223/75657792/attachment.htm>


More information about the gdal-dev mailing list