[gdal-dev] Decoding feature blobs and tracking FIDs in MVT datasets
Linda Kladivová
L.Kladivova at seznam.cz
Tue Dec 23 05:00:09 PST 2025
Hi Even,
I was a bit confused about FIDs in the MVT driver, so thanks a lot for your
earlier explanations. (Sorry I didn’t reply directly to your email — I can
see it in the GDAL archives but not in my inbox, not sure what happened.)
When implementing CreateFeature and DeleteFeature, I initially assumed that
the idx identifier in the temporary SQLite table (created in ICreateFeature)
could be relied on as a kind of “global FID”. So I implemented DeleteFeature
with idx parameter as input. :-) I now understand that this assumption was
wrong, and that there are actually three distinct identifiers involved:
1.
idx/nSerial (ID in temp SQLITE db)
This identifier is created in ICreateFeature and simply represents the
order in which features are written into the temporary table - has no
direct relationship to the FIDs exposed later through the GDAL API.
2.
localID (feature ID in tile)
This ID exists only when reading a specific tile. It corresponds to
poUnderlyingFeature->GetFID() and uniquely identifies a feature fragment
within a single tile.
3.
dataset-level (global) FID
This is the FID returned by GetFID() when reading an MVT dataset. It is
computed from a tile-specific FIDBase combined with the localFID, as you
described in your email.
In addition, I implemented the small extension you suggested: when the OGR_
MVT_ADD_TILE_FIELDS option is enabled, I can now retrieve the tile
coordinates (z, x, y) for each feature while reading the dataset. I’ve
pushed it as a PR https://github.com/OSGeo/gdal/pull/13596, as it’s a useful
improvement that may be helpful when reading the dataset anyway. Could you
please take a look at it?
With this in place, I am currently able to build a table like the following
when reading an MVT dataset (example shown for the first three parcel
features):
<code class='-wm-western'>globalFID fidBase z x y parcel_id localFID</code>
<code class='-wm-western'>5926 5926 7 46 38 76343521010 0</code>
<code class='-wm-western'>22310 5926 7 46 38 73527254010 1</code>
<code class='-wm-western'>38694 5926 7 46 38 73527253010 2</code>
So now I know which globalFIDs correspond to each parcel, including its
fidBase and the tile coordinates (z, x, y). Nice :-)
And internally I also have the full update pipeline implemented - when the
dataset is opened in GDAL_OF_UPDATE mode, an instance of OGRMVTWriterDataset
is created, which allows me to call DeleteFeature (D) and CreateFeature (I),
or a combination of both when an existing feature needs to be updated (U).
During these operations, I collect the affected tiles, and at the end of the
update process the pipeline calls the UpdateOutput method to regenerate only
the .pbf files or MBTILES rows that were impacted.
So the remaining challenge is that I don’t see a clear way to modify the
driver so that a feature can be reliably deleted from the intermediate
SQLite database. I initially considered whether it might be possible to
store the globalFID (or at least the localFID) back into the temporary
SQLite table during the CreateOutput step, after encoding each individual
tile. However, at that stage the driver is not in a dataset-reading context,
and I do not see a reliable way to access or derive the localFID during
writing (perhaps such a mechanism could be implemented, but even if it
exists, it does not seem like an ideal approach).
Given this, I started thinking about an alternative solution based on a
stable, domain-level identifier such as domain_id (in practice, e.g., parcel
_id), which is already known at write time. This would require storing such
an identifier explicitly in the intermediate SQLite table (in addition to
the internal idx, geometry, and tile coordinates). DeleteFeature could then
be implemented by removing all rows associated with that domain-level
identifier (that would be an input parameter).
What do you think, Even? Does this approach make architectural sense to you?
Thanks again for your help, and I hope you have a great Christmas!
Linda Karlovská
---------- Původní e-mail ----------
Od: Linda Kladivová via gdal-dev <gdal-dev at lists.osgeo.org>
Komu: gdal-dev at lists.osgeo.org
Datum: 23. 11. 2025 13:07:27
Předmět: [gdal-dev] Decoding feature blobs and tracking FIDs in MVT datasets
"
Hi,
I’d like to ask about one specific aspect of the MVT driver. In my workflow,
I need to track the FIDs of features inside an MVT dataset together with
their corresponding internal codes (actual parcel IDs).
My use case involves an MVT dataset of about 20 million features that is
frequently updated. For this purpose, I have implemented an extension of the
MVT driver that can open an MVT dataset in update mode and create/delete
features based on changes detected by a preceding ETL process. Updates
typically affect only a few individual features every hour.
After the initial loading of the full dataset using ogr2ogr I need a way to
iterate through the dataset and build a separate table mapping each internal
FID to the parcel’s real ID. I have tried opening and inspecting the temp.db
file (which I do not delete):
GDALDataset* poTempDb =
(GDALDataset*) GDALOpenEx(dstTempDb.c_str(),
GDAL_OF_VECTOR, nullptr, nullptr, nullptr);
OGRLayer* poTempLayer = poTempDb->GetLayerByName("temp");
poTempLayer->ResetReading();
OGRFeature* poFeat = nullptr;
while ((poFeat = poTempLayer->GetNextFeature()) != nullptr)
{
GIntBig fid = poFeat->GetFID();
std::cout << "FID=" << fid << " — ";
int idx = poFeat->GetFieldIndex("feature");
int blobSize = 0;
const GByte* pBlob = poFeat->GetFieldAsBinary(idx, &blobSize);
std::cout << " Blob size=" << blobSize << std::endl;
}
I can successfully access the feature blob, but I need to decode it in order
to extract my custom parcel ID attribute and store those values externally.
This would allow me to keep track of which parcel corresponds to which
internal FID.
This is a one-time (potentially slow) process. Afterward, during incremental
updates, I will be calling CreateFeature and DeleteFeature (and updating my
external mapping table accordingly). For edits, I currently use a
combination of DeleteFeature and CreateFeature (I haven’t implemented
SetFeature yet).
My question is:
Is there a way to decode this blob using existing GDAL/MVT functionality, or
would this require implementing a new function?
Thank you very much for your help.
Linda Karlovská
_______________________________________________
gdal-dev mailing list
gdal-dev at lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/gdal-dev/attachments/20251223/75657792/attachment.htm>
More information about the gdal-dev
mailing list