[gdal-dev] RFC 31 - OGR 64bit Fields and FID

Even Rouault even.rouault at mines-paris.org
Fri Sep 10 13:50:58 EDT 2010


Frank,

yes I definitely think that 64bit integer field/FID support is a must-have for 
OGR.

Here are my remarks on the proposal :

0) Minor point : about "Note that the old interfaces using "long" are already 
64bit on 64bit operating systems so there is little harm to applications 
continuing to use these interfaces on 64bit operating systems". This is true 
on almost all platforms, except on Win64 where a long is still 32 bit...

1) I assume that the corresponding C API will be also added to match the new 
C++ methods ?

2) There are also 2 methods that should be revisited for 64 bit support :

OGRLayer::DeleteFeature(long nFID)
OGRLayer::GetFeature(long nFID)

So we probably need :

OGRLayer::DeleteFeature64(GIntBig nFID)
OGRLayer::GetFeature64(GIntBig nFID)

3) I had thought also about the 64bit support before, and backward 
compatibility was a point for which I didn't come with a satisfactory 
solution. The API additions in the proposal are obviously done in a way to 
preserve backward compatibility in the traditionnal sense of it, but the 
changes done in the drivers to expose OFTInteger64 can be disruptive to 
existing applications. For example, I suppose we'll take the opportunity of 64 
bit integer support to solve that long standing issue with shapefiles and 
decide to map DBF integer fields of width >= 10 to be OFTInteger64 field (e.g. 
4000000000 cannot fit into a signed int32)

What happens if an application relies on the result of a switch(poFDefn-
>GetType()) ? Like in :

switch(poFDefn->GetType())
{
	case OFTString: printf("%s\n", poFeature->GetFieldAsString(i)); break;
	case OFTInteger: printf("%d\n", poFeature->GetFieldAsInteger(i)); break;
	case OFTReal: printf("%f\n", poFeature->GetFieldAsDouble(i)); break;
	default: fatal("Doh ! That can definitely not happen with shapefiles !!!
\n"); break;
}

One solution to workaround this could be to add a global function 
OGRAllow64BitInteger() so that applications can declare that they have been 
ported and now capable of dealing with 64 bit integer. If set, the drivers 
will be allowed to create OFTInteger64 fields, otherwise they will just expose 
them as OFTInteger. The drawback is that it increases a bit the complexity of 
drivers, and that applications aware of the 64bit additions must remind of 
calling the function soon after OGRRegisterAll()


4) Python support : I see 2 possiblities.
    a) as Python code rarely depends on the integer size, we could keep 
existing methods and make them use the 64 bit C API instead. This is indded 
what I've done indeed for the new GDALSetCacheMax64() that is now called by 
gdal.SetCacheMax(). But this could be perhaps a bit more risky to do the same 
for GetFID(), GetField(), GetFieldAsInteger() if the user really expects the 
result to be a 32 bit integer, although I fail to see a situation where it 
would be a problem. Indeed, Python dynamically adapts the type to the stored 
value :

>>> a = 12345
>>> print type(a)
<type 'int'>


>>> a = 3000000000
>>> print type(a)
<type 'long'>

So users won't notice until the returned value is actually a 64 bit integer. 
And I'd note that since Python 3, the long type no longer exist. It is just 
'<class int'> for both...

    b) If we want zero risk, introduce new methods with 64 suffix

   As far as the testing of RFC31 is concerned, once we have Python support 
for it in place, this should be easy. Just put a 11 char int field in a 
shapefile and try to get a value from it.
       
5) Java support. Java has strong typing and method signatures depend on the 
exact types, so no choice here if we want the new gdal.jar to be a drop-in 
replacement. We, at minimum, need explicit 64 bit variants for the getters. 
But for SetFID(), we could keep the existing SetFID(int) and add SetFID(long) 
and/or SetFID64(long) (Note: I'm talking about the Java API where a long is 
always a 64bit integer). As I feel that being consistant with the C++ API is a 
nice plus, my proposal would be :

	keep existing "int Feature.GetFID()" and "int Feature.SetFID(int)"
 	add           "long Feature.GetFID64()" and "int Feature.SetFID64(long)"

   I volunteer to make the appropriate changes in the Java bindings.

6) I realize this is a bit out of scope of the RFC, but as we are introducing 
GetFID64(), why not also introducing GetFIDAsString(), 
OGRLayer::GetFeature(const char* pszId) ... ? I'm thinking here to GML/WFS 
where gml:id is a string. We currently have a logic in the GML driver that 
tries to build an integer FID from the string, but that cannot obviously work 
in all circumstances, so we can end up with accidentally duplicated FIDs.

Best regards,

Even


More information about the gdal-dev mailing list