[fdo-internals] TIcket 471: ArcSDE Long Text Fields.. testing patch

Romica Dascalescu Romica.Dascalescu at autodesk.com
Mon Mar 9 14:02:48 EDT 2009


See inline...

Thanks,
Romi.
________________________________________
From: fdo-internals-bounces at lists.osgeo.org [fdo-internals-bounces at lists.osgeo.org] On Behalf Of Chris Erickson [chris at cartopac.com]
Sent: Monday, March 09, 2009 1:39 PM
To: FDO Internals Mail List
Subject: RE: [fdo-internals] TIcket 471: ArcSDE Long Text Fields.. testing      patch

Romi,
Thanks, Here are some questions:
                1) ArcSDE/Src/Provider/ArcSDEDescribeSchemaCommand.cpp

                You should use the value from capabilities when we set length ("data_definition->SetLength(2147483647);")
                See below the capabilities reported by the provider:

                FdoInt64 ArcSDESchemaCapabilities::GetMaximumDataValueLength (FdoDataType dataType) {
                        switch (dataType)
                        {
                                case FdoDataType_String:   return (FdoInt64)4294967296LL; // ArcSDE's limit itself
                                case FdoDataType_BLOB:     return (FdoInt64)4294967296LL; // Oracle's limit, which is higher than SQL Server's limit
                                case FdoDataType_CLOB:     return (FdoInt64)4294967296LL; // Oracle's limit, which is higher than SQL Server's limit
                        }
                        return (FdoInt64)-1;
                }

I had done this because SetLength() asks for a 32 bit int.  I can change it, but just wanted to explain why I'd done that.
[RD] ok, your change makes sense.


                2) ArcSDE\Src\Provider\ArcSDEReader.cpp

                Note: FDO keeps length in bytes, SDE expects length in characters (bytes/wchar_t depending of the type of the CLOB and SDE_UNICODE).

                Mixing SE_NCLOB_TYPE with SE_CLOB_TYPE will create problems when SDE_UNICODE is not defined (when we use 9.1).
                SE_NCLOB_INFO - contains wchar_t* characters and length is provided in wchar_t and not in bytes Either you add support for CLOB only for
                9.2 and not for 9.1 (add the new code with #ifdef SDE_UNICODE) or you fix the issue (in ArcSDEReader::GetLOBStreamReader).

The code already disallows the streamreader for 9.1, because it can't convert multibyte to Unicode as a stream.. This is handled by this block:
#if !SDE_UNICODE
        if (columnDef->mColumnType == SE_CLOB_TYPE)
                throw FdoException::Create (NlsMsgGet2(ARCSDE_VALUE_TYPE_MISMATCH, "Cannot get LOB stream on property '%2$ls'.  Underlying DBMS is multibyte, stream requires unicode.  Use GetLOB instead.", L"FdoBLOB", identifier));
#endif


                You need to fix ArcSDEReader::GetLOBStreamReader and ArcSDEReader::GetLOB since the size of the buffer is not blob_length it is sizeof(CHAR)*blob_length.
                FdoLOBValue and ArcSDEBLOBStreamReader expects byte* and length (in bytes), we provide (byte*)(wchar_t*) and length (in wchar_t) and we lose half of the array.

I'm a little confused here.  I've just double checked in the debugger, and for a CLOB coming out of ArcSDE that has 12 characters, the blob_length was 24, which would be the Unicode length.  I'm unclear what you're asking me to do.
[RD] Ok since SDE API returns number of bytes then changes are ok.

                ArcSDEReader::GetLOB you allocate twice a FdoByteArray could you please change the code and depending of the conditions allocate it only once?
I am only allocating it once, it just has 2 places where it could get allocated.  Are you asking that I move the allocation out of the if...else so it only appears once?  It is there twice because it either allocates from 'unicodeByteArray' or 'byteArray', depending on if it was coming from a CLOB field w/o Unicode defined.

Is it that you want me to make the following change:

Current:
#if !SDE_UNICODE
        if (columnDef->mColumnType == SE_CLOB_TYPE)
        {
                //Convert to UNICODE
                FdoPtr<FdoByteArray> unicodeByteArray = FdoByteArray::Create(2 * (strlen((char*)byteArray.p) + 1));
                multibyte_to_wide_noalloc((wchar_t*)unicodeByteArray.p, (char*)byteArray.p);
                return (FdoLOBValue*)FdoLOBValue::Create(unicodeByteArray, FdoDataType_BLOB);
        }
        else
#endif
        {
                // Return the LOB value:
                return (FdoLOBValue*)FdoLOBValue::Create(byteArray, FdoDataType_BLOB);
        }

Proposed:
#if !SDE_UNICODE
        if (columnDef->mColumnType == SE_CLOB_TYPE)
        {
                //Convert to UNICODE
                FdoPtr<FdoByteArray> unicodeByteArray = FdoByteArray::Create(2 * (strlen((char*)byteArray.p) + 1));
                multibyte_to_wide_noalloc((wchar_t*)unicodeByteArray.p, (char*)byteArray.p);
                byteArrary = unicodeByteArray;
        }
#endif
        // Return the LOB value:
        return (FdoLOBValue*)FdoLOBValue::Create(byteArray, FdoDataType_BLOB);
[RD] No. Move

    // Create a FdoByteArray that contains the entire blob:
    FdoPtr<FdoByteArray> byteArray = FdoByteArray::Create ((FdoByte*) columnDef->mBindVariable._blob.blob_buffer,
        columnDef->mBindVariable._blob.blob_length);
in the else section. See below
        else
#endif
        {
               // Create a FdoByteArray that contains the entire blob:
                FdoPtr<FdoByteArray> byteArray = FdoByteArray::Create ((FdoByte*) columnDef->mBindVariable._blob.blob_buffer,
                columnDef->mBindVariable._blob.blob_length);
                // Return the LOB value:
                return (FdoLOBValue*)FdoLOBValue::Create(byteArray, FdoDataType_BLOB);
        }



                3) ArcSDE\Src\Provider\ArcSDEFeatureCommand.h

                Could you please tell why you replaced
                "ret = SE_stream_set_string (stream, columnIndex, NULL);"
                with
                "ret = SDENAME_C(SE_stream_set_string) (stream, columnIndex, NULL);"?

                Naming of function should be handled by define of SDE_UNICODE.

I changed it there to be consistant with my name change down below:
                                          if (isUnicode)
                                    ret = SE_stream_set_nstring (stream, columnIndex, (SE_WCHAR*)_string->GetString());
                                else
                                                ret = SDENAME_C(SE_stream_set_string) (stream, columnIndex, (SE_WCHAR*)_string->GetString());

I could change the one down below to :
                                          if (isUnicode)
                                    ret = SE_stream_set_nstring (stream, columnIndex, (SE_WCHAR*)_string->GetString());
                                else
                                                ret = SE_stream_set_stringW (stream, columnIndex, (SE_WCHAR*)_string->GetString());
and revert the previous change if you'd prefer.

[RD] Changing this you will not be able to build 9.1 provider (SDE using 9.1 is not supporting Unicode)
You must keep the function name as it was and SE_stream_set_string will be changed to SE_stream_set_stringW only when SDE_UNICODE is defined.

                case FdoDataType_CLOB: // Character Large Object No conversion should be done here. The client of the CLOB should update the right values in CLOB
                        and you just need to call the right function with the buffer and right length.

I disagree -- My understanding is that the CLOB coming out of FDO should ALWAYS be in unicode.  Therefore, if I'm submitting a UNICODE string, but ArcSDE is 9.1/non-unicode, I need to translate it to multibyte before I send it in using SE_stream_set_clob.

                Note: FDO keeps length in bytes, SDE expects length in characters (bytes/wchar_t depending of the type of the CLOB and SDE_UNICODE).
I'll need your help to see specifically where I'm not passing in the correct lengths.  On the stream reader, I'm getting the Unicode lengths, not # characters, so I'd be suspicious if that was different here.
[RD] Ok.

I had to make a series of changes and I've completed testing against SDE/SQL Server, and am going to send out my results. In a follow-up email, as this one is rather long and unwieldy.

chris erickson
developer
chris at cartopac.com
970.493.9500 x 191
970.482.1485 (fax)



-----Original Message-----
From: fdo-internals-bounces at lists.osgeo.org [mailto:fdo-internals-bounces at lists.osgeo.org] On Behalf Of Romica Dascalescu
Sent: Monday, March 09, 2009 8:49 AM
To: FDO Internals Mail List
Subject: RE: [fdo-internals] TIcket 471: ArcSDE Long Text Fields.. testing patch

Hi Chris,

I have a few comments about your change, see below.

1) ArcSDE/Src/Provider/ArcSDEDescribeSchemaCommand.cpp

You should use the value from capabilities when we set length ("data_definition->SetLength(2147483647);")
See below the capabilities reported by the provider:

FdoInt64 ArcSDESchemaCapabilities::GetMaximumDataValueLength (FdoDataType dataType) {
    switch (dataType)
    {
        case FdoDataType_String:   return (FdoInt64)4294967296LL; // ArcSDE's limit itself
        case FdoDataType_BLOB:     return (FdoInt64)4294967296LL; // Oracle's limit, which is higher than SQL Server's limit
        case FdoDataType_CLOB:     return (FdoInt64)4294967296LL; // Oracle's limit, which is higher than SQL Server's limit
    }
    return (FdoInt64)-1;
}

2) ArcSDE\Src\Provider\ArcSDEReader.cpp

Note: FDO keeps length in bytes, SDE expects length in characters (bytes/wchar_t depending of the type of the CLOB and SDE_UNICODE).

Mixing SE_NCLOB_TYPE with SE_CLOB_TYPE will create problems when SDE_UNICODE is not defined (when we use 9.1).
SE_NCLOB_INFO - contains wchar_t* characters and length is provided in wchar_t and not in bytes Either you add support for CLOB only for 9.2 and not for 9.1 (add the new code with #ifdef SDE_UNICODE) or you fix the issue (in ArcSDEReader::GetLOBStreamReader).
You need to fix ArcSDEReader::GetLOBStreamReader and ArcSDEReader::GetLOB since the size of the buffer is not blob_length it is sizeof(CHAR)*blob_length.
FdoLOBValue and ArcSDEBLOBStreamReader expects byte* and length (in bytes), we provide (byte*)(wchar_t*) and length (in wchar_t) and we lose half of the array.

ArcSDEReader::GetLOB you allocate twice a FdoByteArray could you please change the code and depending of the conditions allocate it only once?


3) ArcSDE\Src\Provider\ArcSDEFeatureCommand.h

Could you please tell why you replaced
"ret = SE_stream_set_string (stream, columnIndex, NULL);"
with
"ret = SDENAME_C(SE_stream_set_string) (stream, columnIndex, NULL);"?

Naming of function should be handled by define of SDE_UNICODE.

case FdoDataType_CLOB: // Character Large Object No conversion should be done here. The client of the CLOB should update the right values in CLOB and you just need to call the right function with the buffer and right length.
Note: FDO keeps length in bytes, SDE expects length in characters (bytes/wchar_t depending of the type of the CLOB and SDE_UNICODE).

Thanks,
Romi.

________________________________
From: fdo-internals-bounces at lists.osgeo.org [fdo-internals-bounces at lists.osgeo.org] On Behalf Of Chris Erickson [chris at cartopac.com]
Sent: Friday, March 06, 2009 5:12 PM
To: FDO Internals Mail List
Subject: [fdo-internals] TIcket 471: ArcSDE Long Text Fields.. testing patch

I'm not done testing this patch, but if anyone would like to take a look at or review it, I'd appreciate it.

I've not tested getting streams on large text fields yet.

It contains the following fixes:
Support for CLOB/NCLOB fields.
Allocating string fields for conversion on heap.

There are known errors with SQL Server VARCHAR(MAX)/NVARCHAR(MAX) fields, where they are coming out empty for some reason.  I have no clues on this, if anyone else wants to look.

I've not been able to fully test against Oracle either, nor complie against 9.1.

[cid:image001.png at 01C99E6D.EE56CE60]<http://www.cartopac.com/>

chris erickson
developer
chris at cartopac.com<mailto:chris at cartopac.com>
970.493.9500 x 191
970.482.1485 (fax)



_______________________________________________
fdo-internals mailing list
fdo-internals at lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/fdo-internals


More information about the fdo-internals mailing list