[fdo-internals] Use of SQL pass through

Wed Jun 3 09:27:56 EDT 2009

Hi,

I would like to solicit feedback, discussion, and suggestions on requirements and issues that we are seeing with the FdoISQLCommand which is used for the purpose of SQL pass through for some of the RDBMS-based providers.

Please see the discussion below.

Thanks,
Orest.

Enhanced SQL Command Support Discussion
Overview
The FDO API currently defines support for a SQL command that allows for pass-through execution of SQL statements either through a non-query execution of the SQL statement directly in the underlying Data Store, or through a query mechanism that returns a simple data reader listing the properties returned as a result of the SQL execution. The definition of the SQL command has remained fairly static over the last number of releases as primary development efforts have focused on extending other aspects of the FDO API, implementing new providers, etc. However, requirements for change to the SQL command have accumulated for us as RDBMS providers have implemented SQL pass through support and clients have attempted to integrate use SQL pass through into their applications in a seamless manner. While we generally expect applications to be using the generic FDO Select and other commands using FDO expressions, application still need to execute direct SQL against RDBMS-based providers on an exception basis for things that cannot do through FDO. While we hope to improve the FDO expression capabilities over time, there will always be this need for direct SQL processing for exception cases.
This email is meant to start discussion on how to handle the requirements that we've been seeing and to get feedback on how to modify the FDO api to handle these requirements. We have some ideas as described below, but would like to get other feedback. We'll draft an RFC once we get close to a consensus on how it should look.
Requirements
One key request has been the desire to have the FDO API support SQL pass-through commands that return an FDO feature reader, referencing a proper FDO schema, not simply an FDO data reader. The feature reader will contain proper geometry properties, relations and associations. This enhancement is also intended to allow client applications that use FDO Feature Readers to code their applications in a generic manner so that data coming back from Select or SQL Pass Through statements can be processed in a uniform manner, thus reducing complexity, costs and time to implement.
Other more SQL specific requirements related to this have also arisen, including:

 *   A need to allow a client to set the size of the fetch array an FDO provider will use when executing SQL statements ([OH] I still have an issue with this one. Fetch sizes are internal tuning parameters and are not FDO api concepts. Other providers that deal with select or insert buffering have default internal sizes.)
 *   A need to specify bind variables for the SQL command, including arrays of bind variables. Since SQL commands may include stored procedure execution, bind variables need indication of whether they are input, output, or return parameters.
Discussion
In general, the intent of SQL pass-through is to deal with physical schemas. There is no parsing of the SQL statements, and what are identified are physical schema tables, columns, functions, commands, etc. The SQL statement can be any SQL command, not just select, but any DML (select, insert, update, delete) or any DDL (create, drop, alter, ...). However, FDO feature readers deal with FDO logical schemas, which is at a different conceptual level. It's the mixing of these levels that causes difficulty for applications using the FDO API. Applications are required to use different code pathways to handle select statement as opposed to direct SQL execution. If clients could use the result of either of these types of operations in a generic routine or component, such as a Data Table or Symbolization packages, applications would be provided a much more seamless and dynamic mechanism on which they can build and extend their applications.
To a certain degree, the current FDO feature reader implementation assumes an FDO class definition is directly related to the properties being returned. With physical schema SQL (let's say just Select command), there isn't necessarily any FDO class definition that applies. This is why currently the SQL command's Execute method returns a data reader, which handles any generically returned data. Note that the FDO select aggregates command doesn't return a feature reader either, since it's returning computed data.
One response to this issue has been to suggest that providers reverse engineer the select results and attempt to find the FDO class definition (coming from a describe schema request) that matches it. Other suggestions have been to construct a class definition on the fly, one that would not result from a call to describe schema. If the select is from a single table, providers would find the class definition that is based on that table and use it. However, then matching up the columns being selected with the properties in the logical schema, there may be some slight differences. This would result in the mixing of physical schema and logical schema elements together. For example, class names that are not the same as table names, property names that are not the same as column names, the use of additional computed columns, inclusion of pkey columns, etc. Granted that in many cases, the logical and physical views would be virtually identical. However, since that cannot be guaranteed, the design that is adopted will have to be able to handle the general case.
To account for these scenarios, providers will need to be modified to reverse engineer class definitions from the selected data and not attempt to match the select request to an existing FDO class. In reality, FDO Providers will be required to handle this in any case since a generic SQL select may not match up at all with an existing FDO class. An example of this can be found when selecting from a table with an owner that is different from the connected data store. For example, Connect to data store called Denver and select from Boulder.Roads - the schema may or may not be similar to Denver.Roads. In these types of circumstances, providers should manufacture a new class definition.
In cases where the resulting columns come from an existing FDO feature table, a provider can return the class definition corresponding to that table. In cases where the columns come from an unknown table, a class definition can be constructed on the fly. By definition, the FDO class definition returned by a feature reader does not necessarily correspond exactly to an existing FDO class definition. Existing class definitions may contain the properties that were asked for in the Select command, plus additional computed properties. It is perfectly legal to return a constructed class definition, which is only valid for the select that was executed, and not usable for further updates or inserts.
In the circumstance that a computed class is generated, the FDO class definition's IsComputed property will return true. In this manner, applications are able to distinguish the content of the feature reader responses coming from the providers and tailor their implementation accordingly. In such a situation, some care will also need to be given to the name of the generated FDO schema and class definitions. At this point no standards exist for naming auto-generated schema, class and property names. It would be beneficial if, as a result of this RFC, some uniform naming conventions could be adopted.
Providers that do return feature readers from SQL commands will need to come up with the appropriate class definition that the feature reader could expose. Here are a few general use cases:

 *   Select is against a table that has an existing class definition and the select returns the same information as defined by the class definition (e.g. select * from roads).
 *   Select is against a table that has an existing class definition, but the select returns other information such as a subset of properties or additional computed properties. The FDO select command handles this type of case as well. The class definition returned by the feature reader includes the specific properties for that select.
 *   Select is against a table that doesn't have an existing class definition. Providers will be required to generate a temporary class definition.
 *   Select is against a table that has an existing class definition, but the select is an aggregation that doesn't return actual objects (e.g. select count(*), max(length) from roads). In this case it doesn't make sense for the provider to return the same class name as for the underlying table since it is not actually returning road objects at all. This is basically another temporary class definition.
FDO API Ideas
In order to support the SQL pass-through objective outlined above, the FDO API will need to be modified to provide a new capability function to state that this is supported, and to provide a way to return an explicit FDO Feature Reader. Two possibilities have been identified for returning the feature reader. One idea is to add a new ExecuteFeatureReader (or similar name) method to FdoISQLCommand. The existing ExecuteReader method will remain as is for backwards compatibility. Another possibility is to keep the SQL pass through interface unchanged but provide a utility that will convert the returned SQL Data Reader into a Feature Reader after the fact.
A related question is around the class definition that would be generated from the SQL since it often would be a temporary class definition for the command and not something found from the FDO Describe Schema command. The feature reader would expose that class definition, but would applications need to get that earlier, e.g. with a new method on FdoISQLCommand to describe the generated class definition prior to getting the reader?
Parameter Direction
FDO already includes API's for defining and using parameters (bind variables). However, there is no indication of direction (input versus output) and if an application is planning to use stored procedure calls in the SQL pass through, they would need to provide some indication of input, output, or return parameters. A possibility is to add a new FdoParameterDirection enumeration and add set/get methods to the FdoParameterValue or FdoParameterValueCollection interface.
A related question is whether FdoISQLCommand needs the application to tell it whether the SQL command being executed is a stored procedure call or some other type of SQL? We want to avoid having providers be forced to parse SQL where currently they probably all just send the SQL directly to the underlying server. The ExecuteNonQuery versus ExecuteReader methods that the application uses already tell the provider whether to expect results back from the SQL or not. A stored procedure call is a "non-query" but can return results, so is it another case?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.osgeo.org/pipermail/fdo-internals/attachments/20090603/be74992e/attachment-0001.html