Optimizing with keys
This topic includes the following sections:
Optimization starts with the design of your database and repository (see Setting up a repository). For xfODBC to quickly process SQL statements, the data files and the repository should have well-chosen keys that reflect the way users access data.
What are keys?
A key is a portion of a record structure that individually identifies records and enables records to be quickly accessed and sorted. For ISAM files, a key can be a portion (segment) or a group of separate portions of the record structure. For repositories, a key can be a single field or a group of fields. The fields that make up a repository key are also called segments.
Keys are created at two different points in your database’s development: when you create the data files and when you define the repository for those files. When you create an ISAM file, for instance, you define the primary key and any secondary keys for that file. When you create a relative file, the record number is automatically defined as the primary key (this is the only key that can be defined for a relative file). ASCII sequential files, on the other hand, don’t have keys.
In addition to defining keys when you create data files, you also define keys when you define the repository that describes your data files. At this point, there are two types of keys you can create: access and foreign. Access keys correspond to the keys you created when you created the data file, are used to locate and sort records, and can be used to define relationships between tables. Foreign keys, however, are not keys in the data file, but can be used to create relationships between tables. (Note that foreign keys are useful only with ODBC-enabled applications that support the ODBC API function SQLForeignKeys.)
For information on defining keys for ISAM files and relative files, see Synergy DBMS.
How xfODBC uses keys
As xfODBC processes an SQL statement, it looks to the database’s system catalog for indexes (keys) it can use to speed the processing of the statement. To determine which key to use, xfODBC evaluates two types of SQL clause: restriction clauses and sort clauses. Restriction clauses include WHERE, HAVING, and JOIN clauses. Sort clauses include ORDER BY and GROUP BY clauses.
To evaluate a restriction clause, xfODBC attempts to match the columns (field names) in the restriction clause with the key’s segments. To evaluate a sort clause, xfODBC matches the sort clause’s order (ASC or DESC) with the key’s segment order and matches the columns in the sort clause with the key’s segments. If no key can be used with the sort clause, xfODBC will then create a temporary table sort, which results in many more I/O operations and poorer performance.
Note the following:
- xfODBC evaluates JOINs before other restriction clauses.
- xfODBC evaluates restriction clauses before ORDER BY clauses. If for the restriction clause xfODBC uses a key that includes the field specified for the ORDER BY, and that field is the first segment in the key, xfODBC won’t perform an additional sort because the result set will already be in the correct order.
- If there is an ORDER BY clause but no restriction clause, xfODBC may choose a key based on the ORDER BY to avoid a sort of the result set.
- xfODBC will optimize a sort clause only if there is no join, the clause has a constant predicate, and the clause has one of the following operators: >, >=, <, or <=.
- For xfODBC to use a key for a restriction or sort clause, the clause doesn’t need to have a column for every segment in the key, but it must contain columns for one or more contiguous segments starting with the first segment. For example, to use the fourth segment of a key that has four segments—seg1, seg2, seg3, and seg4—there must be columns for seg1, seg2, and seg3 as well as seg4. If, for example, a clause has columns that correspond only to seg1 and seg3, xfODBC can use the first segment, seg1, of the key, but not the third, seg3.
- If xfODBC cannot use a key with either a sort or a non-join restriction clause, xfODBC uses the primary key as a sequential read key. This means the driver must read the entire file.
- For inner joins, xfODBC creates a temporary index for each join table whose columns are not part of any key. Each temporary index includes all the join columns for its table. You can prevent xfODBC from creating this temporary index by setting TMPINDEX to OFF. (For information, see SET OPTION.)
If you use xfODBC with inner joins that result in temporary indexes, note that your data files must use static RFAs if update, store, or delete operations will occur while the xfODBC driver is using a temporary index to access a file.
- A key can have a literal segment in any position. See Keys with literals.
- If a key contains a field that overlays other fields, dbcreate creates an alternate index made up of the individual fields that make up the overlay (if it overlays more than one field). See Indexes for overlay segments below.
- Crystal Reports 9 and higher do not support foreign keys. These versions of Crystal Reports will not automatically use foreign keys to optimize a query, so users must understand database relationships in order to optimize joins.
- If a key of reference is not explicitly defined (with the Key of ref setting in Repository), it is assumed to be the key order defined in the repository. For information on the Key of ref setting, see Defining keys.
Note that if a “Key of ref” value is specified for a key in a structure, it should be explicitly set for all keys in that structure. Otherwise dbcreate will report an error indicating there is a duplicate key of reference.
If a structure has a key with an overlay segment, dbcreate creates an index that includes the non-overlay fields (i.e., the fields that the segment overlays) if the overlay field
- overlays more than one primary field.
- does not have an offset.
This index is named as follows in GENESIS_INDEXES: KeyName_GENIX_n (where n is an ascending numeric value). For instance, if a key has an a22 segment that overlays four fields (as in the following example), dbcreate creates an index consisting of the four fields (custid, date, code, and text) in addition to an index consisting of seg1.
record custid ,d4 date ,d6 code ,d2 text ,a10 record ,x seg1 ,a22 ;key
This would look like the following in GENESIS_INDEXES:
KEY1
KEY1_GENIX_0
If a structure has a tag index and a key named KEY_0 that overlays two fields, dbcreate will create the following indexes:
$_VTX_TAG_VIX_0001
KEY_0
KEY_0_GENIX_0
Note, however, that if a field is only partially included in an overlay, that field will be omitted from any GENIX index created for that overlay.
On the other hand, if multiple fields overlay a key segment defined as a primary field (the overlaid field)—the inverse of the first example—dbcreate will not create an index for the overlaying fields. Instead, it will create a single index for the key segment. For example, if four overlay fields—custid (a d4), date (a d6), code (a d2), and text (an a10)—overlay an a22 segment (as in the following structure), dbcreate creates only one index, an index for the a22 key field.
record key1 ,a22 ;key record ,x custid ,d4 date ,d6 code ,d2 text ,a10
In this case, xfODBC won’t use an index for a statement with a restriction or sort clause for the custid, date, code, or text fields.
Finally, if multiple overlay fields are defined as a key (as in the following example), xfODBC won’t be able to use the primary field (the overlaid field) for optimization. For the following example, xfODBC won’t be able to use the key1 field for optimization.
record key1 ,a22 record ,x custid ,d4 ;key segment date ,d6 ;key segment code ,d2 ;key segment text ,a10 ;key segment
Defining keys
Keep the following in mind when defining keys:
- Analyze your data. Remember that adding keys slows down the add/update process. Create keys only if they’ll be used often with restriction clauses or sort clauses. Then write your SQL statements to use these keys. Note that it’s better to define your keys to work with restriction clauses than sort clauses. A key used with a restriction clause has precedence over a key used with a sort clause. It’s even better to define a key that works with both the restriction and sort clause. This improves performance by eliminating the sort phase.
- Define an access key for each key in the data files. xfODBC recognizes keys only if they’re defined in the system catalog, which is generated from the repository, not the data files.
- Avoid creating keys that contain null values. xfODBC uses them only for join optimization.
- Avoid creating keys that are case insensitive or unsigned. xfODBC can’t use them for optimization.
- Avoid creating keys that have a date field with a rolling century (RR). xfODBC can’t use them for optimization.
- Not all date formats can be used as key segments. Only JJJJJJ dates and dates with formats that begin with YYYY can be key segments. (Note, however, that the YYYYMMDDHHMISS date format listed in Date and time fields should not be used as a key segment because you have to use a user type field to specify it.)
- Avoid creating access or foreign keys that contain user-defined fields. The xfODBC query optimizer ignores them. However, there are three exceptions (i.e., cases where xfODBC can use keys with user-defined fields):
- a key with a user-defined date field that has a ^CLASS^ format
- a foreign key with an alpha user-defined field
- an access key with an alpha user-defined field whose key type override is set as alpha. (For information on setting the type override, see Defining keys.)
Note the following:
- If a key contains multiple field segments, xfODBC uses only those segments that are defined prior to a user-defined field segment.
- If a user-defined field is the first segment in a key, xfODBC won’t use the key.
- If the data file and the repository have different data types for the same segments, use the Type option (in the Key Definition window of Repository) to override decimal segments to alpha segments if the segments will have only positive values. This prevents fcompare (see Validate, verify, and compare) from generating warnings when it discovers that the key has a different type than some of its segments. If a segment can have negative values, the segment must be decimal, so do not override the segment data type. For information on the Type option, see Defining keys.
- If a key segment is an overlay field, the overlay is not usually relevant to the query. Because of this, dbcreate creates an alternate index that consists of all the fields that make up the overlay. However, if a key segment is overlaid with multiple overlay fields, dbcreate will not create an index for the primary fields (the overlaid fields). See Indexes for overlay segments for more information.
- If you plan to import data from your Synergy database to another database, such as Oracle or SQL Server, make sure that no keys are null. Many databases, such as SQL Server, do not allow null key fields.
- If a file has a tag field, we recommend including the tag as the first segment in each key for the file. See Tags and optimization for more information.
Keys with literals
A key can have a literal segment in any position in the key. This means that for foreign keys, a literal can be used to correspond to a literal tag in a related table. For example, in the sample database included with the Connectivity Series distribution (the repository is in connect\synodbc\dict), the literal at the beginning of TAG_KEY_VEND in the ORDERS table enables a relation to be created between TAG_KEY_VEND and TAG_KEY, as illustrated in figure 1. Without this literal, we couldn’t use the tag for the VENDORS table, and the keys wouldn’t correspond. We need an equivalent segment as the first segment in TAG_KEY_VEND.
If the ORDERS and VENDORS tables had the following values there would be matches for all the rows in the ORDERS table except the row with the OR_VENDOR value of 42:
Likewise, all but one of the rows in the VENDORS table would have a match. The row with the VEND_KEY value of 40 wouldn’t have a match because there is no row in the ORDERS table with an OR_VENDOR value of 40 and because this row has a VEND_R_TYPE value of 2, while the corresponding literal segment in TAG_KEY_VEND is always 1.
Tags and optimization
Keep the following in mind when defining tags or creating keys for a file with a tag:
- Make sure the tags won’t be modified by user-defined data routines. The xfODBC driver evaluates tags before invoking user-defined data routines, so the values produced by user-defined data routines won’t be evaluated as tag values.
- If a file has a tag field, we recommend including the tag as a segment (preferably the first segment) in each key for the file. Note the following:
- xfODBC isn’t able to use a key that consists solely of a tag for automatic optimization unless it’s the first key in the file. We don’t recommend creating such keys (though they can be used as part of a join, restriction clause, or sort clause).
- Like any segment, a tag field can be used for optimization only if it’s the first segment in the key or if all the segments that precede it are also part of the join, sort clause, or restriction that uses the equal (=) operator.
- If a structure has more than one set of tag criteria joined with the OR connect option, xfODBC won’t be able to use a tag for optimization.
If your repository has a set of keys that either do not include the tag or do not include it as the first segment, create a new key with the tag as the first segment, and then use the Repository option “Excluded by ODBC” to exclude similar keys that either have the tag in other positions or that do not have the tag at all. (For information on this Repository option, see Defining keys.)
Similarly, if your repository has two or more keys that are identical except for the inclusion or placement of a tag field, and one of the keys has the tag as the first segment (as we recommend), do one of the following to ensure that the key with the tag as the first segment is used for optimization:
- Set the Synergy/DE Repository option “Excluded by ODBC” for the keys that don’t have the tag as the first field.
- Make sure the key with the tag as the first segment is lower than or equal to the others in key of reference (KRF) sequence. If it is equal, make sure it’s first in the repository.
- If a tag is based on the Boolean operator EQ, dbcreate automatically creates an index with tag-related information. These are named as follows in the GENESIS_INDEXES file: $_VTX_TAG_VIX_nnnn (where nnnn is a numeric value, starting at 0001 for the first key).
- If a tag is based on an operator other than EQ, xfODBC may not be able to use the tag to optimize queries. If it’s based on the NE operator, xfODBC will not be able to use it. If it is based on LE, LT, GE, or GT, xfODBC will be able to use it only if it is the last tag for the structure and all prior tags are based on the EQ operator. If your repository has a tag that for one of these reasons cannot be used for optimization, add a new tag that uses EQ to the file. Then either use the new tag as the first segment for all keys used by xfODBC for the file, or create a new set of keys that use the new tag, and use the Repository option “Excluded by ODBC” to exclude keys that use the unoptimizable tag.