Understanding updates and locking
This topic includes the following sections:
This topic presents general information on concepts, features, and procedures that differ from one database management system to another. We include this information as a starting point; for complete information on these subjects, see your database documentation. |
When multiple users access and modify data from the same database, there can be data access conflicts (multiple users attempting to access the same data) and update conflicts (different versions of the same data being modified and committed to the database at the same time). This is possible because users of front-end applications can simultaneously access the same data on the same database. Additionally, when a user accesses data for viewing or modifying, one or more rows of data are copied from the database into network buffers or into host variables in the front-end application. From this point on, or until the data is stored back into the database, the data the user views may no longer be the same as the data residing in the database. Other users may have accessed and modified the data between the time the user accessed data and the time the data was committed.
To prevent conflicts and ensure data integrity, relational database management systems (RDBMSs) provide controls that enable developers to specify how data will be accessed. Row locking is one such method that’s commonly employed, but it differs from one database management system to another. And in some situations, the front-end application must be able to verify that updates made to the database by one user do not overwrite updates made by another user (see Optimistic locking and unique row identifiers).
Row locking
To balance the need for good performance against the potential for data conflicts when multiple users access a database, two types of locking are generally employed by database engines: pessimistic locking and optimistic locking. In simple terms,
- pessimistic locking assumes that multiple users might be accessing the same data and attempts to prevent conflicts. Pessimistic locking locks data for much of the duration of a transaction. For SQL Connection (when the correct commands are used) pessimistic locking locks rows from the first fetch with %SSC_MOVE or insert until the transaction ends with %SSC_COMMIT, and an error is returned if another user attempts to access locked data. (See figure 1 below.)
- optimistic locking assumes that there will not be a conflict for data resources and therefore relies on the front-end application to ensure data integrity. When optimistic locking is used and data is committed by a user, the database engine does not check to see if data has been accessed by another user, and you may get lock failures at write time rather than read time.
Different databases management systems use different methods to initiate locking. For many that follow ANSI standards, a SELECT FOR UPDATE operation invokes the database engine’s inherent locking method (which typically is pessimistic). For example, Oracle uses pessimistic locking when a SELECT FOR UPDATE operation is performed. On the other hand, when using SQL Server with the VTX12_SQLNATIVE database driver, pessimistic locking is generally invoked automatically on SELECT if you use a dynamic cursor (the default for VTX12_SQLNATIVE) with the UPDLOCK hint and, optionally, the ROWLOCK hint. Otherwise, SQL Server does not use row locking with these database drivers. Also note that when rows have been selected with the UPDLOCK hint, a commit does not release locks.
Allowing the database engine to use pessimistic locking may result in longer locks on data and increased demand on database resources, especially when many users access a database simultaneously.
|
Optimistic locking and unique row identifiers
The concept of unique row identifiers is central to relational database operation. For row locking to occur at all, the database engine must be able to identify each row with a unique identifier, or pointer. For example, SQL Server enables you to include a rowversion (timestamp) column. In an SQL database table, the unique identifier is usually derived from some combination of the row location within the database and a unique numeric value or timestamp.
For optimistic locking, you can compare these values in your SQL Connection program to determine if a row has been updated by another user or process. You can then write your program to handle both successful and failed comparisons. Verifying unique row identifiers
- provides an additional safeguard for data integrity.
- enables you to write an application that locks data resources more efficiently with lower overhead than would be achieved by depending on the database’s inherent locking mechanism. This increased efficiency is achieved because with unique row identifiers you can lock resources on a row-by-row basis, affecting only those rows where the unique identifier hasn’t changed and locking them only for the duration of the update transaction. Using unique row identifiers can reduce contention for data in a multi-user application and thereby lead to increased throughput for all other application requests.
- improves performance for positioned updates on some databases, such as Oracle. Because SQL Connection can use the unique row identifier for an update, the database doesn’t have to use the index a second time to locate the fetched row via a WHERE clause.
SQL Connection has a convenient method for using unique row identifier information as a condition for update. See Using SQL Connection’s automatic verification for details.
MySQL and optimistic locking
To verify a unique row identifier for a MySQL database, use the method described in Using SQL Connection’s automatic verification or use a CURRENT_TIMESTAMP clause with a timestamp column in an ON UPDATE statement. For an example of the latter, see the exam_create_table.dbl sample program included in the Connectivity Series distribution.
SQL Server and optimistic locking
SQL Server provides the following methods for client-side concurrency control:
- Cursors, as defined by SQL Server, enable applications to perform a positional fetch within result sets and update rows. The client and server data-access components work in unison to manage concurrent data access and consistency. See %SSC_CMD and SQL Server documentation for more information.
- At a lower level, a rowversion pseudo column enables you to manually control data concurrency. See Using a rowversion column below.
- Globally unique identifiers (GUIDs) also enable you to manually control data concurrency. See Using GUID columns below.
You can use the method described in Using SQL Connection’s automatic verification, but for performance reasons, it’s not recommended for SQL Server.
With SQL Server, you can use a rowversion column to ensure data integrity. The rowversion column has a user-defined varbinary(8) data type and is updated with the current date and time when an INSERT or UPDATE command is executed. When creating a table, you can optionally specify a rowversion column, but note that a table can have only one such column and that rowversion columns are accessible to client applications only as read-only columns. (In addition to identifying the row, these IDs also identify a version of the row—i.e., the state of a given row at a given time. If you change the contents of a row, the row’s ID column will get a new value. Think of it as an RFA, a record file address, that changes every time a column in the row is updated.)
Rowversion columns are particularly useful for synchronizing multiple remote databases that are replications of a central database. For example, when data in several remote databases is committed to a large central database on a periodic basis, the rowversion column can be used to verify that modified data is not overwritten with earlier data.
Note the following:
- In SQL statements, you must enter the column name as rowversion or timestamp in lowercase letters.
- If you do not specify the rowversion column when creating a table in a SQL Server database, you will get an error with SQL Connection when using SSQL_FORUPDATE.
See SQL Server documentation for more information.
You can use the NEWID() Transact-SQL function to create a globally unique identifier value (GUID) for a row. To produce a unique identifier value in an inserted row, either the table must have a DEFAULT clause specifying the NEWID() function, or this function must be included in the INSERT statement (which is not necessary for a rowversion column). However, unlike a rowversion column, you can fetch GUID column values with a SELECT statement. (Rowversion columns are invisible to SELECT queries.)
The following example demonstrates how to create a unique identifier both automatically and manually. The first INSERT statement automatically creates a NEWID() value for the GUID column (triggered by the DEFAULT NEWID() clause in the CREATE TABLE command). The second INSERT manually generates the value.
CREATE TABLE e_anniv ( guid UNIQUEIDENTIFIER CONSTRAINT Guid_Default DEFAULT NEWID(), Start_Date DATETIME, Employee_Name VARCHAR(60), ) GO INSERT INTO e_anniv (Start_Date, Employee_Name) VALUES ('7/1/1976','John') INSERT INTO e_anniv VALUES (NEWID(), '3/8/1982','Mary') GO
Note that you should use an a16 to store GUIDs in your SQL Connection program, and if you use a GUID in a stored procedure, you must use SSQL_EXBINARY.
Oracle Server and optimistic locking
For Oracle, you cannot write client-side concurrency code for optimistic locking unless you are using Oracle 10g or later with the ORA_ROWSCN column when the table is set up with ROWDEPENDENCIES enabled. In this case, you can retrieve the SCN in a binary field and use it in a WHERE clause.
You can, however, use ROWID values to identify the current row during updates and deletes. ROWID has the format BBBBBBBB.RRRR.FFFF (hexadecimal), where BBBBBBBB is the block in the database file, RRRR is the row in the block (0 = first row), and FFFF is the database file. For example, 0000000E.000A.0007 denotes the 11th row in the 15th block in the 7th file.
Note the following:
- Adding a FOR UPDATE clause locks fetched rows.
- If you want to use COMMITs and FETCHs with two different cursors, do not use the CURRENT OF clause. CURRENT OF links cursors, so the COMMIT statement will close the operation and make good all updates. You will also lose the selected result set. Instead, select the ROWID and use that value to identify the current row during the update or delete.
Oracle Rdb and optimistic locking
Oracle Rdb uses DBKEY as a unique row identifier. DBKEY values are binary values. When you access a row by DBKEY, the database system can retrieve, delete, or update that row directly, without accessing an index or sequentially scanning a table row by row.
By default, DBKEY values are guaranteed to be valid until you end the transaction in which you retrieve them. However, DBKEY values will remain valid until you detach from the database if you include the clause DBKEY SCOPE IS ATTACH when you declare the schema for the database to which the DBKEY values belong.
To specify a DBKEY as a value expression, use the keyword DBKEY. This keyword is valid only in a selection list (to retrieve DBKEY values) or in a basic predicate with the equal (=) operator (to access rows by the DBKEY values your program retrieves). For example, your program might contain the following types of statements for accessing an Oracle Rdb database:
SELECT col_1, col_2, DBKEY INTO col_1_parm, col_2_parm, dbkey_parm FROM table_a WHERE col_1 = input_retrieval_parm UPDATE table_a SET col_2 = update_parm WHERE DBKEY = dbkey_parm
Even though DBKEY values are stored in binary format in the schema, you must declare the host variable your program will use to store a DBKEY as a character string. The size of the string depends on which operating system Oracle Rdb is running. Check Oracle documentation for the required length of the character string for each operating system. |
IBM DB2
IBM DB2 does not provide any client-side concurrency control or unique column identifier. Front-end applications must use SELECT FOR UPDATE statements. We recommend that you always use SSQL_FORUPDATE when using SQL Connection with DB2.
Informix and optimistic locking
With Informix, the ROWID keyword can be used in RDSQL statements to refer to the C-ISAM record number associated with a row in a database table. ROWID can be thought of as a hidden column in every table. SELECT statements will not return ROWID values unless you specify ROWID in the column list. The following example returns the ROWID value for each row:
SELECT ROWID, * FROM table
The next statement, however, does not return ROWID:
SELECT * FROM table
ROWID can also be used in WHERE clauses to select rows based on their C-ISAM record number. This feature is especially useful when there are no other unique columns in a table. For example:
SELECT ROWID, * FROM table WHERE ROWID > 7
Note that if a row is deleted from the table, its ROWID may be assigned to a new row.
Using SQL Connection’s automatic verification
SQL Connection has a convenient method for using unique row identifier information as a condition for update. This method frees you from having to code row identifier comparisons in your SQL Connection program. With this method, the unique row identifier information is saved when the row is fetched, and then it is automatically used as a condition for the update. You can think of it as an under-the-hood WHERE clause that compares the fetched row identifier with the row identifier that is there at the update—something like “WHERE current_row_id = fetched_row_id”.
1. | Use a SELECT FOR UPDATE statement and specify the SSQL_FORUPDATE and SSQL_POSITION options in your call to %SSC_OPEN. |
2. | Use %SSC_SQLLINK in conjunction with %SSC_EXECUTE. |
Note that for performance reasons we don’t recommend using SELECT FOR UPDATE statements with SQL Server (see note in Row locking). However, if you do use this method, you must have both a rowversion (timestamp) column (see Using a rowversion column) and a unique index.