S_PARSE
Parse a string and return information about it
WSupported on Windows
|
USupported on Unix
|
VSupported on OpenVMS
|
NSupported in Synergy .NET
|
xcall S_PARSE(string, start, dimension, item_position, item_length, item_type, #items, end & [, no_quote])
Arguments
string
The string to parse. (a)
start
The beginning parse position within string. (n)
dimension
The dimension of the item position, item length, and item type arrays. (n)
item_position
The first element of the item position array. (n)
item_length
The first element of the item length array. (n)
item_type
The first element of the item type array. (n)
#items
The variable that will be loaded with the number of items parsed. (n)
end
The variable that will be loaded with either the ending parse position (if string contained more than dimension items) or zero (if all items were parsed). (n)
no_quote
(optional) Overrides the default handling of quote characters. (n)
Discussion
The S_PARSE subroutine parses a string and loads three arrays with information about each item (or token) in the string.
String is an alphanumeric literal or variable. Each item in string will be parsed, and the item characteristics will be loaded into item_position, item_length, and item_type.
The starting position (base one), the length, and the type of each token are loaded into the item position, item length, and item type arrays, respectively. Each array must have dimension elements or more. In other words, item_position(1), item_length(1), and item_type(1) contain the arguments for the first item within string; item_position(2), item_length(2), and item_type(2) contain the arguments for the second item; and so forth.
The type of each item is coded as follows. The I_ mnemonics are defined by the compiler.
Item Type Coding |
||
---|---|---|
Item |
Mnemonic |
Description |
1 |
I_ANUM |
Case-insensitive alpha character, followed by zero or more case-insensitive alphanumeric characters. |
2 |
I_IDENT |
Case-insensitive alpha character, followed by one or more case-insensitive alphanumeric, dollar sign ($), or underscore (_) characters, with at least one of the characters being a dollar sign or an underscore. |
3 |
I_DIGIT |
One or more decimal digits. |
4 |
I_FIXED |
One or more decimal digits, followed by a period (.), followed by one or more decimal digits. |
5 |
I_SPACE |
One or more spaces and/or tabs. |
6 |
I_SQUOTE |
String enclosed in single quotation marks. |
7 |
I_DQUOTE |
String enclosed in double quotation marks. |
8 |
I_SPECIAL |
Any single character that is not part of one of the other item types. |
Parsing continues either until all items within string have been parsed or until dimension items have been parsed. If dimension items are parsed and string still contains more items, end is the base one position within string at which the next item begins. (In other words, passing end as start on another S_PARSE call will continue the parsing.) If all of the items within string are parsed, end is returned with a value of zero.
When a quoted string item is parsed, the item_position array element is the position of the first character that follows the double or single quotation mark, and item_length is the length of the item up to the closing quotation mark. Thus, the delimiters of a quoted string are the only characters within string that aren’t enclosed in one of the items.
Note that quoted strings can be implicitly terminated by the end of string, and that it is possible to have a quoted string with a length of zero (two successive quote characters).
If you want successive quotes in a string to represent a single occurrence of that quote (for example, ‘O’’Leary’ to represent “O’Leary”), the calling program must detect successive occurrences of I_DQUOTE or I_SQUOTE items. In particular, if one of these items is the last item parsed, and more items are on the line, the item should be “pushed back” before the next call to S_PARSE. In other words, set start to the following rather than end for the next call, and don’t process the last item on the current call:
item_position(dimension) – 1
On completion, #items is returned with the number of items that were parsed.
If no_quote is passed and nonzero, S_PARSE returns I_SPECIAL for single and double quotation marks instead of I_SQUOTE and I_DQUOTE.
S_PARSE is affected by the case value of the LOCALIZE routine wherever the operation depends on (or is independent of) the case of a character. |
Examples
.define TTCHN ,1 record line ,a20 start ,i4, 1 dim ,i4, 20 pos ,20i4 len ,20i4 type ,20i4 items ,i4 end ,i4 ix ,i4 proc open(TTCHN, o, "tt:") writes(TTCHN, "Enter a string to parse: ") reads(TTCHN, line) xcall s_parse(line, start, dim, pos, len, type, items, end) writes(TTCHN, %string(items) + " items parsed") if (end) then writes(TTCHN, %string(end) + " is the end position") else writes(TTCHN, "All items were parsed!") for ix from 1 thru items ;Display contents of arrays writes(TTCHN, "item " + %string(ix) + " pos=" + & %string(pos(ix), "ZX") + " len=" + & %string(len(ix)) + " type=" + & %string(type(ix))) close TTCHN stop end
Let’s assume the following line is input:
FRED EARNS $17/HR
The program above produces the following output (based on the recommended .DEFINEs):
Enter a string to parse: FRED EARNS $17/HR 9 items parsed All items were parsed! item 1 pos= 1 len=4 type=1 item 2 pos= 5 len=1 type=5 item 3 pos= 6 len=5 type=1 item 4 pos=11 len=1 type=5 item 5 pos=12 len=1 type=8 item 6 pos=13 len=2 type=3 item 7 pos=15 len=1 type=8 item 8 pos=16 len=2 type=1 item 9 pos=18 len=3 type=5