Functions on strings

Introduction

Character strings (text, string) can occur as part of a device protocol, e.g. as an error message from a subsystem or measurement data transmission via a serial interface. Many WEB-based communication protocols are also based on strings.

The math module supports the scalar type string, which typically contains characters encoded with UTF-8.

All data types can be converted into the string type with str(), for example, or into the corresponding target types with the other cast operators, provided that no syntactical errors occur.

Operators on text

The following operators are supported with the String data type:

all comparison operators, the numerical order of the ASCII characters applies: e.g. '4' < 'A' < 'a'
+ - operator, links two strings
```
s_connected = s1 + s2;
```
tip
str(x) converts any data type into a string representation, including multidimensional objects such as vectors and matrices. To get more control over the formatting, the function sFormat(f, ...) can be helpful.

Properties and basic functions

Length of a character string "sLength"

The length of a character string s in bytes is calculated as follows

l = sLength(s);

In UTF-8 encoding, special characters can also occupy several bytes with the appropriate encoding. This function determines the number of bytes, not the number of (printable) characters!

Extraction of a substring as a copy "sCopy"

A substring r of the string s can be extracted as follows

r = sCopy(s, b); // Copy of the character string s from position b
r = sCopy(s, b, c); // Copy of the string s from position b of length c
r = sCopy(s, BC); // BC is a vector of length 2 that contains b and c (behavior as with sCopy(s, b, c))
r = sCopy(s, BC, sel); // BC is a matrix with two columns corresponding to b and c,
                        // where sel represents a selective row index

In many places, the matrix BC is the result of search functions, e.g. sFind(), which can return several hits in the string. The corresponding text part can be extracted directly via the index sel.

Example:

// 0 1 2 3
// 0123456789012345678901234567890123456789
geo = 'lat: 51.234567, long: 12.3456789';
p6 = sFind(geo, 'lat:\s*([0-9.+-]+)', {subex: true});
// => p6 := [ [ 0,14] // refers to 'lat: 51.234567'
// , [ 5, 9]]; // refers to '51.234567'
lat = dbl(sCopy(geo, p6, 1));
// => lat := 51.234567;

Extraction of the start and end of the string "sLeft", "sRight"

String start b and string end e of length l of the string s can be obtained as a copy as follows

b = sLeft(s,l);
e = sRight(s,l);

Remove a substring "sErase"

To obtain a copy of a string s in which a substring has been removed, the following can be used

r = sErase(s, b); // Copy of the character string s without characters from position b
r = sErase(s, b, c); // Copy of the string s without characters from position b of length c
r = sErase(s, BC); // BC is a vector of length 2 that contains b and c (behavior as with sErase(s, b, c))
r = sErase(s, BC, sel); // BC is a matrix with two columns corresponding to b and c,
                         // where sel represents a selective row index

In many places, the matrix BC is the result of search functions, e.g. sFind(), which can return several hits in the string. The corresponding text part can be removed directly via the index sel.

Insert a string "sInsert"

To insert a string t within a string s at position b and return a copy r, the following can be used

r = sInsert(s, b, t);

Create, format, read, clean up

Extraction of numerical values from a string "sScan"

Under Construction This function is in preparation.

To extract numerical values from a string s and return a vector v of these values, the following can be used

v = sScan(f, s);

Here, f represents a format string of the type scanf().

Formatting of a string "sFormat"

The following function can be used to format a string f, e.g. containing numerical values, as a string r

r = sFormat(f, ...);

Here, f represents a format string of the type printf(), and a corresponding argument of the form %[<width>][.<precision>]<type> must be specified for each placeholder.

<width>: optional field width for the output, width < 0: left-aligned
.<precision>: optional for f/g/e, number of decimal places or valid digits

<type>: The arguments are converted according to the selected output type. Additional length specifications for the data type are therefore not required.

`<type>`	data type	output
b	`<bool>`	Output of 'false' or 'true'
d	`<int>`	signed integer, base 10
u	`<uint>`	unsigned integer, base 10
x, X	`<uint>`	unsigned integer, base 16 small (x) or large (X) digits `A`-`F`
o	`<uint>`	unsigned integer, base 8
f	`<dbl>`	Floating point number without 10-exponent Example: 1234.5678
g	`<dbl>`	Floating point number in optimized representation like f or e depending on the order of magnitude of the number. Examples: 1234.45678, 1.2345e9, 1.2345e-12
e	`<dbl>`	Floating point number with 10's exponent Example: 1.2345678e3
s	`<str>`	text

Example:

template = 'alt: %6.2f, lat: %.9f, lon: %.9f';
tx = sFormat(template, 140.4, 49.8765432, -3.14);
// tx := 'alt: 140.40, lat: 49.876543200, lon: -3.140000000';

Access to the content of a resource "sResource"

The function returns the content of the designated resource as a constant (binary) string.

resStr = sResource({@ref:'myResource'});

warning

Depending on the configuration, the content of the resource may not only contain pure UTF-8 text (e.g. an INI file or a JSON object), but also binary data such as double arrays, jpg images, etc. Binary data in particular may only be further processed by suitable special functions. Please contact us for the implementation of such functions.

Remove whitespace "sTrim"

This function removes all whitespace at the beginning and end of the argument, e.g. in

s_trimmed = sTrim(s);

Conversion to upper/lower case "sUpper", "sLower"

The conversion of a string with regard to upper/lower case is carried out using

s_uppercase = sUpper(s);
s_lowercase = sLower(s);

Simplification "sSimplify" of strings

The function sSimplify() replaces all multiple occurrences of white space (spaces, tabs, CR, LF, ...) with a single space.

s_simple = sSimplify(s);

Normalization "sNormalize" of strings

The function sNormalize() removes matching quotation marks "..." or '...' at the beginning and end of the string and replaces all \-ESCAPE sequences with their designated code.

s_normalized = sNormalize(s);

Search, find and disassemble

Search for substrings "sFind"

The search function sFind returns the index r or a matrix with the position and length of the text parts of the pattern p found in the string s and can be used as follows

r1 = sFind(s, {...}); // configuration of the 'pattern' is mandatory
r1 = sFind(s,p); // direct search from the start of the string
r2 = sFind(s,p,b); // direct search from position b
// Optional configuration object for all variants
rx = sFind(..., { pattern: <string>
                , case: <bool>
                , all: <bool>
                , regex: <bool>
                , subex: <bool>
                });

If no pattern is found, this function returns -1.

A matrix with the search results has the following structure:

rx = \left[ {\begin{array}{cc} {{s_0}}&{{c_0}}\\\ {{s_1}}&{{c_1}}\\\ \vdots & \vdots \\ {{s_{n - 1}}}&{{c_{n - 1}}} \end{array}} \right]

This matrix can be passed directly together with the line number of the individual result to the functions sCopy() or sErase().

A single result can be further processed with the function GetRow() or by directly specifying the indexing:

part4 = sCopy(s, rx, 3); // counting from zero
part4a = sCopy(s, rx[3,0], rx[3,1]); // equivalent
part4_sc = GetRow(rx, 3); // vector [s3, c3]
part4b = sCopy(s, part4_sc); // equivalent

Property	Value	Description
pattern	`<str>`	Constant search pattern if parameter p is not used. It is somewhat more performant for regex modes, as the expression does not have to be translated again and again.
case	`<bool>`	Case-sensitive search, (def: false, case insensitive)
all	`<bool>`	Returns a vector with the starting positions and lengths of the matching text parts, not in conjunction with subex
regex	`<bool>`	Interprets the pattern pattern or p as a regular expression, the result is a matrix with the position and length of the expressions/sub-expressions found
subex	`<bool>`	Returns position and length of the extracted text parts in a result matrix, sets regex automatically

Examples

// 0 1 2 3
// 0123456789012345678901234567890123456789
str = 'Hello world and hello my dear friends!

p0 = sFind(str, "dog");
// => p0 := -1; // not found, independant of selected modes
// my test with (p0 < 0) ? ... : ...
// or isMatrix(p0) ? ... : ...     for 'all'- or 'regex'-modes

p1 = sFind(str, "hello");
// => p1 := 0;

p2 = sFind(str, "hello", {case: true});
// => p2 := 16; // first one is now skipped

p3 = sFind(str, "hello", {all: true});
// => p3 := [ [ 0, 5]
// , [16, 5]];

p4 = sFind(str, "(and|dear)", { regex: true, all: true});
// => p4 := [ [12, 3]
// , [25, 4]];

p5 = sFind(str, 'dear\s+(\w+)', { subex: true });
// => p5 := [ [25, 12] // first row is complete match
// , [30, 7]]; // then (...) extractions follow

dear= sCopy(str, p5, 1);
// => dear:= 'friends';

// 0 1 2 3
// 0123456789012345678901234567890123456789
geo = 'lat: 51.234567, long: 12.3456789';
p6 = sFind(geo, 'lat:\s*([0-9.+-]+)', {subex: true});
// => p6 := [ [ 0,14]
// , [ 5, 9]];
lat = dbl(sCopy(geo, p6, 1));
// => lat := 51.234567;

Replace substrings "sReplace"

To replace (several) character strings t_k within the character string s with the corresponding text of s_k, the following can be used

r = sReplace(s,t_1,s_1, ..., t_N,s_N);

Here, a sequential replacement takes place in the order of the arguments, the values s_k are implicitly converted into a text representation. The sFormat() function can be used for more control during conversion.

Example:

template = 'alt: <alt>, lat: <lat>, lon: <lon>';
tx = sReplace(template, '<alt>', 140.4, '<lat>', 49.8765432, '<lon>', -3.14);
// tx := 'alt: 140.400000, lat: 49.876543, lon: -3.140000';

Search for key-value combination "sGetKV"

Under Construction This function is in preparation.

This function searches a string (text block) for a key-value pair and returns the value.

v1 = sGetKV(s, k);
// Optional configuration block
vx = sGetKV(..., { format: <enum>
                 , assign: <str>
                 , ending: <str>
                 , quotes: <bool>
                 , trim: <bool>
                 });

Property	Value	Description
format	`<enum>`	This input defines the format of the string. Possible values are: - json: - ini: - csv:
assign	`<regex>`	separator between key and value, (def.: `:`)
ending	`<regex>`	separator between key-value entries, (def.: `,]`)
quotes	`<bool>`	Automatic removal of quotation marks from keys and values. (def.: true)
trim	`<bool>`	Automatic removal of white space at the beginning and end of a value (def.: true)

Splitting character strings "sSplit"

The sSplit function can be used to split a character string at a character ch. This returns a two-column matrix of all start positions and lengths of the corresponding substrings

r1 = sSplit(s, ch);
// Optional configuration block
rx = sSplit(..., { trim: <bool>
                 });
pos = rx[0, 0];
len = rx[0, 1];

Property	Value	Description
trim	`<bool>`	Removes white space at the beginning and end of elements

Number of lines within a string "sLines"

The number of lines in a string s can be determined as follows

lines = sLines(s);

All line break variants are supported (LF, CRLF, LFCR, CR).

Extraction of a line from a multi-line string "sLine"

The following function can be used to extract the i-th line from a string s

lineI = sLine(s,i);
// Optional configuration object
lineX = sLine(..., { trim: <bool>
                   });

Property	Value	Description
trim	`<bool>`	Removes white space at the beginning and end of the line

Read in complete lines "sGetLine"

The sGetLine() function combines the individual values from the string channel s and stores them until a complete line can be output. This is removed from the buffer and the accumulation is continued with the remainder.

To obtain a complete line from a stream of characters, the following can be used

line1 = sGetLine(s);
// Optional configuration object
lineX = sGetLine(..., { trim: <bool>
                      , eoln: <str>
                      , timeout: <dbl>
                      });

Property	Value	Description
trim	`<bool>`	The output line is still cleaned of spaces
eoln	`<string>`	Definition of the line terminator
timeout	`<dbl>`	Time in seconds after the last fragment until a line end is inserted.

Introduction​

Operators on text​

Properties and basic functions​

Length of a character string "sLength"​

Extraction of a substring as a copy "sCopy"​

Example:​

Extraction of the start and end of the string "sLeft", "sRight"​

Remove a substring "sErase"​

Insert a string "sInsert"​

Create, format, read, clean up​

Extraction of numerical values from a string "sScan"​

Formatting of a string "sFormat"​

Example:​

Access to the content of a resource "sResource"​

Remove whitespace "sTrim"​

Conversion to upper/lower case "sUpper", "sLower"​

Simplification "sSimplify" of strings​

Normalization "sNormalize" of strings​

Search, find and disassemble​

Search for substrings "sFind"​

Examples​

Replace substrings "sReplace"​

Example:​

Search for key-value combination "sGetKV"​

Splitting character strings "sSplit"​

Number of lines within a string "sLines"​

Extraction of a line from a multi-line string "sLine"​

Read in complete lines "sGetLine"​

Introduction

Operators on text

Properties and basic functions

Length of a character string "sLength"

Extraction of a substring as a copy "sCopy"

Example:

Extraction of the start and end of the string "sLeft", "sRight"

Remove a substring "sErase"

Insert a string "sInsert"

Create, format, read, clean up

Extraction of numerical values from a string "sScan"

Formatting of a string "sFormat"

Example:

Access to the content of a resource "sResource"

Remove whitespace "sTrim"

Conversion to upper/lower case "sUpper", "sLower"

Simplification "sSimplify" of strings

Normalization "sNormalize" of strings

Search, find and disassemble

Search for substrings "sFind"

Examples

Replace substrings "sReplace"

Example:

Search for key-value combination "sGetKV"

Splitting character strings "sSplit"

Number of lines within a string "sLines"

Extraction of a line from a multi-line string "sLine"

Read in complete lines "sGetLine"