4 Primitive Types

Wafl has four primitive types: Integer, Float, String and Bool. In this chapter we discuss the important elements of the primitive types.

This chapter is quite long and detailed, but its content is elementary, so we suggest a cursory read. It is enough to have an idea of what is supported, and later use this chapter as a reference as needed.

4.1 Literals

Integer and float literal constants have the similar syntax as in programming languages C and C++.

Float literals must contain decimal point and at least one digit before it.

Logical literal constants are true and false.

String literals are quoted by single or double quotes. Special characters are specified by escape sequences, like in C/C++. Most important escape sequences are: single quote (\'), double quote (\"), backslash (\\), new line (\n), carriage return (\r), horizontal tab (\t), vertical tab (\v), form feed (\f) and backspace (\b). Like in C/C++, characters may be encoded by 3 octal digits: \nnn.

Examples of integer literals:

{#
    0, 42, -21
#}

{# 0, 42, -21 #}

Examples of float literals:

{#
    3.4, 0., 1.2e-3
#}

{# 3.4, 0, 0.0012 #}

Examples of bool literals:

{#
    true, false
#}

{# true, false #}

Examples of string literals:

{#
    'single quotes',
    "double quotes",
    "two\nlines",
    "octal codes A=\101 a=\141"
#}

{# 'single quotes', 'double quotes', 'two\012lines', 'octal codes A=A a=a' #}

4.2 Operators

4.2.1 Integer operators

Wafl has usual arithmetical operators:

The division of integer values always computes an integer result. While the integer division remainder operator (%) returns positive values for positive dividend and negative values for negative dividends, the modulus operator (%%) always returns a positive result:

{#
    17 / 10,
    17 % 10,
    17 %% 10,
    -17 / 10,
    -17 % 10,
    -17 %% 10,
    17 / -10,
    17 % -10,
    17 %% -10,
    -17 / -10,
    -17 % -10,
    -17 %% -10
#}

{# 1, 7, 7, -1, -7, 3, -1, 7, 7, 1, -7, 3 #}

Bit-level integer operators are syntactically and semantically equivalent to these operators in C/C++:

{#
    // '11110000' & '00111111' = '00110000' = 48
    240 & 63,   
    // '011' | '110' = '111' = 7
    3 | 6,      
    // bit-level complement
    ~5,
    // '11110' << 3 = '11110000' = 240
    30 << 3,   
    // '11111111' >> 3 = '11111' = 31
    255 >> 3
#}

{# 48, 7, -6, 240, 31 #}

The power operator a ** b evaluates a to the power of b. Any integer to the negative power will evaluate zero, except for one. One to the power of any integer will always evaluate 1.:

{#
    2 ** 3,
    2 ** -3,
    1 ** 3,
    1 ** -3,
    -2 ** 3,
    -2 ** -3
#}

{# 8, 0, 1, 1, -8, 0 #}

Multiplication, division, remainder, modulus and bit-level conjunction have higher priority than addition, subtraction and bit-level disjunction. The power operator has the highest priority. Shift operators have the lowest priority.

4.2.2 Float Operators

Wafl has usual float operators:

{#
    2.1 + 3.45678,
    3.0 - 1.2,
    3.14 * 2.17,
    17.0 / 10.,
    2.0 ** 3.0,
    2.0 ** 0.5
#}

{# 5.55678, 1.8, 6.8138, 1.7, 8, 1.414213562 #}

4.2.3 String operators

Wafl has single binary string operator:

"One" + "Two"

OneTwo

There are also indexing operators. They will be discussed with sequence types.

4.2.4 Bool operators

The usual logical operators are supported in both C-like and SQL-like syntax:

{#
    true or false,
    true || false,
    true and false,
    true && false,
    not true,
    !true
#}

{# true, true, false, false, false, false #}

4.2.5 Comparison Operators

The usual comparison operators are defined for Integer, Float and String types:

{#
    1 < 2,
    1.2 <= 2.3,
    "abcd" > "ABCD",
    "abcd" >= "AB",
    21 * 2 = 42,
    21 == 42 / 2,
    17 != 18,
    -3.14 <> 3.14
#}

{# true, true, true, true, true, true, true, true #}

Wafl has no variables and no assignments. The operator ‘=’ has only two roles: (1) to separate the definition name from the body and (2) as equality operator. It can never be ambiguous, so there’s no reason to use operator ‘==’ instead, but if someone likes it more, that’s fine.

4.3 Conversion Functions

Wafl is strongly typed programming language and no implicit type conversions are allowed. Thus, Wafl core library contains the conversion functions:

There are some other conversion functions available for specific pairs of types.

It may seem strange to call these functions as..., but it is quite natural if we expect to use them mainly with dot syntax.

4.3.1 Conversion to Integer

Function asInt(x) converts any non-integer primitive value x to Integer type:

Additionally for conversion from Float to Integer there are:

For conversion of String values to Integer there is also:

Examples of conversions from Float to Integer:

{#
    asInt(3.6),
    asInt(-3.6),
    round(3.6),
    round(-3.6),
    ceil(3.6),
    ceil(-3.6),
    floor(3.6),
    floor(-3.6)
#}

{# 4, -4, 4, -4, 4, -3, 3, -4 #}

Examples of conversions from String to Integer:

{#
    asInt('3'),
    asInt('3.8'),
    asInt('abc'),
    ascii('abc'),
    ascii('')
#}

{# 3, 3, 0, 97, 0 #}

Examples of conversions from Bool to Integer:

{#
    asInt(true),
    asInt(false)
#}

{# 1, 0 #}

4.3.2 Conversion to Float

Function asFloat(x) converts any non-float primitive value x to Float type:

{#
    asFloat(7),
    asFloat('6.2'),
    asFloat('abc'),
    asFloat(true),
    asFloat(false)
#}

{# 7, 6.2, 0, 1, 0 #}

4.3.3 Conversion to String

There are four functions for conversion of values of other data types to strings:

Function / Type and Description

asString

('1 -> String)
Converts a value to a string.

asChar

(PrimeNotString['1] -> String)
Converts a value to a character.

toString

(Float * Int -> String)
Converts a value to a string with given precision.

asPreview

('1 -> String)
Converts a value to a shortened string.

Function asString(x) converts any non-string value x to String. It converts a value x to its string representation, according to the Wafl syntax.

There is a synonymous postfix operator $ with the same behavior.

“Any” means “any” - the function asString and postfix operator $ convert any Wafl value of any type to its String representation.

{#
    asString(3),
    asString(3.14),
    asString(true),
    asString(false),
    asString('123'),
    asString( {# 1, 2.3, "abc", {# true, 's' #} #} ),
    asString([1,2,3,4,5,6,7,8,9,10]),

    3$,
    3.14$,
    true$,
    false$,
    '123'$,
    {# 1, 2.3, "abc", {# true, 's' #} #}$,
    [1,2,3,4,5,6,7,8,9,10]$
#}

{# '3', '3.14', 'true', 'false', '123', '{# 1, 2.3, \'abc\', {# true, \'s\' #} #}', '[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]', '3', '3.14', 'true', 'false', '123', '{# 1, 2.3, \'abc\', {# true, \'s\' #} #}', '[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]' #}

Function asPreview(x) is similar to asString, but returns a shorter string. For simple data it behaves the same as asString. For larger structured data and longer strings, it extracts just a part of the complete string representation.

{#
    asPreview('01234567989'),
    asPreview( 
        '01234567890123456789012345678901234567890123456789'
        '01234567890123456789012345678901234567890123456789'
    ),
    asPreview([1,2,3,4,5,6,7,8,9,10]),
    asPreview(1..1000)
#}

{# '01234567989', '012345678901234567890123456789 ... 0123456789 (len=100)', '[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]', '[1, 2, 3, 4, 5, ..., 999, 1000] (len=1000)' #}

Function asChar(x) works on integers and logical values. It:

{#
    asChar(65),
    asChar(65.7),
    asChar(true),
    asChar(false)
#}

{# 'A', 'B', 'T', 'F' #}

Function toString(x,n) converts float value x to string with n digits after decimal point.

{#
    toString(1234.56789,0),
    toString(1234.56789,1),
    toString(1234.56789,2),
    toString(1234.56789,3)
#}

{# '1235', '1234.6', '1234.57', '1234.568' #}

4.3.4 Conversion to Bool

Function asBool(x) converts any non-bool primitive value x to Bool type:

{#
    asBool( 2 ),
    asBool( -3 ),
    asBool( 0 ),
    asBool( 2.1 ),
    asBool( -3.2 ),
    asBool( 0.0 ),
    asBool( "true" ),
    asBool( "True" ), // this is not same as "true"
    asBool( "T" ),
    asBool( "t" )     // this is not same se "T"
#}

{# true, true, false, true, true, false, true, false, true, false #}

4.4 Integer Functions

Wafl core library includes three integer functions:

4.4.1 Integer Function abs

Integer function abs(x) computes an absolute integer value of the given integer value x:

{#
    abs( 123 ),
    abs( -123 ),
    abs( -0 )
#}

{# 123, 123, 0 #}

4.4.2 Integer Function sgn

Integer function sgn(x) returns a sign of the number x. For positive values returns 1, for negative values returns -1 and for zero returns zero:

{#
    sgn( 20 ),
    sgn( -2 ),
    sgn( 0 )
#}

{# 1, -1, 0 #}

4.4.3 Function random

Integer function random(x) computes a random integer value in range [0,x-1]. In the following example, we compute 20 random values in range [0,4]:

{#
    random( 5 ), random( 5 ), random( 5 ), random( 5 ),
    random( 5 ), random( 5 ), random( 5 ), random( 5 ),
    random( 5 ), random( 5 ), random( 5 ), random( 5 ),
    random( 5 ), random( 5 ), random( 5 ), random( 5 ),
    random( 5 ), random( 5 ), random( 5 ), random( 5 )
#}

{# 0, 2, 0, 3, 0, 3, 1, 1, 0, 1, 3, 2, 2, 3, 0, 4, 0, 2, 4, 3 #}

Randomizing

The default behavior is to reinitialize the random number generator seed on first usage, using current system timer. This is usually exactly what is expected and required by the programmer.

However, sometimes it can be required to have the same random numbers sequence on each program run (for debugging, benchmarking and some other cases). In such cases there is a clwafl command line option -nornd, which sets a predefined seed initialization.

Executing the previous program using:

clwafl -nornd program.wafl

will always result in the same result.

If a program uses a parallel evaluation, then the random sequence will not be guaranteed. In fact, the sequence will be the same, but the sequence usage by different threads will not be the same on each run.

4.5 Float Functions

Wafl core library includes the following float functions:

The following conversion functions are already presented in the previous sections:

4.5.1 Float Function abs

Float function abs(x) computes an absolute float value of a given float value x.

{#
    abs( 123.456 ),
    abs( -123.456 ),
    abs( -0.0 )
#}

{# 123.456, 123.456, 0 #}

4.5.2 Float Function sgn

Float function sgn(x) returns a sign of the number x. For positive values returns 1.0, for negative values returns -1.0 and for zero returns zero:

{#
    sgn( 20.3 ),
    sgn( -2.4 ),
    sgn( 0.0 )
#}

{# 1, -1, 0 #}

4.5.3 Float Function roundTo

Float function roundTo(x,y) rounds float value x. Given float value y defines a lowest significant digit.

{#
    roundTo( 1234.56789, 0.001 ),
    roundTo( 1234.56789, 0.01 ),
    roundTo( 1234.56789, 0.1 ),
    roundTo( 1234.56789, 1. ),
    roundTo( 1234.56789, 10. ),
    roundTo( 1234.56789, 100. ),
    roundTo( 1234.56789, 1000. ),
    roundTo( 1234.56789, 10000. )
#}

{# 1234.568, 1234.57, 1234.6, 1235, 1230, 1200, 1000, 0 #}

4.5.4 Function exp

Exponential function exp(x) computes ex:

{#
    exp( -10.0 ),
    exp( 0.0 ),
    exp( 1.0 ),
    exp( 10.0 )
#}

{# 4.539992976e-05, 1, 2.718281828, 22026.46579 #}

4.5.5 Function ln

Float function ln(x) computes natural logarithm loge x. It is defined for positive float values.

{#
    ln( 0.1 ),
    ln( 1.0 ),
    ln( 2.7182818284590452353602874),
    ln( 100. ),
    ln( 1000. )
#}

{# -2.302585093, 0, 1, 4.605170186, 6.907755279 #}

4.5.6 Function log

Float function log(x) computes logarithm log10 x. It is defined for positive float values.

{#
    log( 0.001 ),
    log( 0.01 ),
    log( 0.1 ),
    log( 1.0 ),
    log( 10. ),
    log( 100. ),
    log( 1000. )
#}

{# -3, -2, -1, 0, 1, 2, 3 #}

4.5.7 Function log2

Float function log2(x) computes logarithm log2 x. It is defined for positive float values.

{#
    log2( 0.001 ),
    log2( 0.0078125 ),
    log2( 0.25 ),
    log2( 0.5 ),
    log2( 1.0 ),
    log2( 2. ),
    log2( 4. ),
    log2( 128. ),
    log2( 1000. )
#}

{# -9.965784285, -7, -2, -1, 0, 1, 2, 7, 9.965784285 #}

4.5.8 Function pow

Float function pow(x,y) computes xy - x to the power of y.

It is defined for positive x and any y. Negative x is allowed only if y is a whole number. Zero x is allowed only for positive y.

{#
    pow( 2., 3. ),
    pow( 2., -3. ),
    pow( 2.5, -3.7 ),
    pow( -2., 3. ),
    pow( 0., 3.2 )
#}

{# 8, 0.125, 0.03369938443, -8, 0 #}

4.5.9 Function sqrt

Float function sqrt(x) computes square root of x. It is defined for non-negative float values x.

{#
    sqrt(1.),
    sqrt(4.),
    sqrt(9.),
    sqrt(16.),
    sqrt(3433.32)
#}

{# 1, 2, 3, 4, 58.59453899 #}

4.5.10 Trigonometric functions

The following trigonometric functions are available:

Angles are measured in radians. Additionally, atan2(x,y) maps a pair of float values to appropriate angle. If y is non-zero then atan2(x,y) = atan(x/y), but atan2 is defined even for y=0.

{#
    sin(3.14/2.0) * cos(3.14/2.0),
    tan(3.14/2.0),
    asin(0.5) + acos(0.5),
    atan(0.5),
    atan2(1.0,2.0),
    atan2(1.0,0.0)
#}

{# 0.0007963264582, 1255.765592, 1.570796327, 0.463647609, 0.463647609, 1.570796327 #}

4.6 String Functions

In this section we preset the string functions.

Conversion functions (asChar, asString, ascii and toString) are presented in previous sections.

Wafl String type works with both single-byte strings and with multi-byte UTF-8 encoded strings. However, some of the functions work only with single-byte characters strings. If a function may not work well for UTF-8 strings, it is noted in this tutorial. Please take care.

4.6.1 Basic String Functions

Function / Type and Description

strLen

(String -> Int)
Get the string length.

length

(Indexable['1]['2]['3] -> Int)
Get the collection size.

size

(Indexable['1]['2]['3] -> Int)
Get the collection size.

strCat

(String * String -> String)
String concatenation. Same as string addition.

isNull

(String -> Bool)
Check if a string represents a database NULL value.

ifNull

(String * String -> String)
Replace null with given value:
    ifNull(x,c) = if isNull(x) then c else x

strLen

Function strLen(x) computes the length of the string x.

It is important to understand that string x may include any characters, and that characters with ASCII code zero are not handled as string terminals. Thus, String type may work not only with character strings, but also with byte strings.

There are two more general synonyms length and size.

{#
    strLen( "abc" ),
    strLen( "abc\0abc" ),
    length( "abc\0abc" ),
    size( "abc\0abc" )
#}

{# 3, 7, 7, 7 #}

In case of UTF-8 strings, strLen, length and size return the size in bytes. To get a real UTF-8 string length in UTF-8 code-points, please use utfLen.

strCat

Function strCat(x,y) computes the concatenation of two given strings. It is equivalent to string operator +.

{#
    "abc" + "def",
    strCat( "abc", "def" )
#}

{# 'abcdef', 'abcdef' #}

isNull, ifNull

Because of the databases, String type supports special undefined value NULL. Function isNull(s) checks if string s is NULL. Function ifNull(s,x) will return s if s is not NULL, but x if s is NULL.

ifNull(s,x) == if isNull(s) then x else s

{#
    $-1,    //  This will return null string
    isNull('a'),
    isNull($-1),
    ifNull("abc","xyz"),
    ifNull($-1,"xyz")
#}

{# 'NULL', false, true, 'abc', 'xyz' #}

In the previous example, we used expression $-1 to create NULL strings. Operator $ will be presented later.

4.6.2 String Extraction Functions

String extraction functions extract and return a part of the given string. Wafl core library includes the following string extraction functions:

Function / Type and Description

sub

(SequenceStr['2]['1] * Int * Int -> SequenceStr['2]['1])
Extracts the subsequence from given 0-based position and given length:
    sub(seq,pos,len)

subStr

(String * Int * Int -> String)
Returns a substring from given position (from 0) and with given length.

strLeft

(String * Int -> String)
Returns first N characters of the string. If N is negative, returns all but last -N elements.

strRight

(String * Int -> String)
Returns last N characters of the string. If N is negative, returns all but first -N elements.

strLTrim

(String -> String)
Trims all spaces from left side.

strRTrim

(String -> String)
Trims all spaces from right side.

strTrim

(String -> String)
Trims all spaces from the string.

sub and subStr

Function sub(s,p,n) returns a substring of string s, beginning at (zero based) position p with length n.

In Wafl core library there is subStr, which is a synonym for sub. In current version both functions are supported, but it is possible that only sub will remain.

{#
    subStr( "abcdefgh", 0, 3 ),
    sub( "abcdefgh", 0, 3 ),
    sub( "abcdefgh", 2, 3 ),
    sub( "abcdefgh", -2, 5 ),
    sub( "abcdefgh", 5, 10 ),
    sub( "abcdefgh", 5, -2 )
#}

{# 'abc', 'abc', 'cde', '', 'fgh', '' #}

Special cases:

In case of UTF-8 strings, sub and subStr may return invalid strings. These functions treat strings as having single byte characters only. If a substring begins or ends in the middle of a multi-byte UTF-8 code-point, the result will not be a valid UTF-8 string. To get a valid UTF-8 substring, with positions denoted in UTF-8 code-points, please use utfSub.

strLeft and strRight

Function strLeft(s,n) returns a substring containing the first n characters of string s:

Function strRight(s,n) returns a substring containing the last n characters of string s:

{#
    strLeft( "abcdefgh", 3 ),    //  first 3 characters
    strLeft( "abcdefgh", 10 ),   //  whole string
    strLeft( "abcdefgh", -5 ),   //  all but last 5 characters
    strLeft( "abcdefgh", -10 ),  //  empty string
    strRight( "abcdefgh", 3 ),   //  last 3 characters
    strRight( "abcdefgh", 10 ),  //  whole string
    strRight( "abcdefgh", -5 ),  //  all but first 5 characters
    strRight( "abcdefgh", -10 )  //  empty string
#}

{# 'abc', 'abcdefgh', 'abc', '', 'fgh', 'abcdefgh', 'fgh', '' #}

In case of UTF-8 strings, strLeft and strRight may return invalid strings. These functions treat strings as having single byte characters only. If a substring begins or ends in the middle of a multi-byte UTF-8 code-point, the result will not be a valid UTF-8 string. To get a valid UTF-8 substring, with positions denoted in UTF-8 code-points, please use utfLeft and utfRight.

strLTrim, strRTrim and strTrim

Function strLTrim(s) returns sub string of s not containing leading non-visible characters. Function strRTrim(s) returns sub string of s not containing trailing non-visible characters. Function strTrim(s) returns sub string of s without both leading and trailing non-visible characters.

{#
    strLTrim( "\0 \t \n   abcd \b \003 \0 \r " ),
    strRTrim( "\0 \t \n   abcd \b \003 \0 \r " ),
    strTrim( "\0 \t \n   abcd \b \003 \0 \r " )
#}

{# 'abcd \010 \003 \000 \015 ', '\000 \011 \012   abcd', 'abcd' #}

4.6.3 String Index and Slice Operators

Index operator s[i] is equivalent to subStr(s, i %% strLen(s) , 1). That means that indexing beyond the length is possible.

Index operator s[i] is similar, but not equivalent to subStr(s,i,1). They are equivalent only if holds: 0 <= i < strLen(s)

{# 
  s[-4], s[-3], s[-2], s[-1],
  s[0], s[1], s[2], s[3], 
  s[4], s[6], s[7], s[8]
#}
where {
  s = "abcd";
}

{# 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'c', 'd', 'a' #}

In case of UTF-8 strings, indexing operator may return an invalid string. It treats strings as having single byte characters only. If index points to a UTF-8 multi-byte code-point element, then result will not be a valid UTF-8 code-point. To get a valid UTF-8 code-point, with positions denoted in UTF-8 code-points, please use utfAt.

Slice operator uses syntax similar to index operator, but behaves like subStr, strLeft and strRight. If 0 < n < m <= strLen(s), then:

If index n is negative or greater than strLen(s), then n %% strLen(s) is used. The same holds for m.

{# 
    s[:6],
    s[:-2],
    s[2:],
    s[-6:],
    s[2:6], 
    s[2:-2], 
    s[-6:6],
    s[-6:-2]
#}
where {
  s = "abcdefgh";
}

{# 'abcdef', 'abcdef', 'cdefgh', 'cdefgh', 'cdef', 'cdef', 'cdef', 'cdef' #}

It is often easier to use slice operator than extraction functions, but they essentially do the same thing.

In case of UTF-8 strings, slice operators may return invalid strings. These operators treat strings as having single byte characters only. If a slice begins or ends in the middle of a multi-byte UTF-8 code-point, the result will not be a valid UTF-8 string. To get a valid UTF-8 slice, with positions denoted in UTF-8 code-points, please use utfSlice.

4.6.4 String Search Functions

Wafl core library includes the following string searching functions:

Function / Type and Description

strPos

(String * String -> Int)
Finds first position of a substring in the string, or -1 if not found.

strPosI

(String * String -> Int)
Same as strPos, but ignores letter case.

strNextPos

(String * String * Int -> Int)
Finds next position of a substring in the string, after given pos.

strNextPosI

(String * String * Int -> Int)
Same as strNextPos, but ignores letter case.

strLastPos

(String * String -> Int)
Finds last position of a substring in the string, or -1 if not found.

strLastPosI

(String * String -> Int)
Same as strLastPos, but ignores letter case.

strNextLastPos

(String * String * Int -> Int)
Finds next last position of a substring in the string, before given pos.

strNextLastPosI

(String * String * Int -> Int)
Same as strNextLastPosI, but ignores letter case.

strBeg

(String * String -> Bool)
Check if the 2nd string is at the beginning of the 1st.

strEnd

(String * String -> Bool)
Check if the 2nd string is at the end of the 1st.

All search functions return the beginning position of the second specified string in the first specified string, if it is found, and -1 if it is not.

Functions whose names end with ‘I’ ignore case: strPosI, strNextPosI, strLastPosI, and strNextLastPosI.

Case-insensitive search works by first converting both strings to uppercase. This can be inefficient for larger strings.

strPos, strPosI, strNextPos and strNextPosI

Function strPos(s,p) returns the position of the first occurrence of the string p in string s.

Function strPosI(s,p) returns the position of the first occurrence of the string p in string s, ignoring the letter case.

Function strNextPos(s,p,i) returns the position of the first occurrence of the string p in string s after the position i.

Function strNextPosI(s,p,i) returns the position of the first occurrence of the string p in string s after the position i, ignoring the letter case.

{# 
    strPos( s, n ),     //  not existing
    strPos( s, x ),     
    strNextPos( s, x, 0 ),
    strNextPos( s, x, 3 ),
    strNextPos( s, x, 6 ),
    strNextPos( s, x, 9 )
#}
where {
    s = "abcabcabcab";
    x = "ab";
    n = "xxx";
};

{# -1, 0, 3, 6, 9, -1 #}

strLastPos, strLastPosI, strNextLastPos and strNextLastPosI

Function strLastPos(s,p) returns the position of the last occurrence of the string p in string s.

Function strPosI(s,p) returns the position of the last occurrence of the string p in string s, ignoring the letter case.

Function strNextPos(s,p,i) returns the position of the last occurrence of the string p in string s before the position i.

Function strNextPosI(s,p,i) returns the position of the last occurrence of the string p in string s before the position i, ignoring the letter case.

Please note that case insensitive functions strLastPosI and strNextLastPosI do not work well for UTF-8 multi-byte characters.

{# 
    strLastPos( s, n ),     //  not existing
    strLastPos( s, x ),     
    strNextLastPos( s, x, 9 ),
    strNextLastPos( s, x, 6 ),
    strNextLastPos( s, x, 3 ),
    strNextLastPos( s, x, 0 )
#}
where {
    s = "abcabcabcab";
    x = "ab";
    n = "xxx";
};

{# -1, 9, 6, 3, 0, -1 #}

strBeg and strEnd

Functions strBeg and strEnd check if the first string has the second string at the beginning or at the end:

{#
    strBeg( "abcdef", "ab" ),
    strBeg( "abcdef", "cd" ),
    strBeg( "abcdef", "ef" ),
    strEnd( "abcdef", "ab" ),
    strEnd( "abcdef", "cd" ),
    strEnd( "abcdef", "ef" )
#}

{# true, false, false, false, false, true #}

4.6.5 Substring Counting Functions

Wafl core library includes the following substring counting functions:

Function / Type and Description

strCountSub

(String * String -> Int)
Count occurences of substring in the given string:
    strCountSub('aaaaaA','aa') == 4

strCountSubI

(String * String -> Int)
Same as strCountSub, but ignores letter case:
    strCountSub('aaaaaA','aa') == 5

strCountSubDis

(String * String -> Int)
Count disjunct occurences of substring in the given string:
    strCountSub('aaaaaA','aa') == 2

strCountSubDisI

(String * String -> Int)
Same as strCountSubDis, but ignores letter case:
    strCountSub('aaaaaA','aa') == 3

All counting functions return the count of appearances of the given substring in the given string. The functions which names end with ‘I’, ignore the letter case. The functions with ‘Dis’ in the names count only the disjunct substrings.

{# 
    strCountSub('aaaaaA','aa'),
    strCountSubI('aaaaaA','aa'),
    strCountSubDis('aaaaaA','aa'),
    strCountSubDisI('aaaaaA','aa')
#}

{# 4, 5, 2, 3 #}

Functions strCountSubI and strCountSubDisI work by converting both strings to uppercase first. They may be inefficient for larger strings.

4.6.6 String Replace Functions

Wafl core library includes the following string searching functions:

Function / Type and Description

strReplace

(String * String * String * Int -> String)
Replaces Nth occurrence of substring with given string:
    strReplace('ababa','b','c',2) == 'abaca'

strReplaceI

(String * String * String * Int -> String)
Same as strReplace, but ignores letter case.

strReplaceAll

(String * String * String -> String)
Replaces all occurrences of substring with given string.

strReplaceAllI

(String * String * String -> String)
Same as strReplaceAll, but ignores letter case.

Each of strReplace* functions evaluates a new string and does not modify any of the given strings.

Function strReplace(s,p,x,i) returns a copy of string s where i.th occurrence of substring p is replaced with x. String s remains unmodified.

Function strReplaceI(s,p,x,i) evaluates the same as strReplace, but ignores letter case while searching for p.

Function strReplaceAll(s,p,x) returns a copy of string s where all occurrences of substring p are replaced with x. String s remains unmodified.

Function strReplaceAllI evaluates the same a strReplaceAll, but ignores letter case while searching for p.

{#
    strReplace( s, 'a', '@', 2 ),
    strReplaceI( s, 'a', '@', 2 ),
    strReplaceAll( s, 'a', '@' ),
    strReplaceAllI( s, 'a', '@' )
#}
where {
    s = "abABabAB";
}

{# 'abAB@bAB', 'ab@BabAB', '@bAB@bAB', '@b@B@b@B' #}

Please note that case insensitive functions strReplaceI and strReplaceAllI may be inefficient for larger strings.

4.6.7 Functions strSplit... and strJoin

Here we meet a list of strings. The list is one of the most important concepts of functional programming languages, including Wafl. We will discuss lists in details in the following chapter.

Function strSplit(s,p) returns a list of all substrings of the string s which are separated from each other by substring p.

Function strJoin(lst,p) concatenates all elements of the list lst, inserting the delimiter p between them.

Function / Type and Description

strSplit

(String * String -> List[String])
Splits a string to a list of string, by extracting the given separator.

strSplitTrim

(String * String -> List[String])
Splits a string to a list of string, by extracting the given separator. All spaces are trimmed from each segment from left and right side.

strSplitLines

(String -> List[String])
Splits a string to a list of string, by extracting new-line separator.

strSplitLinesTrim

(String -> List[String])
Splits a string to a list of string, by extracting new-line separator. All spaces are trimmed from each segment from left and right side.

strJoin

(Sequence['1][String] * String -> String)
Joins (concatenates) a sequence of strings, adding the given separator.

strSplit( 'a,bb,c,dd,e', ',' )

['a', 'bb', 'c', 'dd', 'e']

strJoin( ['a','b','c','d','e','f','g','h'], ';' )

a;b;c;d;e;f;g;h

strJoin( strSplit( 'a,bb,c,dd,e', ','), ';' )

a;bb;c;dd;e

Function strSplitTrim is similar to strSplit, but it detects and removes all empty spaces detected before and after the separators. It is functionally equivalent, but more efficient than mapping strTrim after the split:

s.strSplit(p).map(strTrim) == s.strSplitTrim(p)

Function strSplitLines(s) is similar to strSplit(s,'\n'), but it detects and removes both LF (Linux, ‘\n’) and CRLF (Windows, ‘\r\n’) new line sequences:

{#
    strSplit( '\nabc\ndef\nghi\n', '\n' ),
    strSplit( '\r\nabc\r\ndef\r\nghi\r\n', '\n' ),
    strSplitLines( '\nabc\ndef\nghi\n' ),
    strSplitLines( '\r\nabc\r\ndef\r\nghi\r\n' )
#}

{# ['', 'abc', 'def', 'ghi', ''], ['\015', 'abc\015', 'def\015', 'ghi\015', ''], ['', 'abc', 'def', 'ghi', ''], ['', 'abc', 'def', 'ghi', ''] #}

Function strSplitLinesTrim is similar to strSplitTrim and strSplitLines. It is functionally equivalent, but more efficient than mapping strTrim after using strSplitLines:

s.strSplitLines().map(strTrim) == s.strSplitLinesTrim()

Function strChars converts a string to a list of characters.

{#
    strChars( 'abc\ndef' )
#}

{# ['a', 'b', 'c', '\012', 'd', 'e', 'f'] #}

In case of UTF-8 strings, strChars will cut a string in bytes. To get a valid UTF-8 code-points list, please use utfChars instead.

4.6.8 Encoding functions

Sometimes we need to encode a string in a format that follows the specific syntax. Wafl contains four string encoding functions. All of these functions have the common type: (String -> String).

Function strEncodeHtml(s) returns encoded string ready to include in HTML. All special characters are replaced with appropriate HTML character sequences:

strEncodeHtml( "abc&<>def" )

abc&amp;&lt;&gt;def

Function strEncodeSql(s) returns encoded string ready for use in SQL string literals. All special characters are replaced with SQL escape sequences:

strEncodeSql( "abc'quotes'abc" )

abc''quotes''abc

Function strEncodeUri(s) returns encoded string according the rules of URI syntax:

strEncodeUri( "a + b = c" )

a%20%2B%20b%20%3D%20c

Function strEncodeWafl(s) returns encoded string according the Wafl syntax:

strEncodeWafl( "a\n \0 \'\"..." )

a\012 \000 \'\"...

4.6.9 UTF-8 Functions

To handle UTF-8 strings in a proper way, please use only the functions treating strings as sequences of UTF-8 multi-byte code-points.

The most of the specific UTF-8 behavior is covered by the following functions. However, please note that two important functionalities are not supported by Wafl library, yet:

BOM Functions

Some applications require files with UTF-8 content to have a UTF-8 BOM (byte order mark) at the beginning of the file. The following functions provide UTF-8 BOM handling.

Please note that UTF-8 BOM usage is not recommended, because byte-order is irrelevant in case of UTF-8 format. It is based on individual bytes, not words.

Function / Type and Description

utfBom

( -> String)
Get UTF-8 BOM sequence.

utfIsBom

(String -> Bool)
Check if a string content is UTF-8 BOM.

utfHasBom

(String -> Bool)
Check if a string begins with UTF-8 BOM.

utfAddBom

(String -> String)
Adds a UTF-8 BOM, if not already present.

utfTrimBom

(String -> String)
Trims leading BOM, if present.

Function utfBom() returns the UTF-8 BOM.

Function utfIsBom(s) checks if the string content is exactly the UTF-8 BOM.

Function utfHasBom(s) checks if the string begins with a UTF-8 BOM.

{#
    utfBom(),
    utfIsBom( utfBom() ),
    utfIsBom( 'abc' ),
    utfHasBom( utfBom() + 'abc' ),
    utfHasBom( 'abc' )
#}

{# '\357\273\277', true, false, true, false #}

Function utfAddBom(s) returns a string with a UTF-8 BOM added to the beginning, if not already present.

Function utfTrimBom(s) returns a string without a leading UTF-8 BOM.

{#
    utfAddBom('abc'),
    utfHasBom( utfAddBom('abc') ),
    utfTrimBom( utfAddBom('abc') ),
    utfHasBom( utfTrimBom( utfAddBom('abc') ) )
#}

{# '\357\273\277abc', true, 'abc', false #}

Validity Functions

Function / Type and Description

utfIsValid

(String -> Bool)
Check if a string is a valid UTF-8 encoded string.

utfRepInvalid

(String * String -> String)
Replace invalide code points with the given character.

Function utfIsValid(s) checks if a string is valid UTF-8 string. Please note, each single-byte characters string is a valid UTF-8 string.

Function utfRepInvalid(s,c) computes a strings where all invalid UTF-8 code-points in s are replaced with character c.

{#
    utfIsValid( sub('abc€def',4,4) ),
    utfRepInvalid( sub('abc€def',0,4), '@' ),  //  only the first byte of MB 
    utfRepInvalid( sub('abc€def',4,4), '@' )  //  only the second byte of MB
#}

{# false, 'abc@', '@@de' #}

utfLen

Function / Type and Description

utfLen

(String -> Int)
Get UTF-8 length, as a number of complete code points.

Function utfLen(s) computes the string length by counting the complete UTF-8 code-points. The string length in code-points is always less or equal to the length in bytes (strLen, length or size).

{#
    strLen( 'abc€def' ),
    utfLen( 'abc€def' )
#}

{# 9, 7 #}

utfAt

Function / Type and Description

utfAt

(String * Int -> String)
Returns a code point at given position, indexed by codepoints.

Function utfAt(s,i) is similar to indexing operator s[i], but it uses code-point based indexes. It returns i-th UTF-8 code-point of string s.

For more details please study indexing operator.

{#
    s[0], s[1], s[2], s[3],
    s.utfAt(0), s.utfAt(1), s.utfAt(2), s.utfAt(3)
#}
where {
  s = "abАБабAB";
}

{# 'a', 'b', '\320', '\220', 'a', 'b', '\320\220', '\320\221' #}

{#
    s[-1], s[-2], s[-3], s[-4],
    s.utfAt(-1), s.utfAt(-2), s.utfAt(-3), s.utfAt(-4)
#}
where {
  s = "abАБабAB";
}

{# 'B', 'A', '\261', '\320', 'B', 'A', '\320\261', '\320\260' #}

utfSub, utfSlice, utfLeft, utfRight

Function / Type and Description

utfSub

(String * Int * Int -> String)
Returns a substring from given position (from 0) and with given length, indexing complete UTF-8 code points instead of characters.

utfSlice

(String * Int * Int -> String)
Returns a substring between two given positions, indexing complete UTF-8 code points instead of characters.

utfLeft

(String * Int -> String)
Returns first N UTF-8 code points of the string.

utfRight

(String * Int -> String)
Returns last N UTF-8 code points of the string.

Function utfSub(s,p,n) is similar to subStr(s,p,n), but uses code-point based indexes. It returns a substring of string s, beginning at (zero based) position p with length n, where position and length are counted based on UTF-8 code-points instead of characters.

For more details please study subStr.

{#
    subStr( "abcdАБВГабвгABCD", 0, 8 ),
    utfSub( "abcdАБВГабвгABCD", 0, 8 )
#}

{# 'abcd\320\220\320\221', 'abcd\320\220\320\221\320\222\320\223' #}

Function utfSlice(s,n,m) is similar to string slice operator s[n:m], but uses code-point based indexes. It returns a substring of string s, beginning at (zero based) position n and ending before the position m, where positions n and m are counted based on UTF-8 code-points instead of characters.

For more details please study string slice operator.

{#
    "abcdАБВГабвгABCD" [2:6],
    utfSlice( "abcdАБВГабвгABCD", 2, 6 )
#}

{# 'cd\320\220', 'cd\320\220\320\221' #}

Function utfLeft(s,n) is similar to string function strLeft, but uses code-point based indexes. It returns a substring containing the first n UTF-8 code-points of string s.

For more details please study strLeft.

{#
    strLeft( "abcdАБВГабвгABCD", 6 ),
    utfLeft( "abcdАБВГабвгABCD", 6 )
#}

{# 'abcd\320\220', 'abcd\320\220\320\221' #}

Function utfRight(s,n) is similar to string function strRight, but uses code-point based indexes. It returns a substring containing the last n UTF-8 code-points of string s.

For more details please study strRight.

{#
    strRight( "abcdАБВГабвгABCD", 6 ),
    strRight( "abcdАБВГабвгABCD", 6 )
#}

{# '\320\263ABCD', '\320\263ABCD' #}

Other UTF-8 Functions

Function / Type and Description

utfChars

(String -> List[String])
Splits a string to a list of UTF-8 code points.

utfReverse

(String -> String)
Reverses UTF-8 string.

Function utfChars(s) is similar to string function strChars, but uses code-point instead of characters. It returns a list of all UTF-8 code-points of string s.

For more details please study strChars.

{#
    strChars( "abАБабAB" ),
    utfChars( "abАБабAB" )
#}

{# ['a', 'b', '\320', '\220', '\320', '\221', '\320', '\260', '\320', '\261', 'A', 'B'], ['a', 'b', '\320\220', '\320\221', '\320\260', '\320\261', 'A', 'B'] #}

Function utfReverse(s) is similar to string function strReverse, but uses code-point instead of characters. It returns a reversed string, taking care to preserve the UTF-8 code-points.

For more details please study strReverse.

{#
    strReverse( "abАБабAB" ),
    utfReverse( "abАБабAB" )
#}

{# 'BA\261\320\260\320\221\320\220\320ba', 'BA\320\261\320\260\320\221\320\220ba' #}

4.6.10 Other String Functions

Here we discuss three more string functions:

Function / Type and Description

strLowerCase

(String -> String)
Converts all letters to lower case.

strUpperCase

(String -> String)
Converts all letters to upper case.

strReverse

(String -> String)
Reverses the string.

Function strLowerCase converts a string to a lowercase.

Function strUpperCase converts a string to a lowercase.

{#
    strLowerCase( 'aAbBcC' ),
    strUpperCase( 'aAbBcC' )
#}

{# 'aabbcc', 'AABBCC' #}

Function strReverse computes the reversed string.

{#
    strReverse( 'aAbBcC' )
#}

{# 'CcBbAa' #}

In case of UTF-8 strings, strReverse may return invalid strings. This function treats strings as having single byte characters only. To get a valid UTF-8 reversed string, please use utfReverse.