Table of Contents
Last update: 29.01.2025.
Wafl has four primitive types: Integer
,
Float
, String
and Bool
. In this
chapter we discuss the important elements of the primitive types.
This chapter is quite long and detailed, but its content is elementary, so we recommend a cursory reading. It is enough to get an idea of what is supported, and later you can use this chapter as a reference if needed.
Integer and float literal constants have the same syntax as in the programming languages C and C++.
Floating point literals must contain a decimal point and at least one digit in front of it.
Logical literal constants are true
and
false
.
String literals are quoted using single or double quotation marks. It does not matter which type of quotation mark is used, but the same type must be used at the beginning and at the end of the string.
Special characters are specified by escape sequences, like in C/C++.
The most important escape sequences are: single quotation mark
(\'
), double quotation mark (\"
), backslash
(\\
), new line (\n
), carriage return
(\r
), horizontal tab (\t
), vertical tab
(\v
), form feed (\f
) and backspace
(\b
). As in C/C++, the characters can be encoded with 3
octal digits: \nnn
.
Examples of integer literals:
{#0, 42, -21
#}
{# 0, 42, -21 #}
Examples of float literals:
{#3.4, 0., 1.2e-3
#}
{# 3.4, 0, 0.0012 #}
Examples of bool literals:
{#
true, false #}
{# true, false #}
Examples of string literals:
{#'single quotation marks',
"double quotation marks",
"two\nlines",
"octal codes A=\101 a=\141"
#}
{# 'single quotation marks', 'double quotation marks', 'two\012lines', 'octal codes A=A a=a' #}
Wafl has the usual arithmetic operators:
+
);-
);*
);/
);%
);%%
);**
) and-
).The division of integer values always produces an integer result.
The remainder and modulus operators are very similar. The difference
is that the remainder of the integer division (%
) returns
positive values for positive dividends and negative values for negative
dividends, while the modulus operator (%%
) always returns a
positive result:
{#17 / 10,
17 % 10,
17 %% 10,
-17 / 10,
-17 % 10,
-17 %% 10,
17 / -10,
17 % -10,
17 %% -10,
-17 / -10,
-17 % -10,
-17 %% -10
#}
{# 1, 7, 7, -1, -7, 3, -1, 7, 7, 1, -7, 3 #}
Bit-level integer operators are syntactically and semantically equivalent to these operators in C/C++:
&
);|
);<<
);>>
) and~
).
{#// '11110000' & '00111111' = '00110000' = 48
240 & 63,
// '011' | '110' = '111' = 7
3 | 6,
// bit-level complement
~5,
// '11110' << 3 = '11110000' = 240
30 << 3,
// '11111111' >> 3 = '11111' = 31
255 >> 3
#}
{# 48, 7, -6, 240, 31 #}
The power operator a ** b
returns a
to the
power of b
. Any integer raised to a negative power will
yield zero, except for one. One raised to any power will always yield
1:
{#2 ** 3,
2 ** -3,
1 ** 3,
1 ** -3,
-2 ** 3,
-2 ** -3
#}
{# 8, 0, 1, 1, -8, 0 #}
Multiplication, division, remainder, modulus and bit-level conjunction have a higher precedence than addition, subtraction and bit-level disjunction. The power operator has the higher priority. Shift operators have the lower priority.
Wafl has the usual float operators:
+
);-
);*
);/
);**
) and-
).
{#2.1 + 3.45678,
3.0 - 1.2,
3.14 * 2.17,
17.0 / 10.,
2.0 ** 3.0,
2.0 ** 0.5
#}
{# 5.55678, 1.8, 6.8138, 1.7, 8, 1.414213562 #}
Wafl has a single binary string operator:
"One" + "Two"
OneTwo
There are also indexing operators. They are discussed in the section on the sequence types.
The usual logical operators are supported in both C-like and SQL-like syntax:
&&
, and
);||
, or
) and!
, not
).
{#
true or false,|| false,
true
true and false,&& false,
true
not true,!true
#}
{# true, true, false, false, false, false #}
The usual comparison operators are defined for the types
Integer
, Float
and String
:
==
) and
SQL-like (=
);!=
)
and SQL-like (<>
);<
);<=
);>
) and>=
).
{#1 < 2,
1.2 <= 2.3,
"abcd" > "ABCD",
"abcd" >= "AB",
21 * 2 = 42,
21 == 42 / 2,
17 != 18,
-3.14 <> 3.14
#}
{# true, true, true, true, true, true, true, true #}
Wafl has no variables and no assignments. The operator
‘=
’ has only two purposes: (1) to separate the definition
name from the body and (2) as an equality operator. It can never be
ambiguous, so there’s no reason to use the ‘==
’ operator
instead, but if someone likes it better, that’s fine.
Wafl is a strongly typed programming language and no implicit type conversions are allowed. Therefore, the Wafl core library contains the conversion functions:
asInt
- from any other primitive type to
Integer
;asFloat
- from any other primitive type to
Float
;asString
- from any other type to
String
;asChar
- from any other primitive type to a single
character String
andasBool
- from any other primitive type to
Bool
.There are several other conversion functions for specific type pairs.
It may seem strange to call these functions as...
, but
it is quite natural if we assume that we will mainly use them mainly
with the dot syntax.
The function asInt(x)
converts every non-integer
primitive value x
into the Integer
type:
asInt(x)
converts a Float
value
x
to the nearest integer, just like the synonymous function
round
;asInt(x)
converts a String
value
x
, which is a valid integer literal, into the corresponding
integer value;
x
is a float value literal, only the digits before
decimal point are used;x
is not a valid integer literal, asInt
returns zero;asInt(x)
converts the Bool
values
true
to the integer value 1
and
false
to 0
.For the conversion from Float
to Integer
,
there are also:
round(x)
- the same as asInt(x)
, converts
a float value into the nearest Integer
;ceil(x)
- returns the nearest integer that is not
smaller andfloor(x)
- returns the nearest integer that is not
larger.There are also functions for converting String
values to
Integer
:
ascii(x)
- converts a string value x
to
the ASCII code of the first character of the string;
x
is an empty string, the result of the function is
zero.Examples of conversions from Float
to
Integer
:
{#asInt(3.6),
asInt(-3.6),
round(3.6),
round(-3.6),
ceil(3.6),
ceil(-3.6),
floor(3.6),
floor(-3.6)
#}
{# 4, -4, 4, -4, 4, -3, 3, -4 #}
Examples of conversions from String
to
Integer
:
{#asInt('3'),
asInt('3.8'),
asInt('abc'),
ascii('abc'),
ascii('')
#}
{# 3, 3, 0, 97, 0 #}
Examples of conversions from Bool
to
Integer
:
{#asInt(true),
asInt(false)
#}
{# 1, 0 #}
The function asFloat(x)
converts every primitive
non-float value x
into the type Float
:
asFloat(x)
converts the Integer
value
x
to the corresponding Float
value.asFloat(x)
converts the String
value
x
, which represents a valid Float
literal,
into the corresponding Float
value;
x
is not a valid Float
literal,
asFloat
returns zero;asFloat(x)
converts the Bool
value
true
into the value 1.0
and the value
false
into 0.0
.
{#asFloat(7),
asFloat('6.2'),
asFloat('abc'),
asFloat(true),
asFloat(false)
#}
{# 7, 6.2, 0, 1, 0 #}
There are four functions and an operator for converting values of other data types into strings:
Function / Type and Description
asString
('1 -> String)
Converts a value to a string.
asChar
(PrimeNotString['1] -> String)
Converts a value
to a character.
toString
(Float * Int -> String)
Converts a float value
to a string with given precision.
asPreview
('1 -> String)
Converts a value to a shortened
string.
The function asString(x)
converts every non-string value
x
into String
. It converts a value
x
into its string representation, according to the Wafl
syntax.
There is a synonymous unary postfix operator $
with the
same behavior.
“Any” means “any” - the function asString
and the
postfix operator $
convert any Wafl value
of any type to its String
representation.
{#asString(3),
asString(3.14),
asString(true),
asString(false),
asString('123'),
asString( {# 1, 2.3, "abc", {# true, 's' #} #} ),
asString([1,2,3,4,5,6,7,8,9,10]),
3$,
3.14$,
true$,
false$,'123'$,
1, 2.3, "abc", {# true, 's' #} #}$,
{# [1,2,3,4,5,6,7,8,9,10]$
#}
{# '3', '3.14', 'true', 'false', '123', '{# 1, 2.3, \'abc\', {# true, \'s\' #} #}', '[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]', '3', '3.14', 'true', 'false', '123', '{# 1, 2.3, \'abc\', {# true, \'s\' #} #}', '[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]' #}
The function asPreview(x)
is similar to
asString
, but returns a shorter string. For simple
data, it behaves in the same way as asString
. For larger
structured data and longer strings, it extracts only a part of the
complete string representation.
{#asPreview('01234567989'),
asPreview(
'01234567890123456789012345678901234567890123456789'
'01234567890123456789012345678901234567890123456789'
),asPreview([1,2,3,4,5,6,7,8,9,10]),
asPreview(1..1000)
#}
{# '01234567989', '012345678901234567890123456789 ... 0123456789 (len=100)', '[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]', '[1, 2, 3, 4, 5, ..., 999, 1000] (len=1000)' #}
The function asChar(x)
works with integers and logical
values. It:
Integer
into a string consisting of a
single character with a given ASCII code;true
into the string value
"T"
and the logical value false
into the
string value "F"
.
{#asChar(65),
asChar(65.7),
asChar(true),
asChar(false)
#}
{# 'A', 'B', 'T', 'F' #}
The function toString(x,n)
converts the float value
x
into string with n
decimal places.
{#toString(1234.56789,0),
toString(1234.56789,1),
toString(1234.56789,2),
toString(1234.56789,3)
#}
{# '1235', '1234.6', '1234.57', '1234.568' #}
The function asBool(x)
converts every primitive non-bool
value x
into the type Bool
:
asBool(x)
converts all non-zero Integer
values to true
and zero to false
;asBool(x)
converts all non-zero Float
values to true
and zero to false
;asBool(x)
converts string values "true"
and "T"
to true
, and all other values to
false
.
{#asBool( 2 ),
asBool( -3 ),
asBool( 0 ),
asBool( 2.1 ),
asBool( -3.2 ),
asBool( 0.0 ),
asBool( "true" ),
asBool( "True" ), // this is not same as "true"
asBool( "T" ),
asBool( "t" ) // this is not same se "T"
#}
{# true, true, false, true, true, false, true, false, true, false #}
Wafl core library includes three integer functions:
abs(x)
-
absolute value;sgn(x)
-
sign andbetween(x,a,b)
- check whether x
is between a
and
b
andrandom(x)
-
random value.abs
The integer function abs(x)
computes an absolute integer
value of the given integer value x
:
{#abs( 123 ),
abs( -123 ),
abs( -0 )
#}
{# 123, 123, 0 #}
sgn
The integer function sgn(x)
returns a sign of
the number x
. For positive values it returns
1
, for negative values -1
and for zero it
returns zero:
{#sgn( 20 ),
sgn( -2 ),
sgn( 0 )
#}
{# 1, -1, 0 #}
between
The integer function between(x,a,b)
checks whether the
number x
lies between the numbers a
and
b
, including the limits. It is usually applied as
x.between(a,b)
. This is equivalent to the expression
x >= a and x <= b
:
{#10 .between( 5, 20 ),
10 .between( 10, 20 ),
10 .between( 5, 10 ),
10 .between( 5, 9 ),
10 .between( 11, 20 )
#}
{# true, true, true, false, false #}
random
The integer function random(x)
returns a random integer
value in the range [0,x-1]
. In the following example, we
compute 20 random values in the range [0,4]
:
{#random( 5 ), random( 5 ), random( 5 ), random( 5 ),
random( 5 ), random( 5 ), random( 5 ), random( 5 ),
random( 5 ), random( 5 ), random( 5 ), random( 5 ),
random( 5 ), random( 5 ), random( 5 ), random( 5 ),
random( 5 ), random( 5 ), random( 5 ), random( 5 )
#}
{# 3, 1, 3, 1, 3, 2, 1, 1, 4, 2, 3, 1, 0, 1, 3, 4, 1, 1, 3, 4 #}
By default the random number generator is reinitialized the first time it is used, using the current system timer as the seed. This is usually exactly what is expected and required.
However, sometimes it can be necessary to have the same random number
sequence every time the program is executed (for debugging, benchmarking
and some other cases). For such cases there is the clwafl
command line option -nornd
, which specifies a fixed
predefined seed initialization.
Execute the previous program with:
clwafl -nornd program.wafl
and it will always return the same result.
If a program uses a parallel evaluation, the order of the randomly
generated numbers is not guaranteed. If the -nornd
option
is used, the order of the numbers generated will be the same
but its use by different threads will not give the same results
each time it is run.
The Wafl core library contains the following float functions:
abs(x)
-
absolute value;sgn(x)
-
sign;between(x,a,b)
- check whether x
is between a
and
b
androundTo(x,y)
-
rounding;exp(x)
-
e to the power of x
;ln(x)
- natural
logarithm;log(x)
- logarithm
to the base 10;log2(x)
-
logarithm to the base 2;pow(x,y)
-
x
to the power of y
;sqrt(x)
- square
root;sin(x)
- sine;cos(x)
- cosine;tan(x)
- tangent;asin(x)
- arc sine;acos(x)
- arc cosine;atan(x)
- arc tangent andatan2(y,x)
-
arc tangent of y
/x
(works for
x
=0).The following conversion functions have already been presented in the previous sections:
round(x)
- converts a float value to the nearest Integer
value;ceil(x)
-
returns the nearest Integer
value that is not smaller
andfloor(x)
- returns the nearest Integer
value that is not
larger.abs
The float function abs(x)
returns the absolute value of
a given float value x
.
{#abs( 123.456 ),
abs( -123.456 ),
abs( -0.0 )
#}
{# 123.456, 123.456, 0 #}
sgn
The float function sgn(x)
returns the sign of
the number x
. For positive values it returns
1.0
, for negative values -1.0
and for zero it
returns zero:
{#sgn( 20.3 ),
sgn( -2.4 ),
sgn( 0.0 )
#}
{# 1, -1, 0 #}
between
The float function between(x,a,b)
checks whether the
number x
lies between the numbers a
and
b
, including the limit values. It is usually applied as
x.between(a,b)
. This is equivalent to the expression
x >= a and x <= b
:
{#10.0 .between( 9.5, 10.5 ),
10.0 .between( 10.0, 10.5 ),
10.0 .between( 9.5, 10.0 ),
10.0 .between( 10.1, 10.5 ),
10.0 .between( 9.5, 9.9 )
#}
{# true, true, true, false, false #}
roundTo
The float function roundTo(x,y)
rounds the float value
x
. The given float value y
defines a least
significant digit.
{#roundTo( 1234.56789, 0.001 ),
roundTo( 1234.56789, 0.01 ),
roundTo( 1234.56789, 0.1 ),
roundTo( 1234.56789, 1. ),
roundTo( 1234.56789, 10. ),
roundTo( 1234.56789, 100. ),
roundTo( 1234.56789, 1000. ),
roundTo( 1234.56789, 10000. )
#}
{# 1234.568, 1234.57, 1234.6, 1235, 1230, 1200, 1000, 0 #}
exp
{#exp( -10.0 ),
exp( 0.0 ),
exp( 1.0 ),
exp( 10.0 )
#}
{# 4.539992976e-05, 1, 2.718281828, 22026.46579 #}
ln
The float function ln(x)
returns the natural logarithm
log
e x
. It is defined for positive
float values.
{#ln( 0.1 ),
ln( 1.0 ),
ln( 2.7182818284590452353602874),
ln( 100. ),
ln( 1000. )
#}
{# -2.302585093, 0, 1, 4.605170186, 6.907755279 #}
log
The float function log(x)
returns the logarithm
log
10 x
. It is defined for positive
float values.
{#log( 0.001 ),
log( 0.01 ),
log( 0.1 ),
log( 1.0 ),
log( 10. ),
log( 100. ),
log( 1000. )
#}
{# -3, -2, -1, 0, 1, 2, 3 #}
log2
The float function log2(x)
returns the logarithm
log
2 x
. It is defined for positive
float values.
{#log2( 0.001 ),
log2( 0.0078125 ),
log2( 0.25 ),
log2( 0.5 ),
log2( 1.0 ),
log2( 2. ),
log2( 4. ),
log2( 128. ),
log2( 1000. )
#}
{# -9.965784285, -7, -2, -1, 0, 1, 2, 7, 9.965784285 #}
pow
The float function pow(x,y)
returns
x
y
- x
to the power of
y
.
It is defined for positive x
and any y
.
Negative x
is only permitted if y
is an
integer. Zero x
is only permitted for positive
y
.
{#pow( 2., 3. ),
pow( 2., -3. ),
pow( 2.5, -3.7 ),
pow( -2., 3. ),
pow( 0., 3.2 )
#}
{# 8, 0.125, 0.03369938443, -8, 0 #}
sqrt
The float function sqrt(x)
returns the square root of
x
. It is defined for non-negative float values
x
.
{#sqrt(1.),
sqrt(4.),
sqrt(9.),
sqrt(16.),
sqrt(3433.32)
#}
{# 1, 2, 3, 4, 58.59453899 #}
The following trigonometric functions are available:
sin(x)
- sine;cos(x)
- cosine;tan(x)
- tangent;asin(x)
- arc sine;acos(x)
- arc cosine;atan(x)
- arc tangent andatan2(y,x)
- arc tangent of
y
/x
(works for x
=0).The angles are measured in radians. In addition,
atan2(x,y)
maps a pair of float values to the corresponding
angle. If y
is not zero,
atan2(x,y) = atan(x/y)
, but atan2
is also
defined for y=0
.
{#sin(3.14/2.0) * cos(3.14/2.0),
tan(3.14/2.0),
asin(0.5) + acos(0.5),
atan(0.5),
atan2(1.0,2.0),
atan2(1.0,0.0)
#}
{# 0.0007963264582, 1255.765592, 1.570796327, 0.463647609, 0.463647609, 1.570796327 #}
In this section we introduce the string functions.
The conversion functions (asChar
, asString
,
ascii
and toString
) are presented in the previous sections.
The Wafl String type works with both single-byte strings and UTF-8 encoded multi-byte strings. However, some of the functions work with single-byte strings only. If a function does not work well with UTF-8 strings, this will be noted in this tutorial.
Function / Type and Description
strLen
(String -> Int)
Returns the length of the
character string.
length
(Indexable['1]['2]['3] -> Int)
Returns the size
of the collection.
size
(Indexable['1]['2]['3] -> Int)
Returns the size
of the collection.
strCat
(String * String -> String)
String
concatenation. The same as string addition.
isNull
(String -> Bool)
Checks whether a string is a
database NULL value.
ifNull
(String * String -> String)
Replaces null with
the given value:
ifNull(x,c) = if isNull(x) then c else x
strLen
The function strLen(x)
returns the length of the string
x
.
It is important to understand that string x
can contain
any characters and that characters with the ASCII code zero are not
treated as string terminals. Therefore, the String
type can
work not only with character strings, but also with byte strings.
There are two more general synonyms length
and
size
.
{#strLen( "abc" ),
strLen( "abc\0abc" ),
length( "abc\0abc" ),
size( "abc\0abc" )
#}
{# 3, 7, 7, 7 #}
In the case of UTF-8 strings, strLen
,
length
and size
return the size in bytes. To
get a real UTF-8 string length in UTF-8 code-points, please use
utfLen
.
strCat
The strCat(x,y)
function computes the concatenation of
two given strings. It is equivalent to the string operator
+
.
{#"abc" + "def",
strCat( "abc", "def" )
#}
{# 'abcdef', 'abcdef' #}
isNull
, ifNull
Due to the databases, the String
type supports the
special undefined value NULL
. The
isNull(s)
function checks whether the string s
is NULL. The function ifNull(s,x)
returns s
if
s
is not NULL, but x
if s
is
NULL.
ifNull(s,x) == if isNull(s) then x else s
{#-1, // This returns null string
$isNull('a'),
isNull($-1),
ifNull("abc","xyz"),
ifNull($-1,"xyz")
#}
{# 'NULL', false, true, 'abc', 'xyz' #}
In the previous example, we used the expression $-1
to
generate NULL
strings. The prefix operator $
will be introduced later.
String extraction functions extract a part of the given string and return it. Wafl core library contains the following string extraction functions:
Function / Type and Description
sub
(SequenceStr['2]['1] * Int * Int -> SequenceStr['2]['1])
Extracts the subsequence from given 0-based position and given
length:
sub(seq,pos,len)
subStr
(String * Int * Int -> String)
Returns a
substring from given position (from 0) and with given length.
[Deprecated. Use ‘sub’.]
strLeft
(String * Int -> String)
Returns first N
characters of the string. If N is negative, returns all but last -N
elements.
strRight
(String * Int -> String)
Returns last N
characters of the string. If N is negative, returns all but first -N
elements.
strLTrim
(String -> String)
Trims all spaces from left
side.
strRTrim
(String -> String)
Trims all spaces from right
side.
strTrim
(String -> String)
Trims all spaces from both
sides.
sub
and subStr
The function sub(s,p,n)
returns a substring of character
string s
that starts at the zero based position
p
with the length n
.
In the Wafl core library there is subStr
, which is a
synonym for sub
. In the current version both functions are
supported, but it is possible that only sub
remains in
future versions.
{#subStr( "abcdefgh", 0, 3 ),
sub( "abcdefgh", 0, 3 ),
sub( "abcdefgh", 2, 3 ),
sub( "abcdefgh", -2, 5 ),
sub( "abcdefgh", 5, 10 ),
sub( "abcdefgh", 5, -2 )
#}
{# 'abc', 'abc', 'cde', '', 'fgh', '' #}
Special cases:
sub( "abcdefgh", -2, 5 )
);sub( "abcdefgh", 5, 10 )
)sub( "abcdefgh", 5, -2 )
).In the case of UTF-8 strings, sub
and
subStr
can return invalid strings. These functions treat
strings as if they only consist of single byte characters. If a
substring starts or ends in the middle of a multi-byte UTF-8 code-point,
the result is not a valid UTF-8 string. To obtain a valid UTF-8
substring whose positions are specified in UTF-8 code-points, please use
utfSub
.
strLeft
and
strRight
The strLeft(s,n)
function returns a substring containing
the first n
characters of the string s
:
n
, that is less than
strLen(s)
, it is the same as sub(s,0,n)
;n
it is the same as
strLeft(s,strLen(s)+n)
The strRight(s,n)
function returns a substring
containing the last n
characters of the string
s
:
n
, that is less than
strLen(s)
, it is the same as
sub(s,strLen(s)-n,n)
;n
it is the same as
strRight(s,strLen(s)+n)
{#strLeft( "abcdefgh", 3 ), // first 3 characters
strLeft( "abcdefgh", 10 ), // whole string
strLeft( "abcdefgh", -5 ), // all but last 5 characters
strLeft( "abcdefgh", -10 ), // empty string
strRight( "abcdefgh", 3 ), // last 3 characters
strRight( "abcdefgh", 10 ), // whole string
strRight( "abcdefgh", -5 ), // all but first 5 characters
strRight( "abcdefgh", -10 ) // empty string
#}
{# 'abc', 'abcdefgh', 'abc', '', 'fgh', 'abcdefgh', 'fgh', '' #}
In the case of UTF-8 strings, strLeft
and
strRight
can return invalid strings. These functions treat
strings as if they only consist of single byte characters. If a
substring starts or ends in the middle of a multi-byte UTF-8 code-point,
the result is not a valid UTF-8 string. To obtain a valid UTF-8
substring whose positions are specified in UTF-8 code-points, please use
utfLeft
and utfRight
.
strLTrim
,
strRTrim
and strTrim
The function strLTrim(s)
returns the largest substring
of s
that does not contain any leading, non-visible
characters.
The function strRTrim(s)
returns the largest substring
of s
that does not contain any trailing, non-visible
characters.
The function strTrim(s)
returns the largest substring of
s
that contains neither leading nor trailing, non-visible
characters.
{#strLTrim( "\0 \t \n abcd \b \003 \0 \r " ),
strRTrim( "\0 \t \n abcd \b \003 \0 \r " ),
strTrim( "\0 \t \n abcd \b \003 \0 \r " )
#}
{# 'abcd \010 \003 \000 \015 ', '\000 \011 \012 abcd', 'abcd' #}
The index operator s[i]
is equivalent to
subStr(s, i %% strLen(s) , 1)
. This means that indexing
beyond the length is possible. The index operator s[i]
is
similar, but not equivalent to subStr(s,i,1)
. They are only
equivalent if the following applies: 0 <= i
<
strLen(s)
{# [-4], s[-3], s[-2], s[-1],
s[0], s[1], s[2], s[3],
s[4], s[6], s[7], s[8]
s
#}where {
s = "abcd";
}
{# 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'c', 'd', 'a' #}
With UTF-8 strings, the indexing operator can return an invalid
string. It treats strings as if they only consist of single byte
characters. If the index points to a UTF-8 multi-byte code-point
element, the result is not a valid UTF-8 string. To get a valid UTF-8
code-point whose position is specified in UTF-8 code-points, please use
utfAt
.
The slice operator uses a similar syntax to the index operator, but
behaves like subStr
, strLeft
and
strRight
. If 0 < n
< m
<= strLen(s)
, then:
s[:n]
is the same as strLeft(n)
, it
extracts the first n
characters;s[n:]
is the same as strRight(-n))
, it
extracts all but first n
characters ands[n:m]
is the same as
strRight(strLeft(s,m),-n)
, and the same as
subStr(s,n,m-n)
.If the index n
is negative or greater than
strLen(s)
, then n %% strLen(s)
is used. The
same applies to m
.
{# [:6],
s[:-2],
s[2:],
s[-6:],
s[2:6],
s[2:-2],
s[-6:6],
s[-6:-2]
s
#}where {
s = "abcdefgh";
}
{# 'abcdef', 'abcdef', 'cdefgh', 'cdefgh', 'cdef', 'cdef', 'cdef', 'cdef' #}
It is often easier to use the slice operator than extraction functions, but basically they do the same thing.
In case of UTF-8 strings, slice operators may return invalid strings.
These operators treat strings as having single byte characters only. If
a slice begins or ends in the middle of a multi-byte UTF-8 code-point,
the result will not be a valid UTF-8 string. To get a valid UTF-8 slice,
with positions denoted in UTF-8 code-points, please use
utfSlice
.
In the case of UTF-8 strings, slice operators can return invalid
strings. They treat strings as if they only consist of single byte
characters. If a slice starts or ends in the middle of a multi-byte
UTF-8 code-point, the result is not a valid UTF-8 string. To obtain a
valid UTF-8 slice whose positions are specified in UTF-8 code-points,
please use utfSlice
.
The Wafl core library contains the following string search functions:
Function / Type and Description
strPos
(String * String -> Int)
Finds first position of
a substring in the string, or -1 if not found.
strPosI
(String * String -> Int)
Same as strPos, but
ignores upper and lower case.
strNextPos
(String * String * Int -> Int)
Finds next
position of a substring in the string, after given pos.
strNextPosI
(String * String * Int -> Int)
Same as
strNextPos, but ignores upper and lower case.
strLastPos
(String * String -> Int)
Finds last position of
a substring in the string, or -1 if not found.
strLastPosI
(String * String -> Int)
Same as strLastPos, but
ignores upper and lower case.
strNextLastPos
(String * String * Int -> Int)
Finds next last
position of a substring in the string, before given pos.
strNextLastPosI
(String * String * Int -> Int)
Same as
strNextLastPosI, but ignores upper and lower case.
strBeg
(String * String -> Bool)
Checks whether the 2nd
string is at the beginning of the 1st.
strEnd
(String * String -> Bool)
Checks whether the 2nd
string is at the end of the 1st.
All search functions return the start position of the second
specified string in the first specified string if it is found, and
-1
if it is not found.
Functions whose names end with ‘I’ are case-insensitive:
strPosI
, strNextPosI
,
strLastPosI
, and strNextLastPosI
.
In case-insensitive searches, both strings are first converted to upper case. This can be inefficient for larger strings.
strPos
,
strPosI
, strNextPos
and
strNextPosI
The function strPos(s,p)
returns the position of the
first occurrence of the string p
in the string
s
.
The function strPosI(s,p)
returns the position of the
first occurrence of the string p
in the string
s
, ignoring upper and lower case.
The function strNextPos(s,p,i)
returns the position of
the first occurrence of the string p
in the string
s
after position i
.
The function strNextPosI(s,p,i)
returns the position of
the first occurrence of the string p
in the string
s
after position i
, ignoring upper and lower
case.
{# strPos( s, n ), // not found
strPos( s, x ),
strNextPos( s, x, 0 ),
strNextPos( s, x, 3 ),
strNextPos( s, x, 6 ),
strNextPos( s, x, 9 )
#}where {
s = "abcabcabcab";
x = "ab";
n = "xxx";
};
{# -1, 0, 3, 6, 9, -1 #}
strLastPos
,
strLastPosI
, strNextLastPos
and
strNextLastPosI
The function strLastPos(s,p)
returns the position of the
last occurrence of the string p
in the string
s
.
The function strPosI(s,p)
returns the position of the
last occurrence of the string p
in the string
s
, ignoring upper and lower case.
The function strNextPos(s,p,i)
returns the position of
the last occurrence of the string p
in the string
s
before the position i
.
The function strNextPosI(s,p,i)
returns the position of
the last occurrence of the string p
in the string
s
before the position i
, ignoring upper and
lower case.
Please note that the case-insensitive functions
strLastPosI
and strNextLastPosI
do not work
well for UTF-8 multibyte characters.
{# strLastPos( s, n ), // not existing
strLastPos( s, x ),
strNextLastPos( s, x, 9 ),
strNextLastPos( s, x, 6 ),
strNextLastPos( s, x, 3 ),
strNextLastPos( s, x, 0 )
#}where {
s = "abcabcabcab";
x = "ab";
n = "xxx";
};
{# -1, 9, 6, 3, 0, -1 #}
strBeg
and
strEnd
The functions strBeg
and strEnd
check
whether the first string has the second string at the beginning or at
the end:
{#strBeg( "abcdef", "ab" ),
strBeg( "abcdef", "cd" ),
strBeg( "abcdef", "ef" ),
strEnd( "abcdef", "ab" ),
strEnd( "abcdef", "cd" ),
strEnd( "abcdef", "ef" )
#}
{# true, false, false, false, false, true #}
The Wafl core library contains the following fuctions for counting substrings:
Function / Type and Description
strCountSub
(String * String -> Int)
Count occurrences of
substring in the given string:
strCountSub('aaaaaA','aa') == 4
strCountSubI
(String * String -> Int)
Same as strCountSub,
but ignores upper and lower case:
strCountSub('aaaaaA','aa') == 5
strCountSubDis
(String * String -> Int)
Count disjunct
occurrences of substring in the given string:
strCountSub('aaaaaA','aa') == 2
strCountSubDisI
(String * String -> Int)
Same as strCountSubDis,
but ignores upper and lower case:
strCountSub('aaaaaA','aa') == 3
All counting functions return the number of occurences of the given substring in the given string. The functions whose names end with ‘I’ ignore the upper and lower case. The functions with ‘Dis’ in the names count only the disjunctive substrings.
{# strCountSub('aaaaaA','aa'),
strCountSubI('aaaaaA','aa'),
strCountSubDis('aaaaaA','aa'),
strCountSubDisI('aaaaaA','aa')
#}
{# 4, 5, 2, 3 #}
The functions strCountSubI
and
strCountSubDisI
first convert both strings to upper case
letters. They can be inefficient with larger strings.
The Wafl core library contains the following functions for replacing the parts of the strings with another string:
Function / Type and Description
strReplace
(String * String * String * Int -> String)
Replaces Nth occurrence of substring with given string:
strReplace('ababa','b','c',2) == 'abaca'
strReplaceI
(String * String * String * Int -> String)
Same
as strReplace, but ignores upper and lower case.
strReplaceAll
(String * String * String -> String)
Replaces
all occurrences of substring with given string.
strReplaceAllI
(String * String * String -> String)
Same as
strReplaceAll, but ignores upper and lower case.
Each of the strReplace*
functions evaluates a new string
and does not change any of the given strings.
The function strReplace(s,p,x,i)
returns a copy of
string s
in which the i
.th occurrence of the
substring p
is replaced by x
. The string
s
remains unchanged.
The function strReplaceI(s,p,x,i)
is similar to the
function strReplace
, but ignores upper and lower case when
searching for p
.
The function strReplaceAll(s,p,x)
returns a copy of the
string s
in which all occurrences of substring
p
are replaced by x
. The string s
remains unchanged.
The function strReplaceAllI
is similar to the function
strReplaceAll
, but ignores upper and lower case when
searching for p
.
{#strReplace( s, 'a', '@', 2 ),
strReplaceI( s, 'a', '@', 2 ),
strReplaceAll( s, 'a', '@' ),
strReplaceAllI( s, 'a', '@' )
#}where {
s = "abABabAB";
}
{# 'abAB@bAB', 'ab@BabAB', '@bAB@bAB', '@b@B@b@B' #}
Please note that the case-insensitive functions
strReplaceI
and strReplaceAllI
can be
inefficient with larger strings.
strSplit...
and strJoin
Here we have a list of strings. The list is one of the most important concepts of functional programming languages, including Wafl. We will discuss lists in detail in the following chapter.
The function strSplit(s,p)
returns a list of all
substrings of the string s
that are separated by the
substring p
.
The function strJoin(lst,p)
concatenates all elements of
the list lst
and inserts the separator p
between them.
Function / Type and Description
strSplit
(String * String -> List[String])
Splits a
string to a list of string, by extracting the given separator.
strSplitTrim
(String * String -> List[String])
Splits a
string to a list of string, by extracting the given separator. All
spaces are trimmed from each segment from left and right side.
strSplitLines
(String -> List[String])
Splits a string to a
list of string, by extracting new-line separator.
strSplitLinesTrim
(String -> List[String])
Splits a string to a
list of string, by extracting new-line separator. All spaces are trimmed
from each segment from left and right side.
strJoin
(Sequence['1][String] * String -> String)
Joins
(concatenates) a sequence of strings, adding the given separator.
strSplit( 'a,bb,c,dd,e', ',' )
['a', 'bb', 'c', 'dd', 'e']
strJoin( ['a','b','c','d','e','f','g','h'], ';' )
a;b;c;d;e;f;g;h
strJoin( strSplit( 'a,bb,c,dd,e', ','), ';' )
a;bb;c;dd;e
The function strSplitTrim
is similar to
strSplit
, but it detects and removes all empty spaces
before and after the separators. It is functionally equivalent, but more
efficient than the mapping of strTrim
after the split:
strSplit(p).map(strTrim) == s.strSplitTrim(p) s.
The function strSplitLines(s)
is similar to
strSplit(s,'\n')
, but it detects and removes both LF
(Linux, ‘\n
’) and CRLF (Windows, ‘\r\n
’) new
line sequences:
{#strSplit( '\nabc\ndef\nghi\n', '\n' ),
strSplit( '\r\nabc\r\ndef\r\nghi\r\n', '\n' ),
strSplitLines( '\nabc\ndef\nghi\n' ),
strSplitLines( '\r\nabc\r\ndef\r\nghi\r\n' )
#}
{# ['', 'abc', 'def', 'ghi', ''], ['\015', 'abc\015', 'def\015', 'ghi\015', ''], ['', 'abc', 'def', 'ghi', ''], ['', 'abc', 'def', 'ghi', ''] #}
The function strSplitLinesTrim
is similar to
strSplitTrim
and strSplitLines
. It is
functionally equivalent, but more efficient than mapping
strTrim
after using strSplitLines
:
strSplitLines().map(strTrim) == s.strSplitLinesTrim() s.
The function strChars
converts a string into a list of
characters.
{#strChars( 'abc\ndef' )
#}
{# ['a', 'b', 'c', '\012', 'd', 'e', 'f'] #}
In the case of UTF-8 strings, strChars
cuts a string
into bytes. To get a valid UTF-8 code-point list, use
utfChars
instead.
Sometimes we need to encode a string in a format that follows a
specific syntax. Wafl contains four string encoding functions. All these
functions have the common type: (String -> String)
.
The function strEncodeHtml(s)
returns an encoded string
that can be inserted into HTML. All special characters are replaced by
corresponding HTML escape sequences:
strEncodeHtml( "abc&<>def" )
abc&<>def
The function strEncodeSql(s)
returns an encoded string
ready for use in SQL string literals. All special characters are
replaced by SQL escape sequences:
strEncodeSql( "abc'quotes'abc" )
abc''quotes''abc
The function strEncodeUri(s)
returns an encoded string
according to the rules of URI syntax:
strEncodeUri( "a + b = c" )
a%20%2B%20b%20%3D%20c
The function strEncodeWafl(s)
returns an encoded string
according to the Wafl syntax:
strEncodeWafl( "a\n \0 \'\"..." )
a\012 \000 \'\"...
To handle UTF-8 strings correctly, please only use the functions that handle strings as sequences of UTF-8 multibyte code-points.
Most of the specific UTF-8 behavior is covered by the following functions. Please note, however, that two important functionalities are not yet supported by the Wafl library:
Some applications require that files with UTF-8 content have a UTF-8 BOM (Byte Order Mark) sequence at the beginning of the file. The following functions enable the handling of UTF-8 BOM.
Please note that the use of UTF-8 BOM is not recommended, as the byte-order in UTF-8 format is irrelevant. It is based on individual bytes, not words.
Function / Type and Description
utfBom
( -> String)
Returns UTF-8 BOM sequence.
utfIsBom
(String -> Bool)
Checks whether a string content
is UTF-8 BOM.
utfHasBom
(String -> Bool)
Checks whether a string begins
with UTF-8 BOM.
utfAddBom
(String -> String)
Adds a UTF-8 BOM, if not
already present.
utfTrimBom
(String -> String)
Trims leading BOM, if
present.
The function utfBom()
returns the UTF-8 BOM.
The function utfIsBom(s)
checks whether the string
content corresponds exactly to the UTF-8 BOM.
The function utfHasBom(s)
checks whether the string
begins with a UTF-8 BOM.
{#utfBom(),
utfIsBom( utfBom() ),
utfIsBom( 'abc' ),
utfHasBom( utfBom() + 'abc' ),
utfHasBom( 'abc' )
#}
{# '\357\273\277', true, false, true, false #}
The function utfAddBom(s)
returns a string with a UTF-8
BOM appended to the beginning, if it is not already present.
The function utfTrimBom(s)
returns a string without a
leading UTF-8 BOM.
{#utfAddBom('abc'),
utfHasBom( utfAddBom('abc') ),
utfTrimBom( utfAddBom('abc') ),
utfHasBom( utfTrimBom( utfAddBom('abc') ) )
#}
{# '\357\273\277abc', true, 'abc', false #}
Function / Type and Description
utfIsValid
(String -> Bool)
Checks whether a string is a
valid UTF-8 encoded string.
utfRepInvalid
(String * String -> String)
Replaces invalide
code points with the given character.
The function utfIsValid(s)
checks whether a string is a
valid UTF-8 string. Please note that each single-byte characters string
is a valid UTF-8 string.
The function utfRepInvalid(s,c)
returns a strings in
which all invalid UTF-8 code-points are replaced by the character
c
.
{#utfIsValid( sub('abc€def',4,4) ),
utfRepInvalid( sub('abc€def',0,4), '@' ), // only the first byte of a MB
utfRepInvalid( sub('abc€def',4,4), '@' ) // only the second byte of a MB
#}
{# false, 'abc@', '@@de' #}
utfLen
Function / Type and Description
utfLen
(String -> Int)
Returns UTF-8 length, as a
number of complete code points.
The function utfLen(s)
returns the string length by
counting the complete UTF-8 code-points. The string length in
code-points is always less than or equal to the length in bytes
(strLen
, length
or size
).
{#strLen( 'abc€def' ),
utfLen( 'abc€def' )
#}
{# 9, 7 #}
utfAt
Function / Type and Description
utfAt
(String * Int -> String)
Returns a code point at
given position, indexed by codepoints.
The function utfAt(s,i)
is similar to the indexing
operator s[i]
, but uses indices based on code-points. It
returns the i
-th UTF-8 code-point of the string
s
.
For more details, see indexing operator.
{#[0], s[1], s[2], s[3],
sutfAt(0), s.utfAt(1), s.utfAt(2), s.utfAt(3)
s.
#}where {
s = "abАБабAB";
}
{# 'a', 'b', '\320', '\220', 'a', 'b', '\320\220', '\320\221' #}
{#[-1], s[-2], s[-3], s[-4],
sutfAt(-1), s.utfAt(-2), s.utfAt(-3), s.utfAt(-4)
s.
#}where {
s = "abАБабAB";
}
{# 'B', 'A', '\261', '\320', 'B', 'A', '\320\261', '\320\260' #}
utfSub
,
utfSlice
, utfLeft
, utfRight
Function / Type and Description
utfSub
(String * Int * Int -> String)
Returns a
substring from given position (from 0) and with given length, indexing
complete UTF-8 code points instead of characters.
utfSlice
(String * Int * Int -> String)
Returns a
substring between two given positions, indexing complete UTF-8 code
points instead of characters.
utfLeft
(String * Int -> String)
Returns first N UTF-8
code points of the string.
utfRight
(String * Int -> String)
Returns last N UTF-8
code points of the string.
The function utfSub(s,p,n)
is similar to
subStr(s,p,n)
, but uses code-point based indices. It
returns a substring of the string s
, beginning at the zero
based position p
with length n
, where position
and length are counted based on UTF-8 code-points instead of
characters.
For more details, please read subStr
.
{#subStr( "abcdАБВГабвгABCD", 0, 8 ),
utfSub( "abcdАБВГабвгABCD", 0, 8 )
#}
{# 'abcd\320\220\320\221', 'abcd\320\220\320\221\320\222\320\223' #}
The function utfSlice(s,n,m)
is similar to the string
slice operator s[n:m]
, but uses code-point based indices.
It returns a substring of the string s
, starting at the
zero based position n
and ending before the position
m
, where positions n
and m
are
counted based on UTF-8 code-points instead of characters.
For more details, see string slice operator.
{#"abcdАБВГабвгABCD" [2:6],
utfSlice( "abcdАБВГабвгABCD", 2, 6 )
#}
{# 'cd\320\220', 'cd\320\220\320\221' #}
The function utfLeft(s,n)
is similar to the string
function strLeft
, but uses code-point based indices. It
returns a substring containing the first n
UTF-8
code-points of the string s
.
For more details please study strLeft
.
{#strLeft( "abcdАБВГабвгABCD", 6 ),
utfLeft( "abcdАБВГабвгABCD", 6 )
#}
{# 'abcd\320\220', 'abcd\320\220\320\221' #}
The function utfRight(s,n)
is similar to string function
strRight
, but uses indices based on code-points. It returns
a substring containing the last n
UTF-8 code-points of the
string s
.
For more details, please see strRight
.
{#strRight( "abcdАБВГабвгABCD", 6 ),
strRight( "abcdАБВГабвгABCD", 6 )
#}
{# '\320\263ABCD', '\320\263ABCD' #}
Function / Type and Description
utfChars
(String -> List[String])
Splits a string to a
list of UTF-8 code points.
utfReverse
(String -> String)
Reverses UTF-8 string.
The function utfChars(s)
is similar to the string
function strChars
, but uses code-points instead of
characters. It returns a list of all UTF-8 code-points of the string
s
.
For more details please study strChars
.
{#strChars( "abАБабAB" ),
utfChars( "abАБабAB" )
#}
{# ['a', 'b', '\320', '\220', '\320', '\221', '\320', '\260', '\320', '\261', 'A', 'B'], ['a', 'b', '\320\220', '\320\221', '\320\260', '\320\261', 'A', 'B'] #}
The function utfReverse(s)
is similar to the string
function strReverse
, but uses code-points instead of
characters. It returns a reversed string, taking care to preserve the
UTF-8 code-points.
For more details, please read strReverse
.
{#strReverse( "abАБабAB" ),
utfReverse( "abАБабAB" )
#}
{# 'BA\261\320\260\320\221\320\220\320ba', 'BA\320\261\320\260\320\221\320\220ba' #}
Here we discuss three other string functions:
Function / Type and Description
strLowerCase
(String -> String)
Converts all letters to lower
case.
strUpperCase
(String -> String)
Converts all letters to upper
case.
strReverse
(String -> String)
Reverses the string.
The function strLowerCase
converts all letters in a
string to lower case.
The function strUpperCase
converts all letters in a
string to upper case.
{#strLowerCase( 'aAbBcC' ),
strUpperCase( 'aAbBcC' )
#}
{# 'aabbcc', 'AABBCC' #}
The function strReverse
returns the reversed string.
{#strReverse( 'aAbBcC' )
#}
{# 'CcBbAa' #}
In the case of UTF-8 strings, strReverse
can return
invalid strings. This function treats strings as if they only consisted
of single byte characters. To get a valid UTF-8 reversed string, please
use utfReverse
.