Table of Contents
Wafl is a strongly typed programming language. That means Wafl requires a detailed and very strict type checking before program evaluation. Each function application must fully comply with the function type.
Strong type checking is usually paired with explicit name type declarations. However, Wafl need no explicit type declarations, because the types are automatically inferenced during the code analysis.
Because of the strict type checking, no implicit type conversions are allowed.
In the following example we have a program with wrong type usage. Addition operators is defined for different types, but it is not defined to work on operands of different types. So, a type checking error is reported:
1 + '2.0'
12.0
The proper way to handle such problems is to use explicit type conversions. This will be covered later in the tutorial.
Wafl is a strongly typed programming language. However, Wafl syntax does not include explicit type specification. All types are implicitly inferred. Usually, the program evaluation does not present the inferred definition types to users, but if any type error is detected during the type checking, all inferred types are reported.
In the following example there are three local functions defined:
g
is recognized to be of type
(Float -> Float)
(maps single float argument to float
result);h
is recognized to be of type
(Integer -> Integer)
(maps single integer argument to
integer result).f
is not well defined, because there is an
addition operator with first operand of float type and second operand of
integer type, but implemented addition operator does not allow the
different operand types.7.0)
f(where{
f(x) = g(x) + h(x); // Type checking error!
g(x) = sin(x);
h(x) = x + 5;
}
--- Loading: C:\Users\smalkov\AppData\Local\Temp\wafltmpfile_727892_0.tmp
Typechecking failed!
- f ( x ) : '2 [Line: 4]
The argument (2) type is not suitable for the context :
g(x)+h(x) [Line: 4]
The argument (1) type is not suitable for the context :
h(x) [Line: 4]
Exp.type not compatible with : Int
The expression type is not suitable for the context:
x [Line: 4]
Exp.type not compatible with : Int
--- End loading: C:\Users\smalkov\AppData\Local\Temp\wafltmpfile_727892_0.tmp
The error report contains some redundant data because all branches of the type checking process are reported to allow better understanding of reported problems. If we read the report from the beginning, we can see all the actions performed by type checker:
f
type alone and failed, because
f
uses names g
and h
, which types
are not inferred, yet;g
type alone and inferred its type:
(Float -> Float)
;h
type alone and inferred its type:
(Int -> Int)
;f
type alone and failed again,
because h(x)
is not a valid usage in the context, which
means that x
is already assumed to have a non-float
type;f
type, but now in a wider context
and fails because of the same reason.Sometimes, functions use mutual recursion, i.e. each of them uses another. In that case all these functions have to be checked together, at the same time. Therefore, the type checking is performed in a wider context if a local type checking failed.
In most cases, it is better to read the report from the bottom. The
last block of reported not recognized types is the most relevant. Each
line in the report beginning with OK:
suggests that the
given definition type is well inferred.
Reading from the bottom, the error report shows that type checking
fails for Program
and f
and the problem is in
h(x)
subexpression. The further analysis shows that this is
h(x)
from f
definition body.
Polymorphism is the property of a function to be well defined for arguments of different data types. Implicit polymorphism is very important feature of programming language Wafl.
The type inference premises that each of local definitions is fully polymorphic, i.e. that any argument of any function can be of any data type. Step by step, the assumed polymorphic types are reduced to types for which expressions that define functions are applicable.
In the following example, two polymorphic functions add
and sub
are defined and used with arguments of different
types:
{#1,2),
add(1.3,2.3),
add('a','b'),
add(sub(5,1),
sub(5.4,1.2),
5,4,3),
f(1.2, 3.4, 1.3),
f(asFloat(add(1,2)), 3.1 )
add(
#}where{
add(x,y) = x + y;
sub(x,y) = x - y;
f(x,y,z) = sub( add(x,y), z);
}
{# 3, 3.6, 'ab', 4, 4.2, 6, 3.3, 6.1 #}
Function add
is determined to have type
(Value['1] * Value['1] -> Value['1])
. That means:
Integer
, Float
or String
);In a similar way, the type of the function sub
is
inferred to be
(Numeric['1] * Numeric['1] -> Numeric['1])
. Thus:
Integer
or Float
);Both Numeric['1]
and Value['1]
are
so-called combined data types. They can be further reduced to
different primitive data types. Type '1
represents a
polymorphic type name. Same polymorphic type names in a type must reduce
to same specific types in each specific function application.
In each particular function application, the function type has to be reduced to non-polymorphic data type. However, different applications of a same polymorphic function may reduce to different types in a same program. If used in functions definitions, polymorphic functions can be used in a polymorphic way, with no further data type reduction, or with reduction to more specific but yet another polymorphic type. Their types will be reduced further when the application of defining function induces that.
Function f
in our example is defined as:
= sub( add(x,y), z); f(x,y,z)
and its type is inferred to be
(Numeric['1] * Numeric['1] * Numeric['1] -> Numeric['1])
.
Thus:
Integer
or Float
);Function f
uses both add
and
sub
. In specific application, the type of function
add
is reduced from Value['1]
to
Numeric['1]
, while the type of function is sub
is not further reduced.
It is important to note that each specific application is
reduced independently of the other applications. This is why our
function add
is in the same example successfully applied to
a pair of integers, a pair of floats and a pair of strings, and
functions sub
and f
are successfully applied
to integers and floats. Moreover, in the last expression function
add
is applied to both integers and floats.
Function asFloat
converts a value from integer to float.
Conversion functions will be covered later in the tutorial.
One of the essential properties of functional programming languages is that functions are treated as first order citizens. Functions can be used as arguments and results of other functions and in a much simpler way than in the most of the imperative languages.
The functions which take functions as arguments or return a function as a result are referred as higher order functions. They are essential for functional programming.
In the following simple example, function double
is used
as argument of function apply
application.
2)
apply(double,where{
apply(f,x) = f(x);
double(x) = x + x;
}
4
Function double
has type
(Value['1] -> Value['1])
. It is similar to function
add
from one of the previous examples.
Function apply
has type
(('1 -> '1) * '1 -> '1)
. That means:
('1 -> '1)
;'1
,'1
and'1
may be replaced with
any type, but at the same time in all previous
assertions.In the given example, function apply
applies function
double
to 2
. In this specific application type
variable '1
is replaced with Int
.
More complex examples follow later.
A partial application is any function application where not all expected arguments are specified.
For example, if we would only specify that the first argument of the
function add
(from previous examples), the result would be
a unary function that returns its argument increased by one.
One way to do this is to define a new function:
= add( 1, x ); f(x)
Another way is to use partial application:
= add(1,_); f
In Wafl, partial application syntax is similar to a regular function
call, but underscore ‘_
’ is used instead of
unspecified arguments. It is always necessary to use as much
arguments as the function expects, but some of the arguments may be
specified, and others may remain unspecified.
The result of a partial application is a function. Its arguments correspond to unspecified arguments of the partial application, in the preserved order.
It is possible to have all of the arguments unspecified. However, this is not very useful because the result is the function itself.
In the following example we use function
subStr(str,pos,len)
. It extracts a substring of
str
that begins at position pos
(zero based)
and is len
characters long. String functions will be
discussed in details later.
Several partial applications are used in the following example:
{#"1234567890" ),
strGetFirst5a( "1234567890" ),
strGetFirst5b( "123456", 1 ),
charAt( "123456", 2 ),
charAt( 5 ),
add2( // The following expressions evaluate the same result
subStr( "1234567890", 5, 3 ),
subStr( _, 5, _ )( _, 3 )( "1234567890" )
#}where{
// The following definitions are the same.
// Both functions return
// the first 5 characters of a string.
strGetFirst5a( s ) = strLeft( s, 5 );
strGetFirst5b = strLeft( _, 5 );
// charAt(n) returns the n-th character of a String.
charAt = subStr( _, _, 1 );
// twice(f,x) applies f to x two times
twice( f, x ) = f( f(x) );
// inc(x) returns x incremented by 1
inc(x) = x + 1;
// add2 returns its argument incremented by 2
add2 = twice( inc, _ );
}
{# '12345', '12345', '2', '3', 7, '678', '678' #}
In many functional programming languages partial application is implemented by specifying only the first argument of a function (or more general, some of the arguments from the beginning of the argument list, in the respected order). Such approach is called curried functions. The result of a curried partial application is a function that maps the remaining arguments to the result.
The partial application as implemented in Wafl is more general than curried functions.
A reader may wonder what’s the use of partial application. Like some other topics in this part of the tutorial, their purpose will become clear once we introduce advanced sequence processing techniques.
Higher order functions are so often used in functional programming that we need as many methods as possible to define or create new functions. Partial application is one such method. Lambda functions are another, as we will see in the next section.
In functional programming languages, term lambda function is related to different concepts of defining unnamed functions at the place of their usage.
When we use higher order functions, we often need to specify as an argument a function that is not needed elsewhere. If we use a function only once in a program, it is easier and often more readable to define it inline than to introduce a regular function definition.
In programming language Wafl we define lambda functions in the following form:
\ <arg
1
>, ..., <arg
n
>: <exp>
A lambda function may appear at any place where we can use a regular function name. Sometimes it is required to parenthesize the lambda function to make the syntax clear.
We will often use simplified term lambda instead of lambda function or unnamed function.
In our first lambda example, we define and use an unnamed function that returns its argument increased by 1:
\x:x+1)(41) (
42
This function is the same as inc
function from previous
examples, but now it is defined as a lambda and used at the place of the
definition. It is much shorter than using a full definition syntax, like
in the following example:
41)
inc(where {
inc(x) = x+1;
}
42
The following examples are more complex than previous examples. It may be more difficult to understand them, for beginners in functional programming. But such examples are essential - please read carefully.
Let us now, for a first time, define a function that has only
functions for its arguments and returns a new function as a result.
Function combine
takes two functions as arguments. It
returns a unary function that applies the combined functions to its
argument. So, the following expression should always evaluate
true
:
== f(g(x)) combine(f,g)(x)
One way to write combine
is to define a helper
function c
first, which takes three arguments
f
, g
and x
, and to return its
partial application:
= c(f,g,_)
combine where {
c(f,g,x) = f(g(x));
}
A simpler way is to use lambda instead of defining the function
c
, but the solution is equivalent:
{#1),
add2(1),
add3(10),
combine( inc, inc )(\x:x+1,\x:x+1)(10)
combine(
#}where{
combine( f, g ) = ( \f,g,x : f(g(x)) )( f, g, _ );
add3 = combine( add2, inc );
add2 = combine( inc, inc );
inc(x) = x + 1;
}
{# 3, 4, 12, 12 #}
In the next section, we will see that combine
can be
implemented even more easily.
In the following example, we use combine
to define the
repeat
function that generalizes twice
from
the previous examples. For a given function f
and a given
integer n
, repeat
returns a function that
applies f
to its argument n
times. In
repeat
, we use the lambdas to write the identity function,
and in the main expression we use a lambda equivalent of the
inc
function:
{#\x:x+1, 0 )( 100 ),
repeat( \x:x+1, 1 )( 100 ),
repeat( \x:x+1, 10 )( 100 )
repeat(
#}where{
repeat( f, n ) =
if n=0
then \x:x
else combine( f, repeat(f,n-1) );
combine( f, g ) = ( \f,g,x : f(g(x)) )( f, g, _ );
}
{# 100, 101, 110 #}
Notice that this implementation of repeat
is not very
efficient. However, it shows how higher order functions may be combined
with lambdas to accomplish very interesting goals.
In functional programming languages, a closure is a function that binds some elements of its outer scope. We may say that partial applications are a kind of closures.
Let us go back to the combine
function definition. We
can say that combine
binds given functions f
and g
in a closure:
= ( \f,g,x : f(g(x)) )( f, g, _ ); combine( f, g )
Definitions like this one are very often: first we have to define a function with more arguments and then we have to create a closure by partial application of this function. It seems like we write some things twice.
Luckily, we may use lambda closures (or lambdas with bindings), as an easier way to create closures. To define a lambda closure we need not only the lambda arguments, but also the list of the names from the definition scope which we need to bind. These names are listed in square brackets, after the lambda arguments:
\ <arg
1
>, ..., <arg
n
> [ <name
1
>, ..., <name
k
> ] : <exp>
There is an alternative older syntax for the same thing, where a hash
symbol ‘#
’ is used to separate the list of lambda arguments
from the list of bound scope names:
\ <arg
1
>, ..., <arg
n
> # <name
1
>, ..., <name
k
>: <exp>
Now, using a lambda closure, we may define the combine
function even easier than before:
{#\x:x+1,\x:x+1)(10),
combine(+(2,_), operator*(8,_))(5)
combine(operator
#}where{
combine( f, g ) = \x [f,g]: f(g(x));
}
{# 12, 42 #}
The partial application syntax is not applicable to operators. Also,
operators cannot be used as function application arguments. This is why
Wafl introduces the functional syntax of the operators. Each operator
has a corresponding equivalent function. If there is an operator
<op>
, then there is also the corresponding function
operator<op>
, with the same number of arguments and
the same type.
The following example shows how we can use the operators’ functional
syntax to partially apply the operator ‘+
’:
{#+( _, 'b' )( 'a' ),
operator+( 'a', _ )( 'b' ),
operator+( _, 5 ), 10 ),
apply( operator+( _, 5 ), _ )( 10 ),
apply( operator+, _, _ )( 5, 10 ),
apply2( operator5 ,10 )( operator+ )
apply2( _,
#}where{
apply( f, x ) = f( x );
apply2( f, x, y ) = f( x, y );
}
{# 'ab', 'ab', 15, 15, 15, 15 #}
If an operator is available in both unary and binary form, then the
corresponding operator<op>
function corresponds to
binary form only. For example, while ‘-
’ operator is
available in both unary and binary form, the corresponding
operator-
function has only the binary form. It is the same
for ‘*
’ and ‘%
’ operators.
Because of the functional semantics of subprograms, the source code of programs written in functional programming languages is very often quite unreadable.
For example, consider the following definition of the function
g
. For a given string s
it extracts a
substring from position 5 of length 8, then replaces each
‘a
’ with ‘B
’, then replaces each
‘b
’ with ‘A
’, and finally converts the string
to lowercase letters. It is basically a very simple function, but it’s
very unreadable, no matter how carefully we indent its code:
=
g(s) strLowerCase(
strReplaceAll(
strReplaceAll(
subStr(s,5,8),
'a','B'
),'b','A'
); )
Wafl tries to resolve the problem by introducing a dot operator
.
, which is very similar to object-oriented-like dot
syntax. In short, any function defined for at least one argument is
applicable not only using the usual syntax:
f( x
1
, x
2
, ..., x
n
)
but also using the dot syntax, where the function application is written as the OO method of the first argument:
x
1
.f( x
2
, ..., x
n
)
These two syntaxes have the same semantics. Programmers are free to use both and choose the best one for each specific case.
The usage of dot syntax usually leads to more readable source code.
In the following example, we define two equivalent functions. For
f
we use the dot syntax and for g
we
use the usual syntax of the function application.
{#'abcdABCDabcdABCD' ),
f( 'abcdABCDabcdABCD' )
g(
#}where{
f(s) =
subStr(5,8)
s.strReplaceAll('a','B')
.strReplaceAll('b','A')
.strLowerCase();
.
g(s) =
strLowerCase(
strReplaceAll(
strReplaceAll(
subStr(s,5,8),
'a','B'
),'b','A'
);
)}
{# 'bcdbacda', 'bcdbacda' #}
If the dot operator is used after an integer literal, the parser may
treat the dot symbol ‘.
’ as a decimal point. To prevent the
confusion in such cases, just insert a space before the dot symbol. For
example, use 2 .abs()
instead of 2.abs()
.
A successive usage of the dot syntax may represent an application of transofrmational semantics. Each application of the dot syntax represents a single transformation step, which transforms its first argument into the result of the applied function. The entire sequence of dot operator applications represents a series of such transformational steps.
In the last example, the function f
is defined as a
series of transformations of the input string x
. Four
successive transformations are applied to calculate the result of the
function.
To enable such applications, having the dot syntax in mind, any function that has one main argument and can be thought of as some kind of transformation of that argument, should be defined to have the main argument as the first one.
The dot operator has the highest priority. The only operators with
higher precedence are the library referencing operator ‘::
’
and the record/tuple element selector ‘$
’ and record/tuple
element update operator ‘^
’. So, in the following example,
applying the abs
function via the dot operator will precede
the multiplication:
-3 * 2 .abs()
-6
The high precedence of the dot operator often requires the use of
parentheses around the preceding expression. To avoid burdensome
parentheses, Wafl supports the arrow operator
‘=>
’, which behaves very similar to the dot operator,
with the only difference that it has the lowest priority.
If we replace ‘.
’ with ‘=>
’ in the
previous example, then the abs
function application will
have the lowest priority and will be evaluated after the entire
preceding expression has been evaluated:
-3 * 2 => abs()
6
In first Wafl versions, this technique was called arrow
syntax and the operator ‘->
’ was used instead of
‘.
’. In the current version, the operators
‘->
’ and ‘.
’ are equivalent. However, we
encourage the use of the dot syntax. The arrow operator may not be
supported in some of the future versions. This old arrow operator
‘->
’ should not be confused with the new arrow operator
‘=>
’.
Few sections earlier, there was an example on Fibonacci numbers:
1), fib(2), fib(3), fib(4), fib(5), fib(6), fib(7) #}
{# fib(where{
fib( n ) =
if n<=2 then 1
else fib(n-1) + fib(n-2);
}
{# 1, 1, 2, 3, 5, 8, 13 #}
This is an interesting example on recursive programming. However,
even a simple analysis shows that this approach is very inefficient. The
problem is that the function is evaluated again and again on the same
numbers. Let’s analyze the case of fib(6)
:
fib(6)
-> fib(5) + fib(4)
| -> fib(3) + fib(2)
| | -> 1
| -> fib(2) + fib(1)
| -> 1 -> 1
-> fib(4) + fib(3)
| -> fib(2) + fib(1)
| -> 1 -> 1
-> fib(3) + fib(2)
| -> 1
-> fib(2) + fib(1)
-> 1 -> 1
As we can see: fib(6)
is evaluated once,
fib(5)
once, fib(4)
twice, fib(3)
three times, fib(2)
5 times and fib(1)
3
times. We can see that this sequence corresponds to the Fibonacci
numbers sequence, except the last step: 1, 1, 2, 3, 5, 3. Because just
two lowest level calls return the exact value, it is simple to conclude
that for any n
, the total number of fib(2)
and
fib(1)
evaluations is equal to fib(n)
. And
that grows fast, approximately as ϕn, where ϕ is golden ratio
(1.61803…).
So, our function fib
becomes very slow for large
numbers. How can we improve this? From the formula it is obvious that
during the computation we should keep the last two numbers. It is
relatively simple to do in imperative programming. For example, in C/C++
we could do something like:
unsigned fib( unsigned n ) {
unsigned prev2=1, prev1=1, result=1;
// for n=0 and n=1 result is already ok
// result == 1 == fib(0) == fib(1)
// prev1 == 1 == fib(1)
// prev2 == 1 == fib(0)
for( int i=2; i<=n; i++ ) {
= prev1 + prev2; // result = fib(i)
result = prev1; // prev2 = fib(i-2)
prev2 = result; // prev1 = fib(i-1)
prev1 }
return result;
}
But in Wafl we do not have variables, and this is not as simple as in C++. There is no implicit computation state, and the only way to carry on the computation state is to do it explicitly. One solution may be a strait transition from the C++ example:
1), fib(2), fib(3), fib(4), fib(5), fib(6), fib(7) #}
{# fib(where {
fib(n) = if n <= 2 then 1
else fib_s(1,1,2,n);
fib_s(p1,p2,i,n) = if i=n then p2
else fib_s(p2,p1+p2,i+1,n);
}
{# 1, 1, 2, 3, 5, 8, 13 #}
If we check this carefully, we may remove the constant factor
n
from fib_s
and get a better solution:
1), fib(2), fib(3), fib(4), fib(5), fib(6), fib(7) #}
{# fib(where {
fib(n) = if n <= 2 then 1
else fib_s(1,1,n-2);
fib_s(p1,p2,i) = if i=0 then p2
else fib_s(p2,p1+p2,i-1);
}
{# 1, 1, 2, 3, 5, 8, 13 #}
However, it is not hard to see that this approach to the problem is not so nice.
Another approach to the exponentially growing recursion is to remember the previous results. This is where cached functions may help.
Cached functions is a Wafl technique for remembering the previous
results of a function. If we have a fast growing recursion tree, like in
fib
case, or we have a function that is very often computed
for the same arguments, then we can suggest that such function should
have its results cached.
The syntax is simple:
<name>(<args>) = cached <exp>;
If we try to use it with fib
, then it becomes:
1), fib(2), fib(3), fib(4), fib(5), fib(6), fib(7) #}
{# fib(where{
fib( n ) = cached
if n<=2 then 1
else fib(n-1) + fib(n-2);
}
{# 1, 1, 2, 3, 5, 8, 13 #}
If we test it on larger numbers we can see that it is much faster
than the original solution, but that the previous solution, with
explicit state transfer, is still more efficient. It is not a surprise -
the problem is that the cached solution must store computed results and
look for previously stored result. However, if we use fib
over and over again on the same range of numbers, the cached solution
becomes much more efficient that any other.
And it is simple to use, which is a significant benefit.
The caching efficiency largely depends on the complexity of the arguments and the results. If they are complex (require a large storage or are expensive to compare), then its benefits are questionable.
Wafl internally tries to reduce the cached result storage size to prevent enormous memory usage, by removing some cached values. However, the memory usage can become significant if many functions are cached or if the range of the argument is very large, which is often the case for the functions with many arguments.