Wafl is a strongly typed programming language. That means Wafl requires a detailed and very strict type checking before program evaluation. Each function application must fully comply with the function type.
Strong type checking is usually paired with explicit name type declarations. However, Wafl need no explicit type declarations, because the types are automatically inferenced during the code analysis.
Because of the strict type checking, no implicit type conversions are allowed.
In the following example we have a program with wrong type usage. Addition operators is defined for different types, but it is not defined to work on operands of different types. So, a type checking error is reported:
1 + '2.0'
--- Loading: C:\Users\smalkov\AppData\Local\Temp\wafltmpfile_sgk8.18.tmp
Typecheck program in context: Program
The expression type is not suitable for the context: '2.0'
- Program : '1
Typechecking failed!
--- End loading: C:\Users\smalkov\AppData\Local\Temp\wafltmpfile_sgk8.18.tmp
The proper way to handle such problems is to use explicit type conversions. This will be covered later in the tutorial.
Wafl is a strongly typed programming language. However, Wafl syntax does not include explicit type specification. All types are implicitly inferred. Usually, the program evaluation does not present the inferred definition types to users, but if any type error is detected during the type checking, all inferred types are reported.
In the following example there are three local functions defined:
g
is recognized to be of type (Float -> Float)
(maps single float argument to float result);h
is recognized to be of type (Integer -> Integer)
(maps single integer argument to integer result).f
is not well defined, because there is an addition operator with first operand of float type and second operand of integer type, but implemented addition operator does not allow the different operand types.f(7.0)
where{
f(x) = g(x) + h(x); // Type checking error!
g(x) = sin(x);
h(x) = x + 5;
}
--- Loading: C:\Users\smalkov\AppData\Local\Temp\wafltmpfile_sgk8.1a.tmp
Typecheck program in context: Program
Typecheck fnc: f in context: f
Cannot normalize undefined expression type: g(x)
- f : '1
Typecheck fnc: g in context: g
OK: g : (Float -> Float)
Typecheck fnc: h in context: h
OK: h : (Int -> Int)
Typecheck fnc: f in context: f
Type checking failed for expression:
--> h(x)
- f : '1
Typecheck fnc: f in context: Program
Type checking failed for expression:
--> h(x)
Subtypes are not recognized:
- f : '2
- Program : '1
Typechecking failed!
--- End loading: C:\Users\smalkov\AppData\Local\Temp\wafltmpfile_sgk8.1a.tmp
The error report contains some redundant data because all branches of the type checking process are reported to allow better understanding of reported problems. If we read the report from the beginning, we can see all the actions performed by type checker:
f
type alone and failed, because f
uses names g
and h
, which types are not inferred, yet;g
type alone and inferred its type: (Float -> Float)
;h
type alone and inferred its type: (Int -> Int)
;f
type alone and failed again, because h(x)
is not a valid usage in the context, which means that x
is already assumed to have a non-float type;f
type, but now in a wider context and fails because of the same reason.Sometimes, functions use mutual recursion, i.e. each of them uses another. In that case all these functions have to be checked together, at the same time. Therefore, the type checking is performed in a wider context if a local type checking failed.
In most cases, it is better to read the report from the bottom. The last block of reported not recognized types is the most relevant. Each line in the report beginning with OK:
suggests that the given definition type is well inferred.
Reading from the bottom, the error report shows that type checking fails for Program
and f
and the problem is in h(x)
subexpression. The further analysis shows that this is h(x)
from f
definition body.
Polymorphism is the property of a function to be well defined for arguments of different data types. Implicit polymorphism is very important feature of programming language Wafl.
The type inference premises that each of local definitions is fully polymorphic, i.e. that any argument of any function can be of any data type. Step by step, the assumed polymorphic types are reduced to types for which expressions that define functions are applicable.
In the following example, two polymorphic functions add
and sub
are defined and used with arguments of different types:
{#
add(1,2),
add(1.3,2.3),
add('a','b'),
sub(5,1),
sub(5.4,1.2),
f(5,4,3),
f(1.2, 3.4, 1.3),
add( asFloat(add(1,2)), 3.1 )
#}
where{
add(x,y) = x + y;
sub(x,y) = x - y;
f(x,y,z) = sub( add(x,y), z);
}
{# 3, 3.6, 'ab', 4, 4.2, 6, 3.3, 6.1 #}
Function add
is determined to have type (Value['1] * Value['1] -> Value['1])
. That means:
Integer
, Float
or String
);In a similar way, the type of the function sub
is inferred to be (Numeric['1] * Numeric['1] -> Numeric['1])
. Thus:
Integer
or Float
);Both Numeric['1]
and Value['1]
are so-called combined data types. They can be further reduced to different primitive data types. Type '1
represents a polymorphic type name. Same polymorphic type names in a type must reduce to same specific types in each specific function application.
In each particular function application, the function type has to be reduced to non-polymorphic data type. However, different applications of a same polymorphic function may reduce to different types in a same program. If used in functions definitions, polymorphic functions can be used in a polymorphic way, with no further data type reduction, or with reduction to more specific but yet another polymorphic type. Their types will be reduced further when the application of defining function induces that.
Function f
in our example is defined as:
f(x,y,z) = sub( add(x,y), z);
and its type is inferred to be (Numeric['1] * Numeric['1] * Numeric['1] -> Numeric['1])
. Thus:
Integer
or Float
);Function f
uses both add
and sub
. In specific application, the type of function add
is reduced from Value['1]
to Numeric['1]
, while the type of function is sub
is not further reduced.
It is important to note that each specific application is reduced independently of the other applications. This is why our function add
is in the same example successfully applied to a pair of integers, a pair of floats and a pair of strings, and functions sub
and f
are successfully applied to integers and floats. Moreover, in the last expression function add
is applied to both integers and floats.
Function asFloat
converts a value from integer to float. Conversion functions will be covered later in the tutorial.
One of the essential properties of functional programming languages is that functions are treated as first order citizens. Functions can be used as arguments and results of other functions and in a much simpler way than in the most of the imperative languages.
The functions which take functions as arguments or return a function as a result are referred as higher order functions. They are essential for functional programming.
In the following simple example, function double
is used as argument of function apply
application.
apply(double,2)
where{
apply(f,x) = f(x);
double(x) = x + x;
}
4
Function double
has type (Value['1] -> Value['1])
. It is similar to function add
from one of the previous examples.
Function apply
has type (('1 -> '1) * '1 -> '1)
. That means:
('1 -> '1)
;'1
,'1
and'1
may be replaced with any type, but at the same time in all previous assertions.In the given example, function apply
applies function double
to 2
. In this specific application type variable '1
is replaced with Int
.
More complex examples follow later.
A partial application is any function application where not all expected arguments are specified.
For example, if we would only specify that the first argument of the function add
(from previous examples), the result would be a unary function that returns its argument increased by one.
One way to do this is to define a new function:
f(x) = add( 1, x );
Another way is to use partial application:
f = add(1,_);
In Wafl, partial application syntax is similar to a regular function call, but underscore ‘_
’ is used instead of unspecified arguments. It is always necessary to use as much arguments as the function expects, but some of the arguments may be specified, and others may remain unspecified.
The result of a partial application is a function. Its arguments correspond to unspecified arguments of the partial application, in the preserved order.
It is possible to have all of the arguments unspecified. However, this is not very useful because the result is the function itself.
In the following example we use function subStr(str,pos,len)
. It extracts a substring of str
that begins at position pos
(zero based) and is len
characters long. String functions will be discussed in details later.
Several partial applications are used in the following example:
{#
strGetFirst5a( "1234567890" ),
strGetFirst5b( "1234567890" ),
charAt( "123456", 1 ),
charAt( "123456", 2 ),
add2( 5 ),
// The following expressions evaluate the same result
subStr( "1234567890", 5, 3 ),
subStr( _, 5, _ )( _, 3 )( "1234567890" )
#}
where{
// The following definitions are the same.
// Both functions return
// the first 5 characters of a string.
strGetFirst5a( s ) = strLeft( s, 5 );
strGetFirst5b = strLeft( _, 5 );
// charAt(n) returns the n-th character of a String.
charAt = subStr( _, _, 1 );
// twice(f,x) applies f to x two times
twice( f, x ) = f( f(x) );
// inc(x) returns x incremented by 1
inc(x) = x + 1;
// add2 returns its argument incremented by 2
add2 = twice( inc, _ );
}
{# '12345', '12345', '2', '3', 7, '678', '678' #}
In many functional programming languages partial application is implemented by specifying only the first argument of a function (or more general, some of the arguments from the beginning of the argument list, in the respected order). Such approach is called curried functions. The result of a curried partial application is a function that maps the remaining arguments to the result.
The partial application as implemented in Wafl is more general than curried functions.
A reader may wonder what’s the use of partial application. Like some other topics in this part of the tutorial, their purpose will become clear once we introduce advanced sequence processing techniques.
Higher order functions are so often used in functional programming that we need as many methods as possible to define or create new functions. Partial application is one such method. Lambda functions are another, as we will see in the next section.
In functional programming languages, term lambda function is related to different concepts of defining unnamed functions at the place of their usage.
When we use higher order functions, we often need to specify as an argument a function that is not needed elsewhere. If we use a function only once in a program, it is easier and often more readable to define it inline than to introduce a regular function definition.
In programming language Wafl we define lambda functions in the following form:
\ <arg
1
>, ..., <arg
n
>: <exp>
A lambda function may appear at any place where we can use a regular function name. Sometimes it is required to parenthesize the lambda function to make the syntax clear.
We will often use simplified term lambda instead of lambda function or unnamed function.
In our first lambda example, we define and use an unnamed function that returns its argument increased by 1:
(\x:x+1)(41)
42
This function is the same as inc
function from previous examples, but now it is defined as a lambda and used at the place of the definition. It is much shorter than using a full definition syntax, like in the following example:
inc(41)
where {
inc(x) = x+1;
}
42
The following examples are more complex than previous examples. It may be more difficult to understand them, for beginners in functional programming. But such examples are essential - please read carefully.
Let us now, for a first time, define a function that has only functions for its arguments and returns a new function as a result. Function combine
takes two functions as arguments. It returns a unary function that applies the combined functions to its argument. So, the following expression should always evaluate true
:
combine(f,g)(x) == f(g(x))
One way to write combine
is to define a helper function c
first, which takes three arguments f
, g
and x
, and to return its partial application:
combine = c(f,g,_)
where {
c(f,g,x) = f(g(x));
}
A simpler way is to use lambda instead of defining the function c
, but the solution is equivalent:
{#
add2(1),
add3(1),
combine( inc, inc )(10),
combine(\x:x+1,\x:x+1)(10)
#}
where{
combine( f, g ) = ( \f,g,x : f(g(x)) )( f, g, _ );
add3 = combine( add2, inc );
add2 = combine( inc, inc );
inc(x) = x + 1;
}
{# 3, 4, 12, 12 #}
In the next section, we will see that combine
can be implemented even more easily.
In the following example, we use combine
to define the repeat
function that generalizes twice
from the previous examples. For a given function f
and a given integer n
, repeat
returns a function that applies f
to its argument n
times. In repeat
, we use the lambdas to write the identity function, and in the main expression we use a lambda equivalent of the inc
function:
{#
repeat( \x:x+1, 0 )( 100 ),
repeat( \x:x+1, 1 )( 100 ),
repeat( \x:x+1, 10 )( 100 )
#}
where{
repeat( f, n ) =
if n=0
then \x:x
else combine( f, repeat(f,n-1) );
combine( f, g ) = ( \f,g,x : f(g(x)) )( f, g, _ );
}
{# 100, 101, 110 #}
Notice that this implementation of repeat
is not very efficient. However, it shows how higher order functions may be combined with lambdas to accomplish very interesting goals.
In functional programming languages, a closure is a function that binds some elements of its outer scope. We may say that partial applications are a kind of closures.
Let us go back to the combine
function definition. We can say that combine
binds given functions f
and g
in a closure:
combine( f, g ) = ( \f,g,x : f(g(x)) )( f, g, _ );
Definitions like this one are very often: first we have to define a function with more arguments and then we have to create a closure by partial application of this function. It seems like we write some things twice.
Luckily, we may use lambda closures (or lambdas with bindings), as an easier way to create closures. To define a lambda closure we need not only the lambda arguments, but also the list of the names from the definition scope which we need to bind. These names are listed in square brackets, after the lambda arguments:
\ <arg
1
>, ..., <arg
n
> [ <name
1
>, ..., <name
k
> ] : <exp>
The older syntax for the same thing is the following one - a hash symbol ‘#
’ is used to separate the list of lambda arguments from the list of bound scope names:
\ <arg
1
>, ..., <arg
n
> # <name
1
>, ..., <name
k
>: <exp>
Now, using a lambda closure, we may define the combine
function even easier than before:
{#
combine(\x:x+1,\x:x+1)(10),
combine(operator+(2,_), operator*(8,_))(5)
#}
where{
combine( f, g ) = \x [f,g]: f(g(x));
}
{# 12, 42 #}
The partial application syntax is not applicable to operators. Also, operators cannot be used as function application arguments. This is why Wafl introduces the functional syntax of the operators. Each operator has a corresponding equivalent function. If there is an operator <op>
, then there is also the corresponding function operator<op>
, with the same number of arguments and the same type.
The following example shows how we can use the operators’ functional syntax to partially applicate the operator ‘+
’:
{#
operator+( _, 'b' )( 'a' ),
operator+( 'a', _ )( 'b' ),
apply( operator+( _, 5 ), 10 ),
apply( operator+( _, 5 ), _ )( 10 ),
apply2( operator+, _, _ )( 5, 10 ),
apply2( _, 5 ,10 )( operator+ )
#}
where{
apply( f, x ) = f( x );
apply2( f, x, y ) = f( x, y );
}
{# 'ab', 'ab', 15, 15, 15, 15 #}
While the ‘-
’ operator is available in both unary and binary form, the corresponding operator-
function has only the binary form.
Because of the functional semantics of subprograms, the source code of programs written in functional programming languages is very often quite unreadable.
For example, consider the following definition of the function g
. For a given string s
it extracts a substring from position 5 of length 8, then replaces each ‘a
’ with ‘B
’, then replaces each ‘b
’ with ‘A
’, and finally converts the string to lowercase letters. It is basically a very simple function, but it’s very unreadable, no matter how carefully we indent its code:
g(s) =
strLowerCase(
strReplaceAll(
strReplaceAll(
subStr(s,5,8),
'a','B'
),
'b','A'
)
);
Wafl tries to resolve the problem by using object-oriented-like dot syntax. In short, any function defined for at least one argument is applicable not only using the usual syntax:
f( x
1
, x
2
, ..., x
n
)
but also using the dot syntax, where the function application is written as the OO method of the first argument:
x
1
.f( x
2
, ..., x
n
)
These two syntaxes have the same semantics. Programmers are free to use both and choose the best one for each specific case.
The usage of dot syntax usually leads to more readable source code. In the following example, we define two equivalent functions. For f
we use the dot syntax and for g
we use the usual syntax of the function application.
{#
f( 'abcdABCDabcdABCD' ),
g( 'abcdABCDabcdABCD' )
#}
where{
f(s) =
s.subStr(5,8)
.strReplaceAll('a','B')
.strReplaceAll('b','A')
.strLowerCase();
g(s) =
strLowerCase(
strReplaceAll(
strReplaceAll(
subStr(s,5,8),
'a','B'
),
'b','A'
)
);
}
{# 'bcdbacda', 'bcdbacda' #}
In first Wafl versions, this technique was called arrow syntax and the operator ‘->
’ was used instead of ‘.
’. In the current version, the operators ‘->
’ and ‘.
’ are equivalent. However, we encourage the use of the dot syntax. The arrow operator may not be supported in some of the future versions.
Having the dot syntax in mind, any function that has one main argument, e.g. evaluates some kind of transformation of that argument, should be defined to have the main argument as the first.
Few sections earlier, there was an example on Fibonacci numbers:
{# fib(1), fib(2), fib(3), fib(4), fib(5), fib(6), fib(7) #}
where{
fib( n ) =
if n<=2 then 1
else fib(n-1) + fib(n-2);
}
{# 1, 1, 2, 3, 5, 8, 13 #}
This is an interesting example on recursive programming. However, even a simple analysis shows that this approach is very inefficient. The problem is that the function is evaluated again and again on the same numbers. Let’s analyze the case of fib(6)
:
fib(6)
-> fib(5) + fib(4)
| -> fib(3) + fib(2)
| | -> 1
| -> fib(2) + fib(1)
| -> 1 -> 1
-> fib(4) + fib(3)
| -> fib(2) + fib(1)
| -> 1 -> 1
-> fib(3) + fib(2)
| -> 1
-> fib(2) + fib(1)
-> 1 -> 1
As we can see: fib(6)
is evaluated once, fib(5)
once, fib(4)
twice, fib(3)
three times, fib(2)
5 times and fib(1)
3 times. We can see that this sequence corresponds to the Fibonacci numbers sequence, except the last step: 1, 1, 2, 3, 5, 3. Because just two lowest level calls return the exact value, it is simple to conclude that for any n
, the total number of fib(2)
and fib(1)
evaluations is equal to fib(n)
. And that grows fast, approximately as ϕn, where ϕ is golden ratio (1.61803…).
So, our function fib
becomes very slow for large numbers. How can we improve this? From the formula it is obvious that during the computation we should keep the last two numbers. It is relatively simple to do in imperative programming. For example, in C/C++ we could do something like:
unsigned fib( unsigned n ) {
unsigned prev2=1, prev1=1, result=1;
// for n=0 and n=1 result is already ok
// result == 1 == fib(0) == fib(1)
// prev1 == 1 == fib(1)
// prev2 == 1 == fib(0)
for( int i=2; i<=n; i++ ) {
= prev1 + prev2; // result = fib(i)
result = prev1; // prev2 = fib(i-2)
prev2 = result; // prev1 = fib(i-1)
prev1 }
return result;
}
But in Wafl we do not have variables, and this is not as simple as in C++. There is no implicit computation state, and the only way to carry on the computation state is to do it explicitly. One solution may be a strait transition from the C++ example:
{# fib(1), fib(2), fib(3), fib(4), fib(5), fib(6), fib(7) #}
where {
fib(n) = if n <= 2 then 1
else fib_s(1,1,2,n);
fib_s(p1,p2,i,n) = if i=n then p2
else fib_s(p2,p1+p2,i+1,n);
}
{# 1, 1, 2, 3, 5, 8, 13 #}
If we check this carefully, we may remove the constant factor n
from fib_s
and get a better solution:
{# fib(1), fib(2), fib(3), fib(4), fib(5), fib(6), fib(7) #}
where {
fib(n) = if n <= 2 then 1
else fib_s(1,1,n-2);
fib_s(p1,p2,i) = if i=0 then p2
else fib_s(p2,p1+p2,i-1);
}
{# 1, 1, 2, 3, 5, 8, 13 #}
However, it is not hard to see that this approach to the problem is not so nice.
Another approach to the exponentially growing recursion is to remember the previous results. This is where cached functions may help.
Cached functions is a Wafl technique for remembering the previous results of a function. If we have a fast growing recursion tree, like in fib
case, or we have a function that is very often computed for the same arguments, then we can suggest that such function should have its results cached.
The syntax is simple:
<name>(<args>) = cached <exp>;
If we try to use it with fib
, then it becomes:
{# fib(1), fib(2), fib(3), fib(4), fib(5), fib(6), fib(7) #}
where{
fib( n ) = cached
if n<=2 then 1
else fib(n-1) + fib(n-2);
}
{# 1, 1, 2, 3, 5, 8, 13 #}
If we test it on larger numbers we can see that it is much faster than the original solution, but that the previous solution, with explicit state transfer, is still more efficient. It is not a surprise - the problem is that the cached solution must store computed results and look for previously stored result. However, if we use fib
over and over again on the same range of numbers, the cached solution becomes much more efficient that any other.
And it is simple to use, which is a significant benefit.
The efficiency of the caching algorithm largely depends on the complexity of the arguments and the results. If they are complex (require a large storage or are expensive to compare), then its benefits are questionable.
Wafl internally tries to reduce the cached result storage size to prevent enormous memory usage, by removing some cached values. However, the memory usage can become significant if many functions are cached or if the range of the argument is very large, which is often the case for the functions with many arguments.