Recursion

Mathematical definitions

Recursion is a popular technique in Computer Science to solve problems by a divide and conquer approach. Recursion attempts to break a problem in to smaller parts and solve each part recursively to further combine all the sub-solutions to create a solution to the original problem.

The main components of this technique are:

Divide the problem into a number of sub-problems that are smaller instances of the same problem.

Conquer the sub-problems by solving them recursively.

Combine the solutions to the sub-problems into the solution of the original problem

In Mathematics a recursive process is defined when the items of a function have an interdependent relationship, whose values are dependent of the particular algorithm or routine that is being implemented. In mathematical notation, a recursive function for a given geometric sequence can be defined as:

mathematical definition of a recursive function using a geometric series

Where r is is just a common ratio between the consecutive terms of the sequence. The common ratio is not specific to recursive functions, just for geometric sequences.

Lets suppose we want to find the value of the function when n is 4. We simply cant compute it in the first place! we need to first find the value of the function when n is 3, when n is 2… etc. We can first define the function in terms of the set

To find the value of the function when n is 4 it has to be calculated as:

We can verify that indeed the fourth term of the sequence has a value of 8. This recursive function is repeatedly applying a given relation or routine operation to known values of the same function. Its also important to outline that the minimal value in the range of all possible values that the function can take is specified in the recursive definition. This is commonly known as the base case in mathematics.

Base case for the recursive formula defined above

Base cases are also a very important characteristic of logic in mathematics to prove the validity of any mathematical expression. This is typically done by Mathematical induction.

https://en.wikipedia.org/wiki/Mathematical_induction

Factorial

A classical example on a recursive function is the factorial definition of a number. In mathematics the factorial of a positive integer n , is the product of all positive integers less than or equal to n. The factorial operation is widely used in combinatoric mathematics, like calculating all the possible permutations of a given set of numbers.

Consider the following set of positive integers:

A set of positive integers

The sub-sets of all possible permutations are:

The recursive function for any positive integer n can be defined as:

Such that the total number of permutations that set S can have is a total of 6, as illustrated above:

Recursive operation to calculate 3! or for any positive integer n

The Stack

Implementing a recursive algorithm, just means to have a method call itself. Any method within a program along with its parameters, its return address and any local variables will get stored in a particular memory of the computer called the stack. This brings a very important topic to light which is stack memory which is essential to understand, to be able to successfully grasp the concept of recursion.

The Stack, stores temporary information about the active subroutines of a program, this could be local variables, value types and methods. The main purpose of the stack is to keep track of the point to which each active subroutine should return control . The stack is basically a stack data structure, which behaves as a First In First Out (FIFO) buffer. One of the essential components of stack memory operations is a register called Stack Pointer.

The stack pointer’s task is to keep track of the current stack memory location and the reference it points to is adjusted accordingly every time the stack changes. There are two main actions that cause a stack to change, popping and pushing data. When data is pushed, a new stack frame is added at the top of the stack, and when memory is freed, data gets popped of the top of stack. Since a stack has a FIFO behavior as outlined above, a stack pointer always points to the last filled memory address of the stack. Depending on the processor architecture, a stack can store or remove data using incremental address indexing or decremental address indexing. For the following diagrams I generalized it as an incremental addressing system. The best analogy to thing about a stack, is to think of a stack of plates, books, or any other type of element that can be “stacked”. Plates can be stacked one on top of another, and you can only add one plate to the top of the existing stack and similarly, only remove the plate at the top.

Pushing data to the Stack – sequence

The following diagrams will illustrate what happens when a piece of code executes, when there is only data that is saved to the stack

The general sequence of events that happen when a method is invoked are the following:

A. Space is allocated for information needed for the execution of the method on the stack in one single Stack Frame. This includes the calling address which is basically a pointer so when the thread finishes running the method it knows where to go back in the code to continue execution.

B. Method parameters are copied, only if they are value types.

C. Control is then passed to the JIT (Just In Time) Compiler and the current thread starts executing the code. This really depends in the progamming language JIT is usually in the .NET environment

D. As the thread is executing the method, save any local variables to the stack (if any).

E. If the method has a return value, it will return it. After the method is finished executing all the data will be removed from the stack, and the stack pointer will decrement, to point to the new memory address of the top of the stack.

Pushing data to the Stack….

Mains’s frame pointer gets pushed to the stack. When the thread finishes executing the method it knows where to go back to in order to continue execution.
Increment the stack pointer, to reference its new location

Main’s local variable a gets pushed to the stack.
Increment the stack pointer, to reference its new location

Main’s local variable s gets pushed to the stack.
Increment the stack pointer, to reference its new location

Foo’s frame pointer gets pushed to the stack. When the thread finishes executing the method it knows where to go back to in order to continue execution.
Increment the stack pointer, to reference its new location

Control is passed to the JIT Compiler.
the value of the local variable a is copied bit by bit to Foo’s parameter and pushed to the stack
Increment the stack pointer, to reference its new location

Foo’s local variable loc1 is pushed to the stack .
Increment the stack pointer, to reference its new location

Foo’s local variable loc2 is pushed to the stack .
Increment the stack pointer, to reference its new location

Push the return value to the stack
Increment the stack pointer, to reference its new location

Popping data off the Stack…

Since this method returns a value, it will first return the value and then all the method’s data ( local variables and parameters) will be popped off the stack.
Decrement the stack pointer to reference its new location
Thread continues execution from Foo’s current frame pointer

Foo’s memory address and Main’s local variables get popped of the stack
Decrement the stack pointer to reference its new location
Thread returns control to Main’s entry point

This is roughly the process of what happens when methods and value types are executed by a thread in terms of memory allocation. I will later address this in more detail in a future post by including reference types as well.

Stack memory allocation, as stated previously, is very important to understand to have the basis on how a recursive function works, simply because it saves the state of each function call. I will also dedicate a whole post to talk more about Stacks and Queues, which in general are very powerful data structures.

Going back to recursion

Lets consider our factorial function once again:

In code this is usually written as:

int64_t factorial(int n)
{
	//base case
	if (n == 0 || n == 1) return 1;

	return n * factorial(n - 1);
}


int main()
{	
	int64_t r = factorial(4);
}

Execution sequence.

Stack memory allocation

Every time Factorial calls itself, it will decrement the value of n, which in this case is 4, until it reaches the base condition. This prevents the code to execute in an infinite loop. Every recursive function should have a base condition just as the mathematic definition does.

r is copied to the stack, its value is set to none, by the compiler for now.
Increment the stack pointer, to reference its new location

Factorial’s memory address gets pushed to the stack
The value of n is copied bit by bit to Factorial’s parameter and pushed to the stack
Increment the stack pointer, to reference its new location

The value of n is copied bit by bit to Factorial’s parameter and pushed to the stack
Increment the stack pointer, to reference its new location

The value of n is copied to the stack
n is decreased by 1
Increment the stack pointer, to reference its new location

The value of n is copied to the stack
n is decreased by 1
Increment the stack pointer, to reference its new location

The value of n is copied to the stack
n is decreased by 1
Increment the stack pointer, to reference its new location
Base condition is met! method will start returning the value of n for each stack and calculating n!

Stack memory de-allocation

The value of n is returned
n =1 is popped of the stack
Decrement the stack pointer, to reference its new location
calculate n!

The value of n is returned
n =2 is popped of the stack
Decrement the stack pointer, to reference its new location
calculate n!

The value of n is returned
n =3 is popped of the stack
Decrement the stack pointer, to reference its new location
calculate n!

The final value of n is returned
n =4 and Factorial’s frame pointer is popped of the stack
Decrement the stack pointer, to reference its new location
r now holds the value of 24

This is the basic sequence of what happens to the data when a method is being recursively called by the current execution thread of a program.

Iterative definition of a recursive function.

Mathematical representation.

Any recursive function in mathematics also has an iterative version. Its possible to convert from an iterative function to a recursive one and visa-versa. For example, if we continue with the factorial recursive function that we have been working with:

Its iterative definition is:

Which can be expanded as:

Computational representation.

In software as in mathematics any recursive function can be converted to an iterative one. This can either be done using Tail recursion or by implementing a stack data structure to simulate how a compiler keeps track of a recursive function. In this example I have implemented a stack data structure to keep track of the data at each iteration.

int64_t FactorialIterative(int n)
{
	stack <int> s; 
	s.push(n);	
	int p=n , r;
	while (p>1)
	{				
		p--;
		r = s.top() * p;
		s.pop();
		s.push(r);
	}

	return s.top();
}

Although in this particular case the code could be simplified to:

int64_t factorialIterative(int n)
{
	
	int p = 1;
	while (n>1)
	{				
		p *= n;
		n--;
	}

	return p;
}

Heaps Algorithm

Heaps algorithm is used to calculate all the possible permutations of n objects. The original algorithm was developed to be recursive, but it can also be done iteratively, as proved before. Below is an example of all the different permutations that the set S can have.

You can also notice that this example fits nicely with the previous one because it is also a factorial function. To calculate the total number of permutations K that a set S can have:

void PrintSet( vector<int>& data)
{
	printf("[");
	for (int i = 0; i < data.size(); i++)
	{
		if (i > 0)cout << ",";
		printf(data[i]);
	}
	printf( "]");
	printf("\n");
}

void Permutations( vector<int> &data, int n)
{	
	if (n == 1) PrintSet(data);

	for (int i = 0; i < n; i++)
	{
		Permutations(data,n-1);
		if (n % 2 == 0)		
			swap(data[i], data[n - 1]);
		else
			swap(data[0], data[n - 1]);		
	}

}

int main()
{
	vector<int> data = vector<int>{ 6,2,3,9 };
	int n = data.size();
	Permutations(data, n);
}

The output would be:

[6,2,3,9]
[2,6,3,9]
[3,6,2,9]
[6,3,2,9]
[2,3,6,9]
[3,2,6,9]
[9,2,3,6]
[2,9,3,6]
[3,9,2,6]
[9,3,2,6]
[2,3,9,6]
[3,2,9,6]
[9,6,3,2]
[6,9,3,2]
[3,9,6,2]
[9,3,6,2]
[6,3,9,2]
[3,6,9,2]
[9,6,2,3]
[6,9,2,3]
[2,9,6,3]
[9,2,6,3]
[6,2,9,3]
[2,6,9,3]

Recursion and performance

There are cases in recursion where performance is critical do to using a lot of stack memory for every recursive function call. By nature, a recursive algorithm will consume much more stack space than an iterative algorithm. Stack memory is usually much more limited than heap memory. In this post I will not get into much detail in to why even bother to choose a recursive implementation over an iterative one. But in short, sometimes a recursive implementation can require much less amount of code. It all really depends on the type of problem that must be solved. There are also some optimization techniques that can be applied to recursive functions such as tail recursion and memoization. For now, I will just focus on memoization.

Memoization

Memoization is a popular technique to optimize algorithms by making sure that no calculation is repeated more than once for the same input. This is done by storing the results of a computation and returning the results if the same input occurs again in the future, to avoid calculating the same thing more than once. The key to implement an efficient memoization is to choose the right data structure. The most common data structure used to cache results is as hash table. Hash tables have a very efficient key-value look up functions which take O(1) linear time to add and retrieve values.

Fibonacci

Fibonacci is a very good example of how a recursive implementation can have huge performance costs as the input grows in size. Fibonacci, roughly speaking has an exponential time complexity (2^n).

The mathematical function that defines the Fibonacci sequence, is simply:

Recursive tree

Here we can see how the 3rd and 2nd terms are calculated more than once!:

This can be a big problem with a function that has an exponential time complexity. Computation times can get quickly out of hand as the size of the input increases.

Without memoization

int64_t fibonacciRecursive(int n)
{
	if (n == 0 || n == 1) return n;
	return fibonacciRecursive(n - 1) + fibonacciRecursive(n - 2);
}

With memoization

int64_t FibonacciMemo(int n , map<int, int64_t>* p_hash)
{
	int64_t r;

	if (n == 0 || n == 1) return n;
	
	// if value has been calculated, just return it
	if (p_hash->find(n) != p_hash ->end())
		r = p_hash->at(n);
	else
	{
		r = FibonacciMemo(n - 1, p_hash) + FibonacciMemo(n - 2, p_hash);
		p_hash->insert(pair<int, int64_t>(n, r));
	}
	
	return r;
}

Computation times

Below is a comparison between the recursion without optimization and recursion with memoization. We can see that the difference is huge! I ran this on a Intel(R) Core(TM) i7-8750 CPU in release mode in Visual Studio.

----------------------FIBONACCI PERFORMANCE COMPARISON----------------------

Calculating 55th term

Recursive Fibonacci
Elapsed time: 553579853us (553580ms) (553.58s) (9.22633333min)
Result: 139583862445

Recursive Fibonacci with memoization
Elapsed time: 25us (0.025ms) (2.5e-05s) (4.166667e-7min)
Result: 139583862445

Conclusion

I am sure everyone can have their own conclusions regarding the use of recursion to solve a computational problem. I guess the more experienced you are as a programmer, you can much quicker derive what problem can be solved easier by recursion, and then do corresponding optimizations if needed. Having said that, the most difficult part for me is to know when to apply recursion. In my programming experience until know, I have never really delved in to some of the technicalities of recursion and this post was an attempt to study it more in depth. I have not even thought much on using recursion to solve any computational problem that I have encountered before. I guess this might be due to my informal education on Computer Science so far, because it is a common topic in a Computer Science degree. I am currently studying from several different books on analysis and design of algorithms to hopefully fill the gap that I currently have being as a self taught programmer.