Algorithm Best Practices June 2018: Recursive Tree Search Functions

Introduction to Recursion

Recursive functions can be a massive headache for all developers involved if the code is not properly documented, formatted, and abstracted. For those that don't know, a recursive function is a function that calls itself. The University of Wisconsin Madison defines it as follows:

A recursive function (DEF) is a function which either calls itself or is in a potential cycle of function calls. As the definition specifies, there are two types of recursive functions. Consider a function which calls itself: we call this type of recursion immediate recursion.

— University of Wisconsin - Madison

Recursive functions can do a variety of neat things and replicate a lot of natural phenomenons such as fractals. Fractal Sierpinski Triangle

Challenges

Code that can run itself can lead to a whole host of issues. The most common is a "stack overflow". This is when a recursive function calls itself over and over, never terminating, and pushes its function calls onto the "call stack" until the computer can't handle it anymore and shuts down - hence a "stack overflow." This is one of the many easy traps of recursive programming.

Furthermore, recursive code is difficult to read, especially if you're not the one who wrote it. Without proper code etiquette, it can get incredibly confusing.

Solution and Modularization

The solution to these problems is to a) write code that works b) document your code. In this section, I'll show you a technique I learned from my coworker, Brian Lonsdorf, that clearly abstracts the different pieces of recursion into modular, isolated functions. This not only helps keep the code organized, but combined with the correct commenting, can result in clear and concise code.

The code below was written in Python 3 and demonstrates a simple depth-first search using an accumulator object that is pass-by-reference. The coolest parts about this code in my opinion was the implementation of a visitor function and an accumulator. Using these methods, handling data and performing the main logic is clearly compartmentalized. This ensures the main recursive logic isn't affected by your logic and prevents bugs (like stack overflows) from occurring.

1# Python 3
2# Kevin Hou (khou22.com)
3
4# The condition function that determines if a node has child nodes
5# This ensures the code terminates eventually
6def hasChildren(node):
7    return len(node['children']) > 0
8
9# Visitor function
10# This function is called on every node the search function visits
11# This is where your main logic should go
12def visitor(element, accumulator):
13
14    # In this situation, we're simply checking if it's a leaf and storing it in our accumulator if it is
15    if not hasChildren(element): # Check if it's a leaf node
16        accumulator.append(element) # Add to accumulator if leaf
17
18# The recursive function that calls itself
19# Searches a node, applies a visitor function, and searches each child node
20def searchChildren(node, visitor, accumulator):
21    visitor(node, accumulator) # Visitor function
22
23    # Recursively search if possible
24    if hasChildren(node):
25        children = node['children'] # Get child nodes
26        list(map(lambda child: searchChildren(child, visitor, accumulator), children)) # Search each child node
27            
28# Get all SLDS components
29def getLeaves(tree):
30    accumulator = [] # Pass-by-reference accumulator object
31    
32    # Initial search call with visitor function
33    searchChildren(tree, visitor, accumulator)
34