← Back to Blog

Algorithm Best Practices June 2018: Recursive Tree Search Functions

python
sourceCode

By Kevin Hou

3 minute read

Introduction to Recursion

Recursive functions can be a massive headache for all developers involved if the code is not properly documented, formatted, and abstracted. For those that don't know, a recursive function is a function that calls itself. The University of Wisconsin Madison defines it as follows:

A recursive function (DEF) is a function which either calls itself or is in a potential cycle of function calls. As the definition specifies, there are two types of recursive functions. Consider a function which calls itself: we call this type of recursion immediate recursion.

University of Wisconsin - Madison

Recursive functions can do a variety of neat things and replicate a lot of natural phenomenons such as fractals. Fractal Sierpinski Triangle

Challenges

Code that can run itself can lead to a whole host of issues. The most common is a "stack overflow". This is when a recursive function calls itself over and over, never terminating, and pushes its function calls onto the "call stack" until the computer can't handle it anymore and shuts down - hence a "stack overflow." This is one of the many easy traps of recursive programming.

Furthermore, recursive code is difficult to read, especially if you're not the one who wrote it. Without proper code etiquette, it can get incredibly confusing.

Solution and Modularization

The solution to these problems is to a) write code that works b) document your code. In this section, I'll show you a technique I learned from my coworker, Brian Lonsdorf, that clearly abstracts the different pieces of recursion into modular, isolated functions. This not only helps keep the code organized, but combined with the correct commenting, can result in clear and concise code.

The code below was written in Python 3 and demonstrates a simple depth-first search using an accumulator object that is pass-by-reference. The coolest parts about this code in my opinion was the implementation of a visitor function and an accumulator. Using these methods, handling data and performing the main logic is clearly compartmentalized. This ensures the main recursive logic isn't affected by your logic and prevents bugs (like stack overflows) from occurring.

1# Python 3 2# Kevin Hou (khou22.com) 3 4# The condition function that determines if a node has child nodes 5# This ensures the code terminates eventually 6def hasChildren(node): 7 return len(node['children']) > 0 8 9# Visitor function 10# This function is called on every node the search function visits 11# This is where your main logic should go 12def visitor(element, accumulator): 13 14 # In this situation, we're simply checking if it's a leaf and storing it in our accumulator if it is 15 if not hasChildren(element): # Check if it's a leaf node 16 accumulator.append(element) # Add to accumulator if leaf 17 18# The recursive function that calls itself 19# Searches a node, applies a visitor function, and searches each child node 20def searchChildren(node, visitor, accumulator): 21 visitor(node, accumulator) # Visitor function 22 23 # Recursively search if possible 24 if hasChildren(node): 25 children = node['children'] # Get child nodes 26 list(map(lambda child: searchChildren(child, visitor, accumulator), children)) # Search each child node 27 28# Get all SLDS components 29def getLeaves(tree): 30 accumulator = [] # Pass-by-reference accumulator object 31 32 # Initial search call with visitor function 33 searchChildren(tree, visitor, accumulator) 34