Interactive Debugging of Pattern‑Matching Conditionals

Introduction

Many debugging concepts commonly used today were already invented during the era of mainframes. Compared with the past, today's debugging tools are not fundamentally different. They all record information during program execution and present it to users either while the program is running or after it finishes. However, the amount of data that can be collected during program execution is enormous. What users actually need to focus on for debugging is often just a tiny part of it. As computer storage has increased and interaction has improved, designing better debugger front-ends to display collected logs, traces, and symbol tables has become a widely discussed topic.

If all programs in the world were executed sequentially, there probably wouldn't be so many debug tools. What makes computer programs complex are various control structures. When people debug, what they care about most is the program's execution flow. They often have questions like: “Why did this condition fail?”, “Why did this function reach this point? It should have ended earlier.”, or “Which branch was taken to reach this point?” Insightful debugging tools should help people solve these high-level questions, not just provide the values of variables at runtime.

Fortunately, there are already some tools that help people visualize program execution. For example, CrossCode can visualize the runtime execution of JavaScript functions from both control flow and data flow perspectives and support navigation between multiple levels of abstractions Hayatpur et al., 2023. Omnicode continuously executes the program and displays the intermediate variable values using scatter plots, in order to provide users with insights into the data distribution Kang and Guo, 2017.

These works are all insightful, but they are mostly based on imperative programming languages. As a result, their visualizations of control flow are limited to if-then-else statements and loops. Interactive debugging tools optimized for pattern matching are still waiting to be explored. However, pattern matching, which is a very commonly used control flow structure in functional programming languages, is rarely addressed. Although there are many interactive debuggers in the world of functional programming, they also lack the ability to provide users with high-level insight like the tools mentioned earlier. No one has designed an interactive debugger interface specifically for them.

In this essay, we present our recent work on this. We developed an interactive debugger to address the problem, and designed three interfaces for it. Our system is based on an expressive pattern matching syntax: the Ultimate Conditional Syntax (UCS). The UCS is a unified syntax that introduces pattern matching using the is operator into nested multi-way if expressions, with support of factoring out common prefixes in conditions and interleaving computations within branches Cheng and Parreaux, 2024.

The UCS allows people to express multi-level conditional logic in a very concise way using a single syntax. In other programming languages, such logic often requires nested pattern matching and if-then-else expressions. However, due to its flexibility and extensibility, people sometimes write very long UCS expressions, which are difficult to debug. Let’s first look at an example of the UCS from real-world examples.

fun expr(prec: Int): Expr = if peek is
  Some of
    Token.Numeric(literal) and literal.toIntOption is Some(value) then
      consume
      Expr.Num(value) exprCont(prec)
    Token.Identifier(name) then
      consume
      Expr.Var(name) exprCont(prec)
    Token.Symbol("(") then
      consume
      expr(0) require(Token.Symbol(")", true)) exprCont(prec)
    token then Expr.error of "Unexpected token " + token Token.summary()
  None then Expr.error of "Unexpected end of input"

fun exprCont(acc: Expr, prec: Int): Expr =
  if peek is Some(Token.Symbol(op))
    and op !== ")"
    and Expr.opPrec(op) is [leftPrec, rightPrec]
    and leftPrec > prec then
      consume
      let right = expr of rightPrec
      Expr.Infix(op, acc, right) exprCont(prec)
  else acc

The example is taken from an arithmetic expression parser written in MLscript. It includes two functions, expr and exprCont. The expr function parses an entire expression where all operators have precedence greater than prec. It calls exprCont to parse the the continuation of the expression after the left-hand side of the operator.

Note the conditions connected by and in exprCont (line 16 to 19). These four conditions share the same else (line 23), so when the function returns acc, we don't know which condition failed. Debugging this kind of expression requires a certain cognitive load. If someone uses a traditional debugger, they need to set breakpoints on each of the four condition expressions, or check the values of the variables involved in the conditions in the symbol table to find out the reason. If someone is used to using the print function for debugging, they may also need to make some small refactorings, since it’s not possible to insert print expressions between the conditions.

We argue that our debugger can accomplish this task and can flexibly handle more complex debugging needs in the following three aspects.

Culprit pattern highlighting helps users quickly and effectively locate the problem by using color channels to indicate whether each sub-pattern in the pattern matching matches the input value.
Visualizing conditional flow shows the execution path of the UCS expression when the debugger pauses the program inside it. This helps users understand which branches have been tried before and why the current branch was taken.
Automatic recursion tracing can visualize the entire execution path of recursive functions based on our understanding of the branch structure of UCS expressions. This helps users see which branches triggered recursion and reduces the mental effort needed to understand recursive functions.

Methodology

In this section, we introduce three main features of the interactive debugger we developed for MLscript. All of them make debugging the UCS more straightforward and intuitive.

Culprit pattern highlighting

The culprit pattern refers to the first pattern (or condition) that fails to match in a series of conjunctions. When the user uses the debugger and stops at a certain point in a UCS expression, the debugger shows the culprit patterns that caused the program to reach this branch. It's important to note that simply highlighting the failed condition is not enough. We also need to highlight the specific sub-pattern that is the first to fail.

Let's take the UCS expression from Section 1 as an example. We use red to highlight sub-patterns that fail to match or conditions that evaluate to false, and green to highlight sub-patterns that match successfully or conditions that evaluate to true. Gray is used to highlight sub-patterns that are not used. Figure 2.1 contains some inputs and how they appear in the debugger when execution pauses at the branch else acc.

Select an example:

Example 1

Example 2

Example 3

Example 4

The simulated debugger's view.

In this case, the value of peek is None. Therefore, it does not match the first pattern. We highlight the Some in red to indicate that the culprit is the constructor pattern instead of the inner pattern.

In this case, the value of peek is Some(Token.Symbol("**")). It matches the first pattern and passed the second condition. But the implementation of Expr.opPrec does not recognize "**" as an operator, and it returns null. The culprit pattern is [leftPrec, rightPrec]. Note that we highlight the tuple and marked the elements as unused. This clearly shows the culprit pattern.

In this case, the value of peek is Some(Token.Symbol("+")). It matches the first pattern and passed the second condition. The value returned by Expr.opPrec is [1, 2]. But the fourth condition did not pass. Therefore, the culprit is the fourth condition.

In this case, the value of peek is Some(Token.Symbol("+")). It matches the first pattern and passed the second condition. The value returned by Expr.opPrec is [1, 2]. The fourth condition also passed.

Visualizing conditional flow

A single UCS expression can be long and complex. We found that when people use UCS expressions, they are usually clear about their logic. However, when debugging, they sometimes overlook the flow between different branches.

Consider the following concise balance function for balancing an AVL tree. This function checks the height difference between the subtrees of the current node and calls the corresponding tree rotation function. This part of the implementation is easy to get wrong when learning about AVL trees. Beginners might write one of the conditions incorrectly, which can cause the tree to not be rotated properly.

We can see that in this implementation, the last line else t is used as the fallback for many failed conditions. In fact, almost every failed pattern matching or condition before it ends up in this branch. When people debug this program, they can only see that the program entered the final branch, but they have no idea about the execution of the earlier branches. So it's very important to show the execution path of the entire UCS expression.

Simulated debugger view: use numbers to mark the branches that have been tried.

Figure 2.2 shows the visualization of the UCS expression of the balance function. We use numbers to mark the branches that have been tried before entering the current branch at the breakpoint.

Automatic recursion tracing

Because of the flexibility and simplicity of the UCS, many recursive functions in functional programming can be written as a single UCS expression. For example, the following is an implementation of capture-avoiding substitution in lambda calculus.

fun subst(t: Term, x: Str, v: Term): Term = if t is
  Var(y) and x == y then v
  Abs(Var(y) as p, t') and x != y and
    hasFree(v, y) then
      let y' = freshVar(t')
      let t'' = subst(t', y, Var(y'))
      Abs(Var(y'), subst(t'', x, v))
    else Abs(p, subst(t', x, v))
  App(lhs, rhs) then App(subst(lhs, x, v), subst(rhs, x, v))
  else t

The body of this function consists of a single UCS expression. Here we briefly explain what this function does.

Its first branch (line 2) checks whether t is a Var. If it is, it directly returns v.
The second branch (line 3) is a bit more complex. It first checks whether t is an Abs, and binds the Var representing the argument to p. Then it checks whether the argument name y is the same as the name x to be replaced. If they are different, it continues to match the third branch. If they are the same, it proceeds with the inner branches from line 4 to line 9.
1. The first inner branch (line 4) checks term v has a free variable y. If it does, it generates a new variable y' and substitutes y with y' in t'. Then performs substitution recursively.
2. Otherwise, the second inner branch (line 8) performs substitution recursively.
The third branch (line 9) checks whether t is an App. If so, it performs substitution recursively.
The final branch (line 10) serves as a fallback for all the cases not covered by the previous branches.

Our system visualizes this UCS expression as a tree, as shown in Figure 2.2. Blue nodes represent the values being checked in the UCS expression. The sub-trees extending from each blue node represent the branches taken after the value is checked. Gray dashed arrows indicate the possible fallback execution paths between branches.

A visualization of the UCS expression of subst function.

Such a tree can represent the traces of a single call to the subst function. For example, the call subst(Var("x"), "x", Var("y")) can be visualized like in Figure 2.3.

A visualization of the trace of the call subst(Var("x"), "x", Var("y")).

Now that we know such a tree can represent the trace of a single execution of a recursive function, we can record the traces of all calls of the recursive function and combine them into one large tree.

Conclusion

In this essay, we have presented a new debugging tool for UCS expressions. We have showcased three user interface features of our tool: culprit pattern highlighting, visualizing conditional flow, and automatic recursion tracing. They aim to help users quickly and effectively locate the problem. Currently, the tool is still being developed. In the future, we will user study to evaluate the effectiveness of our approach.