Recent interactive debugging tools effectively exhibit high‑level
insights into program execution, yet most focus on imperative
programming languages and neglect an important component of functional
programming: pattern matching. We present an interactive debugger
built atop an expressive pattern‑matching syntax (the Ultimate
Conditional Syntax) and describe three user interfaces for
understanding failures and control flow: (1) culprit-pattern
highlighting that pinpoints successful and failed subpattern, (2)
fallback-flow visualization that reveals how default branches are
shared and taken, and (3) automatic recursion tracing that exhibits
the high-level control flow of recursive functions.
Introduction
Many debugging concepts commonly used today were already invented
during the era of mainframes. Compared with the past, today's
debugging tools are not fundamentally different. They all record
information during program execution and present it to users either
while the program is running or after it finishes. However, the amount
of data that can be collected during program execution is enormous.
What users actually need to focus on for debugging is often just a
tiny part of it. As computer storage has increased and interaction has
improved, designing better debugger front-ends to display collected
logs, traces, and symbol tables has become a widely discussed topic.
If all programs in the world were executed sequentially, there
probably wouldn't be so many debug tools. What makes computer programs
complex are various control structures. When people debug, what they
care about most is the program's execution flow. They often have
questions like: “Why did this condition fail?”, “Why did this function
reach this point? It should have ended earlier.”, or “Which branch was
taken to reach this point?” Insightful debugging tools should help
people solve these high-level questions, not just provide the values
of variables at runtime.
Fortunately, there are already some tools that help people visualize
program execution. For example, CrossCode can visualize the runtime
execution of JavaScript functions from both control flow and data flow
perspectives and support navigation between multiple levels of
abstractions
Hayatpur et al., 2023. Omnicode continuously executes the program and displays the
intermediate variable values using scatter plots, in order to provide
users with insights into the data distribution
Kang and Guo, 2017.
These works are all insightful, but they are mostly based on
imperative programming languages. As a result, their visualizations of
control flow are limited to
if-then-else statements and
loops. Interactive debugging tools optimized for pattern matching are
still waiting to be explored. However, pattern matching, which is a
very commonly used control flow structure in functional programming
languages, is rarely addressed. Although there are many interactive
debuggers in the world of functional programming, they also lack the
ability to provide users with high-level insight like the tools
mentioned earlier. No one has designed an interactive debugger
interface specifically for them.
In this essay, we present our recent work on this. We developed an
interactive debugger to address the problem, and designed three
interfaces for it. Our system is based on an expressive pattern
matching syntax: the Ultimate Conditional Syntax (UCS). The UCS is a
unified syntax that introduces pattern matching using the
is operator into nested multi-way
if expressions, with support of factoring out common
prefixes in conditions and interleaving computations within branches
Cheng and Parreaux, 2024.
The UCS allows people to express multi-level conditional logic in a
very concise way using a single syntax. In other programming
languages, such logic often requires nested pattern matching and
if-then-else expressions. However, due to its flexibility and
extensibility, people sometimes write very long UCS expressions, which
are difficult to debug. Let’s first look at an example of the UCS from
real-world examples.
fun expr(prec: Int): Expr = if peek is
Some of
Token.Numeric(literal) and literal.toIntOption is Some(value) then
consume
Expr.Num(value) exprCont(prec)
Token.Identifier(name) then
consume
Expr.Var(name) exprCont(prec)
Token.Symbol("(") then
consume
expr(0) require(Token.Symbol(")", true)) exprCont(prec)
token then Expr.error of "Unexpected token " + token Token.summary()
None then Expr.error of "Unexpected end of input"
fun exprCont(acc: Expr, prec: Int): Expr =
if peek is Some(Token.Symbol(op))
and op !== ")"
and Expr.opPrec(op) is [leftPrec, rightPrec]
and leftPrec > prec then
consume
let right = expr of rightPrec
Expr.Infix(op, acc, right) exprCont(prec)
else acc
The example is taken from an arithmetic expression parser written in
MLscript. It includes two functions, expr and
exprCont. The expr function parses an entire
expression where all operators have precedence greater than
prec. It calls exprCont to parse the the
continuation of the expression after the left-hand side of the
operator.
Note the
conditions connected by and in
exprCont (line 16 to 19).
These four conditions share the same else (line 23), so
when the function returns acc, we don't know which
condition failed. Debugging this kind of expression requires a certain
cognitive load. If someone uses a traditional debugger, they need to
set breakpoints on each of the four condition expressions, or check
the values of the variables involved in the conditions in the symbol
table to find out the reason. If someone is used to using the print
function for debugging, they may also need to make some small
refactorings, since it’s not possible to insert print expressions
between the conditions.
We argue that our debugger can accomplish this task and can flexibly
handle more complex debugging needs in the following three aspects.
Culprit pattern highlighting helps users quickly
and effectively locate the problem by using color channels to
indicate whether each sub-pattern in the pattern matching matches
the input value.
Visualizing conditional flow shows the execution
path of the UCS expression when the debugger pauses the program
inside it. This helps users understand which branches have been
tried before and why the current branch was taken.
Automatic recursion tracing can visualize the
entire execution path of recursive functions based on our
understanding of the branch structure of UCS expressions. This helps
users see which branches triggered recursion and reduces the mental
effort needed to understand recursive functions.
Methodology
In this section, we introduce three main features of the interactive
debugger we developed for MLscript. All of them make debugging the UCS
more straightforward and intuitive.
Culprit pattern highlighting
The culprit pattern refers to the first pattern (or condition) that
fails to match in a series of conjunctions. When the user uses the
debugger and stops at a certain point in a UCS expression, the
debugger shows the culprit patterns that caused the program to reach
this branch. It's important to note that simply highlighting the
failed condition is not enough. We also need to highlight the specific
sub-pattern that is the first to fail.
Let's take the UCS expression from Section 1 as an example. We use
red to highlight sub-patterns that fail to
match or conditions that evaluate to false, and
green to highlight sub-patterns that
match successfully or conditions that evaluate to true.
Gray is used to highlight sub-patterns
that are not used.
Figure 2.1
contains some inputs and how they appear in the debugger when
execution pauses at the branch else acc.
Select an example:
The simulated debugger's view.
In this case, the value of peek is None.
Therefore, it does not match the first pattern. We highlight the
Some in red to indicate
that the culprit is the constructor pattern instead of the inner
pattern.
In this case, the value of peek is
Some(Token.Symbol("**")). It matches the first pattern
and passed the second condition. But the implementation of
Expr.opPrec does not recognize "**" as an
operator, and it returns null. The culprit pattern is
[leftPrec, rightPrec]. Note that we highlight the tuple
and marked the elements as unused. This clearly shows the culprit
pattern.
In this case, the value of peek is
Some(Token.Symbol("+")). It matches the first pattern
and passed the second condition. The value returned by
Expr.opPrec is [1, 2]. But the fourth
condition did not pass. Therefore, the culprit is the fourth
condition.
In this case, the value of peek is
Some(Token.Symbol("+")). It matches the first pattern
and passed the second condition. The value returned by
Expr.opPrec is [1, 2]. The fourth
condition also passed.
Visualizing conditional flow
A single UCS expression can be long and complex. We found that when
people use UCS expressions, they are usually clear about their logic.
However, when debugging, they sometimes overlook the flow between
different branches.
Consider the following concise
balance
function for balancing an AVL tree. This function checks the height
difference between the subtrees of the current node and calls the
corresponding tree rotation function. This part of the implementation
is easy to get wrong when learning about AVL trees. Beginners might
write one of the conditions incorrectly, which can cause the tree to
not be rotated properly.
fun balance(t: AVLTreeNode): AVLTreeNode = if t is
Node(_, l, r, _) and height(r) - height(l)
> 1 and r is Node(_, rl, rr, _) and height(rr) - height(rl)
> 0 then rotateLeft(t)
< 0 and rl is Node then rotateRightLeft(t)
< 1 and l is Node(_, ll, lr, _) and height(lr) - height(ll)
> 0 and lr is Node then rotateLeftRight(t)
< 0 then rotateRight(t)
else t
We can see that in this implementation, the last line
else t is used as the fallback for many failed
conditions. In fact, almost every failed pattern matching or condition
before it ends up in this branch. When people debug this program, they
can only see that the program entered the final branch, but they have
no idea about the execution of the earlier branches. So it's very
important to show the execution path of the entire UCS expression.
Automatic recursion tracing
Because of the flexibility and simplicity of the UCS, many recursive
functions in functional programming can be written as a single UCS
expression. For example, the following is an implementation of
capture-avoiding substitution in lambda calculus.
fun subst(t: Term, x: Str, v: Term): Term = if t is
Var(y) and x == y then v
Abs(Var(y) as p, t') and x != y and
hasFree(v, y) then
let y' = freshVar(t')
let t'' = subst(t', y, Var(y'))
Abs(Var(y'), subst(t'', x, v))
else Abs(p, subst(t', x, v))
App(lhs, rhs) then App(subst(lhs, x, v), subst(rhs, x, v))
else t
The body of this function consists of a single UCS expression. Here we
briefly explain what this function does.
Its first branch (line 2) checks whether t is a
Var. If it is, it directly returns v.
The second branch (line 3) is a bit more complex. It first checks
whether
t is an Abs, and binds the
Var representing the argument to p. Then
it checks whether the argument name y is the same as
the name x to be replaced. If they are different, it
continues to match the third branch. If they are the same, it
proceeds with the inner branches from line 4 to line 9.
The first inner branch (line 4) checks term v has a
free variable y. If it does, it generates a new
variable y' and substitutes y with
y' in t'. Then performs substitution
recursively.
Otherwise, the second inner branch (line 8) performs
substitution recursively.
The third branch (line 9) checks whether t is an
App. If so, it performs substitution recursively.
The final branch (line 10) serves as a fallback for all the cases
not covered by the previous branches.
Our system visualizes this UCS expression as a tree, as shown in
Figure 2.2. Blue nodes represent the values being checked in the UCS
expression. The sub-trees extending from each blue node represent the
branches taken after the value is checked. Gray dashed arrows indicate
the possible fallback execution paths between branches.
A visualization of the UCS expression of
subst function.
Such a tree can represent the traces of a single call to the
subst function. For example, the call
subst(Var("x"), "x", Var("y")) can be visualized like in
Figure 2.3.
A visualization of the trace of the call
subst(Var("x"), "x", Var("y")).
Now that we know such a tree can represent the trace of a single
execution of a recursive function, we can record the traces of all
calls of the recursive function and combine them into one large tree.
Conclusion
In this essay, we have presented a new debugging tool for UCS
expressions. We have showcased three user interface features of our
tool:
culprit pattern highlighting,
visualizing conditional flow, and
automatic recursion tracing. They aim to help users
quickly and effectively locate the problem. Currently, the tool is
still being developed. In the future, we will user study to evaluate
the effectiveness of our approach.
fun exprCont(acc: Expr, prec: Int): Expr =
if peek is Some(Token.Symbol(op))
and op !== ")"
and Expr.opPrec(op) is [leftPrec, rightPrec]
and leftPrec > prec then
consume
let right = expr of rightPrec
Expr.Infix(op, acc, right) exprCont(prec)
else acc
fun balance(t: AVLTreeNode): AVLTreeNode = if t is
Node(_, l, r, _) and height(r) - height(l)
> 1 and r is Node(_, rl, rr, _) and height(rr) - height(rl)
> 0 then rotateLeft(t)
< 0 and rl is Node then rotateRightLeft(t)
< 1 and l is Node(_, ll, lr, _) and height(lr) - height(ll)
> 0 and lr is Node then rotateLeftRight(t)
< 0 then rotateRight(t)
else t