Term rewriting

Notation
Preface
Motivating Examples
- Symbolic derivation
- Group Theory
Abstract reduction system
- Notation
- Equivalence and reduction
Sudoku solver

Notation

$x \to y$ denotes rewrite rule which rewrites $x$ to $y$ , but not the other way around

Preface

Term rewriting vs. equational logic
- In term rewriting: equations are used as directed replacement rules, i.e. LHS can be replaced by RHS, but not vice versa
Equational reasoning is concerned with a rather restricted class of first-order languages
- Only predicate symbol is equality

Motivating Examples

Term rewriting system
Terms are built up by
- variables
- constant symbols
- function symbols
Termination: is it always the case that after finitely many rule applications we reach an expression to which no more rules apply?
- The resulting expression is called a normal form

Let $t$ be a term.

The normal form of $t$ , denoted $\hat{t}$ , is a term s.t. no further rewrite rules are applicable.

Symbolic derivation

Consider rules

$\begin{alignat*}{3} (R1) \quad & D_X(X) & \to 1 \\ (R2) \quad & D_X(Y) & \to 0 \\ (R3) \quad & D_X(u + v) & \to D_X(u) + D_X(v) \\ (R4) \quad & D_X(u * v) & \to D_X(u) * v + u * D_X(v) \end{alignat*}$

struct Termz{Op}
    args
end

struct D
    var
    expr
end

Termz(x) = x
Termz(expr::Expr) = begin
    @assert expr.head === :call
    Termz{eval(expr.args[1])}(Termz.(expr.args[2:end]))
end

terms(op, args::AbstractArray) = length(args) > 1 ? Termz{op}(args) : args[1]

# Have to implement this explicitly since using different arrays to hold `args`
import Base: ==
==(t1::Termz{Op1}, t2::Termz{Op2}) where {Op1, Op2} = false
==(t1::Termz{Op}, t2::Termz{Op}) where Op = (t1.args ∩ t2.args) == t1.args

t = Termz(:(3x + 2))

D(x, y::Termz) = Termz{D}([x, y])
D(x::Termz, y::Termz) = x == y ? 1 : Termz{D}([x, y])
D(x::Symbol, y::Symbol) = x == y ? 1 : 0
D(x, y::Number) = 0
D(x, t::Termz{+}) = begin
    # @info "calling Termz{+} with args" t

    Termz{+}(D.(x, t.args))
end
D(x, t::Termz{*}) = begin
    # @info "calling Termz{*} with args" t

    Termz{+}([
        Termz{*}([ D(x, t.args[1]), t.args[2:end]... ]),
        Termz{*}([ t.args[1], D(x, terms(*, t.args[2:end])) ])
    ])
end
D(x, t::Termz{^}) = begin
    # @info "calling Termz{^} with args" t

    Termz{*}([
        t.args[end], # exponent
        Termz{^}([
            t.args[1],
            Termz{+}([t.args[end], -1])
        ])
    ])
end

eval(x::Number) = x
eval(x::Number, dict) = x
eval(x::Symbol, dict) = dict[x]
eval(t::Termz{Op}) where Op = foldl(Op, eval.(t.args))
eval(t::Termz{Op}, dict) where Op = foldl(Op, [eval(a, dict) for a ∈ t.args])
(t::Termz)(; kwargs...) = eval(t, kwargs)

function variables(t::Termz; res = Set())
    for t_ ∈ t.args
        if t_ isa Symbol
            push!(res, t_)
        elseif t_ isa Termz
            variables(t_; res = res)
        end
    end

    return res
end


"Generates an anonymous function which evaluates together with the variable ordering."
generate(t::Termz) = begin
    vars = collect(variables(t))

    if length(vars) == 0
        return () -> eval(t, Dict()), []
    end

    expr = quote
        ($(vars...)) -> eval(t, Dict(zip($vars, [$(vars...)])))
    end

    return eval(expr), vars
end


# TESTING

eval(D(:x, t), Dict(:x => 2.0))

eval(t, Dict(:x => 1.0))

der = D(:x, Termz(:(3 * x * y)))
der(; x = 0.0, y = 1.0)

# Addition
t = Termz(:(x + 1))
@assert t(; x = 1) == 2
@assert D(:x, t)(; x = 100) == 1

# Multiplication
t = Termz(:(x * y))
@assert t(; x = 1.0, y = 2.0) == 2.0
@assert D(:x, t)(; x = 1.0, y = 2.0) == 2.0

# Power
t = Termz(:(x^3))
@assert t(; x = 2.0) == 8.0
@assert D(:x, t)(; x = 2.0) == 12.0

# All combined
t = Termz(:((3 + x)x^2 + 2x + 1))
@assert t(; x = 1) == 7
@assert D(:x, t)(; x = 2) == 26

println(D(:x, t))
println(D(:x, t)(; x = 2))

println(D(:x, Termz(:(x * x))))

# Macros for convenience

macro termz(expr)
    Termz(expr)
end

macro D(var, expr)
    D(Termz(var), Termz(expr))
end

println()
println("Macros:")
println(@termz 1 + x)
println(@D(x, 1 + 100x))
println(@D(x, x * x))


println(@D(x, f(x)))
println(D(:x, @termz f(x)))
println(@D(f(x), f(x)))
# println(@D(g(x), f(x)))  # FIXME: currently requires the function to be defined

g() = 1  # but doesn't have to have the correct signature; though `eval` will fail
println(@D(g(x), f(x)))

Termz{+}(Any[Termz{+}(Termz{*}[Termz{*}(Termz[Termz{+}([0, 1]), Termz{^}(Any[:x, 2])]), Termz{*}(Termz[Termz{+}(Any[3, :x]), Termz{*}(Any[2, Termz{^}(Any[:x, Termz{+}([2, -1])])])])]), Termz{+}(Termz{*}[Termz{*}(Any[0, :x]), Termz{*}([2, 1])]), 0])
26
Termz{+}(Termz{*}[Termz{*}(Any[1, :x]), Termz{*}(Any[:x, 1])])

Macros:
Termz{+}(Any[1, :x])
Termz{+}(Any[0, Termz{+}(Termz{*}[Termz{*}(Any[0, :x]), Termz{*}([100, 1])])])
Termz{+}(Termz{*}[Termz{*}(Any[1, :x]), Termz{*}(Any[:x, 1])])
Termz{D}(Any[:x, Termz{f}(Symbol[:x])])
Termz{D}(Any[:x, Termz{f}(Symbol[:x])])
1
Termz{D}(Termz[Termz{g}(Symbol[:x]), Termz{f}(Symbol[:x])])

By applying the corresponding rules in sequence we get

$\begin{equation*} D_X(X * X) \to (X * 1) + (1 * X) \end{equation*}$

Confluence
- Say there are different ways of applying rules to a given term $t$
- Suppose these different ways lead to two different terms $t_1$ and $t_2$
- Can we always find a common term $s$ that can be reached from both $t_1$ and $t_2$ by application of the defined rules?
- If this is true (for all terms in this system), we have confluence for the defined rules
Completion
- Can we always make a non-confluent system into a confluent system by adding implied rules
- This is referred to as completion of the system

One can show that $(R1)-(R4)$ is actually confluent.

Suppose we have some arbitrary term $t$
Consider all possible rules that was applied to the previous term $D_X(t_{-1}) \to t$

All rules are independent
Rules constitutes a functional program
- On the LHS, the defined function $D_X$ occurs only at the very outside
Termination of the resul means that $D_X is a total function
Confluence of the rules means that the result of a computation is independent of the evaluation/rewrite strategy

All term rewriting systems that constitute functional programs are confluent.

Group Theory

$\begin{alignat*}{3} (G1) & (x \circ y) \circ z & \approx x \circ (y \circ z) \\ (G2) & e \circ x & \approx x \\ (G3) & i(x) \circ x & \approx e \end{alignat*}$

Given a set of identities $E$ and two terms $s$ and $t$ , is it possible to transform the term $s$ into the term $t$ , using the identities in $E$ as rewrite rules that can be applied in both directions?

There are a couple of problems we need to overcome:

Equivalent terms can have distinct normal forms
Normal forms need not exist

Abstract reduction system

An abstract reduction system is a pair $(A, \to)$ , where the reduction $\to$ is a binary relation on the set $A$ , i.e. $\to \ \subseteq A \times A$ .

Instead of $(a, b) \in \to$ we write $a \to b$ ..

So a binary relation is just a subset of $A \times A$ which satisfy some property.

Notation

Identity:

$\begin{equation*} \overset{0}{\to} \ := \left\{ (x, x) \mid x \in A \right\} \end{equation*}$
(i + 1)-fold composition, $i \ge 0$

$\begin{equation*} \overset{i + 1}{\to} \ := \overset{i}{\to} \circ \to \end{equation*}$
Transistive closure

$\begin{equation*} \overset{+}{\to} \ := \bigcup_{i > 0}^{} \overset{i}{\to} \end{equation*}$
Reflexive transistive closure

$\begin{equation*} \overset{ * }{\to} \ := \overset{ + }{\to} \cup \overset{0}{\to} \end{equation*}$
Reflexive closure

$\begin{equation*} \overset{=}{\to} \ := \to \cup \overset{0}{\to} \end{equation*}$
Inverse

$\begin{equation*} \overset{-1}{\to} \ := \left\{ (y, x) \mid x \to y \right\} \end{equation*}$
Inverse

$\begin{equation*} \leftarrow \ := \overset{-1}{\to} \end{equation*}$
Symmetric closure

$\begin{equation*} \leftrightarrow \ := \rightarrow \cup \leftarrow \end{equation*}$
Transistive symmetric closure

$\begin{equation*} \overset{ + }{\leftrightarrow} \ := \big(\leftrightarrow \big)^{ + } \end{equation*}$
Reflexive transistive symmetric closure

$\begin{equation*} \overset{ * }{\leftrightarrow} \ := \big( \leftrightarrow \big)^* \end{equation*}$
$x$ is reducible if and only if

$\begin{equation*} \exists y: \quad x \to y \end{equation*}$
$x$ is in normal form (irreducible) if and only if it is not reducible
$y$ is a normal form of $x$ if and only if $x \overset{*}{\to} y$ and $y$ is in normal form. If $x$ has a uniquely determined normal form, the latter is denoted by $x \downarrow$ .
$y$ is a direct successor of $x$ if and only if $x \to y$
$y$ is a successor of $x$ if and only if $x \overset{+}{\to} y$
$x$ and $y$ are joinable if and only if

$\begin{equation*} \exists z: \quad x \overset{ * }{\to} z \overset{ * }{\leftarrow} y \end{equation*}$

in which case we write $x \downarrow y$
A reduction $\to$ is called Church-Rosser if and only if

$\begin{equation*} x \overset{ * }{\leftrightarrow} y \implies x \downarrow y \end{equation*}$
A reduction $\to$ is called confluent if and only if

$\begin{equation*} y_1 \overset{\leftarrow}{ * } x \overset{ * }{\to} y_2 \implies y_1 \downarrow y_2 \end{equation*}$
A reduction $\to$ is called terminating if and only if there is no infinitiely descending chain $a_0 \to a_1 \to \cdots$
A reduction $\to$ is called normalizing if and only if every element has a normal form
A reduction $\to$ is called convergent if and only if it is both confluent and terminating

Equivalence and reduction

Can view reduction in two ways

Directed computation, starting from some point , and then tries to reac a normal form by following the reduction
- Corresponds to idea of program evaluation
Consider merely as a description of , where means that there is a path between and where the arrows can be traversed in both directions, e.g.
- Corresponds to idea of identities which can be used in both directions

(2) is of course very slow to do in general, so would be nice if we could do (1) until we reach normal form and then compare the normal forms of the expressions
- Only works if reduction terminates and normal forms are unique: formally, we need termination and confluence of reduction

Basic definitions

Two relations $R \subseteq A \times B$ and $S \subseteq B \times C$

Their composition is defined by

$\begin{equation*} R \circ S := \left\{ (x, z) \in A \times C \mid \exists y \in B: (x, y) \in R \land (y, z) \in S \right\} \end{equation*}$

$\begin{alignat*}{2} \overset{0}{\to} &:= \left\{ (x, x) \mid x \in A \right\} & \text{identity} \\ \overset{i + 1}{\to} &:= \overset{i}{\to} \circ \to & (i + 1) \text{-fold composition, } i \ge 0 \\ \overset{+}{\to} &:= \bigcup_{i > 0}^{} \overset{i}{\to} & \text{transitive closure} \\ \overset{ * }{\to} & := \overset{ + }{\to} \cup \overset{0}{\to} & \text{reflexive transistive closure} \\ \overset{=}{\to} & := \to \cup \overset{0}{\to} & \text{reflexive closure} \\ \overset{-1}{\to} & := \left\{ (y, x) \mid x \to y \right\} & \text{inverse} \\ \leftarrow &:= \overset{-1}{\to} & \text{inverse} \\ \leftrightarrow &:= \rightarrow \cup \leftarrow & \text{symmetric closure} \\ \overset{ + }{\leftrightarrow} & := \big(\leftrightarrow \big)^{ + } & \text{transistive symmetric closure} \\ \overset{ * }{\leftrightarrow} &:= \big( \leftrightarrow \big)^* & \text{reflexive transistive symmetric closure} \end{alignat*}$

Notations like and only work for arrow-like symbols
- In the case of arbitrary relations $R \subseteq A \times A$ we write $R^*, R^{-1}$ , etc.
Some of the constructions can also be expressed nicely in terms of paths:
- $x \overset{n}{\to} y$ if there is a path of length $n$ from $x$ to $y$
- $x \overset{*}{\to} y$ if there is some finite path from $x$ to $y$
- $x \overset{ + }{\to} y$ if there is some finite nonempty path from $x$ to $y$
The of is the least set with property which contains
- E.g. $\overset{ * }{\to}$ , the reflexive transistive closure of $\to$ , is the least reflexive and transisitve relation which contains $\to$ .
- For arbitrary and , the of need not exist, but in the above cases they always do because reflexivity, transistivity and symmetry are closed under arbitrary intersections
  - In such cases, the $P \text{-closure}$ can be be defined directly as the intersection of all sets with property $P$ which contain $R$
Claim: is the least equivalence relation containing
- Proof: An equivalence relation containing $\to$ must also contain $\leftarrow$ by the symmetry property of a equivalence relation, thus is contains $\leftrightarrow$ . We just need to verify that it also contains $0$ and any number of $\leftrightarrow$ . Observe first that $\overset{0}{\leftrightarrow}$ is definitively in it because of the reflexitivity of the equivalence relation. Finally it then contains $(\leftrightarrow)^*$ because of transistivity. Hence any other equivalence relation $R'$ containing $\to$ must contain $R$ (set corresponding to $\overset{*}{\leftrightarrow}$ ), i.e. it's the least equivalence relation containing $\to$ .

Sudoku solver

Why? Because we can

cell_candidates(i, j, grid) = [k for k = 1:size(grid, 1) if k ∉ (Set(grid[i, :]) ∪ Set(grid[:, j]))]

subsolve!(grid) = begin
    # solve all the straight forward ones
    for i = 1:size(grid, 1)
        for j = 1:size(grid, 2)
            if grid[i, j] == 0
                candidates = cell_candidates(i, j, grid)
                if length(candidates) == 1
                    grid[i, j] = candidates[1]
                end
            end
        end
    end

    grid
end

solve!(grid; level = 0) = begin
    @info level

    success = false
    prev_soln = zeros(size(grid))

    num_steps = 0
    while grid != prev_soln
        copy!(prev_soln, grid)
        subsolve!(grid)

        # increment step counter
        num_steps += 1
        # @info "Completed step $num_steps"
    end

    # at this point, we either solved it completely or there are ambigious entries
    unsolved = findall(zeros(size(grid)) .== grid)
    # @info "unsolved" unsolved
    issolved = length(unsolved) == 0
    if !issolved
        # try out filling in different entries; starting with the ones with the fewest
        # possible candidates
        unsolved_ns = [length(cell_candidates(idx[1], idx[2], grid)) for idx ∈ unsolved]
        if 0 ∈ unsolved_ns
            return false
        end

        unsolved_indices = sort(1:length(unsolved); by = i -> unsolved_ns[i])
        # @info unsolved_ns[unsolved_indices]

        for i ∈ unsolved_indices
            idx = unsolved[i]
            soln = copy(grid)

            for v ∈ cell_candidates(idx[1], idx[2], soln)
                soln[idx] = v
                issolved = solve!(soln; level = level + 1)
                issolved && @info soln
                if issolved
                    grid .= soln
                    return true
                end
            end

            # @info "failed to solve grid"
        end
    end

    unsolved = findall(zeros(size(grid)) .== grid)
    return length(unsolved) == 0 # issolved
end

solve(grid) = begin
    soln = copy(grid)
    issolved = solve!(soln)
    if !issolved
        @warn "failed to solve Sudoku"
    end

    return soln
end

# # simple case
# grid = [
#     0 1 3;
#     3 2 0;
#     0 0 2
# ]
# @assert solve(grid) == [2 1 3; 3 2 1; 1 3 2]

# grid = [
#     0 1 3 0 5;
#     3 2 0 0 0;
#     0 0 2 0 0;
#     4 0 0 3 0;
#     1 0 0 0 0
# ]

# solve(grid)

# FAILED!!!
# Takes forever to run; might be a mistake in the grid
grid = [
    5 3 0 0 7 0 0 0 0;
    6 0 0 1 9 6 0 0 0;
    0 9 8 0 0 0 0 6 0;
    8 0 0 0 6 0 0 0 3;
    4 0 0 8 0 3 0 0 1;
    7 0 0 0 2 0 0 0 6;
    0 6 0 0 0 0 2 8 0;
    0 0 0 4 1 9 0 0 5;
    0 0 0 0 8 0 0 7 9;
]
size(grid)
@time res = solve(grid)

# Solved ✓
# - Requires 1 lvl nesting
grid = [
    0 0 3 0 1 0;
    5 6 0 3 2 0;
    0 5 4 2 0 3;
    2 0 6 4 5 0;
    0 1 2 0 4 5;
    0 4 0 1 0 0;
]

# Solved ✓
# - Requires 4 lvl nesting
grid = [
    0 0 0 1 0 6;
    6 0 4 0 0 0;
    1 0 2 0 0 0;
    0 0 0 5 0 1;
    0 0 0 6 0 3;
    5 0 6 0 0 0;
]

# Solved ✓
# - Requires 14 lvl nesting
grid = [
    0 0 0 4 0 0;
    0 0 6 0 0 3;
    0 0 0 0 0 0;
    0 0 0 0 0 0;
    6 0 0 5 0 0;
    0 0 1 0 0 0
]

# 7 x 7
# Solved ✓
# - Requires 17 lvl nesting
grid = [
    0 4 0 0 3 1 6 0;
    1 0 0 0 0 0 0 0;
    8 0 0 0 0 0 0 0;
    4 3 0 0 2 0 0 0;
    0 0 0 8 0 0 3 4;
    0 0 0 0 0 0 0 5;
    0 0 0 0 0 0 0 6;
    0 7 4 5 0 0 8 0;
]

# 8 x 8
# Solved ✓
# - Requires 10 lvl nesting
grid = [
    0 0 0 0 7 0 0 6;
    0 0 0 5 3 0 0 7;
    0 0 0 0 0 6 1 0;
    3 0 2 0 0 5 0 0;
    0 0 0 0 0 0 0 1;
    8 4 0 0 0 0 0 0;
    0 0 8 0 0 2 0 0;
    0 6 0 1 5 0 0 0
]

# Solved ✓
# - Requires lvl 14 nesting
grid = [
    0 0 0 2 0 3 0 0;
    0 0 6 0 0 7 0 0;
    3 8 0 0 0 0 6 0;
    0 0 0 7 3 0 0 5;
    7 0 0 4 1 0 0 0;
    0 2 0 0 0 0 7 4;
    0 0 3 0 0 5 0 0;
    0 0 1 0 7 0 0 0;
]
@assert size(grid) == (8, 8)

# 9 x 9
# Solved ✓
# - Requires lvl 14 nesting
# - @time: 1.177324 seconds (16.63 M allocations: 1.678 GiB, 17.42% gc time)
grid = [
    5 3 0 0 7 0 0 0 0;
    6 0 0 1 9 5 0 0 0;
    0 9 8 0 0 0 0 6 0;
    8 0 0 0 6 0 0 0 3;
    4 0 0 8 0 3 0 0 1;
    7 0 0 0 2 0 0 0 6;
    0 6 0 0 0 0 2 8 0;
    0 0 0 4 1 9 0 0 5;
    0 0 0 0 8 0 0 7 9
]

# Solved ✓
# - Requires lvl 18 nesting
# - @time: 0.020624 seconds (285.23 k allocations: 28.252 MiB, 11.97% gc time)
grid = [
    0 0 0 1 0 5 0 6 8;
    0 0 0 0 0 0 7 0 1;
    9 0 1 0 0 0 0 3 0;
    0 0 7 0 2 6 0 0 0;
    5 0 0 0 0 0 0 0 3;
    0 0 0 8 7 0 4 0 0;
    0 3 0 0 0 0 8 0 5;
    1 0 5 0 0 0 0 0 0;
    7 9 0 4 0 1 0 0 0;
]

# Solved ✓
# - Requires lvl 39 nesting
# - @time: 0.036620 seconds (693.66 k allocations: 68.903 MiB, 16.69% gc time)
grid = [
    1 0 0 0 0 0 0 0 0;
    0 0 0 0 1 2 0 0 0;
    0 0 0 0 0 0 0 0 1;
    0 8 0 0 0 0 5 0 1;
    0 5 0 0 8 9 0 0 7;
    0 9 0 0 0 0 0 0 0;
    0 0 0 0 0 0 6 5 0;
    0 0 0 0 0 0 0 0 0;
    0 0 0 0 3 0 0 0 0;
]

# Solved ✓
# - Requires 2 lvl nesting
# - @time: 0.011032 seconds (160.33 k allocations: 16.076 MiB, 20.82% gc time)
grid = [
    0 0 9 0 7 0 4 3 6;
    0 0 0 3 8 0 9 0 0;
    6 3 2 5 0 0 0 0 0;
    7 6 3 8 0 0 0 4 0;
    4 0 0 1 0 7 0 0 8;
    0 2 0 0 0 3 5 6 7;
    0 0 0 0 0 8 2 7 3;
    0 0 6 0 3 9 0 0 0;
    3 5 8 0 1 0 6 0 0;
]

@assert size(grid) == (9, 9)
@time res = solve(grid)
res

Term rewriting

Table of Contents

Notation

Preface

Motivating Examples

Symbolic derivation

Group Theory

Abstract reduction system

Notation

Equivalence and reduction

Basic definitions

Sudoku solver