# Term rewriting

## Table of Contents

## Notation

- denotes rewrite rule which rewrites to , but not the other way around

## Preface

- Term rewriting vs. equational logic
- In term rewriting: equations are used as
*directed*replacement rules, i.e. LHS can be replaced by RHS, but*not*vice versa

- In term rewriting: equations are used as
**Equational reasoning**is concerned with a rather restricted class of first-order languages- Only predicate symbol is equality

## Motivating Examples

**Term rewriting system****Terms**are built up by- variables
- constant symbols
- function symbols

**Termination:**is it always the case that after finitely many rule applications we reach an expression to which no more rules apply?- The resulting expression is called a
**normal form**

- The resulting expression is called a

Let be a term.

The **normal form** of , denoted , is a term s.t. no further rewrite rules are applicable.

### Symbolic derivation

Consider rules

struct Termz{Op} args end struct D var expr end Termz(x) = x Termz(expr::Expr) = begin @assert expr.head === :call Termz{eval(expr.args[1])}(Termz.(expr.args[2:end])) end terms(op, args::AbstractArray) = length(args) > 1 ? Termz{op}(args) : args[1] # Have to implement this explicitly since using different arrays to hold `args` import Base: == ==(t1::Termz{Op1}, t2::Termz{Op2}) where {Op1, Op2} = false ==(t1::Termz{Op}, t2::Termz{Op}) where Op = (t1.args ∩ t2.args) == t1.args t = Termz(:(3x + 2)) D(x, y::Termz) = Termz{D}([x, y]) D(x::Termz, y::Termz) = x == y ? 1 : Termz{D}([x, y]) D(x::Symbol, y::Symbol) = x == y ? 1 : 0 D(x, y::Number) = 0 D(x, t::Termz{+}) = begin # @info "calling Termz{+} with args" t Termz{+}(D.(x, t.args)) end D(x, t::Termz{*}) = begin # @info "calling Termz{*} with args" t Termz{+}([ Termz{*}([ D(x, t.args[1]), t.args[2:end]... ]), Termz{*}([ t.args[1], D(x, terms(*, t.args[2:end])) ]) ]) end D(x, t::Termz{^}) = begin # @info "calling Termz{^} with args" t Termz{*}([ t.args[end], # exponent Termz{^}([ t.args[1], Termz{+}([t.args[end], -1]) ]) ]) end eval(x::Number) = x eval(x::Number, dict) = x eval(x::Symbol, dict) = dict[x] eval(t::Termz{Op}) where Op = foldl(Op, eval.(t.args)) eval(t::Termz{Op}, dict) where Op = foldl(Op, [eval(a, dict) for a ∈ t.args]) (t::Termz)(; kwargs...) = eval(t, kwargs) function variables(t::Termz; res = Set()) for t_ ∈ t.args if t_ isa Symbol push!(res, t_) elseif t_ isa Termz variables(t_; res = res) end end return res end "Generates an anonymous function which evaluates together with the variable ordering." generate(t::Termz) = begin vars = collect(variables(t)) if length(vars) == 0 return () -> eval(t, Dict()), [] end expr = quote ($(vars...)) -> eval(t, Dict(zip($vars, [$(vars...)]))) end return eval(expr), vars end # TESTING eval(D(:x, t), Dict(:x => 2.0)) eval(t, Dict(:x => 1.0)) der = D(:x, Termz(:(3 * x * y))) der(; x = 0.0, y = 1.0) # Addition t = Termz(:(x + 1)) @assert t(; x = 1) == 2 @assert D(:x, t)(; x = 100) == 1 # Multiplication t = Termz(:(x * y)) @assert t(; x = 1.0, y = 2.0) == 2.0 @assert D(:x, t)(; x = 1.0, y = 2.0) == 2.0 # Power t = Termz(:(x^3)) @assert t(; x = 2.0) == 8.0 @assert D(:x, t)(; x = 2.0) == 12.0 # All combined t = Termz(:((3 + x)x^2 + 2x + 1)) @assert t(; x = 1) == 7 @assert D(:x, t)(; x = 2) == 26 println(D(:x, t)) println(D(:x, t)(; x = 2)) println(D(:x, Termz(:(x * x)))) # Macros for convenience macro termz(expr) Termz(expr) end macro D(var, expr) D(Termz(var), Termz(expr)) end println() println("Macros:") println(@termz 1 + x) println(@D(x, 1 + 100x)) println(@D(x, x * x)) println(@D(x, f(x))) println(D(:x, @termz f(x))) println(@D(f(x), f(x))) # println(@D(g(x), f(x))) # FIXME: currently requires the function to be defined g() = 1 # but doesn't have to have the correct signature; though `eval` will fail println(@D(g(x), f(x)))

Termz{+}(Any[Termz{+}(Termz{*}[Termz{*}(Termz[Termz{+}([0, 1]), Termz{^}(Any[:x, 2])]), Termz{*}(Termz[Termz{+}(Any[3, :x]), Termz{*}(Any[2, Termz{^}(Any[:x, Termz{+}([2, -1])])])])]), Termz{+}(Termz{*}[Termz{*}(Any[0, :x]), Termz{*}([2, 1])]), 0]) 26 Termz{+}(Termz{*}[Termz{*}(Any[1, :x]), Termz{*}(Any[:x, 1])]) Macros: Termz{+}(Any[1, :x]) Termz{+}(Any[0, Termz{+}(Termz{*}[Termz{*}(Any[0, :x]), Termz{*}([100, 1])])]) Termz{+}(Termz{*}[Termz{*}(Any[1, :x]), Termz{*}(Any[:x, 1])]) Termz{D}(Any[:x, Termz{f}(Symbol[:x])]) Termz{D}(Any[:x, Termz{f}(Symbol[:x])]) 1 Termz{D}(Termz[Termz{g}(Symbol[:x]), Termz{f}(Symbol[:x])])

By applying the corresponding rules in sequence we get

**Confluence**- Say there are different ways of applying rules to a given term
- Suppose these different ways lead to two
*different*terms and - Can we always find a common term that can be reached from both and by application of the defined rules?
- If this is true (for all terms in this system), we have
*confluence*for the defined*rules*

**Completion**- Can we always make a
*non-confluent*system into a*confluent*system by adding implied rules - This is referred to as
*completion*of the system

- Can we always make a

One can show that is actually *confluent*.

- Suppose we have some arbitrary term
- Consider all possible rules that was applied to the previous term

OR

- All rules are independent
- Rules constitutes a
*functional program*- On the LHS, the defined function occurs only at the very outside

- Termination of the resul means that $D
_{X}is a*total function* - Confluence of the rules means that the result of a computation is independent of the evaluation/rewrite strategy

All term rewriting systems that constitute functional programs are confluent.

### Group Theory

Given a set of identities and two terms and , is it possible to transform the term into the term , using the identities in as rewrite rules that can be applied in *both* directions?

There are a couple of problems we need to overcome:

- Equivalent terms can have distinct normal forms
- Normal forms need not exist

## Abstract reduction system

An **abstract reduction system** is a pair , where the **reduction** is a binary relation on the set , i.e. .

Instead of we write ..

So a binary *relation* is just a subset of which satisfy some property.

### Notation

Identity:

(i + 1)-fold composition,

Transistive closure

Reflexive transistive closure

Reflexive closure

Inverse

Inverse

Symmetric closure

Transistive symmetric closure

Reflexive transistive symmetric closure

is

**reducible**if and only if- is
**in normal form (irreducible)**if and only if it is*not*reducible - is
**a normal form of**if and only if and is in normal form. If has a uniquely determined normal form, the latter is denoted by . - is a
**direct successor**of if and only if - is a
**successor**of if and only if and are

**joinable**if and only ifin which case we write

A reduction is called

**Church-Rosser**if and only ifA reduction is called

**confluent**if and only if- A reduction is called
**terminating**if and only if there is*no*infinitiely descending chain - A reduction is called
**normalizing**if and only if every element has a normal form - A reduction is called
**convergent**if and only if it is both confluent and terminating

### Equivalence and reduction

Can view reduction in two ways

- Directed computation, starting from some point , and then tries to reac a normal form by following the reduction
- Corresponds to idea of program evaluation

- Consider merely as a description of , where means that there is a path between and where the arrows can be traversed in
*both*directions, e.g.- Corresponds to idea of identities which can be used in both directions

So

- (2) is of course very slow to do in general, so would be nice if we could do (1) until we reach normal form and then compare the normal forms of the expressions
- Only works if reduction terminates and normal forms are unique: formally, we need
*termination*and*confluence*of reduction

- Only works if reduction terminates and normal forms are unique: formally, we need

#### Basic definitions

- Two relations and

Their **composition** is defined by

- Notations like and only work for arrow-like symbols
- In the case of arbitrary relations we write , etc.

- Some of the constructions can also be expressed nicely in terms of
*paths*:- if there is a path of length from to
- if there is some finite path from to
- if there is some finite
*nonempty*path from to

- The of is the least set with property which contains
- E.g. , the reflexive transistive closure of , is the least reflexive and transisitve relation which contains .
- For arbitrary and , the of need not exist, but in the above cases they always do because reflexivity, transistivity and symmetry are closed under arbitrary intersections
- In such cases, the can be be defined directly as the intersection of all sets with property which contain

- Claim: is the least equivalence relation containing
- Proof: An equivalence relation containing must also contain by the symmetry property of a equivalence relation, thus is contains . We just need to verify that it also contains and any number of . Observe first that is definitively in it because of the reflexitivity of the equivalence relation. Finally it then contains because of transistivity. Hence any other equivalence relation containing must contain (set corresponding to ), i.e. it's the
*least*equivalence relation containing .

- Proof: An equivalence relation containing must also contain by the symmetry property of a equivalence relation, thus is contains . We just need to verify that it also contains and any number of . Observe first that is definitively in it because of the reflexitivity of the equivalence relation. Finally it then contains because of transistivity. Hence any other equivalence relation containing must contain (set corresponding to ), i.e. it's the

## Sudoku solver

- Why? Because we can

cell_candidates(i, j, grid) = [k for k = 1:size(grid, 1) if k ∉ (Set(grid[i, :]) ∪ Set(grid[:, j]))] subsolve!(grid) = begin # solve all the straight forward ones for i = 1:size(grid, 1) for j = 1:size(grid, 2) if grid[i, j] == 0 candidates = cell_candidates(i, j, grid) if length(candidates) == 1 grid[i, j] = candidates[1] end end end end grid end solve!(grid; level = 0) = begin @info level success = false prev_soln = zeros(size(grid)) num_steps = 0 while grid != prev_soln copy!(prev_soln, grid) subsolve!(grid) # increment step counter num_steps += 1 # @info "Completed step $num_steps" end # at this point, we either solved it completely or there are ambigious entries unsolved = findall(zeros(size(grid)) .== grid) # @info "unsolved" unsolved issolved = length(unsolved) == 0 if !issolved # try out filling in different entries; starting with the ones with the fewest # possible candidates unsolved_ns = [length(cell_candidates(idx[1], idx[2], grid)) for idx ∈ unsolved] if 0 ∈ unsolved_ns return false end unsolved_indices = sort(1:length(unsolved); by = i -> unsolved_ns[i]) # @info unsolved_ns[unsolved_indices] for i ∈ unsolved_indices idx = unsolved[i] soln = copy(grid) for v ∈ cell_candidates(idx[1], idx[2], soln) soln[idx] = v issolved = solve!(soln; level = level + 1) issolved && @info soln if issolved grid .= soln return true end end # @info "failed to solve grid" end end unsolved = findall(zeros(size(grid)) .== grid) return length(unsolved) == 0 # issolved end solve(grid) = begin soln = copy(grid) issolved = solve!(soln) if !issolved @warn "failed to solve Sudoku" end return soln end # # simple case # grid = [ # 0 1 3; # 3 2 0; # 0 0 2 # ] # @assert solve(grid) == [2 1 3; 3 2 1; 1 3 2] # grid = [ # 0 1 3 0 5; # 3 2 0 0 0; # 0 0 2 0 0; # 4 0 0 3 0; # 1 0 0 0 0 # ] # solve(grid) # FAILED!!! # Takes forever to run; might be a mistake in the grid grid = [ 5 3 0 0 7 0 0 0 0; 6 0 0 1 9 6 0 0 0; 0 9 8 0 0 0 0 6 0; 8 0 0 0 6 0 0 0 3; 4 0 0 8 0 3 0 0 1; 7 0 0 0 2 0 0 0 6; 0 6 0 0 0 0 2 8 0; 0 0 0 4 1 9 0 0 5; 0 0 0 0 8 0 0 7 9; ] size(grid) @time res = solve(grid) # Solved ✓ # - Requires 1 lvl nesting grid = [ 0 0 3 0 1 0; 5 6 0 3 2 0; 0 5 4 2 0 3; 2 0 6 4 5 0; 0 1 2 0 4 5; 0 4 0 1 0 0; ] # Solved ✓ # - Requires 4 lvl nesting grid = [ 0 0 0 1 0 6; 6 0 4 0 0 0; 1 0 2 0 0 0; 0 0 0 5 0 1; 0 0 0 6 0 3; 5 0 6 0 0 0; ] # Solved ✓ # - Requires 14 lvl nesting grid = [ 0 0 0 4 0 0; 0 0 6 0 0 3; 0 0 0 0 0 0; 0 0 0 0 0 0; 6 0 0 5 0 0; 0 0 1 0 0 0 ] # 7 x 7 # Solved ✓ # - Requires 17 lvl nesting grid = [ 0 4 0 0 3 1 6 0; 1 0 0 0 0 0 0 0; 8 0 0 0 0 0 0 0; 4 3 0 0 2 0 0 0; 0 0 0 8 0 0 3 4; 0 0 0 0 0 0 0 5; 0 0 0 0 0 0 0 6; 0 7 4 5 0 0 8 0; ] # 8 x 8 # Solved ✓ # - Requires 10 lvl nesting grid = [ 0 0 0 0 7 0 0 6; 0 0 0 5 3 0 0 7; 0 0 0 0 0 6 1 0; 3 0 2 0 0 5 0 0; 0 0 0 0 0 0 0 1; 8 4 0 0 0 0 0 0; 0 0 8 0 0 2 0 0; 0 6 0 1 5 0 0 0 ] # Solved ✓ # - Requires lvl 14 nesting grid = [ 0 0 0 2 0 3 0 0; 0 0 6 0 0 7 0 0; 3 8 0 0 0 0 6 0; 0 0 0 7 3 0 0 5; 7 0 0 4 1 0 0 0; 0 2 0 0 0 0 7 4; 0 0 3 0 0 5 0 0; 0 0 1 0 7 0 0 0; ] @assert size(grid) == (8, 8) # 9 x 9 # Solved ✓ # - Requires lvl 14 nesting # - @time: 1.177324 seconds (16.63 M allocations: 1.678 GiB, 17.42% gc time) grid = [ 5 3 0 0 7 0 0 0 0; 6 0 0 1 9 5 0 0 0; 0 9 8 0 0 0 0 6 0; 8 0 0 0 6 0 0 0 3; 4 0 0 8 0 3 0 0 1; 7 0 0 0 2 0 0 0 6; 0 6 0 0 0 0 2 8 0; 0 0 0 4 1 9 0 0 5; 0 0 0 0 8 0 0 7 9 ] # Solved ✓ # - Requires lvl 18 nesting # - @time: 0.020624 seconds (285.23 k allocations: 28.252 MiB, 11.97% gc time) grid = [ 0 0 0 1 0 5 0 6 8; 0 0 0 0 0 0 7 0 1; 9 0 1 0 0 0 0 3 0; 0 0 7 0 2 6 0 0 0; 5 0 0 0 0 0 0 0 3; 0 0 0 8 7 0 4 0 0; 0 3 0 0 0 0 8 0 5; 1 0 5 0 0 0 0 0 0; 7 9 0 4 0 1 0 0 0; ] # Solved ✓ # - Requires lvl 39 nesting # - @time: 0.036620 seconds (693.66 k allocations: 68.903 MiB, 16.69% gc time) grid = [ 1 0 0 0 0 0 0 0 0; 0 0 0 0 1 2 0 0 0; 0 0 0 0 0 0 0 0 1; 0 8 0 0 0 0 5 0 1; 0 5 0 0 8 9 0 0 7; 0 9 0 0 0 0 0 0 0; 0 0 0 0 0 0 6 5 0; 0 0 0 0 0 0 0 0 0; 0 0 0 0 3 0 0 0 0; ] # Solved ✓ # - Requires 2 lvl nesting # - @time: 0.011032 seconds (160.33 k allocations: 16.076 MiB, 20.82% gc time) grid = [ 0 0 9 0 7 0 4 3 6; 0 0 0 3 8 0 9 0 0; 6 3 2 5 0 0 0 0 0; 7 6 3 8 0 0 0 4 0; 4 0 0 1 0 7 0 0 8; 0 2 0 0 0 3 5 6 7; 0 0 0 0 0 8 2 7 3; 0 0 6 0 3 9 0 0 0; 3 5 8 0 1 0 6 0 0; ] @assert size(grid) == (9, 9) @time res = solve(grid) res