4 Lazy pattern matching
4.1 Lazy pattern matching in theory
In previous sections we only considered strict ML.
In a strict language we can define ML pattern matching as a predicate
operating on terms, which also are the values of program expressions.
In other words, pattern matching apply to completely evaluated
expressions, or normal forms.
Studying pattern matching in a lazy language such as Haskell requires a
more sophisticated semantical setting.
Essentially, lazy language manipulate values that are known partially
and, more significant to our study, pattern matching operates on
such incomplete values.
v ::= | | | | | Partial Values |
| Ω | | | | Undefined value |
| c(v1,v2, … ,va) | | | | (constructor) head-normal form. |
|
Definition 1 of the instance relation for patterns and
values apply unchanged to partial values (we have _ ≼ Ω,
value Ω possesses all types). Thus we keep the same notation
≼, and maintain Definition 3 of the
instance relation for matrices.
However we cannot keep Definition 2 of ML pattern matching.
Example 3
Let us consider a simple example
2:
case e of True -> 1 | _ -> 2.
If the root symbol of expression
e is not a constructor, its partial
value is Ω.
Then, since
True ⋠Ω and _ ≼ Ω,
the value of the whole expression is
2.
But, if we
compute
e further, its value may become
True. Then,
the value of the whole expression becomes
1.
Something is wrong, since the value of the whole expression changed
from
1 to
2.
More generally, partial values and computation interact.
Let us consider some expression e.
If the root symbol of expression e is a constructor c, then
expression e is a head-normal form and we express the “current value”
of e as c(v1,v2, … ,va)
— see Huet and Lévy [1991] for a more precise treatment in the context of term
rewriting systems.
Otherwise, the “current value” of expression e is Ω.
Then, we consider various “current values” along the
evaluation of e. As constructors cannot be reduced, those values are
increasing according to the following precision ordering.
Definition 7 (Precision ordering)
Relation ≼
Ω is defined on pairs of values (
v,
w) as follows.
Ω | ≼Ω | w | | |
c(v1, … , va) | ≼Ω | c(w1, … , wa) | | iff (v1⋯va) ≼Ω(w1⋯wa) |
(v1⋯vn) | ≼Ω | (w1⋯wn) | | iff vi ≼Ωwi, for all i ∈ [1…n] |
|
To be of practical use, a predicate P that defines pattern matching
must be monotonic. That is, when P(v) holds,
P(w) also holds for all w such that v ≼Ωw.
With monotonic predicates, matching
decisions do not change during computations. One should notice that,
given any pattern p, the predicate p ≼ v is monotonic
in v. Example 3 shows that the predicate P ⋠v→ is not monotonic in general.
We thus need a new definition of pattern matching. For the moment, we
leave most of lazy pattern matching unspecified.
Definition 8 (General (lazy) pattern matching)
Let
P(
P,
v→) be a predicate defined over
pattern matrices
P and value vectors
v→, where the size
n
of
v→ is equal to the width of
P.
Row number
i in
P filters v→,
if and only if the following condition holds:
We call P the disambiguating predicate and
now look for sufficient conditions on P that account for our
intuition of pattern matching in a lazy language.
-
Pattern matching is deterministic, in the sense that at most one
clause is matched. Hence, for all P and v→, we assume:
- Matching the first row of a matrix reduces to the instance
relation. Hence, for all v→, we assume:
- We require predicate P to be monotonic in its value
component. That is, given any matrix P, for all value vectors
v→ and w→, we assume:
The three conditions above are our basic restrictions on P.
We further define UP and MP as U and M
(Definition 6) parameterized by P.
Now, given a definition of lazy pattern matching, we face
the temptation to assume that the computation of U described in
Section 3.1 still works for UP.
More precisely, by finding additional sufficient conditions on
predicate P we aim at proving UP = Urec.
Thus, we re-examine the proof of Proposition 2 in the context of
lazy pattern matching.
Base cases follow from basic restrictions.
-
By our first basic restriction and since
( ) ≼ (), we have
P(( ), ()) = False. Thus we have UP(( ), ()) =
False.
- By our second basic restriction, we directly get
UP(∅, q→ ) = True.
To prove the inductive cases, it suffices to reformulate the key
properties of Lemma 1, replacing P ⋠v→ by
P(P, v→). However, key properties now are rather assumed than
established.
Definition 9 (Key properties)
We say that predicate
P meets key properties when the following four
properties hold.
For any matrix
P,
constructor
c, and value vector
v→
such that
v1 =
c(
w1, … ,
wa), we assume:
P(P, | | )
⇐⇒ P(S(c, P), S(c, | | )).
(1) |
Additionally, for any value vector
v→, we assume:
P(P, (v1 v2⋯vn)) =⇒
P(D(P), (v2⋯vn)).
(2) |
Furthermore,
given any matrix
P, let Σ be set of the root
constructors of
P’s first column.
If Σ is not empty, then
for any constructor
c not in Σ and
any value vector (
w1⋯
wa v2⋯
vn), we assume:
P(D(P), (v2⋯vn)) =⇒
P(P, (c(w1, … , wa) v2⋯vn)).
(3) |
If Σ is empty,
then, for any value vector
v→, we instead assume:
P(D(P), (v2⋯vn)) =⇒
P(P, (v1 v2⋯vn)).
(4) |
It is not obvious that assuming key properties
suffices to prove that UP can be computed as U is,
since Ω does not show in the proof of Proposition 2.
Indeed, monotonicity plays some part here.
Proposition 3
We have
UP =
Urec.
Proof:
Base cases follow from basic restrictions;
while the proofs of all inductive cases in Proposition
2,
except 2-(a), apply unchanged.
Hence, we assume q1 to be a wildcard and the set Σ to be
a complete signature.
We need prove:
UP(P, ((_:t) q2⋯qn)) =
| | UP(S(ck,P), S(ck, | | )).
|
By (1), for any constructor ck in Σ and
any vector v→ such that v1 = ck(w1, … , wak), we have:
v→ ∈ MP(P, q→ )
⇐⇒ S(ck, v→) ∈ MP(S(ck, P), S(ck, q→ )).
Hence, a potential difficulty arises for vectors v→
in MP(P, q→ ), when v1 is Ω.
Then, by monotonicity of P (and ≼), the non-empty type axiom,
and for any constructor c (in the signature of
type t), there exists (w1⋯wa) such that
(c(w1, … , wa) v2⋯vn) ∈ MP(P, q→ ).
That is (by (1) in the forwards direction), UP(S(ck, P),
S(ck, q→)) holds for all ck in Σ.
□
As an immediate consequence of the proposition above, the
useless clause problem is now solved in the lazy case
(see second item in Proposition 1).
However, the exact formulation of exhaustiveness needs a slight change.
Reconsider Example 3.
case e of True -> 1 | _ -> 2
By definition of ≼, True ≼ Ω does not
hold, hence Ω cannot match first clause.
However,
P((
), Ω)
does not hold either, by monotonicity.
Hence, there is no row that filters value Ω and
Definition 4 would flag this matching as non-exhaustive,
a clear contradiction with our intuition of exhaustiveness.
Thus, we now directly define an exhaustive matrix P from the condition
UP(P, (_⋯_)) = False. That is,
P is exhaustive, if and only if for all vectors v→,
P(P, v→) does not hold.
By this new definition, the example is exhaustive:
for any value v in {Ω,False, True},
we have _ ≺ v. Thus we have:
Hence, by our first basic restriction, for all v, P(P, v) does not hold.
4.2 Lazy pattern matching, à la Laville
Laville’s definition of lazy pattern matching Laville [1991] stems
directly from the need of a monotonic P:
if we decide that some term is evaluated enough not to match a
pattern, then we want this to remain true when the term is evaluated
further.
By definition, matrix P and value v→ are incompatible,
written P #v→ , when
making v→ more precise cannot produce an instance of P. That is,
P #v→ means
Definition 10 (Lazy pattern matching (Laville))
Define
P(
P,
v→) =
P #
v→ in the generic
definition
8.
The first basic restriction follows by letting
w→ be v→ in the definition of P #v→,
the second restriction follows from ∅ ⋠w→ for
all w→, and monotonicity is a
consequence of the transitivity of ≼Ω.
Incompatibility is the most general P in the following sense:
for any predicate P, any matrix and
any value vector v→, we have.
For, if P and v→ are compatible (i.e. not incompatible), then there exists w→, with
v→ ≼Ωw→ and P ≼ w→.
Thus, by the first basic restriction, P(P, w→) does not hold,
and, by the monotonicity of P, P(P, v→) does not hold either.
Incompatibility is easily computed by the following rules.
c(p1, … , pa) | # | c′(v1, … , va′) | (where c ≠ c′) |
c(p1, … , pa) | # | c(v1, … , va) | iff (p1⋯pa) #(v1⋯va) |
(p1⋯pn) | # | (v1⋯vn) | iff there exists i ∈ [1…n], pi #vi |
(p1∣p2) | # | v | iff p1 #v and p2 #v |
|
P | # | | iff for all i ∈ [1…n],
p→ i #v→ |
|
It is routine to show that incompatibility meets key properties.
Hence, by Proposition 3,
algorithm Urec is correct with respect to Laville’s semantics.
Laville’s definition is quite appealing as a good, implementation
independent, definition of lazy pattern matching. However, there is a
slight difficulty: predicate P #v→ is not sequential
in the sense of Kahn and Plotkin [1978]
in v→ for any matrix P. This means that its compilation on an
ordinary, sequential, computer is problematic Maranget [1992], Sekar et al. [1992].
As a consequence, the Haskell committee adopted
another semantics for pattern matching.
Their definition is aware of the
presence of Ω and solves the difficulty by specifying
left-to-right testing order.
4.3 Pattern matching in Haskell
By interpreting the Haskell report Hudak et al. [1998] we
can formulate a pattern matching predicate for this
language. Matching can yield three different results: it may either
succeed, fail or diverge. Furthermore, matching of arguments is
performed left-to-right. We encode “success”, “failure” and
“divergence” by the three values T, F and ⊥, and
define the following H function.
H(_, v) | = | T |
H(c(p1, … , pa), Ω) | = | ⊥ |
H(c(p1, … , pa), c′(v1, … , va′)) | = | F
(where c ≠ c′) |
H(c(p1, … , pa), c(v1, … , va)) | = | H((p1⋯pa), (v1⋯va)) |
H((p1 p2⋯pn), (v1 v2⋯vn)) | = | H(p1, v1) ∧⊥H((p2⋯pn), (v2⋯vn)) |
H((), ()) | = | T |
|
Where the extended (left-to-right) boolean connectors
are defined as follows.
We ignore some of Haskell
patterns such as irrefutable patterns.
We also ignore or-patterns at the moment.
From this definition one easily shows the following two properties
on vectors:
H( | | , | | ) = T ⇐⇒ | | ≼ | |
H( | | , | | ) = F =⇒ | | # | |
=⇒ | | ⋠ | | .
|
We then interpret
the many program equivalences of section 3.17.3 in the Haskell report as
expressing a downward search for a pattern of which v→ is an
instance of:
H(∅, | | ) = F
H(P, | | ) = H( | | 1, | | ) ∨⊥
H(P[2…m], | | ).
|
Informally, H(P, v→ ) = T means “v→ is found to
match some row in P in the Haskell way”,
H(P, v→ ) = F means “no row of P is found to
be matched”, and H(P, v→ ) = ⊥ means “v→ is not
precise enough to make a clear decision”.
We can now formulate the Haskell way of pattern matching
in our setting.
Definition 11 (Haskell pattern matching)
Define
P(
P,
v→ ) to be
H(
P,
v→ ) =
F in the generic
Definition
8.
One easily checks that predicate H(P, v→ ) = F meets all
basic restrictions and key properties
(decomposing along first columns is instrumental).
Hence, save for or-patterns, algorithm Urec also computes the utility
of pattern matching in Haskell.
4.4 Or-patterns in Haskell
As this work is partly dedicated to specific warnings
for or-patterns, we wish to enrich Haskell matching with or-patterns.
The H function is extended to consider or-patterns, sticking
to left-to-right bias:
H((p1∣p2), v) = H(p1, v) ∨⊥H(p2, v).
|
Semantical consequences are non-negligible, since the
equivalence H(p→, v→ ) = T ⇐⇒ p→ ≼ v→
does not hold any more, as can be seen by considering
H((True∣_), Ω) = ⊥.
However, the left-to-right implication still holds, and the
following definition of Haskell pattern matching makes sense.
Definition 12 (Haskell matching with or-patterns)
Let
P be a pattern matrix and
v→ be a value vector.
Vector
v matches row
i in
P, if and only if the following
proposition hold:
H(P[1…i), | | ) = F ∧
H( | | i, | | ) = T.
|
From this definition of matching, we define the utility predicate
UH
and the set of matching values
MH as we did in Definition
6.
The definition above is not the application of the generic
definition 8 to P(P,v→ ) = (H(P, v→ ) =
F ), because we have written H(p→ i, v→ ) =
T in place of the instance relation p→ i ≼
v→ . However,
as illustrated by the following lemma,
H(q→ , v→ ) = T and
q→ ≼ v→ are closely related.
Lemma 2
Let
p be a pattern and
v be a value such that
H(
p,
v) = ⊥.
There exists value
w such that
v ≼
Ωw and
H(
p,
w) ≠ ⊥.
Furthermore, if
p ≼
v, then
H(
p,
w) =
T.
Proof:
We first prove the existence of
w by induction on
p.
-
If p = c(p1, … , pa), then, by hypothesis H(p, v) = ⊥,
we have two sub-cases.
-
Value v is c(v1, … , va), with
H(pi, vi) = ⊥ for i in some (non-empty) index set I.
Applying induction hypothesis
to all such i yields values v′i such that
vi ≼Ωv′i and H(pi, v′i) ≠⊥.
Then, we define w = c(w1, … , wa) where wi = v′i
for i ∈ I, and wi = vi otherwise.
- Otherwise, value v is Ω.
Let v′ be c(Ω, … , Ω).
If H(p, v′) is not ⊥, then we define w = v′.
Otherwise, we reason as in the previous case.
- If p = (q1∣q2), we have two sub-cases.
-
If H(q1, v) = F and H(q2, v) = ⊥,
then (by induction) there exists a value w such that H(q2, w) ≠
⊥. Finally, by definition of H, we have
H(p, w) = H(q2, w) ≠ ⊥.
- If H(q1, v) = ⊥, then (by induction) there exists
a value w′, such that v ≼Ωw′ and H(q1, w′) ≠ ⊥
If H(q1, w′) = T, we define w to be w′ and we conclude.
Otherwise, H(q1, w′) = F and thus
H((q1∣q2),w′) = H(q2, w′).
Then we conclude, either directly, or by induction in the case where
H(q2, w′) = ⊥.
Additionally,
H(
p,
w) =
T holds under the extra hypothesis
p
≼
v, by
H(
p,
w) =
F =⇒
p ⋠
w and by the
monotonicity of ≼.
□
The lemma above suffices to relate
Haskell matching to generic lazy matching, and thus to compute the
utility of Haskell matching.
Proposition 4
We have UH = Urec.
Proof:
We note
UH≼ the utility predicate that results from the generic
definition, taking
P(
P,
v→ ) to be
H(
p,
v→ ) =
F.
From generic proposition
3, we have
UH≼ =
Urec.
(Formally we check that predicate
H(
P,
v→ ) =
F
meets key properties even when some of the patterns in
P are
or-patterns).
Then we show UH = UH≼.
From the implication
H(q→, v→ ) = T =⇒ q→ ≼ v→ ,
we have UH(P, q→ ) =⇒
UH≼(P,q→ );
the converse implication follows from Lemma 2.
□
It is time to clearly stress on some important consequence of
propositions U = Urec, UP = Urec and UH = Urec:
all our utility predicates are in fact equal.
This suggests a quite powerful and elegant“semantical”
proof technique, which we immediately demonstrate.
Lemma 3 (Irrelevance of column order)
Let
P be a pattern matrix and
q→ be a pattern vector.
By permuting the same columns in both
P and
q→ we get matrix
P′
and vector
q→ ′.
Then we have
UH(
P,
q→ ) =
UH(
P′,
q→ ′).
Proof:
Consider strict matching.
Since predicates
P ⋠v→ and q→ ≼ v→ do not depend on
column order, we have U(P, q→ ) = U(P′, q→ ′).
From UH = U, we conclude.
□
First observe that proving the irrelevance of column order for Haskell
matching by induction on matrix and pattern structure would be quite
cumbersome.
Also notice that the lemma above is not obvious,
since Haskell matching depends upon column order
in a strong sense.
For instance, let P, q→ , P′ and q→ ′ be as follows.
P = | ⎛
⎝ | | ⎞
⎠ |
| | = (False _)
P′ = | ⎛
⎝ | | ⎞
⎠ |
| | ′ = (_ False).
|
Matrix P′ (resp. vector q→ ′) is P (resp. q→ )
with columns swapped.
The sets of matching values are as follows.
| = | { (False Ω),
(False True),
(False False) } |
| = | {
(True False),
(False False) } |
|
Swapping the components of the elements of MH(P, q→ ) does
not yield MH(P′, q→ ′), since (Ω False)
does not belong to MH(P′, q→ ′).
However, some of the values of the MH sets above are related by the
permutation. Moreover, the equality UH=U can be seen as
telling us that there is at least one such value.
From now on, we simply write U for any utility predicate, regardless
of semantics. We also write “algorithm U” for Urec.