The work of the NFA→DFA conversion is done by the following function determinize_cc. The function takes a list of NFA states, the dictionary of the already built states, and returns a pair ([dfa_state], updated_dictionary) where [dfa_state] is a singleton list.
The front end of the NFA→DFA transformer: finAuDeterminize states = fst $ determinize_cc states []
At the heart of the credit card transform is the phrase from the above code:
The phrase expresses the addition to the dictionary of the converted_states of a dfa_state that we haven't built yet. The computation of the dfa_state is written 4 lines below the phrase in question. Because (,) is non-strict in its arguments and locate is non-strict in its result, we can get away with a mere promise to "pay".
Note that the computation of the dfa_state needs t0' and t1', which in turn rely on converted_states1. This fact shows that we can tie the knot by making a promise to compute a state, add this promise to the dictionary of the built states, and use the updated dictionary to build the descendants. Because Haskell is a non-strict language, we don't need to do anything special to make the promise. Every computation is Haskell is by default a promise.
We can print the DFA for dom18 to see what we've got:
which is indeed a DFA (which happens to be minimal) recognizing (0+1)* - 1(0+1)*
We can run the determinized FA using the same function finAuAcceptStringQ:
The complete code for this example is in http://pobox.com/~oleg/ftp/Haskell/CCard-transform-DFA.lhs.
Another example of tying a knot in the case of forward links, by using a fixed-point combinator, is discussed in http://www.mail-archive.com/[email protected]/msg10687.html.
Improved error-recovery for transformations of cyclic graphs
(...some observations about the aforementioned forward links/fixed-point combinator example)
For a long time, I've had an issue with Oleg's reply to Hal Daume III, the "forward links" example. The problem is that it doesn't really exploit laziness or circular values. It's solution would work even in a strict language. It's simply a functional version of the standard approach: build the result with markers and patch it up afterwards.
It is a fairly clever way of doing purely something that is typically done with references and mutable update, but it doesn't really address what Hal Daume III was after. Fixing Hal Daume's example so that it won't loop is relatively trivial; simply change the case to a let or equivalently use a lazy pattern match in the case. However, if that's all there was to it, I would've written this a long time ago.
The problem is that it no longer gives you control of the error message or anyway to recover from it. With GHC's extensions to exception handling you could do it, but you'd have to put readDecisionTree in the IO monad to recover from it, and if you wanted better messages you'd have to put most of the parsing in the IO monad so that you could catch the error earlier and provide more information then rethrow.
What's kept me is that I couldn't figure out a way to tie the knot when the environment had a type like, Either String [(String,DecisionTree)]. This is because it's impossible for this case; we decide whether to return:
- Left "could not find subtree" or
- Right someValue
and therefore whether the environment is Left or Right based on whether we could find the subtree in the environment. In effect, we need to lookup a value in an environment we may return to know whether to return it. Obviously this is a truly circular dependency.
This made me think that Oleg's solution was as good as any other and better than some (actually, ironically Oleg's solution also uses a let instead of a case, however, there's nothing stopping it from being a case, but it still would provide no way to recover from it without effectively doing what is mentioned below). Recently, I've thought about this again and the solution is obvious and follows directly from the original definition modified to use let.
It doesn't loop because only particular values in the lookup table fail, in fact, you might never know there was a variable lookup error if you didn't touch all of the tree. This translates directly into the environment having type [(String,Either String DecisionTree)].
There are several benefits to this approach compared to Oleg's:
- it solves my original problem, you are now able to specify the error messages (Oleg's can do this),
- it goes beyond that (and beyond Hal Daume's original "specification") and also allows you to recover from an error without resorting to the IO monad and/or extensions (Oleg's can't do this),
- it does implicitly what Oleg's version does explicitly,
- because of (3) it shares properly while Oleg's does not,
- both the environment and the returned value are made up of showable values, not opaque functions,
- it requires less changes to the original code and is more localized than Oleg's solution; only the variable lookup and top-level function will need to change.
To recover, all one needs to do is make sure all the values in the lookup table are Right values. If they aren't, there are various ways you could collect the information; there are also variations on how to combine error information and what to provide. Even without a correctness check, you can still provide better error messages for the erroneous thunks.
A possible variation that loses some of the benefits, is to change the DecisionTree type (or have a different version, IndirectComposite comes to mind here) that has Either ErrorInfo ErrorDecisionTree subnodes, which will allow you to recover at any time (though, if you want to make a normal DecisionTree out of it you will lose sharing). Also, the circular dependency only comes up if you need to use the environment to decide on an error.
For example:
- a plain old syntactic parse error can cyclicly use an Either ErrorInfo [(String,DecisionTree)] perfectly fine (pass in fromRight env where fromRight ~(Right x) = x). It will also work even with the above approach giving the environment the type Either [(String,Either ErrorInfo DecisionTree)]. Below is code for a simplified scenario that does most of these things,
| input | outputs |
| define x *y *y define y a b | Right [("x",Right "a b a b"),
("y",Right "a b")]
|
| define x *y *y aousht define y a b | Left "parse error with: aousht"
|
| define x *y *z define y a b define z *w | Right [("x",Left "couldn't find w in z"),
("y",Right "a b"),
("z",Left "couldn't find w in z")]
|
- Consider a tree Y that contains the subtree X twice:
.png)


