Monads in Scala Part Two: More Maybes
2013-09-10 21:53
381 查看
http://scabl.blogspot.com/2013/03/monads-in-scala-2.html
Monads in Scala Part Two: More Maybes
In the first installment of this series, we translated the examples from the first
page of thechapter on monads in the Haskell
wikibook from Haskell to Scala. In this blog post, we will continue on with the second page. This page continues
with more Maybe examples, which will help us to generalize our monadic laws test for Maybes a bit. We'll also learn about kleisli
composition, which composes functions suitable for passing to flatMap.
All the source code presented here is available here on GitHub. I'm compiling against Scala 2.10.1. Some of
this code will not compile under Scala 2.9.x.
Safe Functions
The first section of the wiki page discusses safe functions - versions of mathematical functionsthat normally produce a Just wrapping the result, but that produce MaybeNotwhen the input is not within the range of that mathematical
function. (As a reminder, I am using MaybeNot where the Haskell examples using Nothing, to avoid confusion with the Nothing found
in the Scala library.) This will allow us to "floor out" when chaining mathematical operations - anything along the way that produces a MaybeNot will cause the MaybeNot to
propagate right on down the chain.
I'm not sure what Haskell's numeric types look like, but we do already get this kind of behavior when using Java doubles that underlie Scala Doubles.
For example,sqrt(-1.0d) will produce double value NaN, which stands for "not a number". If we then take the log of that result
- log(NaN) - we will still get NaN. Basically any math operation where one of the operands is NaN will
produce a NaN. So safe math functions may not be as useful in Java or Scala, but it's a good example, so let's roll with it.
Here are my safe versions of sqrt and log:
12345 | def safeSqrt(d: Double): Maybe[Double] = if (d >= 0) Just(scala.math.sqrt(d)) else MaybeNot def safeLog(d: Double): Maybe[Double] = if (d > 0) Just(scala.math.log(d)) else MaybeNot |
with ❤ by GitHub
Now when mathematicians say things like "log square root", they mean you should take the square root first, and then take the log of that. Some people, like myself, might find this confusing, so I wanted to point that out. The initial implementation of safeLogSqrtin
the example simply checks the outcome of sqrt before passing it on to log, like so:
12345 | def safeLogSqrt0(d: Double): Maybe[Double] = safeSqrt(d) match { case Just(d) => safeLog(d) case MaybeNot => MaybeNot } |
with ❤ by GitHub
This is a mouthful, especially when the unsafe version is as simple as this:
def unsafeLogSqrt(d: Double): Double = log(sqrt(d))
Of course, we want to use flatMap (>>= in Haskell) instead of manually testing ifsafeSqrt produced
a Just or a MaybeNot. The Haskell example presented here looks like this:
12 | def safeLogSqrt1(d: Double): Maybe[Double] = Just(d) flatMap safeSqrt flatMap safeLog |
with ❤ by GitHub
Which, according to the "left unit" Monad law, can be simplified to this:
12 | def safeLogSqrt2(d: Double): Maybe[Double] = safeSqrt(d) flatMap safeLog |
with ❤ by GitHub
I'm not sure why the Haskell example uses the longer form here. My guess is that it's because return x >>= safeSqrt >>= safeLog looks sexier than safeSqrt(x)
>>= safeLog, but that's just my best guess.
Before moving on, here's a version of safeLogSqrt that uses a for loop:
12 | def safeLogSqrt3(d: Double): Maybe[Double] = for (s <- safeSqrt(d); l <- safeLog(s)) yield l |
with ❤ by GitHub
Kleisli Composition
The Haskell wikibook points out that you can produce an unsafe logSqrt function by simply composing the unsafe functions log and sqrt.For function composition, mathematicians use a little dot like this: ∘. In Haskell, the composition is done like this:
unsafeLogSqrt = log . sqrt
The function on the right is the one that gets called first. In Scala, it would look something like this:
12 | val unsafeLogSqrt: (Double) => Double = scala.math.log _ compose scala.math.sqrt _ |
with ❤ by GitHub
Let's step back and examine what's going on here. In Scala, when we follow a function's name with an underscore, we are indicating that we are talking about the function itself, and not trying to apply that function. scala.math.log is
a scala.Function1, and as such, has a compose method that takes a scala.Function1 as
an argument. The result is a new function that behaves by calling sqrt first, and then calling log on the result of that.
At this point, the wikibook brings in the Kleisli composition operator, <=<,
and shows how you can define a safeLogSqrt with similar brevity, like so:
safeLogSqrt = safeLog <=< safeSqrt
This is similar to function composition, but ordinary function composition will not work, because the output from safeSqrt is Maybe[Double],
and the input to safeLog isDouble. So we really want to chain these together in a flatMap style,
as in the body of a Scala for loop.
As a first attempt at a kleisli composition operator in Scala, let's write a method that takes in two functions as arguments, and produces the kleisli composition of those two functions:
1234567 | def kleisliCompose[A, B, C]( f: (B) => Maybe[C], g: (A) => Maybe[B]):(A) => Maybe[C] = { a: A => for (b <- g(a); c <- f(b)) yield c} |
with ❤ by GitHub
We are producing a function that takes an A as input and produces a Maybe[C]. We first apply g,
which takes an A as input and produces a Maybe[B]. Then we flatMap theMaybe[B] with f,
which takes a B as input and produces a Maybe[C]. Now we can define safeLogSqrt as
follows:
12 | def safeLogSqrt4(d: Double): Maybe[Double] = kleisliCompose(safeLog _, safeSqrt _)(d) |
with ❤ by GitHub
This is nice, but wouldn't it be better if we could achieve this using an infix notation, to match scala.math.log _ compose scala.math.sqrt _? We cannot simply add a kleisliCompose method
to Function1. What we do in Scala >= 2.10 in these circumstances is build an implicit class that contains the desired new method. Let's take a look at our implicit class, and then break it down:
123456 | implicit class MaybeFunction[B, C](f: (B) => Maybe[C]) { def kleisliCompose[A](g: (A) => Maybe[B]): (A) => Maybe[C] = { a: A => for (b <- g(a); c <- f(b)) yield c }} |
with ❤ by GitHub
The kleisliCompose method itself is the same, except that type parameters B and C,
and parameter f, have been lifted out of the method and into the enclosing class. TheMaybeFunction constructor takes a single Function1 as
argument - in particular, aFunction1 that returns a Maybe.
Let's take note that MaybeFunction is implicit, and that it has a methodkleisliCompose.
Now, when the compiler encounters something like safeLog _ kleisliCompose safeSqrt _, and tries to resolve the method call, it does not immediately find a method kleisliCompose in Function1,
which is the type ofsafeLog _. Before it gives up, it looks for any implicit conversions it could use to transform safeLog _ into
that has a method named kleisliCompose with an applicable signature. Assuming our MaybeFunction class is in the right scope, it
will implicitly construct a MaybeFunction from the Function1, and callkleisliCompose on
that.
This Pimp my Library pattern is available in languages like Ruby and Groovy via meta-programming. In Scala, it's done
at compile-time in a type-safe way. While implicit classes are newly available in Scala 2.10, the same effect has been achieved for years with implicit functions. The behavior of implicit classes is more controlled than that of implicit functions, as the target
of the type conversion is constrained to be the new, implicit class.
So let's go ahead and use this pimped Function1 to define our safeLogSqrt function:
12 | def safeLogSqrt5(d: Double): Maybe[Double] = (safeLog _ kleisliCompose safeSqrt _)(d) |
with ❤ by GitHub
Testing Safe Functions
What about testing all the safe functions we have defined above? First of all, we could write tests that assure that all of the versions of safeLogSqrt presented above produce the same results.To do this, we first need to define a set of test data to use. To be sure to test every outcome, I've included input values for which safeLogSqrt will return both aJust and
a MaybeNot. I've also included a value (0) for which safeSqrt will
return aJust, but safeLogSqrt will return a MaybeNot. So I'm
covering the three major code paths: Double to Just[Double] to Just[Double]; Double to Just[Double] toMaybeNot;
and Double to MaybeNot to MaybeNot.
12345678 | /** A series of doubles useful for testing safe methods. */val doubles = List[Double]( -2, // log, sqrt produce NaN -1, // log, sqrt produce NaN 0, // log produces -Infinity, sqrt produces 0 0.5, // log produces negative 1, // log produces 0 2) // keeping things positive |
with ❤ by GitHub
Now the test. Just a word of note here, I am switching my tests from just using asserts to using ScalaTest's ShouldMatchers.
I'm using "org.scalatest" %% "scalatest" % "2.0.M5b", but any recent version should work. I have 5 versions ofsafeLogSqrt, so
testing four pairs will do:
12345678910111213141516171819202122232425 | package maybe import org.scalatest.FlatSpecimport org.scalatest.matchers.ShouldMatchers class SafeMathSpec extends FlatSpec with ShouldMatchers { import safe._ behavior of "various implementations of safe.safeLogSqrt" they should "agree over a range of test input" in { shouldAgreeOverTestInputRange(safeLogSqrt1 _, safeLogSqrt2 _) shouldAgreeOverTestInputRange(safeLogSqrt1 _, safeLogSqrt3 _) shouldAgreeOverTestInputRange(safeLogSqrt1 _, safeLogSqrt4 _) shouldAgreeOverTestInputRange(safeLogSqrt1 _, safeLogSqrt5 _) } def shouldAgreeOverTestInputRange( safeLogSqrt1: (Double) => Maybe[Double], safeLogSqrt2: (Double) => Maybe[Double]) { doubles foreach { d => (safeLogSqrt1(d)) should equal (safeLogSqrt2(d)) } }} |
with ❤ by GitHub
Testing Maybe Obeys Monad Laws
In the last blog post, we developed tests to assure that Maybe obeyed the monadic laws in regards to some sample Person data. Wenow have another data set that we can test the Maybe monad against. But the old
version of the test is hard-coded to use Persondata. Let's generalize that into a function that takes the required test data as arguments. Then we can call this function for different sets
of test data. We can easily extrapolate over the types involved. We also pass a String description to include in a FlatSpecbehavior clause.
The function opens like this:
1234567 | def maybeShouldObeyMonadLaws[A, B, C]( testDataDescription: String, testItems: List[A], f: Function1[A, Maybe[B]], g: Function1[B, Maybe[C]]) { behavior of "Maybe monad with respect to " + testDataDescription |
with ❤ by GitHub
As you will recall, the left unit Monad law states that m >>= return = m. Let's assert this over all of our
test data:
12345678 | it should "obey left unit monadic law" in { testItems foreach { a => { Just(a) flatMap f } should equal { f(a) } } } |
with ❤ by GitHub
For right unit and associativity, we need to seed a list of all possible Maybes. We'll throwMaybeNot in with the rest of our test
data wrapped in Justs:
12345678910111213141516171819 | val maybes = MaybeNot +: (testItems map { Just(_) }) it should "obey right unit monadic law" in { maybes foreach { m => { m flatMap { Just(_) } } should equal { m } } } it should "obey associativity monadic law" in { maybes foreach { m => { m flatMap f flatMap g } should equal { m flatMap { a => f(a) flatMap g } } } } |
with ❤ by GitHub
We have a couple more Monad tests from last time. I won't go into the details, and you can check it out in the source
code. I want to get down to business, which is calling this function over our two data sets:
1234567891011 | maybeShouldObeyMonadLaws( "Person data", Person.persons, { p: Person => p.mother }, { p: Person => p.father }) maybeShouldObeyMonadLaws( "safe math operations", safe.doubles, safe.safeSqrt _, safe.safeLog _) |
with ❤ by GitHub
Lookup Tables
The next major section of the wikibook page is on lookup tables. A couple of things perplexed mehere. First, they are using a list of pairs as a lookup table. I did check, and it seems Haskell has a perfectly
serviceable Map class. Using a list of pairs instead of a library collection class would never occur to me. I'm probably spoiled by the quality of Scala's collection library. The second thing that seemed quite different to me was the fact that they are
putting raw strings and numbers into the map. There are two maps in their example that are semantically different, but both have the same type signature: string to string. Haskell has such a rich type system, it seems like they could easily do better than
this. They probably made these choices to keep the example simple, and to focus the discussion on the use of Monads. But it wouldn't be natural Scala without using case classes and maps. Typing the data is trivial in Scala:
1234 | case class Name(name: String)case class Number(number: String)case class Registration(registration: String)case class TaxOwed(taxOwed: Double) |
with ❤ by GitHub
Before writing any lookup methods, we need to seed the maps. First, we'll generate a hundred or so pieces of test data for each type:
12345678910 | private val seedTestData: List[Int] = (1 to 100).toListprivate def intToName(i: Int): Name = Name(i.toString)private def intToNumber(i: Int): Number = Number(i.toString)private def intToRegistration(i: Int): Registration = Registration(i.toString)private def intToTaxOwed(i: Int): TaxOwed = TaxOwed(i.toDouble) val names: List[Name] = seedTestData map intToNameval numbers: List[Number] = seedTestData map intToNumberval registrations: List[Registration] = seedTestData map intToRegistrationval taxesOwed: List[TaxOwed] = seedTestData map intToTaxOwed |
with ❤ by GitHub
We're going to generate the maps in such a way that there is at least one test value for every permutation of Just and MaybeNot.
First, we agree that in every key/value pair in a map, the key and the value relate back to the same Int. Then, we filter each map down to the pairs whose underlying Int is
divisible by some small prime. Using 2, 3, and 5 over a hundred element test set will give all permutations. Here's how it looks:
123456789101112131415161718192021 | private def seedMap[A, B]( intFilter: (Int) => Boolean, intToA: (Int) => A, intToB: (Int) => B): Map[A, B] = { import language.postfixOps seedTestData filter intFilter map { i => intToA(i) -> intToB(i) } toMap} // only those divisible by 2 are in the number map private val numberMap: Map[Name, Number] = seedMap(_ % 2 == 0, intToName, intToNumber) // only those divisible by 3 are in the registration map private val registrationMap: Map[Number, Registration] = seedMap(_ % 3 == 0, intToNumber, intToRegistration) // only those divisible by 5 are in the tax map private val taxOwedMap: Map[Registration, TaxOwed] = seedMap(_ % 5 == 0, intToRegistration, intToTaxOwed) |
with ❤ by GitHub
We need to add a conversion from Option to Maybe to translate the result of Map#get.
A natural place for this is in the Maybe companion object:
123456 | object Maybe { def apply[A](option: Option[A]): Maybe[A] = option match { case Some(a) => Just(a) case None => MaybeNot }} |
with ❤ by GitHub
We'll define the single-level lookup methods in terms of a generic lookup function that also takes the Map as parameter:
1234567891011 | private def lookup[A, B](map: Map[A, B])(a: A): Maybe[B] = Maybe(map.get(a)) def lookupNumber(name: Name): Maybe[Number] = lookup(numberMap)(name) def lookupRegistration(number: Number): Maybe[Registration] = lookup(registrationMap)(number) def lookupTaxOwed(registration: Registration): Maybe[TaxOwed] = lookup(taxOwedMap)(registration) |
with ❤ by GitHub
Finally, we can implement the lookup methods that span multiple dictionaries with forloops or flatMaps:
123456789101112131415161718 | def lookupRegistration1(name: Name): Maybe[Registration] = for ( number <- lookupNumber(name); registration <- lookupRegistration(number)) yield registration def lookupRegistration2(name: Name): Maybe[Registration] = lookupNumber(name) flatMap lookupRegistration def lookupTaxOwed1(name: Name): Maybe[TaxOwed] = for ( number <- lookupNumber(name); registration <- lookupRegistration(number); taxOwed <- lookupTaxOwed(registration)) yield taxOwed def lookupTaxOwed2(name: Name): Maybe[TaxOwed] = lookupNumber(name) flatMap lookupRegistration flatMap lookupTaxOwed |
with ❤ by GitHub
More Testing Maybe Obeys Monad Laws
We can easily add two more Monad laws tests, like so:1234567891011 | maybeShouldObeyMonadLaws( "looking up Numbers and Registrations by Name", lookup.names, lookup.lookupNumber _, lookup.lookupRegistration _) maybeShouldObeyMonadLaws( "looking up Registrations and TaxesOwed by Number", lookup.numbers, lookup.lookupRegistration _, lookup.lookupTaxOwed _) |
with ❤ by GitHub
Conclusion
These were fun examples that are common use-cases for a Maybe class. We learned about kleisli composition, and solidified our understanding of the Maybe Monad. As a Java refugee, it took me a while to get used to Scala's Option classas an alternative tonulls. But once you get the hang of it, it's really handy. Methods like map, flatMap,orElse,
and getOrElse, are so much more elegant than the equivalent operations with Java nulls. All that being said, I am looking forward
to a non-Maybe example.
at 23:09
ThisBlogThis!Share
to TwitterShare
to Facebook
Labels: haskell, kleisli
composition, monads, monads
in scala, scala
相关文章推荐
- Monads in Scala Part One: Maybe[Person]
- scabl: Monads in Scala Part Three: Lisst[A]
- Building a WPF Sudoku Game: Part 4 - Building a Least Privilege Plug-in System and Even More Custom Controls
- two or more web modules defined in the configuration have the same context root
- eclipse 配置scala问题-More than one scala library found in the build path
- Asynchronous Programming, Monads and Continuations in C#, F# and Scala
- gcc error: two or more data types in declaration specifiers
- display two or more logcat filters at the same time in Android Studio
- error: two or more data types in declaration specifiers原因及解决方法
- 使用VS进行工作流开发系列博客3-Developing Workflows in VS: Part 2 - Planning Your Workflow: Two Things to Keep in Mind
- LInux C语言错误 two or more data types in declaration of `main'
- Memory space manipulating in Java(Section two:Process Memory Model on AIX part three-The Large & Very Large Memory Model)
- Two or more Web modules defined in the configuration have the same context root (/bms).
- eclipse 配置scala问题-More than one scala library found in the build path
- Memory space manipulating in Java(Section two:Process Memory Model on AIX part one-Terminologies)
- two or more web modules defined in the configuration have the same context root解决方案
- eclipse 配置scala问题-More than one scala library found in the build path
- error: two or more data types in declaration specifiers
- two or more web modules defined in the configuration have the same context root解决方案
- Building a WPF Sudoku Game: Part 4 - Building a Least Privilege Plug-in System and Even More Custom Controls (zz)