Lecture 07

## IO
                    Lecture 07

---

### Information

About Individual Projects

---

### Restrictions

- Web HTTP server
                    - Several functionalities defined by __you__
                    - External storage (PostgreSQL, Redis, Kafka, ...)
                    - Clear instruction how to build & run
                    - Clear instruction how to use

---

### Grading

You may request 2 intermediates reviews (no effect on your grade)

---

### Grading Criteria

- Correctness in terms of staged functionality
                    - Correctness in terms of usage of libraries and tools
                    - Readable code organization
                    - Tests

---

### Submission

Github / Gitlab repository

---

Links:
                    - [Full Information about Projects](https://hackmd.io/@jOaKtRJ0S7ixy4dRtK5_AQ/HkwG-7PPt)
                    - [Petstore Example](https://github.com/pauljamescleary/scala-pet-store)

We know that

- Total Functions are _good_
                    - Pure Functions are _good_
                    - Referential Transparency is _good_

but ...
                    ---

It is impossible to write a _useful_ program without making a side-effect

---

Unless we want to transform out computer into a stove

---

What to do then?

How can we express a side-effect in the type system?
                    ---

Suppose, we have a special type `World`, that would include a snapshot of the real world with
                    everything that exists:

```scala [1-15]
                    // Type alias - just renaming
                    type World
                    ```

---

How then would a function that makes a side-effect look like?

What signature will it have?

For example, what signature will function writing to console have? reading from console?
                    ---

```scala [1-15]
                    // Take a snapshot of the world as an argument
                    // Extract console from the world,
                    // and put there a new line
                    // Returning a new snapshot of the world
                    def putStrLn(world: World, input: String): World = ???

// Take a snapshot of the world as an argument
                    // Extract console from the world,
                    // and read a new line from it
                    // Returning a new snapshot of the world
                    def readStrLn(world: World): (World, String) = ???
                    ```

---

Now, let's write program using these functions:

```scala [1-15]
                    def program(world: World): World = {
                      val (newWorld: World, stranger: String) =
                        readStlLn(world)
                      val greeting: String = s"Hello, $stranger"
                      putStrLn(world, greeting)
                    }
                    ```
                    ---

Now, all we have to do is creating a instance of `World`

However, we cannot do that

Let's rewrite our program without actually mentioning the `World`
                    explicitly

---

```scala [1-15]
                    def putStrLn(input: String): World => World = ???
                    def readStrLn: World => (World, String) = ???

def program: World => World = { world =>
                      val (newWorld: World, stranger: String) =
                        readStlLn(world)
                      val greeting: String = s"Hello, $stranger"
                      putStrLn(greeting)(world)
                    }
                    ```

---

We still are using `world` as an argument inside our program.

Let's define a new type that will hide it

---

```scala [1-15]
                    type WithSideEffect[A] = World => (World, A)
                    ```

---

```scala [1-15]
                    object WithSideEffect {
                      def pure[A](a: A): WithSideEffect[A] =
                        world => (world, a)

def flatMap[A, B](
                        fa: WithSideEffect[A],
                        f: A => WithSideEffect[B]
                      ): WithSideEffect[B] = { world =>
                        val (newWorld, res) = fa(world)
                        f(res)(newWorld)
                      }
                      // ...
                    ```

---

```scala[1-15]
                      // ...
                      def map[A, B](
                        fa: WithSideEffect[A],
                        f: A => B
                      ): WithSideEffect[B] =
                        flatMap[A, B](fa, f.andThen(pure[B]))
                    }
                    ```

---

Let's use define our program through `map` and `flatMap`:

```scala [1-15]
                    def putStrLn(input: String): WithSideEffect[Unit] = ???
                    def readStrLn: WithSideEffect[String] = ???

def program: WithSideEffect[Unit] = {
                      readStrLn.flatMap { stranger =>
                        val greeting: String = s"Hello, $stranger"
                        putStrLn(greeting)
                      }
                    }
                    ```

---

Scala has special syntax called `for-comprehension` to make several
                    nested `flatMap` and `map` calls readable

It allows writing programs in imperative style

---

Rewriting one more time:

```scala [1-15]
                    def program: WithSideEffect[Unit] =
                      for {
                        stranger <- readStrLn
                        // Local variable - same as val
                        greeting: String = s"Hello, $stranger"
                        _ <- putStrLn(stranger)
                      } yield ()
                    ```

---

We erased all mentions of the `World` type

However, our program (type `WithSideEffect`) is still just a function

It does not do anything useful, unless we explicitly call it

---

We need something to run it. Suppose we have an alternative entrypoint:

```scala [1-15]
                    def run(args: Array[String]): WithSideEffect[Unit] =
                      program

def main(args: Array[String]): Unit = {
                      // somehow calling run(args)
                    }
                    ```

---

This function is called the end of the world

It represents a boundary between pure programs and its impure runtime

Side-effects in this setting are evaluated somewhere where end of the world is called

Thus, in our code we only operate with pure functions and referentially transparent expressions

---

```scala [1-15]
                    def programTwo: WithSideEffect[Unit] =
                      for {
                        stranger <- readStrLn
                        greeting: String = s"Hello, $stranger"
                        _ <- putStrLn(greeting)
                        _ <- putStrLn(greeting)
                        _ <- putStrLn(greeting)
                      } yield ()
                    // Printing 3 times: Hello, $stranger
                    ```

---

```scala [1-15]
                    def programThree: WithSideEffect[Unit] =
                      for {
                        stranger <- readStrLn
                        greeting: String = s"Hello, $stranger"
                        print = putStrLn(greeting)
                        _ <- print
                        _ <- print
                        _ <- print
                      } yield ()
                    // Printing 3 times: Hello, $stranger
                    ```

---

Everytime we see that function returns `WithSideEffect`, we know that this function
                    will make a side-effect during evaluation

We explicitly declare side-effects this way, while maintaining purity and referential
                    transparency

---

Moreover, if we call such function, we don't have a choice - we must return `WithSideEffect` in
                    our function

```scala [1-15]
                    def complexComputation: WithSideEffect[BigDecimal] = ???
                    def pureLogic(from: BigDecimal): String = ???

def myProgram: WithSideEffect[String] =
                      map(complexComputation, pureLogic)
                    ```

---

Side-effects in such notation can contain anything: reading from console, saving data in database,
                    calling another service through HTTP

We still should separate our business logic in functions that do not return `WithSideEffect` in order
                    to maintain __Local Reasoning__

---

__Local Reasoning__ - ability to think about correctness of the function _locally_, using
                    only its signature and body

---

Overall concept of declaring computations with side-effects is called `IO` monad

```scala [1-15]
                    type IO[A] = WithSideEffect[A]
                    ```
                    ---

Scala ecosystem has several implementations of such concept:

- `cats.effect.IO`
                    - `monix.Task`
                    - `zio.ZIO`

Moreover, all them enable not only synchronous computations. They also enable concurrent, too. However,
                    this is a topic for another lecture

## Laziness

Now it is not about you

---

Scala has several built in mechanisms for lazy evaluation

First - `lazy val`. They are evaluated only when they are first called

---

```scala [1-15]
                    lazy val later: Int = {
                      println("One")
                      1
                    }
                    println("Two")
                    println(s"Three $later")
                    println(s"Four $later")

// std.out: "Two"
                    //          "One"
                    //          "Three 1"
                    //          "Four 1"
                    ```

---

Second mechanism - _by-name_ argument pass:

```scala [1-15]
                    @tailrec
                    def whileLoop(condition: => Boolean)(action: => Unit): Unit =
                      if (condition) {
                        action
                        whileLoop(condition)(action)
                      }
                    ```

`=>` arguments will be evaluated each time they are called

---

Example:

```scala [1-15]
                    var x = 0

whileLoop (x < 3) {
                      x += 1
                      println(s"Current $x")
                    }
                    // Current: 1
                    // Current: 2
                    // Current: 3
                    ```

## Implementation of `IO`

How it works in Scala?

---

Special type for function: __`() => T`__

```scala [1-15]
                        trait IO[+T] {
                          def impureCompute(): T
                        }
                        ```

---

```scala [1-15]
                        def putStrLn(line: String): IO[Unit] =
                          () => println(line)

putStrLn("Hello, world!")  // does nothing
                        putStrLn("Hello, world!").impureCompute() // prints
                        ```

---

This definition holds the referential transparency

```scala [1-15]
                        val a = putStrLn("Hello, world!")
                        a
                        a
                        ```

```scala [1-15]
                        putStrLn("Hello, world!")
                        putStrLn("Hello, world!")
                        ```
                        And both do nothing

---

This type is often called a suspended computation

It is lazy, because it does not start at the moment of its creation,
                        but only when it is explicitly started.

Unlike `lazy val`, these computations are not memoized and repeat on each invocation.

`impureCompute` invocations are usually limited to only a few places in the program.

---

Is this a pure function?

```scala [1-15]
                        def readln(): String = scala.io.StdIn.readLine()
                        ```

---

```scala [1-15]
                        val getStrLn: IO[String] = { () =>
                          scala.io.StdIn.readLine()
                        }
                        ```

This is a `val` instead of a `def`, is it correct? Why?

---
                        How do we define a pure function with an argument which read a user input
                        and prints the argument and the user input?

```scala [1-15]
                        def echo(line: String): Unit = {
                          val userInput = scala.io.StdIn.readLine()
                          println(line)
                          println(userInput)
                        }
                        ```
                        ---

```scala [1-15]
                        def printTwice(line: String): IO[Unit] = { () =>
                          val userInput = scala.io.StdIn.readLine()
                          println(line)
                          println(userInput)
                        }
                        ```

```scala [1-15]
                        def printTwice(line: String): IO[Unit] = { () =>
                          val userInput = getStrLn.impureCompute()
                          putStrLn(line).impureCompute()
                          putStrLn(userInput).impureCompute()
                        }
                        ```

---

Let's introduce `map` and `flatMap` combinators:

```scala [1-15]
                        trait IO[+T] {
                          def impureCompute(): T

def flatMap[O](f: T => IO[O]): IO[O] = { () =>
                            val intermediate: T = impureCompute()
                            f(intermediate).impureCompute()
                          }

def map[O](f: T => O): IO[O] = { () =>
                            f(impureCompute())
                          }
                        }
                        ```

---

Chaining of computations via `flatMap`:

```scala [1-15]
                        def printTwice(line: String): IO[Unit] =
                          getStrLn.flatMap { userInput =>
                            putStrLn(line).flatMap { _ =>
                              putStrLn(userInput)
                            }
                          }

---

And with some for-comprehensions:

```scala [1-15]
                        def printTwice(line: String): IO[Unit] =
                          for {
                            userInput <- getStrLn
                            _ <- putStrLn(line)
                            _ <- putStrLn(userInput)
                          } yield ()
                        ```

---

Compare to this:

```scala [1-15]
                        def printTwice(line: String): Unit = {
                          val userInput = readLn()
                          println(line)
                          println(userInput)
                        }
                        ```

## Operations with `IO`

What can we do?

---

Each pure computation can be lifted to an impure one:

```scala [1-15]
                    def add(l: Int, r: Int): Int = l + r

def impureAdd(l: Int, r: Int): IO[Int] = {
                      val res = add(l, r)
                      () => res
                    }

def deferredAdd(l: Int, r: Int): Computation[Int] =
                      { () => add(l, r) }
                    ```

What is the difference between `impureAdd` and `deferredAdd`?

---

Generally, putting a value `T` into `IO[T]` is done by function `pure`:

```scala [1-15]
                    object IO {
                      def pure[T](value: T): IO[T] =
                        { () => value }
                    }
                    ```

---

Suppose we have a list of strings

```scala [1-15]
                    val strings = List("Hello", "World")
                    ```

How do we print all these strings sequentially using `putStrLn`, `flatMap` and recursion?

---

`traverse` combinator is analogous to `foreach` and `map` methods on lists.

```scala [1-15]
                    def traverse[T, O](
                      list: List[T],
                      computation: T => IO[O],
                    ): IO[List[O]] =
                      list match {
                        case x :: xs =>
                          for {
                            res <- computation(x)
                            resRest <- traverse(xs, computation)
                          } yield res :: resRest
                        case Nil => Computation.pure(Nil)
                      }
                    ```

It runs a provided side-effectful function on each element sequentially and returns a list of results.

---
                    Another useful combinator is `sequence`:

```scala [1-15]
                    def sequence[T](
                      list: List[Computation[T]]
                    ): Computation[List[T]] =
                      traverse(list, identity)
                    ```

It chains a list of computations sequentially and returns a computation with list of results.

## Errors in `IO`

How to work with them?

---

Another important aspect to consider is potential errors and exceptions in the effectful code.

```scala [1-15]
                    val fail: IO[Nothing] = { () =>
                      throw new RuntimeExceptions("no code is executed after me")
                    }
                    ```
                    Like in ordinary imperative code, exceptions stop the computation chain and return early.
                    ---

```scala [1-15]
                    (for {
                      _ <- putStrLn("Hello")
                      _ <- fail
                      _ <- putStrLn("world")
                    } yield ()).impureCompute()
                    ```
                    How will this code behave?
                    ---

Let's suppose we want to catch errors and introduce additional logic.

We can introduce an `attempt` function which returns an infallible computation with `Try` in result.

```scala [1-15]
                    def attempt[T](
                      computation: IO[T],
                    ): IO[Try[T]] = { () =>
                      Try(computation.impureCompute())
                    }
                    ```

---

```scala [1-15]
                    (for {
                      _ <- putStrLn("Hello")
                      result <- attempt(fail)
                      _ <- result match {
                        case Success(i) => putStrLn("world")
                        case Failure(throwable) =>
                          putStrLn("I failed with: " + throwable)
                      }
                    } yield ()).impureCompute()
                    ```

---

We can also handle errors partially using a `PartialFunction`

```scala [1-15]
                    def catchErrors[T](computation: IO[T])(
                      handler: PartialFunction[Throwable, IO[T]],
                    ): IO[T] = attempt(computation).flatMap {
                      case Success(i) => IO.pure(i)
                      case Failure(throwable)
                        if handler.isDefinedAt(throwable) =>
                          handler(throwable)
                      case Failure(throwable) => throw throwable
                    }
                    ```

---
                    ```scala [1-15]
                    (for {
                      _ <- putStrLn("Hello")
                      result <- catchErrors(fail) {
                        case throwable =>
                          putStrLn("I failed with: " + throwable)
                      }
                    } yield ()).impureCompute()
                    ```

---

Finally, let's implement end of the world

```scala [1-15]
                    abstract class IOApp {

def run(args: Array[String]): IO[Unit]

final def main(args: Array[String]): Unit =
                        run(args).impureCompute()
                    }
                    ```

---

```scala [1-15]
                    object Main extends IOApp {
                      def printTwice(line: String): IO[Unit] =
                        for {
                          userInput <- getStrLn
                          _ <- putStrLn(line)
                          _ <- putStrLn(userInput)
                        } yield ()

def run(args: Array[String]): IO[Unit] =
                        printTwice("Hello!")
                    }
                    ```

---

## Questions?