Lesson 22. Interacting with the command line and lazy I/O

After reading lesson 22, you’ll be able to

  • Access command-line arguments
  • Use the traditional approach to interacting through I/O
  • Write I/O code using lazy evaluation to make I/O easier

Often when people first learn about I/O and Haskell, they assume that I/O is somewhat of a challenge for Haskell because Haskell is all about pure programs and I/O is anything but pure. But there’s another way to view I/O that makes it uniquely suited to Haskell, and somewhat clunky in other programming languages. Often when working with I/O in any language, we talk about I/O streams, but what is a stream? One good way to understand I/O streams is as a lazily evaluated list of characters. STDIN streams user input into a program until an eventual end is reached. But this end isn’t always known (and in theory could never occur). This is exactly how to think about lists in Haskell when using lazy evaluation.

This view of I/O is used in nearly every programming language when reading from large files. Often it’s impractical, or even impossible, to read a large file into memory before operating on it. But imagine that a given large file was simply some text assigned to a variable, and that variable was a lazy list. As you learned earlier, lazy evaluation allows you to operate on infinitely long lists. No matter how large your input is, you can handle it if you treat the problem like a large list.

In this lesson, you’ll look at a simple problem and solve it in a few ways. All you want to do is create a program that reads in an arbitrarily long list of numbers entered by a user, and then adds them all up and returns the result to the user. Along the way, you’ll learn both how to write traditional I/O and how to use lazy evaluation to come up with a much easier way to reason about the solution.

Consider this

You want to write a program that will let a user test whether words are palindromes. This is easy for a single word, but how can you let the user supply a continuous list of potential palindromes and keep checking as long as the user has words to check?

22.1. Interacting with the command line the nonlazy way

First let’s design a command-line tool that reads a list of numbers entered by the user and adds them all up. You’ll create a program called sum.hs. In the preceding lesson, you dealt with taking in user inputs and performing computations on them. The tricky thing this time is that you don’t know how many items the user is going to enter in advance.

One way to solve this is to allow the user to enter a value as an argument to the program; for example:

$ ./sum 4
"enter your numbers"

 3
 5
 9
 25
"your total is 42"

To get arguments, you can use the getArgs function found in System.Environment. The type signature of getArgs is as follows:

getArgs :: IO [String]

So you get a list of Strings in the context of IO. Here’s an example of using getArgs in your main.

Listing 22.1. Getting command-line arguments by using getArgs
import System.Environment

main :: IO ()
main = do
 args <- getArgs

To get a feel for how getArgs works, it would be nice to print out all the args you have. Because you know that args is a list, you could use map to iterate over each value. But you have a problem, because you’re working in the context of a do statement with an IO type. What you want is something like this.

Listing 22.2. Proposed solution to print your args (note: won’t compile)
map putStrLn args

But args isn’t an ordinary list, and putStrLn isn’t an ordinary function. You can map over a list of values in IO with a special version of map that operates on Lists in the context of IO (technically, on any member of the Monad type class). For that, there’s a special helper function called mapM (the M stands for Monad).

Listing 22.3. Next improvement: using mapM (still won’t compile)
main :: IO ()
main = do
 args <- getArgs
 mapM putStrLn args

Now when you compile your program, you still end up getting an error:

Couldn't match type '[()]' with '()'

GHC is complaining because the type of main is supposed to be IO (), but you’ll recall that map always returns a list. The trouble is that you just want to iterate over args and perform an IO action. You don’t care about the results, and don’t want a list back at the end. To solve this, there’s another function called mapM_ (note the underscore). This works just like mapM but throws away the results. Typically, when a function ends with an underscore in Haskell, it indicates that you’re throwing away the results. With this small refactor, you’re ready to go:

main :: IO ()
main = do
 args <- getArgs
 mapM_ putStrLn args

You can try a few commands and see what you get:

$ ./sum
$ ./sum 2
2
$ ./sum 2 3 4 5
2
3
4
5
Quick check 22.1

Q1:

Write a main that uses mapM to call getLine three times, and then use mapM_ to print out the values’ input. (Hint: You’ll need to throw away an argument when using mapM with getLine; use (\_ -> ...) to achieve this.)

QC 22.1 answer

1:

exampleMain :: IO ()
exampleMain = do
   vals <- mapM (\_ -> getLine) [1..3]
   mapM_ putStrLn vals

 

Now you can add the logic to capture your argument. You should also cover the case of a user failing to enter an argument. You’ll treat that as 0 lines. Also note that you’re using the print function for the first time. The print function is (putStrLn . show) and makes printing any type of value easier.

Listing 22.4. Using a command-line argument to determine how many lines to read
main :: IO ()
main = do
 args <- getArgs
 let linesToRead = if length args > 0
                   then read (head args)
                   else 0 :: Int
 print linesToRead

Now that you know how many lines you need, you need to repeatedly call getLine. Haskell has another useful function for iterating in this way called replicateM. The replicateM function takes a value for the number of times you want to repeat and an IO action and repeats the action as expected. You need to import Control.Monad to do this.

Listing 22.5. Reading a number of lines equal to the user’s argument
import Control.Monad

main :: IO ()
main = do
 args <- getArgs
 let linesToRead = if length args > 0
                   then read (head args)
                   else 0
 numbers <- replicateM linesToRead getLine
 print "sum goes here"

Okay, you’re almost there! Remember that getLine returns a String in the IO context. Before you can take the sum of all these arguments, you need to convert them to Ints, and then you can return the sum of this list.

Listing 22.6. The full content of your sum.hs program
import System.Environment
import Control.Monad

main :: IO ()
main = do
 args <- getArgs
 let linesToRead = if length args > 0
                   then read (head args)
                   else 0 :: Int
 numbers <- replicateM linesToRead getLine
 let ints = map read numbers :: [Int]
 print (sum ints)

That was a bit of work, but now you have a tool that lets users enter as many ints as they want, and you can add them up for them:

$ ./sum 2
4
59

$ ./sum 4
1
2
3
410

Even in this simple program, you’ve covered a number of the tools used to handle user inputs. Table 22.1 covers some useful functions for iterating in an IO type.

Table 22.1. Functions for iterating in an IO context

Function

Behavior

mapM Takes an IO action and a regular list, performing the action on each item in the list, and returning a list in the IO context
mapM_ Same as mapM, but it throws away the values (note the underscore)
replicateM Takes an IO action, an Int n, and then repeats the IO action n times, returning the results in an IO list
replicateM_ Same as replicateM, but it throws away the results

Next you’ll look at how much easier this would be if you used lazy evaluation.

Quick check 22.2

Q1:

Write your own version of replicateM, myReplicateM, that uses mapM. (Don’t worry too much about the type signature.)

QC 22.2 answer

1:

myReplicateM :: Monad m => Int -> m a -> m [a]
myReplicateM n func = mapM (\_ -> func) [1 .. n]

 

22.2. Interacting with lazy I/O

Your last program worked but had a few issues. First is that you require the user to input the specific number of lines needed. The user of your sum program needs to know this ahead of time. What if users are keeping a running tally of visitors to a museum, or piping in the output of another program to yours? Recall that the primary purpose of having an IO type is to separate functions that absolutely must work in I/O with more general ones. Ideally, you want as much of your program logic outside your main. In this program, all your logic is wrapped up in IO, which indicates that you’re not doing a good job of abstracting out your overall program. This is partially because so much I/O behavior is intermingled with what your program is supposed to be doing.

The root cause of this issue is that you’re treating your I/O data as a sequence of values that you have to deal with immediately. An alternative is to think of the stream of data coming from the user in the same way you would any other list in Haskell. Rather than think of each piece of data as a discrete user interaction, you can treat the entire interaction as a list of characters coming from the user. If you treat your input as a list of Chars, it’s much easier to design your program and forget all about the messy parts of I/O. To do this, you need just one special action: getContents. The getContents action lets you treat the I/O stream for STDIN as a list of characters.

You can use getContents with mapM_ to see how strangely this can act. You’ll be working with a new file named sum_lazy.hs for this section.

Listing 22.7. A simple main to explore lazy I/O
main :: IO ()
main = do
  userInput <- getContents
  mapM_ print userInput

The getContents action reads input until it gets an end-of-file signal. For a normal text file, this is the end of the file, but for user input you have to manually enter it (usually via Ctrl-D in most terminals). Before running this program, it’s worth thinking about what’s going to happen, given lazy evaluation. In a strict (nonlazy) language, you’d assume that you have to wait until you manually enter Ctrl-D before your input would be printed back to use. Let’s see what happens in Haskell:

$ ./sum_lazy
hi
'h'
'i'
'
'
what?
'w'
'h'
'a'
't'
'?'
'
'

As you can see, because Haskell can handle lazy lists, it’s able to process your text as soon as you enter it! This means you can handle continuous interaction in interesting ways.

Quick check 22.3

Q1:

Use lazy I/O to write a program that reverses your input and prints it back to you.

QC 22.3 answer

1:

reverser :: IO ()
reverser = do
   input <- getContents
   let reversed = reverse input
   putStrLn reversed

 

22.2.1. Thinking of your problem as a lazy list

With getContents, you can rewrite your program, this time completely ignoring IO until later. All you need to do now is take a list of characters consisting of numbers and newline characters . Here’s a sample list.

Listing 22.8. Sample data representing a string of input characters
sampleData = ['6','2','
','2','1','
']

If you can write a function that converts this into a list of Ints, you’ll be all set! There’s a useful function for Strings that you can use to make this easy. The lines function allows you to split a string by lines. Here’s an example in GHCi with your sample data:

GHCi> lines sampleData
["62","21"]

The Data.List.Split module contains a more generic function than lines, splitOn, which splits a String based on another String. Data.List.Split isn’t part of base Haskell, but is included in the Haskell Platform. If you aren’t using the Haskell Platform, you may need to install it. The splitOn function is a useful one to know when processing text. Here’s how lines could be written with splitOn.

Listing 22.9. Defining myLines with splitOn from Data.List.Split
myLines = splitOn "
"

With lines, all you need is to map the read function over your new lists and you’ll get your list of Ints. You’ll create a toInts function to do this.

Listing 22.10. toInts function to convert your Char list into a list of Ints
toInts :: String -> [Int]
toInts = map read . lines

Making this function work with IO is remarkably easy. You apply it to your userInput you captured with getContents.

Listing 22.11. Your lazy solution to processing your numbers
main :: IO ()
main = do
  userInput <- getContents
  let numbers = toInts userInput
  print (sum numbers)

As you can see, your final main is much cleaner than your first version. Now you can compile your program and test it out:

$ ./sum_lazy
4
234
23
1
3
<ctrl-d>
265

This is much nicer than before, as your code is cleaner and users don’t have to worry about how many numbers are in the list when they start. In this lesson, you’ve seen how to structure your program to work in a way similar to most other programming languages. You request data from the user, process that data, and then request more input from the user. In this model, you’re performing strict I/O, meaning that you evaluate each piece of data as you get it. In many cases, if you treat the user input as a regular lazy list of Chars, you can abstract out nearly all of your non-I/O code much more easily. In the end, you have only one point where you need to treat your list as I/O: when you first receive it. This allows all the rest of your code to be written as code that operates on a normal list in Haskell.

Quick check 22.4

Q1:

Write a program that returns the sum of the squares of the input.

QC 22.4 answer

1:

mainSumSquares :: IO ()
mainSumSquares = do
   userInput <- getContents
   let numbers = toInts userInput
   let squares = map (^2) numbers
   print (sum squares)

 

Summary

In this lesson, our objective was to introduce you to the ways to write simple command-line interfaces in Haskell. The most familiar way is to treat I/O just like any other programming language. You can use do-notation to create a procedural list of IO actions, and build interactions with I/O this way. A more interesting approach, possible in few languages other than Haskell, is to take advantage of lazy evaluation. With lazy evaluation, you can think of the entire input stream as a lazily evaluated list of characters, [Char]. You can radically simplify your code by writing out pure functions as though they were just working on the type [Char]. Let’s see if you got this.

Q22.1

Write a program, simple_calc.hs, that reads simple equations involving adding two numbers or multiplying two numbers. The program should solve the equation each user types into each line as each line is entered.

Q22.2

Write a program that allows a user to select a number between 1 and 5 and then prints a famous quote (quotes are of your choosing). After printing the quote, the program will ask whether the user would like another. If the user enters n, the program ends; otherwise, the user gets another quote. The program repeats until the user enters n. Try to use lazy evaluation and treat the user input as a list rather than recursively calling main at the end.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.44.92