• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

thma/WhyHaskellMatters: In this article I try to explain why Haskell keeps being ...

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称(OpenSource Name):

thma/WhyHaskellMatters

开源软件地址(OpenSource Url):

https://github.com/thma/WhyHaskellMatters

开源编程语言(OpenSource Language):

Haskell 100.0%

开源软件介绍(OpenSource Introduction):

Why Haskell Matters

Actions Status

Haskell doesn't solve different problems than other languages. But it solves them differently.

-- unknown

Abstract

In this article I try to explain why Haskell keeps being such an important language by presenting some of its most important and distinguishing features and detailing them with working code examples.

The presentation aims to be self-contained and does not require any previous knowledge of the language.

The target audience are Haskell freshmen and developers with a background in non-functional languages who are eager to learn about concepts of functional programming and Haskell in particular.

Table of contents

Introduction

Exactly thirty years ago, on April 1st 1990, a small group of researchers in the field of non-strict functional programming published the original Haskell language report.

Haskell never became one of the most popular languages in the software industry or part of the mainstream, but it has been and still is quite influential in the software development community.

In this article I try to explain why Haskell keeps being such an important language by presenting some of its most distinguishing features and detailing them with working code examples.

The presentation aims to be self-contained and does not require any previous knowledge of the language. I will also try to keep the learning curve moderate and to limit the scope of the presentation; nevertheless this article is by no means a complete introduction to the language.

(If you are looking for thorough tutorials have a look at Haskell Wikibook or Learn You a Haskell

Before diving directly into the technical details I'd like to first have a closer look on the reception of Haskell in the software developers community:

A strange development over time

In a talk in 2017 on the Haskell journey since its beginnings in the 1980ies Simon Peyton Jones speaks about the rather unusual life story of Haskell.

First he talks about the typical life cycle of research languages. They are often created by a single researcher (who is also the single user), and most of them will be abandoned after just a few years.

A more successful research language might gain some interest in a larger community but will still not escape the ivory tower and typically will be given up within ten years.

On the other hand we have all those popular programming languages that are quickly adopted by large numbers of developers and thus reach "the threshold of immortality". That is the base of existing code will grow so large that the language will be in use for decades.

A little jokingly he then depicts the sad fate of languages designed by committees by flat line through zero: They simply never take off.

Finally, he presents a chart showing the Haskell timeline:

the haskell timeline

The development shown in this chart seems rather unexpected: Haskell started as a research language and was even designed by a committee; so in all probability it should have been abandoned long before the millennium!

Instead, it gained some momentum in its early years followed by a rather quiet phase during the decade of OO hype (Java being released in 1995). And then again we see a continuous growth of interest since about 2005. I'm writing this in early 2020, and we still see this trend!

Being used versus being discussed

Then Simon Peyton Jones points out another interesting characteristic of the reception of Haskell in recent years: In statistics that rank programming languages by actual usage Haskell is typically not under the 30 most active languages. But in statistics that instead rank languages by the volume of discussions on the internet Haskell typically scores much better (often in the top ten).

So why does Haskell keep being a hot topic in the software development community?

A very short answer might be: Haskell has a number of features that are clearly different from those of most other programming languages. Many of these features have proven to be powerful tools to solve basic problems of software development elegantly.

Therefore, over time other programming languages have adopted parts of these concepts (e.g. pattern matching or type classes). In discussions about such concepts the Haskell heritage is mentioned and differences between the original Haskell concepts and those of other languages are discussed. Sometimes people feel encouraged to have a closer look at the source of these concepts to get a deeper understanding of their original intentions. That's why we see a growing number of developers working in Python, Typescript, Scala, Rust, C++, C# or Java starting to dive into Haskell.

A further essential point is that Haskell is still an experimental laboratory for research in areas such as compiler construction, programming language design, theorem-provers, type systems etc. So inevitably Haskell will be a topic in the discussion about these approaches.

In the following sections we will try to find the longer answer by studying some of the most distinguishing features of Haskell.

Functions are First-class

In computer science, a programming language is said to have first-class functions if it treats functions as first-class citizens. This means the language supports passing functions as arguments to other functions, returning them as the values from other functions, and assigning them to variables or storing them in data structures.[1] Some programming language theorists require support for anonymous functions (function literals) as well.[2] In languages with first-class functions, the names of functions do not have any special status; they are treated like ordinary variables with a function type.

quoted from Wikipedia

We'll go through this one by one:

Functions can be assigned to variables exactly as any other values

Let's have a look how this looks like in Haskell. First we define some simple values:

-- define constant `aNumber` with a value of 42. 
aNumber :: Integer
aNumber = 42

-- define constant `aString` with a value of "hello world"
aString :: String
aString = "Hello World"

In the first line we see a type signature that defines the constant aNumber to be of type Integer. In the second line we define the value of aNumber to be 42. In the same way we define the constant aString to be of type String.

Haskell is a statically typed language: all type checks happen at compile time. Static typing has the advantage that type errors don't happen at runtime. This is especially useful if a function signature is changed and this change affects many dependent parts of a project: the compiler will detect the breaking changes at all affected places.

The Haskell Compiler also provides type inference, which allows the compiler to deduce the concrete data type of an expression from the context. Thus, it is usually not required to provide type declarations. Nevertheless, using explicit type signatures is considered good style as they are an important element of a comprehensive documentation.

Next we define a function square that takes an integer argument and returns the square value of the argument:

square :: Integer -> Integer
square x = x * x

Definition of a function works exactly in the same way as the definition of any other value. The only thing special is that we declare the type to be a function type by using the -> notation. So :: Integer -> Integer represents a function from Integer to Integer. In the second line we define function square to compute x * x for any Integer argument x.

Ok, seems not too difficult, so let's define another function double that doubles its input value:

double :: Integer -> Integer
double n = 2 * n

Support for anonymous functions

Anonymous functions, also known as lambda expressions, can be defined in Haskell like this:

\x -> x * x

This expression denotes an anonymous function that takes a single argument x and returns the square of that argument. The backslash is read as λ (the greek letter lambda).

You can use such as expressions everywhere where you would use any other function. For example you could apply the anonymous function \x -> x * x to a number just like the named function square:

-- use named function:
result = square 5

-- use anonymous function:
result' = (\x -> x * x) 5

We will see more useful applications of anonymous functions in the following section.

Functions can be returned as values from other functions

Function composition

Do you remember function composition from your high-school math classes? Function composition is an operation that takes two functions f and g and produces a function h such that h(x) = g(f(x)) The resulting composite function is denoted h = g ∘ f where (g ∘ f )(x) = g(f(x)). Intuitively, composing functions is a chaining process in which the output of function f is used as input of function g.

So looking from a programmers perspective the operator is a function that takes two functions as arguments and returns a new composite function.

In Haskell this operator is represented as the dot operator .:

(.) :: (b -> c) -> (a -> b) -> a -> c
(.) f g x = f (g x)

The brackets around the dot are required as we want to use a non-alphabetical symbol as an identifier. In Haskell such identifiers can be used as infix operators (as we will see below). Otherwise (.) is defined as any other function. Please also note how close the syntax is to the original mathematical definition.

Using this operator we can easily create a composite function that first doubles a number and then computes the square of that doubled number:

squareAfterDouble :: Integer -> Integer
squareAfterDouble = square . double

Currying and Partial Application

In this section we look at another interesting example of functions producing other functions as return values. We start by defining a function add that takes two Integer arguments and computes their sum:

-- function adding two numbers 
add :: Integer -> Integer -> Integer
add x y = x + y

This look quite straightforward. But still there is one interesting detail to note: the type signature of add is not something like

add :: (Integer, Integer) -> Integer

Instead it is:

add :: Integer -> Integer -> Integer

What does this signature actually mean? It can be read as "A function taking an Integer argument and returning a function of type Integer -> Integer". Sounds weird? But that's exactly what Haskell does internally. So if we call add 2 3 first add is applied to 2 which return a new function of type Integer -> Integer which is then applied to 3.

This technique is called Currying

Currying is widely used in Haskell as it allows another cool thing: partial application.

In the next code snippet we define a function add5 by partially applying the function add to only one argument:

-- partial application: applying add to 5 returns a function of type Integer -> Integer
add5 :: Integer -> Integer
add5 = add 5

The trick is as follows: add 5 returns a function of type Integer -> Integer which will add 5 to any Integer argument.

Partial application thus allows us to write functions that return functions as result values. This technique is frequently used to provide functions with configuration data.

Functions can be passed as arguments to other functions

I could keep this section short by telling you that we have already seen an example for this: the function composition operator (.). It accepts two functions as arguments and returns a new one as in:

squareAfterDouble :: Integer -> Integer
squareAfterDouble = square . double

But I have another instructive example at hand.

Let's imagine we have to implement a function that doubles any odd Integer:

ifOddDouble :: Integer -> Integer
ifOddDouble n =
  if odd n
    then double n
    else n

The Haskell code is straightforward: new ingredients are the if ... then ... else ... and the odd odd which is a predicate from the Haskell standard library that returns True if an integral number is odd.

Now let's assume that we also need another function that computes the square for any odd number:

ifOddSquare :: Integer -> Integer
ifOddSquare n =
  if odd n
    then square n
    else n

As vigilant developers we immediately detect a violation of the Don't repeat yourself principle as both functions only vary in the usage of a different growth functions double versus square.

So we are looking for a way to refactor this code by a solution that keeps the original structure but allows to vary the used growth function.

What we need is a function that takes a growth function (of type (Integer -> Integer)) as first argument, an Integer as second argument and returns an Integer. The specified growth function will be applied in the then clause:

ifOdd :: (Integer -> Integer) -> Integer -> Integer
ifOdd growthFunction n =
  if odd n
    then growthFunction n
    else n

With this approach we can refactor ifOddDouble and ifOddSquare as follows:

ifOddDouble :: Integer -> Integer
ifOddDouble n = ifOdd double n

ifOddSquare :: Integer -> Integer
ifOddSquare n = ifOdd square n

Now imagine that we have to implement new function ifEvenDouble and ifEvenSquare, that will work only on even numbers. Instead of repeating ourselves we come up with a function ifPredGrow that takes a predicate function of type (Integer -> Bool) as first argument, a growth function of type (Integer -> Integer) as second argument and an Integer as third argument, returning an Integer.

The predicate function will be used to determine whether the growth function has to be applied:

ifPredGrow :: (Integer -> Bool) -> (Integer -> Integer) -> Integer -> Integer
ifPredGrow predicate growthFunction n =
  if predicate n
    then growthFunction n
    else n

Using this higher order function that even takes two functions as arguments we can write the two new functions and further refactor the existing ones without breaking the DRY principle:

ifEvenDouble :: Integer -> Integer
ifEvenDouble n = ifPredGrow even double n

ifEvenSquare :: Integer -> Integer
ifEvenSquare n = ifPredGrow even square n

ifOddDouble'' :: Integer -> Integer
ifOddDouble'' n = ifPredGrow odd double n

ifOddSquare'' :: Integer -> Integer
ifOddSquare'' n = ifPredGrow odd square n

Pattern matching

With the things that we have learnt so far, we can now start to implement some more interesting functions. So what about implementing the recursive factorial function?

The factorial function can be defined as follows:

For all n ∈ ℕ0:

0! = 1
n! = n * (n-1)!

With our current knowledge of Haskell we can implement this as follows:

factorial :: Natural -> Natural
factorial n =
  if n == 0
    then 1
    else n * factorial (n - 1)

We are using the Haskell data type Natural to denote the set of non-negative integers ℕ0. Using the literal factorial within the definition of the function factorial works as expected and denotes a recursive function call.

As these kind of recursive definition of functions are typical for functional programming, the language designers have added a useful feature called pattern matching that allows to define functions by a set of equations:

fac :: Natural -> Natural
fac 0 = 1
fac n = n * fac (n - 1)

This style comes much closer to the mathematical definition and is typically more readable, as it helps to avoid nested if ... then ... else ... constructs.

Pattern matching can not only be used for numeric values but for any other data types. We'll see some more examples shortly.

Algebraic Data Types

Haskell supports user-defined data types by making use of a well thought out concept. Let's start with a simple example:

data Status = Green | Yellow | Red

This declares a data type Status which has exactly three different instances. For each instance a data constructor is defined that allows to create a new instance of the data type.

Each of those data constructors is a function (in this simple case a constant) that returns a Status instance.

The type Status is a so called sum type as it is represents the set defined by the sum of all three instances Green, Yellow, Red. In Java this corresponds to Enumerations.

Let's assume we have to create a converter that maps our Status values to Severity values representing severity levels in some other system. This converter can be written using the pattern matching syntax that we already have seen above:

-- another sum type representing severity:
data Severity = Low | Middle | High deriving (Eq, Show)

severity :: Status -> Severity
severity Green  = Low
severity Yellow = Middle
severity Red    = High

The compiler will tell us when we did not cover all instances of the Status type (by making use of the -fwarn-incomplete-patterns pragma).

Now we look at data types that combine multiple different elements, like pairs n-tuples, etc. Let's start with a PairStatusSeverity type that combines two different elements:

data PairStatusSeverity = P Status Severity

This can be understood as: data type PairStatusSeverity can be constructed from a data constructor P that takes a value of type Status and a value of type Severity and returns a Pair instance.

So for example P Green High returns a PairStatusSeverity instance (the data constructor P has the signature P :: Status -> Severity -> PairStatusSeverity).

The type PairStatusSeverity can be interpreted as the set of all possible ordered pairs of Status and Severity values, that is the cartesian product of Status and Severity.

That's why such a data type is called product type.

Haskell allows you to create arbitrary data types by combining sum types and product types. The complete range of data types that can be constructed in this way is called algebraic data types or ADT in short.

Using algebraic data types has several advantages:

  • Pattern matching can be used to analyze any concrete instance to select different behaviour based on input data. as in the example that maps Status to Severity there is no need to use if..then..else.. constructs.
  • The compiler can detect incomplete patterns matching or other flaws.
  • The compiler can derive many complex functionality automatically for ADTs as they are constructed in such a regular way.

We will cover the interesting combination of ADTs and pattern matching in the following sections.

Polymorphic Data Types

Forming pairs or more generally n-tuples is a very common task in programming. Therefore it would be inconvenient and repetitive if we were forced to create new Pair or Tuple types for each concrete usage. consider the following example:

data PairStatusSeverity = P Status Severity

data PairStatusString   = P' Status String

data PairSeverityStatus = P'' Severity Status

Luckily data type declarations allow to use type variables to avoid this kind of cluttered code. So we can define a generic data type Pair that allows us to freely combine different kinds of arguments:

-- a simple polymorphic type
data Pair a b = P a b

This can be understood as: data type Pair uses two elements of (potentially) different types a and b; the data constructor P takes a value of type a and a value of type b and returns a Pair a b instance (the data constructor P has the signature P :: a -> b -> Pair a b). The type Pair can now be used to create many different concrete data types it is thus called a polymorphic data type. As the Polymorphism is defined by type variables, i.e. parameters to the type declarations, this mechanism is called parametric polymorphism.

As pairs and n-tuples are so frequently used, the Haskell language designers have added some syntactic sugar to work effortlessly with them.

So you can simply write tuples like this:

tuple :: (Status, Severity, String)
tuple = (Green, Low, "All green")

Lists

Another very useful polymorphic type is the List.

A list can either be the empty list (denoted by the data constructor []) or some element of a data type a followed by a list with elements of type a, denoted by [a].

This intuition is reflected in the following data type definition:

data [a] = [] | a : [a]

The cons operator (:) (which is an infix operator like (.) from the previous section) is declared as a data constructor to construct a list from a single element of type a and a list of type [a].

So a list containing only a single element 1 is constructed by:

1 : []

A list containing the three numbers 1, 2, 3 is constructed like this:

1 : 2 : 3 : []

Luckily the Haskell language designers have been so kind to offer some syntactic sugar for this. So the first list can simply be written as [1] and the second as [1,2,3].

Polymorphic type expressions describe families of types. For example, (forall a)[a] is the family of types consisting of, for every type a, the type of lists of a. Lists of integers (e.g. [1,2,3]), lists of characters (['a','b','c']), even lists of lists of integers, etc., are all members of this family.

Function that work on lists can use pattern matching to select behaviour for the [] and the a:[a] case.

Take for instance the definition of the function length that computes the length of a list:

length :: [a] -> Integer
length []     =  0
length (x:xs) =  1 + length xs

We can read these equations as: The length of the empty list is 0, and the length of a list whose first element is x and remainder is xs is 1 plus the length of xs.

In our next example we want to work with a of some random integers:

someNumbers :: [Integer]
someNumbers = [49,64,97,54,19,90,934,22,215,6,68,325,720,8082,1,33,31]

Now we want to select all even or all odd numbers from this list. We are looking for a function filter that takes two arguments: first a predicate function that will be used to check each element and second the actual list of elements. The function will return a list with all matching elements. And of course our solution should work not only for Integers but for any other types as well. Here is the type signature of such a filter function:

filter :: (a -> Bool) -> [a] -> [a]

In the implementation we will use pattern matching to provide different behaviour for the [] and the (x:xs) case:

filter :: (a -> Bool) -> [a] -> [a]
filter pred []     = []
filter pred (x:xs)
  | pred x         = x : filter pred xs
  | otherwise      = filter pred xs

The [] case is obvious. To understand the (x:xs) case we have to know that in addition to simple matching of the type constructors we can also use pattern guards to perform additional testing on the input data. In this case we compute pred x if it evaluates to True, x is a match and will be cons'ed with the result of filter pred xs. If it does not evaluate to True, we will not add x to the result list and thus simply call filter recursively on the remainder of the list.

Now we can use filter to select elements from our sample list:

someEvenNumbers :: [Integer]
someEvenNumbers = filter even someNumbers

-- predicates may also be lambda-expresssions
someOddNumbers :: [Integer]
someOddNumbers = filter (\n -> n `rem` 2 /= 0) someNumbers  

Of course we don't have to invent functions like filter on our own but can rely on the extensive set of predefined functions working on lists in the Haskell base library.

Arithmetic sequences

There is a nice feature that often comes in handy when dealing with lists of numbers. It's called arithmetic sequences and allows you to define lis


鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap