Software Development Resources
Create account | Log in | Log in with OpenID | Help

Lisp

From DocForge

Lisp is a family of computer programming languages with a long history and a distinctive fully-parenthesized syntax. Originally specified in 1958, Lisp is the second-oldest high-level programming language in widespread use today; only Fortran is older. Lisp has changed a great deal since its early days, and a number of dialects have existed over its history. Today, the most widely-known general-purpose Lisp dialects are Common Lisp and Scheme.

Lisp was originally created as a practical mathematical notation for computer programs, based on Alonzo Church's lambda calculus. It quickly became the favored programming language for artificial intelligence research. As one of the earliest programming languages, Lisp pioneered many ideas in computer science, including tree data structures, automatic storage management, dynamic typing, object-oriented programming, and the self-hosting compiler.

The name Lisp derives from "List Processing". Linked lists are one of Lisp languages' primary data structures, and Lisp source code is itself made up of lists. As a result, Lisp programs can manipulate source code as a data structure, giving rise to macro systems that allow programmers to create new syntax or even new "little languages" embedded in Lisp.


Contents

[edit] Major Modern Dialects

The two major dialects of Lisp used for general-purpose programming today are Common Lisp and Scheme. These languages represent significantly different design choices.

Common Lisp, descended mainly from ZetaLisp and Franz Lisp, is an expanded superset of earlier Lisp dialects, with a large language standard including many built-in data types and syntactic forms, as well as an object system. Scheme is a more minimalist design, with a much smaller set of standard features but with certain implementation features (such as tail-call optimization and full continuations) not necessarily found in Common Lisp.

In addition, Lisp dialects are used as as scripting languages in a number of applications, with the most well-known being Emacs Lisp in the Emacs editor and Autolisp in AutoCAD.

[edit] Language Innovations

Lisp was the first homoiconic programming language: the primary representation of program code is the same type of list structure that is also used for the main data structures. As a result, Lisp functions can be manipulated, altered or even created within a Lisp program without extensive parsing or manipulation of binary machine code. This is generally considered one of the primary advantages of the language with regards to its expressiveness, and makes the language amenable to metacircular evaluation.

The now-ubiquitous if-then-else structure, now taken for granted as an essential element of most programming languages, was invented by McCarthy for use in Lisp, where it saw its first appearance in a more general form (the cond structure). It was inherited by Algol, which popularized it.

Lisp deeply influenced Alan Kay, the leader of the research on Smalltalk, and then in turn Lisp was influenced by Smalltalk, by adopting object-oriented programming features (classes, instances, etc.) in the late 1970s.

Largely because of its resource requirements with respect to early computing hardware (including early microprocessors), Lisp did not become as popular outside of the AI community as Fortran and the ALGOL-descended C language. Newer languages such as Java and Python have incorporated some limited versions of some of the features of Lisp, but are necessarily unable to bring the coherence and synergy of the full concepts found in Lisp. Because of its suitability to ill-defined, complex, and dynamic applications, Lisp is currently enjoying some resurgence of popular interest.

[edit] Syntax and Semantics

Note: Examples here are written in Common Lisp (though most are also valid Scheme).

Lisp is an expression-oriented language. Unlike most other languages, no distinction is made between expressions and statements; all code and data are written as expressions. When an expression is evaluated, it produces a value (or list of values), which then can be embedded into other expressions.

McCarthy's 1958 paper introduced two types of syntax: S-expressions (Symbolic Expressions, also called "sexps"), which mirror the internal representation of code and data; and M-expressions (Meta Expressions), which express functions of S-expressions. M-expressions never found favour, and almost all Lisps today use S-expressions to manipulate both code and data.

The use of parentheses is Lisp's most immediately obvious difference from other programming language families. As a result, students have long given Lisp nicknames such as "Lots of Irritating Superfluous Parentheses". However, the S-expression syntax is also responsible for much of Lisp's power; the syntax is extremely regular, which facilitates manipulation by computer. However, the syntax of Lisp is not limited to traditional parentheses notation. It can be extended to include alternative notations. XMLisp, for instance, is a Common Lisp extension that employs the metaobject-protocol to integrate S-expressions with the XML.

The reliance on expressions gives the language great flexibility. Because Lisp functions are themselves written as lists, they can be processed exactly like data. This allows easy writing of programs which manipulate other programs (metaprogramming). Many Lisp dialects exploit this feature using macro systems, which enables extension of the language almost without limit.

A Lisp list is written with its elements separated by whitespace, and surrounded by parentheses. For example, (1 2 foo) is a list whose elements are three atoms, the values 1, 2, and foo. These values are implicitly typed: they are respectively two integers and a Lisp-specific data type called a "symbol", and do not have to be declared as such.

The empty list () is also represented as the special atom nil. This is the only entity in Lisp which is both an atom and a list.

Expressions are written as lists, using prefix notation. The first element in the list is the name of a form, i.e., a function, operator, macro, or "special operator" (see below.) The remainder of the list are the arguments. For example, the function list returns its arguments as a list, so the expression

(list '1 '2 'foo)

evaluates to the list (1 2 foo). The "quote" before the arguments in the preceding example is a "special operator" which prevents the quoted arguments being evaluated (not strictly necessary for the numbers, since 1 evaluates to 1, etc). Any unquoted expressions are recursively evaluated before the enclosing expression is evaluated. For example,

(list 1 2 (list 3 4))

evaluates to the list (1 2 (3 4)). Note that the third argument is a list; lists can be nested.

Arithmetic operators are treated similarly. The expression

(+ 1 2 3 4)

evaluates to 10. The equivalent under infix notation would be "1 + 2 + 3 + 4". Arithmetic operators in Lisp are variadic (or n-ary), able to take any number of arguments.

"Special operators" (sometimes called "special forms" by older users) provide Lisp's control structure. For example, the special operator if takes three arguments. If the first argument is non-nil, it evaluates to the second argument; otherwise, it evaluates to the third argument. Thus, the expression

(if nil
   (list 1 2 "foo")
   (list 3 4 "bar"))

evaluates to (3 4 "bar"). Of course, this would be more useful if a non-trivial expression had been substituted in place of nil.

[edit] Lambda Expressions

Another special operator, lambda, is used to bind variables to values which are then evaluated within an expression. This operator is also used to create functions: the arguments to lambda are a list of arguments, and the expression or expressions to which the function evaluates (the returned value is the value of the last expression that is evaluated). The expression

(lambda (arg) (+ arg 1))

is an expression which, when applied, takes one argument, bound to arg and returns the number one greater than that argument. Lambda expressions are treated no differently from named functions; they are invoked the same way. Therefore, the expression

((lambda (arg) (+ arg 1)) 5)

evaluates to 6.

[edit] Atoms

In the original LISP there were two fundamental data types: atoms and lists. A list was a finite ordered sequence of elements, where each element is in itself either an atom or a list, and an atom was a number or a symbol. A symbol was essentially a unique named item, written as an Alphanumeric string in source code, and used either as a variable name or as a data item in symbolic processing. For example, the list (FOO (BAR 1) 2) contains three elements: the symbol FOO, the list (BAR 1), and the number 2.

The essential difference between atoms and lists was that atoms were immutable and unique. Two atoms that appeared in different places in source code but were written in the exact same way represented the same object, whereas each list was a separate object that could be altered independently of other lists and could be distinguished from other lists by comparison operators.

As more data types were introduced in later Lisp dialects, and programming styles evolved, the concept of an atom lost importance. Many dialects still retained the predicate atom for legacy compatibility, defining it as true for anything that is not a cons cell (ie. a list or a partial list).

[edit] Conses and Lists

A Lisp list is a singly-linked list. Each cell of this list is called a cons (in Scheme, a pair), and is composed of two pointers, called the car and cdr respectively. These are equivalent to the data and next fields discussed in the article linked list.

Of the many data structures that can be built out of cons cells, one of the most basic is called a proper list. A proper list is either the special nil (empty list) symbol, or a cons in which the car points to a datum (which may be another cons structure, such as a list), and the cdr points to another proper list.

If a given cons is taken to be the head of a linked list, then its car points to the first element of the list, and its cdr points to the rest of the list. For this reason, the <ttcar</tt> and cdr functions are also called first and rest when referring to conses which are part of a linked list (rather than, say, a tree).

Thus, a Lisp list is not an atomic object, as an instance of a container class in C++ or Java would be. A list is nothing more than an aggregate of linked conses. A variable which refers to a given list is simply a pointer to the first cons in the list. Traversal of a list can be done by "cdring down" the list; that is, taking successive cdrs to visit each cons of the list; or by using any of a number of higher-order functions to map a function over a list.

Because conses and lists are so universal in Lisp systems, it is a common misconception that they are Lisp's only data structures. In fact, all but the most simplistic Lisps have other data structures – such as vectors (arrays), hash tables, structures, and so forth.

[edit] S-expressions Represent Lists

Parenthesized S-expressions represent linked list structure. There are several ways to represent the same list as an S-expression. A cons can be written in dotted-pair notation as (a . b), where a is the car and b</tt the cdr. A longer proper list might be written <tt>(a . (b . (c . (d . nil)))) in dotted-pair notation. This is conventionally abbreviated as (a b c d) in list notation. An improper list may be written in a combination of the two – as (a b c . d) for the list of three conses whose last cdr is d (i.e., the list (a . (b . (c . d))) in fully-specified form).

[edit] List-Processing Procedures

Lisp provides many built-in procedures for accessing and controlling lists. Lists can be created directly with the list procedure, which takes any number of arguments, and returns the list of these arguments.

(list 1 2 'a 3)
;Output: (1 2 a 3)
(list 1 '(2 3) 4)
;Output: (1 (2 3) 4)

Because of the way that lists are constructed from cons pairs, the cons procedure can be used to add an element to the front of a list. Note that the cons procedure is asymmetric in how it handles list arguments, because of how lists are constructed.

(cons 1 '(2 3))
;Output: (1 2 3)
(cons '(1 2) '(3 4))
;Output: ((1 2) 3 4)

The append procedure appends two (or more) lists to one another. Because Lisp lists are linked lists, appending two lists has asymptotic time complexity <math>O(n)</math>.

(append '(1 2) '(3 4))
;Output: (1 2 3 4)
(append '(1 2 3) '() '(a) '(5 6))
;Output: (1 2 3 a 5 6)

[edit] Shared Structure

Lisp lists, being simple linked lists, can share structure with one another. That is to say, two lists can have the same tail, or final sequence of conses. For instance, after the execution of the following Common Lisp code:

(setq foo (list 'a 'b 'c))
(setq bar (cons 'x (cdr foo)))

the lists foo and bar are (a b c) and (x b c) respectively. However, the tail (b c) is the same structure in both lists. It is not a copy; the cons cells pointing to b and c are in the same memory locations for both lists.

Sharing structure rather than copying can give a dramatic performance improvement. However, this technique can interact in undesired ways with functions that alter lists passed to them as arguments. Altering one list, such as by replacing the c with a goose, will affect the other:

(setf (third foo) 'goose)

This changes foo to (a b goose), but also changes bar to (x b goose) – a possibly unexpected result. This can be a source of bugs, and functions which alter their arguments are documented as destructive for this very reason.

Aficionados of functional programming avoid destructive functions. In the Scheme dialect, which favors the functional style, the names of destructive functions are marked with a cautionary exclamation point, or "bang" — such as set-car! (read set car bang), which replaces the car of a cons. In the Common Lisp dialect, destructive functions are commonplace; the equivalent of set-car! is named rplaca for "replace car." This function is rarely seen however as Common Lisp includes a special facility, setf, to make it easier to define and use destructive functions. A frequent style in Common Lisp is to write code functionally (without destructive calls) when prototyping, then to add destructive calls as an optimization where it is safe to do so.

[edit] Self-evaluating forms and quoting

Lisp evaluates expressions which are entered by the user. Symbols and lists evaluate to some other (usually, simpler) expression – for instance, a symbol evaluates to the value of the variable it names; (+ 2 3) evaluates to 5. However, most other forms evaluate to themselves: if you enter 5 into Lisp, you just get back 5.

Any expression can also be marked to prevent it from being evaluated (as is necessary for symbols and lists). This is the role of the quote special operator, or its abbreviation ' (a single quotation mark). For instance, usually if you enter the symbol foo you will get back the value of the corresponding variable (or an error, if there is no such variable). If you wish to refer to the literal symbol, you enter (quote foo) or, usually, 'foo.

Both Common Lisp and Scheme also support the backquote operator (often called quasiquote by Schemers), entered with the ` character. This is almost the same as the plain quote, except it allows variables to be interpolated into a quoted list with the comma and comma-at operators. If the variable snue has the value (bar baz) then `(foo ,snue) evaluates to (foo (bar baz)), while `(foo ,@snue) evaluates to (foo bar baz). The backquote is most frequently used in defining macro expansions.

Self-evaluating forms and quoted forms are Lisp's equivalent of literals. However, they are not necessarily constants. In some Lisp dialects it is possible to modify the values of literals in program code. For instance, if a quoted form is used in the body of a function, and is changed as a side-effect, that function's behavior may differ on subsequent iterations. This is usually a bug. When behavior like this is intentional, using a closure is the explicit way to do it.

Lisp's formalization of quotation has been noted by Douglas Hofstadter (in Gödel, Escher, Bach) and others as an example of the philosophical idea of self-reference.

[edit] Scope and Closure

The modern Lisp family splits over the use of dynamic or static (aka lexical) scope. Scheme and Common Lisp make use of static scoping by default, while the more primitive Lisp systems used as embedded languages in Emacs and AutoCAD use dynamic scoping.

[edit] List Structure of Program Code

A fundamental distinction between Lisp and other languages is that in Lisp, program code is not simply text. Parenthesized S-expressions, as depicted above, are the printed representation of Lisp code, but as soon as these are entered into a Lisp system they are translated by the parser (called the read function) into linked list and tree structures in memory.

Lisp macros operate on these structures. Because Lisp code has the same structure as lists, macros can be built with any of the list-processing functions in the language. In short, anything that Lisp can do to a data structure, Lisp macros can do to code. In contrast, in most other languages the parser's output is purely internal to the language implementation and cannot be manipulated by the programmer. Macros in C, for instance, operate on the level of the preprocessor, before the parser is invoked, and cannot re-structure the program code in the way Lisp macros can.

In simplistic Lisp implementations, this list structure is directly interpreted to run the program; a function is literally a piece of list structure which is traversed by the interpreter in executing it. However, most actual Lisp systems (including all conforming Common Lisp systems) also include a compiler. The compiler translates list structure into machine code or bytecode for execution.

[edit] Evaluation and the Read-Eval-Print Loop

Lisp languages are frequently used with an interactive command line, which may be combined with an integrated development environment. The user types in expressions at the command line, or directs the IDE to transmit them to the Lisp system. Lisp reads the entered expressions, evaluates them, and prints the result. For this reason, the Lisp command line is called a "read-eval-print loop", or REPL.

The basic operation of the REPL is as follows. This is a simplistic description which omits many elements of a real Lisp, such as quoting and macros.

The read function accepts textual S-expressions as input, and parses them into list structure. For instance, if you type the string (+ 1 2) at the prompt, read translates this into a linked list with three elements – the symbol +, the number 1, and the number 2. It so happens that this list is also a valid piece of Lisp code; that is, it can be evaluated. This is because the car of the list names a function – the addition operation.

The eval function evaluates list structure, returning some other piece of structure as a result. Evaluation does not have to mean interpretation; some Lisp systems compile every expression to native machine code. It is simple, however, to describe evaluation as interpretation: To evaluate a list whose car names a function, eval first evaluates each of the arguments given in its cdr, then applies the function to the arguments. In this case, the function is addition, and applying it to the argument list (1 2) yields the answer 3. This is the result of the evaluation. Evaluation is performed in applicative order.

It is the job of the print function to represent output to the user. For a simple result such as 3 this is trivial. An expression which evaluated to a piece of list structure would require that print traverse the list and print it out as an S-expression.

To implement a Lisp REPL, it is necessary only to implement these three functions and an infinite-loop function. (Naturally, the implementation of eval will be complicated, since it must also implement all special operators like if.) This done, a basic REPL itself is but a single line of code: (loop (print (eval (read)))).

[edit] Control Structures

Lisp originally had very few control structures, but many more were added during the language's evolution. (Lisp's original conditional operator, cond, is the precursor to later if-then-else structures.)

Programmers in the Scheme dialect often express loops using tail recursion. Scheme's commonality in academic computer science has led some students to believe that tail recursion is the only, or the most common, way to write iterations in Lisp; this is incorrect. All frequently-seen Lisp dialects have imperative-style iteration constructs, from Scheme's do loop to Common Lisp's complex loop expressions.

Some Lisp control structures are special operators, equivalent to other languages' syntactic keywords. Expressions using these operators have the same surface appearance as function calls, but differ in that the arguments are not necessarily evaluated -- or, in the case of an iteration expression, may be evaluated more than once.

In contrast to most other major programming languages, Lisp allows the programmer to implement control structures using the language itself. Several control structures are implemented as Lisp macros, and can even be macroexpanded by the programmer who wants to know how they work.

Both Common Lisp and Scheme have operators for non-local control flow. The differences in these operators are some of the deepest differences between the two dialects. Scheme supports re-entrant continuations using the call/cc procedure, which allows a program to save (and later restore) a particular place in execution. Common Lisp does not support re-entrant continuations, but does support several ways of handling escape continuations.

Frequently, the same algorithm can be expressed in Lisp in either an imperative or a functional style. As noted above, Scheme tends to favor the functional style, using tail recursion and continuations to express control flow. However, imperative style is still quite possible. The style preferred by many Common Lisp programmers may seem more familiar to programmers used to structured languages such as C, while that preferred by Schemers more closely resembles pure-functional languages such as Haskell.

Because of Lisp's early heritage in list processing, it has a wide array of higher-order functions relating to iteration over sequences. In many cases where an explicit loop would be needed in other languages (like a for loop in C) in Lisp the same task can be accomplished with a higher-order function. (The same is true of many functional programming languages.)

A good example is a function which in Scheme is called map and in Common Lisp is called mapcar. Given a function and one or more lists, mapcar applies the function successively to the lists' elements in order, collecting the results in a new list:

(mapcar #'+ '(1 2 3 4 5) '(10 20 30 40 50))

This applies the + function to each corresponding pair of list elements, yielding the result (11 22 33 44 55).

[edit] Examples

Here are examples of Common Lisp code. While unlike Lisp programs used in industry, they are similar to Lisp as taught in computer science courses.

The basic "Hello World" program:

 (print "Hello world")
 (print (list 'Hello 'world))

As the reader may have noticed from the above discussion, Lisp syntax lends itself naturally to recursion. Mathematical problems such as the enumeration of recursively-defined sets are simple to express in this notation.

Evaluate a number's factorial:

(defun factorial (n)
  (if (<= n 1)
    1
    (* n (factorial (- n 1)))))

An alternative implementation, often faster than the previous version if the Lisp system has tail recursion optimization:

(defun factorial (n &optional (acc 1))
  (if (<= n 1)
    acc
    (factorial (- n 1) (* acc n))))

Contrast with an iterative version which uses Common Lisp's loop macro:

(defun factorial (n)
  (loop for i from 1 to n
    for fac = 1 then (* fac i)
    finally (return fac)))

The following function reverses a list. (Lisp's built-in reverse function does the same thing.)

(defun -reverse (l &optional acc)
   (if (atom l)
     acc
     (-reverse (cdr l) (cons (car l) acc))))

[edit] Object Systems

Various object systems and models have been built on top of, alongside, or into Lisp, including:

  • ObjectLisp or Object Lisp, favored by Lisp Machines Incorporated
  • Loops (Lisp Object-Oriented Programming System) and the later CommonLoops
  • Flavors, built at MIT, and its descendant New Flavors, which were favored by Symbolics
  • The Common Lisp Object System, CLOS (descended from New Flavors and CommonLoops)
  • KR (short for Knowledge Representation), a constraints-based object system developed to aid the writing of Garnet, a GUI library for Common Lisp
  • SageCLOS, an Object Oriented Interface to AutoLISP invented by Ralph Gimenez

CLOS features multiple inheritance, multiple dispatch ("multimethods"), and a powerful system of "method combinations". In fact, Common Lisp, which includes CLOS, was the first object-oriented language to be officially standardized.

[edit] References

[edit] External links



Additional copyright notice: Some content of this page is a derivative work of a Wikipedia article under the CC-BY-SA License and/or GNU FDL. The original article and author information can be found at http://en.wikipedia.org/wiki/Lisp_%28programming_language%29.

Discuss