Grep
Encyclopedia
grep is a command-line text-search utility originally written for Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

. The name comes from the ed
Ed (text editor)
ed is a line editor for the Unix operating system. It was one of the first end-user programs hosted on the system and has been standard in Unix-based systems ever since. ed was originally written in PDP-11/20 assembler by Ken Thompson in 1971...

 command g/re/p (global / regular expression / print). The grep command searches files or standard input globally for lines matching a given regular expression
Regular expression
In computing, a regular expression provides a concise and flexible means for "matching" strings of text, such as particular characters, words, or patterns of characters. Abbreviations for "regular expression" include "regex" and "regexp"...

, and prints the lines to the program's standard output.

History

Grep was created by Ken Thompson
Ken Thompson
Kenneth Lane Thompson , commonly referred to as ken in hacker circles, is an American pioneer of computer science...

 as a standalone application adapted from the regular expression parser he had written for ed
Ed (text editor)
ed is a line editor for the Unix operating system. It was one of the first end-user programs hosted on the system and has been standard in Unix-based systems ever since. ed was originally written in PDP-11/20 assembler by Ken Thompson in 1971...

 (which he also created). Its official creation date is given as March 3, 1973, in the Manual for Unix Version 4.

Usage

This is an example of a common grep usage:


grep apple fruitlist.txt


In this case, grep prints all lines containing apple from the file fruitlist.txt, regardless of word boundaries; therefore lines containing pineapple or apples are also printed. The grep command is case sensitive by default, so this example's output does not include lines containing Apple (with a capital A) unless they also contain apple.

To search all .txt files in a directory for apple in a shell that supports globbing, use an asterisk in place of the file name:


grep apple *.txt


Regular expression
Regular expression
In computing, a regular expression provides a concise and flexible means for "matching" strings of text, such as particular characters, words, or patterns of characters. Abbreviations for "regular expression" include "regex" and "regexp"...

s can be used to match more complicated queries. The following prints all lines in the file that begin with the letter a, followed by any one character, then the letters ple.


grep ^a.ple fruitlist.txt


As noted above, the term "grep" derives from a usage in ed
Ed (text editor)
ed is a line editor for the Unix operating system. It was one of the first end-user programs hosted on the system and has been standard in Unix-based systems ever since. ed was originally written in PDP-11/20 assembler by Ken Thompson in 1971...

 and related text editor
Text editor
A text editor is a type of program used for editing plain text files.Text editors are often provided with operating systems or software development packages, and can be used to change configuration files and programming language source code....

s. Before grep existed as a separate command, the same effect might have been achieved by doing:


ed fruitlist.txt
g/^a.ple/p
q


where the second line is the command given to ed to print the relevant lines, and the third line is the command to exit from ed.

Like most Unix commands, grep accepts options in the form of command-line arguments, to change many of its behaviors. For example:


grep -i apple fruitlist.txt


This prints all lines containing apple regardless of capitalization. The -i argument tells grep to be case insensitive, or to ignore case.

To print all lines containing apple as a word (pineapple and apples will not match):


grep -w apple fruitlist.txt


But if fruitlist.txt contains apple as a word followed by hyphen (-) character, it will also get matched.


cat fruitlist.txt
apple
apples
pineapple
apple-
apple-fruit
fruit-apple

grep -w apple fruitlist.txt
apple
apple-
apple-fruit
fruit-apple


So to print all lines only containing exactly apple in the whole line, use line-regexp instead of word-regexp:


cat fruitlist.txt
apple
apples
pineapple
apple-
apple-fruit
fruit-apple

grep -x apple fruitlist.txt
apple


the -v (lower-case v) prints all lines that do NOT contain apple in this example.

grep -v apple fruitlist.txt
banana
pear
peach
orange

Variations

There are countless implementations and derivatives of grep available for many operating systems. Early variants of grep included egrep and fgrep. egrep applies an extended regular expression syntax that was added to Unix after Ken Thompson
Ken Thompson
Kenneth Lane Thompson , commonly referred to as ken in hacker circles, is an American pioneer of computer science...

's original regular expression implementation. fgrep searches for any of a list of fixed strings using the Aho–Corasick string matching algorithm
Aho–Corasick string matching algorithm
The Aho–Corasick string matching algorithm is a string searching algorithm invented by Alfred V. Aho and Margaret J. Corasick. It is a kind of dictionary-matching algorithm that locates elements of a finite set of strings within an input text. It matches all patterns simultaneously...

. These variants of grep persist in most modern grep implementations as command-line switches (and standardized as -E and -F in POSIX
POSIX
POSIX , an acronym for "Portable Operating System Interface", is a family of standards specified by the IEEE for maintaining compatibility between operating systems...

). In such combined implementations, grep may also behave differently depending on the name by which it is invoked, allowing fgrep, egrep, and grep to be links to the same program.

Other commands contain the word "grep" to indicate that they search (usually for regular expression matches). The pgrep
Pgrep
pgrep is a command-line utility initially written for use with the Solaris 7 operating system. It has since been reimplemented for Linux and the BSDs . It searches for all the named processes that can be specified as extended regular expression patterns, and—by default—returns their process ID...

 utility, for instance, displays the processes whose names match a given regular expression.

In Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...

, grep is the name of the built-in function that finds elements in a list that satisfy a certain property. This higher-order function
Higher-order function
In mathematics and computer science, higher-order functions, functional forms, or functionals are functions which do at least one of the following:*take one or more functions as an input*output a function...

 is typically named filter
Filter (higher-order function)
In functional programming, filter is a higher-order function that processes a data structure in some order to produce a new data structure containing exactly those elements of the original data structure for which a given predicate returns the boolean value true.-Example:In Haskell, the code...

 in functional programming
Functional programming
In computer science, functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast to the imperative programming style, which emphasizes changes in state...

 languages.

pcregrep is an implementation of grep that uses Perl regular expression syntax.

Ports
Porting
In computer science, porting is the process of adapting software so that an executable program can be created for a computing environment that is different from the one for which it was originally designed...

 of grep (within Cygwin
Cygwin
Cygwin is a Unix-like environment and command-line interface for Microsoft Windows. Cygwin provides native integration of Windows-based applications, data, and other system resources with applications, software tools, and data of the Unix-like environment...

 and GnuWin32
GnuWin32
The GnuWin32 project provides native ports in the form of runnable computer programs, patches, and source code for various GNU and open source tools and software, much of it modified to run on the 32-bit Windows platform...

, for example) also run under Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

. Some versions of Windows feature the similar qgrep command.

Usage as a verb

In December 2003, the Oxford English Dictionary
Oxford English Dictionary
The Oxford English Dictionary , published by the Oxford University Press, is the self-styled premier dictionary of the English language. Two fully bound print editions of the OED have been published under its current name, in 1928 and 1989. The first edition was published in twelve volumes , and...

 Online
added draft entries for "grep" as both a noun and a verb.

A common verb usage is the phrase "You can't grep dead trees"—meaning one can more easily search through digital media, using tools such as grep, than one could with a hard copy (i.e., one made from dead trees, paper). Compare with google
Google (verb)
The transitive verb to google refers to using the Google search engine to obtain information on the Web. However, it can also be used as a general term for searching the internet using any search engine, not just Google...

.

See also

  • Boyer–Moore string search algorithm
    Boyer–Moore string search algorithm
    The Boyer–Moore string search algorithm is a particularly efficient string searching algorithm, and it has been the standard benchmark for the practical string search literature. It was developed by Bob Boyer and J Strother Moore in 1977...

  • List of Unix utilities
  • vgrep, or "visual grep"
  • find
    Find (command)
    In computing, find is a command in the command line interpreters of DOS, OS/2 and Microsoft Windows. It is used to search for a specific text string in a file or files...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK