Substring
Encyclopedia
A subsequence, substring, prefix or suffix of a string
String (computer science)
In formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set or alphabet....

 is a subset of the symbols in a string, where the order of the elements is preserved. In this context, the terms string and sequence
Sequence
In mathematics, a sequence is an ordered list of objects . Like a set, it contains members , and the number of terms is called the length of the sequence. Unlike a set, order matters, and exactly the same elements can appear multiple times at different positions in the sequence...

have the same meaning.

Subsequence

Main article subsequence
Subsequence
In mathematics, a subsequence is a sequence that can be derived from another sequence by deleting some elements without changing the order of the remaining elements...



A subsequence of a string is a string such that , where . Subsequence is a generalisation of substring, suffix and prefix. Finding the longest string which is equal to a subsequence of two or more strings is known as the longest common subsequence problem
Longest common subsequence problem
The longest common subsequence problem is to find the longest subsequence common to all sequences in a set of sequences . Note that subsequence is different from a substring, see substring vs. subsequence...

.

Example: The string anna is equal to a subsequence of the string banana:

banana
|| ||
an na

Including the empty subsequence, the number of subsequences of a string of length where symbols only occur once, is simply the number of subsets of the symbols in the string, i.e. .

Substring

A substring (or factor) of a string is a string , where and . A substring of a string is a prefix of a suffix of the string, and equivalently a suffix of a prefix. If is a substring of , it is also a subsequence
Subsequence
In mathematics, a subsequence is a sequence that can be derived from another sequence by deleting some elements without changing the order of the remaining elements...

, which is a more general concept. Given a pattern , you can find its occurrences in a string with a string searching algorithm
String searching algorithm
String searching algorithms, sometimes called string matching algorithms, are an important class of string algorithms that try to find a place where one or several strings are found within a larger string or text....

. Finding the longest string which is equal to a substring of two or more strings is known as the longest common substring problem
Longest common substring problem
The longest common substring problem is to find the longest string that is a substring of two or more strings. It should not be confused with the longest common subsequence problem. The longest common substring problem is to find the longest string (or strings) that is a substring (or are...

.

Example: The string ana is equal to substrings (and subsequences) of banana at two different offsets:

banana
|||||
ana||
|||
ana

In the mathematical literature, substrings are also called subwords (in America) or factors (in Europe).

Not including the empty substring, the number of substrings of a string of length where symbols only occur once, is the number of ways to choose two distinct places between symbols to start/end the substring. Including the very beginning and very end of the string, there are such places. So there are non-empty substrings.

Prefix

A prefix of a string is a string , where . A proper prefix of a string is not equal to the string itself (); some sources in addition restrict a proper prefix to be non-empty (). A prefix can be seen as a special case of a substring.

Example: The string ban is equal to a prefix (and substring and subsequence) of the string banana:

banana
|||
ban

The square subset symbol is sometimes used to indicate a prefix, so that denotes that is a prefix of . This defines a binary relation
Binary relation
In mathematics, a binary relation on a set A is a collection of ordered pairs of elements of A. In other words, it is a subset of the Cartesian product A2 = . More generally, a binary relation between two sets A and B is a subset of...

 on strings, called the prefix relation.

In formal language theory, the term prefix of a string is also commonly understood to be the set of all prefixes of a string, with respect to that language. See the article on string functions for more details.

Suffix

A suffix of a string is a string , where . A proper suffix of a string is not equal to the string itself (); again, a more restricted interpretation is that it is also not empty (). A suffix can be seen as a special case of a substring.

Example: The string nana is equal to a suffix (and substring and subsequence) of the string banana:

banana
||||
nana

Superstring

Given a set of strings , a superstring of the set is single string that contains every string in as a substring.
For example, a concatenation of the strings of in any order gives a trivial superstring of . For a more interesting example, let . Then is a superstring of , and is another, shorter superstring of . Generally, we are interested in finding superstrings whose length is small.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK