Leaning toothpick syndrome
Encyclopedia
In computer programming
Computer programming
Computer programming is the process of designing, writing, testing, debugging, and maintaining the source code of computer programs. This source code is written in one or more programming languages. The purpose of programming is to create a program that performs specific operations or exhibits a...

, leaning toothpick syndrome (LTS) is the situation in which a quoted expression becomes unreadable because it contains a large number of escape character
Escape character
In computing and telecommunication, an escape character is a character which invokes an alternative interpretation on subsequent characters in a character sequence. An escape character is a particular case of metacharacters...

s, usually backslash
Backslash
The backslash is a typographical mark used mainly in computing. It was first introduced to computers in 1960 by Bob Bemer. Sometimes called a reverse solidus or a slosh, it is the mirror image of the common slash....

es ("\"), to avoid delimiter collision.

The official Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...

 documentation introduced the term to wider usage; there, the phrase is used to describe regular expression
Regular expression
In computing, a regular expression provides a concise and flexible means for "matching" strings of text, such as particular characters, words, or patterns of characters. Abbreviations for "regular expression" include "regex" and "regexp"...

s that match Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

-style paths in which the elements are separated by forward slashes.

LTS appears in many programming languages and in many situations, including in patterns that match Uniform Resource Identifier
Uniform Resource Identifier
In computing, a uniform resource identifier is a string of characters used to identify a name or a resource on the Internet. Such identification enables interaction with representations of the resource over a network using specific protocols...

s (URIs) and in programs that output quoted text. Many quines fall into the latter category.

Pattern example

Consider the following Perl regular expression intended to match URIs which identify files under the pub directory of an FTP
File Transfer Protocol
File Transfer Protocol is a standard network protocol used to transfer files from one host to another host over a TCP-based network, such as the Internet. FTP is built on a client-server architecture and utilizes separate control and data connections between the client and server...

 site:


m/ftp:\/\/[^\/]*\/pub\//


Perl solves this problem by allowing many other characters to be delimiters for a regular expression. For example, the following three examples are equivalent to the expression given above:

m{ftp://[^/]*/pub/}
m#ftp://[^/]*/pub/#
m!ftp://[^/]*/pub/!

Quoted text example

A Perl program to print an HTML link tag, where the URL and link text are stored in variables $url and $text respectively, might look like this. Notice the use of backslashes to escape the quoted double-quote characters:

print "$text";

Using single quotes to delimit the string is not feasible, as Perl does not expand variables inside single-quoted strings.

print '$text'

Using the printf function
Printf
Printf format string refers to a control parameter used by a class of functions typically associated with some types of programming languages. The format string specifies a method for rendering an arbitrary number of varied data type parameter into a string...

 is a viable solution in many languages (Perl, C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

, PHP
PHP
PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...

):

printf('%s', $url, $text);

The qq operator in Perl allows for any delimiter:

print qq{$text};
print qq|$text|;
print qq($text);


Here documents are especially well suited for multi-line strings; however, here documents do not allow for proper indentation
Indent style
In computer programming, an indent style is a convention governing the indentation of blocks of code to convey the program's structure. This article largely addresses the C programming language and its descendants, but can be applied to most other programming languages...

. This example shows the Perl syntax:

print < $text
HERE_IT_ENDS

C#

The C# programming language handles LTS by the use of the '@' symbol at the start of string literals, before the initial quotation marks e.g.


string filePath = @"C:\Foo\Bar.txt"


rather than otherwise requiring:


string filePath = "C:\\Foo\\Bar.txt"

Python

Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

 has a similar construct using 'r':


filePath = r"C:\Foo\Bar.txt"

Scala

Scala allows usage of triple quotes in order to prevent escaping confusion:


val filePath = """C:\Foo\Bar.txt"""
val pubPattern = """ftp://[^/]*/pub/"""r


The triple quotes also allow for multi line strings, as shown here:


val text = """First line,
second line."""

Sed

Sed
Sed
sed is a Unix utility that parses text and implements a programming language which can apply transformations to such text. It reads input line by line , applying the operation which has been specified via the command line , and then outputs the line. It was developed from 1973 to 1974 as a Unix...

regular expressions, particularly using the 's' operator, have a similar situation to Perl, and indeed sed is a predecessor to Perl – the default delimiter is '/', but other delimiters can also be used – default is "s/regexp/replacement/", but "s,regexp,replacement," is also valid. For example, to match a "pub" directory (as in the Perl example) and replace it with "foo", the default (escaping the slashes) is:
s/ftp:\/\/[^\/]*\/pub\//foo/
Using a comma (',') as delimiter instead yields:
s,ftp://[^/]*/pub/,foo,
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK