E-mail address
Encyclopedia
An email address identifies an email box to which email messages
Email
Electronic mail, commonly known as email or e-mail, is a method of exchanging digital messages from an author to one or more recipients. Modern email operates across the Internet or other computer networks. Some early email systems required that the author and the recipient both be online at the...

 are delivered. An example format of an email address is lewis@example.net which is read as lewis at example dot net. Many earlier email systems used different address formats.

Overview

Transport of email across the Internet uses the Simple Mail Transfer Protocol
Simple Mail Transfer Protocol
Simple Mail Transfer Protocol is an Internet standard for electronic mail transmission across Internet Protocol networks. SMTP was first defined by RFC 821 , and last updated by RFC 5321 which includes the extended SMTP additions, and is the protocol in widespread use today...

 (SMTP), which is defined in Internet standard
Internet standard
In computer network engineering, an Internet Standard is a normative specification of a technology or methodology applicable to the Internet. Internet Standards are created and published by the Internet Engineering Task Force .-Overview:...

s RFC 5321 and RFC 5322, while mailboxes are most often accessed with the Post Office Protocol
Post Office Protocol
In computing, the Post Office Protocol is an application-layer Internet standard protocol used by local e-mail clients to retrieve e-mail from a remote server over a TCP/IP connection. POP and IMAP are the two most prevalent Internet standard protocols for e-mail retrieval. Virtually all modern...

 (POP) and the Internet Message Access Protocol
Internet Message Access Protocol
Internet message access protocol is one of the two most prevalent Internet standard protocols for e-mail retrieval, the other being the Post Office Protocol...

 (IMAP).

Email addresses, such as jsmith@example.org, have two parts. The part before the @ sign is the local-part of the address, often the username of the recipient (jsmith), and the part after the @ sign is a domain name to which the email message will be sent (example.org).

An SMTP server looks up the domain name
Domain name
A domain name is an identification string that defines a realm of administrative autonomy, authority, or control in the Internet. Domain names are formed by the rules and procedures of the Domain Name System ....

 using the Domain Name System
Domain name system
The Domain Name System is a hierarchical distributed naming system for computers, services, or any resource connected to the Internet or a private network. It associates various information with domain names assigned to each of the participating entities...

, which is a distributed database. A server queries the DNS for any mail exchanger records (MX record
MX record
A mail exchanger record is a type of resource record in the Domain Name System that specifies a mail server responsible for accepting email messages on behalf of a recipient's domain, and a preference value used to prioritize mail delivery if multiple mail servers are available...

s) to find the host name of a designated mail transfer agent
Mail transfer agent
Within Internet message handling services , a message transfer agent or mail transfer agent or mail relay is software that transfers electronic mail messages from one computer to another using a client–server application architecture...

 (MTA) for that address. That way, the organization holding the delegation for a given domain —the mailbox provider— can define which are the target hosts for all email destined to its domain. The mail exchanger does not need to be located in the domain of the destination mail box, it must simply accept mail for the domain. The target hosts are configured with a mechanism to deliver mail to all destination mail boxes. The local-part of an address, is defined to be opaque to intermediate mail relay systems except the final mailbox host. For example, it must not be assumed to be case-insensitive.

Multiple email addresses may point to the same mailbox. Conversely, a single email address may be an alias and have a distribution function to many mailboxes. Email aliases, electronic mailing list
Electronic mailing list
An electronic mailing list is a special usage of email that allows for widespread distribution of information to many Internet users. It is similar to a traditional mailing list — a list of names and addresses — as might be kept by an organization for sending publications to...

s, sub-addressing, and catch-all
Catch-all (Mail)
In common use, a catchall or catch-all is a general term, or physical dumping group for a variety of similar words or meanings.In the context of e-mail, a Catch-all usually refers to a mailbox on a domain that will "catch all" of the emails addressed to the domain that do not exist in the mail server...

 addresses, the latter being mailboxes that receive messages irrespectively of the local part, are common patterns for achieving such results.

The addresses found in the header fields of an email message are not the ones used by SMTP servers to deliver the message. Servers use the so-called message envelope to route mail. While envelope and header addresses may be equal, forged email addresses are often seen in spam, phishing
Phishing
Phishing is a way of attempting to acquire information such as usernames, passwords, and credit card details by masquerading as a trustworthy entity in an electronic communication. Communications purporting to be from popular social web sites, auction sites, online payment processors or IT...

, and many other internet-based scams. This has led to several initiatives which aim to make such forgeries easier to spot.

To indicate whom the message is intended for, a user can use the "display name" of the recipient followed by the address specification surrounded by angled brackets, for example: John Smith .

Earlier forms of email addresses included the somewhat verbose notation required by X.400
X.400
X.400 is a suite of ITU-T Recommendations that define standards for Data Communication Networks for Message Handling Systems — more commonly known as "email"....

, and the UUCP
UUCP
UUCP is an abbreviation for Unix-to-Unix Copy. The term generally refers to a suite of computer programs and protocols allowing remote execution of commands and transfer of files, email and netnews between computers. Specifically, a command named uucp is one of the programs in the suite; it...

 "bang path" notation, in which the address was given in the form of a sequence of computers through which the message should be relayed. This was widely used for several years, but was superseded by the generally more convenient SMTP form.

Syntax

The format of email addresses is local-part@domain where the local-part may be up to 64 characters long and the domain name
Domain name
A domain name is an identification string that defines a realm of administrative autonomy, authority, or control in the Internet. Domain names are formed by the rules and procedures of the Domain Name System ....

 may have a maximum of 253 characters - but the maximum 256 characters length of a forward or reverse path restricts the entire email address to be no more than 254 characters. - formally defined in RFC 5322 (sections 3.2.3 and 3.4.1) and by RFC 5321.

Local part

The local-part of the email address may use any of these ASCII characters RFC 5322 Section 3.2.3:
  • Uppercase and lowercase English letters (a–z, A–Z) (ASCII: 65-90, 97-122)
  • Digits 0 to 9 (ASCII: 48-57)
  • Characters !#$%&'*+-/=?^_`{|}~ (ASCII: 33, 35-39, 42, 43, 45, 47, 61, 63, 94-96, 123-126)
  • Character . (dot, period, full stop) (ASCII: 46) provided that it is not the first or last character, and provided also that it does not appear two or more times consecutively (e.g. John..Doe@example.com is not allowed.).
  • Special characters are allowed with restrictions including:
    • Space and ",:;<>@[\] (ASCII: 32, 34, 40, 41, 44, 58, 59, 60, 62, 64, 91-93)


The restrictions for special characters are that they must be contained between quotation marks and that 3 special chars (The space, backslash \ and quotation mark " (ASCII: 32, 92, 34) must be preceded by a backslash \ (e.g. "\"\\\ ").

A quoted string may exist as a dot separated entity within the local-part or it may exist when the outermost quotes are the outermost chars of the local-part (e.g. abc."defghi".xyz@example.com or "abcdefghixyz"@example.com are allowed. abc"defghi"xyz@example.com is not; neither is abc\"def\"ghi@example.com). Quoted strings and characters however, are not commonly used. RFC 5321 also warns that "a host that expects to receive mail SHOULD avoid defining mailboxes where the Local-part requires (or uses) the Quoted-string form" (sic).

The local-part "postmaster" is treated specially - it is case-insensitive, and should be forwarded to the server's administrator. Technically all other local-parts are case sensitive, therefore jsmith@example.com and JSmith@example.com specify different mailboxes. However most organizations treat uppercase and lowercase letters as equivalent, and also do not allow use of the technically valid characters (space, ? and ^). Organizations are free to restrict the forms of their own email addresses as desired, e.g., Windows Live Hotmail, for example, only allows creation of email addresses using alphanumerics, dot (.), underscore (_) and hyphen (-).

Systems that send mail must be capable of handling outgoing mail for all valid addresses. Contrary to the relevant standards, some defective systems treat certain legitimate addresses as invalid and fail to handle mail to these addresses. Hotmail, for example, refuses to send mail to any address containing any of the following standards-permissible characters: !#$%*/?^`{|}~

Domain part

The domain name part of an email address has to conform to strict guidelines: it must match the requirements for a hostname
Hostname
A hostname is a label that is assigned to a device connected to a computer network and that is used to identify the device in various forms of electronic communication such as the World Wide Web, e-mail or Usenet...

, consisting of letters, digits, hyphens and dots. In addition, the domain part may be an IP address
IP address
An Internet Protocol address is a numerical label assigned to each device participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing...

 literal, surrounded by square braces, such as jsmith@[192.168.2.1], although this is rarely seen except in email spam.

The informational RFC 3696 written by the author of RFC 5321 explains the
details in a readable way, with a few errors noted in the 3696 errata.

Valid email addresses

  • niceandsimple@example.com
  • a.little.unusual@example.com
  • a.little.more.unusual@dept.example.com
  • much."more\ unusual"@example.com
  • very.unusual."@".unusual.com@example.com
  • very.",:;<>[]".VERY."very\\\ \@\"very".unusual@strange.example.com

Invalid email addresses

  • Abc.example.com (character @ is missing)

  • A@b@c@example.com (only one @ is allowed outside quotation marks)
  • ",:;<>[\]@example.com (none of the characters before the @ in this example is allowed outside quotation marks)
  • just"not"right@example.com (quoted strings must be dot separated or the only element making up the local-part)
  • this\ is\"really\"not\\allowed@example.com (spaces, quotes and slashes may only exist when within quoted strings and preceded by a slash)

Common local-part semantics

According to RFC 5321 2.3.11 Mailbox and Address, "...the local-part MUST be interpreted and assigned semantics only by the host specified in the domain part of the address.".
This means that no assumptions can be made about the meaning of the local-part of another mail server. It is entirely up to the configuration of the mail server.

Local-part normalization

Interpretation of the local-part of an email address is dependent on the conventions and policies implemented in the mail server. For example, case-sensitivity may distinguish mailboxes differing only in capitalization of characters of the local-part, although this is not very common. GMail
Gmail
Gmail is a free, advertising-supported email service provided by Google. Users may access Gmail as secure webmail, as well via POP3 or IMAP protocols. Gmail was launched as an invitation-only beta release on April 1, 2004 and it became available to the general public on February 7, 2007, though...

 (Google
Google
Google Inc. is an American multinational public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program...

 Mail) ignores all dots in the local-part for the purposes of determining account identity. This prevents the creation of user accounts your.user.name or yourusername when the account your.username already exists.

Address tags

Some mail services allow a user to append a tag to his email address (e.g., where joeuser@example.com is the main address, which would also accept mail for joeuser+work@example.com or joeuser-family@example.com). The text of tag may be used to apply filtering and to create single-use addresses. Some IETF standards-track documents, such as RFC 5233 refer to this convention as "sub-addressing".

Disposable email addresses of this form, using various separators between the base name and the tag are supported by several email services, including Runbox
Runbox
Runbox AS is a mainly employee-owned company that provides email hosting services worldwide. Runbox was founded in October, 1999 and is headquartered in Oslo, Norway...

 (plus and hyphen), Google Mail
Gmail
Gmail is a free, advertising-supported email service provided by Google. Users may access Gmail as secure webmail, as well via POP3 or IMAP protocols. Gmail was launched as an invitation-only beta release on April 1, 2004 and it became available to the general public on February 7, 2007, though...

 (plus), Yahoo! Mail Plus (hyphen), Apple's MobileMe
MobileMe
MobileMe was a subscription-based collection of online services and software offered by Apple Inc. Originally launched on January 5, 2000, as iTools, a free collection of Internet-based services for users of Mac OS 9, Apple relaunched it as .Mac on July 17, 2002, when it became a paid subscription...

 (plus), FastMail.FM
FastMail.FM
FastMail.FM is an e-mail service offered by the Messaging Engine company of Parkville, Victoria, Australia. Its servers are located in New York City with a backup in Norway.- History :...

 (plus and Subdomain Addressing), and MMDF
MMDF
MMDF, the Multichannel Memorandum Distribution Facility, is a message transfer agent , a computer program designed to transmit email.-History:...

 (equals).

Most installations of the qmail
Qmail
qmail is a mail transfer agent that runs on Unix. It was written, starting December 1995, by Daniel J. Bernstein as a more secure replacement for the popular Sendmail program...

 and Courier Mail Server
Courier Mail Server
The Courier mail server is a mail transfer agent server that provides ESMTP, IMAP, POP3, SMAP, webmail, and mailing list services with individual components. It is best known for its IMAP server component....

 products support the use of a hyphen '-' as a separator within the local-part, such as joeuser-tag@example.com or joeuser-tag-sub-anything-else@example.com. This allows qmail through .qmail-default or .qmail-tag-sub-anything-else files to sort, filter, forward, or run an application based on the tagging system established.

Some mail servers violate RFC 5322, and the recommendations in RFC 3696, by refusing to send mail addressed to a user on another system merely because the local-part of the address contains the plus sign (+). It is also quite common for web forms to either refuse to accept the plus sign as a part or the username or to even misbehave in an undetermined manner. However this is generally much less of a problem when the hyphen (-) is used as the separator.

Validation

Not only are email addresses used in a mail client or on a mail server, but also used in websites where a user-supplied email address is often validated
Data validation
In computer science, data validation is the process of ensuring that a program operates on clean, correct and useful data. It uses routines, often called "validation rules" or "check routines", that check for correctness, meaningfulness, and security of data that are input to the system...

.

An email address is generally recognized as having two parts joined with an at-sign (@); this in itself is a basic form of validation. However, the technical specification detailed in RFC 822 and subsequent RFCs go far beyond this, offering very complex and strict restrictions.

Trying to match these restrictions is a complex task, often resulting in long regular expression
Regular expression
In computing, a regular expression provides a concise and flexible means for "matching" strings of text, such as particular characters, words, or patterns of characters. Abbreviations for "regular expression" include "regex" and "regexp"...

s.

This means that many mail servers adopt very relaxed validation that allows and handles email addresses that are disallowed according to the RFC and instead verify the email address against relevant systems such as DNS
Domain name system
The Domain Name System is a hierarchical distributed naming system for computers, services, or any resource connected to the Internet or a private network. It associates various information with domain names assigned to each of the participating entities...

 for the domain part or using callback verification
Callback verification
Callback verification, also known as callout verification or Sender Address Verification, is a technique used by SMTP software in order to validate e-mail addresses. The most common target of verification is the sender address from the message envelope...

 to check if the mailbox exists.

Conversely, many websites check email addresses much more strictly than the standard specifies, rejecting addresses containing valid characters like + or / signs, or setting arbitrary length limitations (e.g., 30 characters). RFC 3696 was written to give specific advice for validating internet identifiers, including email addresses.

Internationalization

The IETF conducts a technical and standards working group devoted to internationalization issues of email addresses, entitled Email Address Internationalization (EAI, also known as IMA - Internationalized Email Address). This group has published the informational RFC 4952, envisioning changes to the mail header environment to permit the full range of Unicode characters and an SMTP Extension to permit UTF-8 mail addressing. Experimental RFC 5335 describes internationalized email headers, including a UTF8-based address specification. The list of valid examples below is thus expected to undergo significant additions.

The basic EAI concepts involve exchanging mail in UTF-8. Though the original proposal included a downgrading mechanisms for legacy systems this has now been dropped. The local servers are responsible for the "local" part of the address, whereas the domain portion would be restricted by the rules of internationalized domain name
Internationalized domain name
An internationalized domain name is an Internet domain name that contains at least one label that is displayed in software applications, in whole or in part, in a language-specific script or alphabet, such as Arabic, Chinese, Russian, Hindi or the Latin alphabet-based characters with diacritics,...

s, though still transmitted in UTF-8. The mail server is also responsible for any mapping mechanism between the IMA form and any ASCII alias.

When EAI is standardized, users will likely have a localized address in a native language script or character set, as well as an ASCII form for communicating with legacy systems or for script-independent use. Applications that recognize internationalized domain names and mail addresses must have facilities to convert these representations.

Internationalization Examples

These addresses are not compliant with RFC 5322 and will therefore not work with many of the current generation of email servers and clients.
  • Latin Alphabet (with diacritics): Pelé@example.com
  • Greek Alphabet: Rδοκιμή@παράδειγμα.δοκιμή
  • Japanese Characters: 甲斐@黒川.日本
  • Cyrillic Characters: чебурашка@ящик-с-апельсинами.рф

External links

  • RFC 821 - Simple Mail Transfer Protocol (Obsoleted by RFC 2821)
  • RFC 822 - Standard for the Format of ARPA Internet Text Messages (Obsoleted by RFC 2822) (Errata)
  • RFC 1035 - Domain names - implementation and specification (Errata)
  • RFC 1123 - Requirements for Internet Hosts - Application and Support (Updated by RFC 2821, RFC 5321) (Errata)
  • RFC 2821 - Simple Mail Transfer Protocol (Obsoletes RFC 821, Updates RFC 1123, Obsoleted by RFC 5321) (Errata)
  • RFC 2142 - Mailbox Names for Common Services, Roles and Functions ( Errata)
  • RFC 2822 - Internet Message Format (Obsoletes RFC 822, Obsoleted by RFC 5322) (Errata)
  • RFC 3696 - Application Techniques for Checking and Transformation of Names (Errata)
  • RFC 4291 - IP Version 6 Addressing Architecture (Updated by RFC 5952) (Errata)
  • RFC 5321 - Simple Mail Transfer Protocol (Obsoletes RFC 2821, Updates RFC 1123) (Errata)
  • RFC 5322 - Internet Message Format (Obsoletes RFC 2822) (Errata)
  • RFC 5952 - A Recommendation for IPv6 Address Text Representation (Updates RFC 4291) (Errata)
  • http://haacked.com/archive/2007/08/21/i-knew-how-to-validate-an-email-address-until-i.aspx I Knew How To Validate An Email Address Until I Read The RFC
  • http://www.remote.org/jochen/mail/info/chars.html Characters in the Local Part of an EMail address
  • http://www.remote.org/jochen/mail/info/address.html Anatomy of an EMail Address
  • http://code.iamcal.com/php/rfc822/ RFC 822 / RFC 2822 / RFC 3696 An EMail address parser in PHP
  • http://isemail.info/about Online EMail Address Validator using PHP
  • http://emailverify.net/Demo.aspx Online email address validator using Microsoft .NET
  • http://newbeatsmedia.com/2010/02/05/your-virtual-image-part-1/ Picking the Right EMail Address
  • http://code.iamcal.com/php/rfc822/full_regexp.txt Regular Expression for Determining if an EMail Address is Valid
  • http://squiloople.com/2009/12/20/email-address-validation/ E-mail address validation tutorial with examples
  • http://www.circleid.com/posts/print/digging_through_the_problem_of_ipv6_and_email_part_1/ Digging Through the Problem of IPv6 and Email Part 1 Part 2 Part 3
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK