Proxy auto-config
Encyclopedia
A proxy auto-config file defines how web browser
Web browser
A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...

s and other user agent
User agent
In computing, a user agent is a client application implementing a network protocol used in communications within a client–server distributed computing system...

s can automatically choose the appropriate proxy server
Proxy server
In computer networks, a proxy server is a server that acts as an intermediary for requests from clients seeking resources from other servers. A client connects to the proxy server, requesting some service, such as a file, connection, web page, or other resource available from a different server...

 (access method) for fetching a given URL
Uniform Resource Locator
In computing, a uniform resource locator or universal resource locator is a specific character string that constitutes a reference to an Internet resource....

.

A PAC file contains a JavaScript
JavaScript
JavaScript is a prototype-based scripting language that is dynamic, weakly typed and has first-class functions. It is a multi-paradigm language, supporting object-oriented, imperative, and functional programming styles....

 function "FindProxyForURL(url, host)". This function returns a string with one or more access method specifications. These specifications cause the user agent to use a particular proxy server or to connect directly.

Multiple specifications provide a fallback when a proxy fails to respond. The browser fetches this PAC file before retrieving other pages. The URL of the PAC file is either configured manually or determined automatically by the Web Proxy Autodiscovery Protocol
Web Proxy Autodiscovery Protocol
The Web Proxy Auto-Discovery Protocol is a method used by clients to locate a URL of a configuration file using DHCP and/or DNS discovery methods. Once detection and download of the configuration file is complete it can be executed to determine the proxy for a specified URL...

.

Context

Modern web browsers implement several levels of automation; users can choose the level that is appropriate to their needs. The following methods are commonly implemented:
  • Automatic proxy selection: Specify a hostname and a port number to be used for all URLs. Most browsers allow you to specify a list of domains (such as localhost) that will bypass this proxy.
  • Proxy auto-configuration (PAC): Specify the URL for a PAC file with a JavaScript function that determines the appropriate proxy for each URL. This method is more suitable for laptop users who need several different proxy configurations, or complex corporate setups with many different proxies.
  • Web Proxy Autodiscovery Protocol
    Web Proxy Autodiscovery Protocol
    The Web Proxy Auto-Discovery Protocol is a method used by clients to locate a URL of a configuration file using DHCP and/or DNS discovery methods. Once detection and download of the configuration file is complete it can be executed to determine the proxy for a specified URL...

     (WPAD): Let the browser guess the location of the PAC file through DHCP and DNS
    Domain name system
    The Domain Name System is a hierarchical distributed naming system for computers, services, or any resource connected to the Internet or a private network. It associates various information with domain names assigned to each of the participating entities...

     lookups.

Proxy Configuration

Computer operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

s (e.g., Microsoft Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

, Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

, Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

) require a number of settings to communicate over the Internet
Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...

. These settings are typically obtained from an Internet Service Provider
Internet service provider
An Internet service provider is a company that provides access to the Internet. Access ISPs directly connect customers to the Internet using copper wires, wireless or fiber-optic connections. Hosting ISPs lease server space for smaller businesses and host other people servers...

 (ISP). Either anonymous (proxy to use a proxy server
Proxy server
In computer networks, a proxy server is a server that acts as an intermediary for requests from clients seeking resources from other servers. A client connects to the proxy server, requesting some service, such as a file, connection, web page, or other resource available from a different server...

) or real settings may be used to establish a network connection. For more information, see Windows Proxy Connection, contact your ISP, or search the Web for your own OS proxy requirements.

The PAC file

The Proxy auto-config file format was originally designed by Netscape
Netscape
Netscape Communications is a US computer services company, best known for Netscape Navigator, its web browser. When it was an independent company, its headquarters were in Mountain View, California...

 in 1996 for the Netscape Navigator 2.0
Netscape Navigator
Netscape Navigator was a proprietary web browser that was popular in the 1990s. It was the flagship product of the Netscape Communications Corporation and the dominant web browser in terms of usage share, although by 2002 its usage had almost disappeared...

 and is a text file
Text file
A text file is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists within a computer file system...

 that defines at least one JavaScript function, FindProxyForURL(url, host), with two arguments: url is the URL of the object and host is the hostname derived from that URL. By convention, the PAC file is normally named proxy.pac. The WPAD standard uses wpad.dat.

To use it, a PAC file is published to a web server
Web server
Web server can refer to either the hardware or the software that helps to deliver content that can be accessed through the Internet....

, and client user agents are instructed to use it, either by entering the URL in the proxy connection settings of the browser or through the use of the WPAD protocol.

Even though most clients will process the script regardless of the MIME type returned in the HTTP request, for the sake of completeness and to maximize compatibility, the web server should be configured to declare the MIME type of this file to be either application/x-ns-proxy-autoconfig or application/x-javascript-config.

There is little evidence to favor the use of one MIME type over the other. It would be, however, reasonable to assume that application/x-ns-proxy-autoconfig will be supported in more clients than application/x-javascript-config as it was defined in the original Netscape specification, the latter type coming into use more recently.

A very simple example of a PAC file is:

function FindProxyForURL(url, host)
{
return "PROXY proxy.example.com:8080; DIRECT";
}


This function instructs the browser to retrieve all pages through the proxy on port 8080 of the server proxy.example.com. Should this proxy fail to respond, the browser contacts the website directly, without using a proxy. The latter may fail if firewalls
Firewall (computing)
A firewall is a device or set of devices designed to permit or deny network transmissions based upon a set of rules and is frequently used to protect networks from unauthorized access while permitting legitimate communications to pass....

 or other intermediary network devices reject requests from sources other than the proxy, a common configuration in corporate networks.

A more complicated example demonstrates some available JavaScript functions to be used in the FindProxyForURL function:


function FindProxyForURL(url, host) {
// our local URLs from the domains below example.com don't need a proxy:
if (shExpMatch(host, "*.example.com"))
{
return "DIRECT";
}

// URLs within this network are accessed through
// port 8080 on fastproxy.example.com:
if (isInNet(host, "10.0.0.0", "255.255.248.0"))
{
return "PROXY fastproxy.example.com:8080";
}

// All other requests go through port 8080 of proxy.example.com.
// should that fail to respond, go directly to the WWW:
return "PROXY proxy.example.com:8080; DIRECT";
}

PAC file character encoding

Browsers such as Firefox and Internet Explorer only support system default encoding PAC files, and cannot support Unicode encodings such as UTF-8.

DnsResolve

The function dnsResolve (and similar other functions) performs a DNS
Domain name system
The Domain Name System is a hierarchical distributed naming system for computers, services, or any resource connected to the Internet or a private network. It associates various information with domain names assigned to each of the participating entities...

 lookup that can block your browser for a long time if the DNS server does not respond.

Caching of proxy autoconfiguration results by domain name in Microsoft's Internet Explorer
Internet Explorer
Windows Internet Explorer is a series of graphical web browsers developed by Microsoft and included as part of the Microsoft Windows line of operating systems, starting in 1995. It was first released as part of the add-on package Plus! for Windows 95 that year...

 5.5 or higher limits the flexibility of the PAC standard. In effect, you can choose the proxy based on the domain name, but not on the path of the URL. Alternatively, you need to disable caching of proxy autoconfiguration results by editing the registry
Windows registry
The Windows Registry is a hierarchical database that stores configuration settings and options on Microsoft Windows operating systems. It contains settings for low-level operating system components as well as the applications running on the platform: the kernel, device drivers, services, SAM, user...

, a process described by de Boyne Pollard (listed in further reading).

It is recommended to always use IP addresses instead of host domain names in the isInNet function for compatibility with other Windows components which make use of the Internet Explorer PAC settings, such as .NET 2.0 Framework. For example,

if (isInNet(host, dnsResolve(sampledomain) , "255.255.248.0")) // .NET 2.0 will resolve proxy properly

if (isInNet(host, sampledomain, "255.255.248.0")) // .NET 2.0 will not resolve proxy properly

The current convention is to fail over to direct connection when a PAC file is unavailable.

When switching quickly between network configurations (e.g. when entering or leaving a VPN), dnsResolve may give outdated results due to DNS caching.

For instance, Firefox usually keeps
20 domain entries cached for 60 seconds. This may be configured via the network.dnsCacheEntries
and network.dnsCacheExpiration preference variables. Also flushing the system's dns cache may help,
which can be achieved e.g. in Linux by sudo service dns-clean start.

myIpAddress

The myIpAddress function has often been reported to give wrong/unusable results, e.g. 127.0.0.1, the IP address of the localhost.
It may help to remove on the system's host file (e.g. /etc/hosts on Linux) any lines referring to the machine hostname,
while the line 127.0.0.1 localhost can and should stay.

On Internet Explorer 9 isInNet("localHostName", "second.ip", "255.255.255.255") returns true and can be used as a workaround.

Advanced functionality

More advanced PAC files can reduce load on proxies, do load balancing, fail over, or even black/white listing before the request hits the proxies.
One can return multiple proxies:


return "PROXY proxy1.example.com:8080; PROXY proxy2.example.com:8080";

Further reading

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK