posted on 3:30 PM, July 12, 2009
A class for parsing and composing URIs (web addresses).
Note that a URI is composed of the following components:
my $uri = new ExSite::URI(%option);
separator => parameter separator character (eg. ";" or "&")
plaintext => output plaintext URLs if true otherwise, output HTML URLs
uri => a URI string to initialize the object with
secure_query => encrypt the query data to make it tamper-resistant
By default, the object will be initialized with the current URI, will use '&' as the parameter separator, and will output HTML URIs.
The only difference between HTML and plaintext URIs is whether or not HTML metacharacters such as '&' are escaped. (In plaintext mode they are left unescaped.)
You can change the separator character at any time:
The current separator character is used for both parsing URIs and composing new URIs, so you may need to switch if you want to use a different separator character for your input and output.
You can change the text mode with the following calls:
$uri->plaintext; # output plaintext URIs $uri->html; # output HTML URIs
At any time, you can extract a structure with all of the parsed URI data using:
%parsed_uri = $uri->info;
You can also fetch individual URI components using:
$data = $uri->get($component);
This class can manage URIs from any source, in principle. Its defaults are optimized for handling ExSite URIs. ExSite URIs use a conventional format which assumes the following additional rules:
The path component of the URI constists of a
These are common URI conventions, so this class should be fairly
versatile, even with non-ExSite URIs. You might encounter minor
issues with non-ExSite URIs that do not use the same conventions. For
example, not all query strings are sequences of key/value pairs, so
we might not be able to extract intelligible parameters from unconventional
query strings. Also, it may not be possible for URI to tell which part
of a path corresponds to a
If you do not pass an explicit URI, the object will initialize itself with the URI of the current request, as read from the Apache environment.
You can re-initialize the object with a different URI at any time:
After modifying the URI (see below), it is often the case that you want to reset it back to its initial state. You can do this:
If the URI was explicitly passed to the object, this will restore the original state completely. If the URI was implicitly determined from the local environment, however, it may be different, depending on how local definitions have changed in the meantime. If the path or query data have been altered in ExSite's input buffers, then the URI will reflect those changes.
Sometimes you want this behaviour for explicit URIs. For example, the object may be forced to an explicit URI that is meant to reflect a local URI that would normally be implicit. (This happens when publishing, for instance, where we spoof the URI and environment for each page that we generate.) To get the implicit reset behaviour on a an explicit URI, do this:
This tells the object to use any updated input data when constructing the implicit URI.
The query is the part of the URL after a question mark. It is typically broken into key=value pairs by a separator character, which is ``&'' by default.
To change a parameter in the URI:
To remove a parameter completely:
$uri->parameter($key,undef); # OR $uri->parameter($key);
To change multiple parameters:
The query string is written as
If you make the URI object secure:
then your query strings will be encrypted, making them tamper-proof. This is not recommended for normal usage, as it is quite convenient to be able to inspect and alter query strings. However, you may wish to make exceptions in some cases where sensitive data may be exposed in the query string, or there are security issues associated with editable query strings.
To go back to normal query strings, use:
(This is a misnomer, since there is nothing really insecure about a normal query string.)
The URI path includes the slash-separated values after the domain
name and before the '?'. This is typically broken down into two
/path = /script_name/path_info
The script_name is typically broken down into a diskpath to a CGI
program, while the
/script_name + /path_info = /cgi/page.cgi + /store/catalog.html/widgets/blue_grommet
In principle the
/path_info(CMS segment) + /path_info(Catalog segment) = /store/catalog.html + /widgets/blue_grommet
The breakdown of different
$uri->path("CMS","/store/catalog.html"); # scalar method $uri->path("Catalog","widgets","red_grommet"); # array method
These new path segments will replace the original path segments, without altering the remaining segments of the path.
If you define a new path segment unknown to the Input manager, then the new path segment will be appended to those that are already defined. For example,
would result in ``/foo'' being appended to the existing path, resulting
in a new
To delete a path segment, just pass nothing as the segment data:
$uri->path("Catalog",undef); $uri->path("Catalog"); # equivalent
To completely override the path segments defined by the Input manager, and explicitly define your path, use these:
A service page is a special page in the ExSite CMS that services
requests for a particular plug-in. If a page generates a URL that
will be processed by that plug-in, it should automatically adjust the
target URL so that it redirects to the service page. This is done in
the URI class by the
To change the current URI so that it directs to the service page instead of whatever page it happens to be on, use this:
Not all plug-ins are configured to use service pages, but there is no harm in calling this method in those cases; it will leave the current URI unchanged.
Some URIs direct to pages/screens that require a certain level of user access to view. Simply using the URI is not sufficient to view the contents; you also need to be logged in as a user with sufficient access. If you do not have this level of access, you are likely to get a permission denied error message, or be prompted for a login and password.
There is a feature by which you can include authentication credentials in a URI so that the user will not receive an error or login prompt. This trick uses encrypted ``authtokens'' embedded into the parameter string.
There are two things to consider when using authtokens:
To generate an encrypted authtoken string:
my $authtoken = $uri->authtoken($login_id, $expiry_in_days);
To modify the current URI to include an authtoken granting that URI special access:
You then must output the URI (see below) to actually use it. You
cannot really modify the URI any further at this point, because then
the authtoken won't match the updated URI, and it will fail to
validate. It may be necessary to reset the URI or remove the
my $auth_url = $uri->authorize_url($lgin_id, $expiry_in_days);
After a URI has been modified using the above methods, you can obtain
the changed URI using the
$newuri = $uri->write($type);
$newuri = $uri->write_relative();
This returns the URI after the authority. It presumes the same authority as the referrer.
$newuri = $uri->write_full();
This returns the full URI including the scheme and authority.
Modifications to the URI are cumulative, so you can make changes,
output the new URI, make more changes, output again, etc. If you want
to reset the URI to its original state so that changes are not
cumulative, use the
This also syncs with the Input manager to retrieve any new path segments that were defined since the URI object was instantiated.
best practices (5)
content management (12)
data handling (7)
graphic design (21)
html formatting (7)
plug-in modules (28)
visual tutorial (29)
web protocols (9)