LoginRegisterCommercial SupportContact Us


Development & IT > Captchas

Captchas

posted on 3:45 PM, August 3, 2007

Captchas are small puzzles you have to perform to prove to a website that you are not an automated program. The word stands for "Completely Automated Public Turing test to tell Computers and Humans Apart".

ExSite supports a few types of captcha which you can use when required. Some plug-ins will use captchas automatically.

When to use a Captcha

Captchas are required whenever you have a concern that an automatic program may exploit a form on your website to:

  • spam you or your website users (eg. spam posted to feedback forms, or to forums).
  • register for free services or junk accounts
  • bombard a poll or contest entry form
  • etc.

As a general rule, captchas are usually not required in member-only areas, or on forms that require the user be authenticated before using. (Eg. member-only forums.) By entering their password earlier, the user has already proved their identity, so in most cases captchas are no longer required unless you are concerned that a member may put together an automated program to exploit one of your services.

Captchas are not completely fool-proof, and there are ways of exploiting them using partially or fully-automated methods. However, at the very least they may slow down the undesirable posts to a more manageable level.

Image Captchas

This is an example of an image captcha:


The user must be able to read the text in the image and transcribe it back to normal text. This is a difficult (but not impossible) task for a program, which would have to use OCR algorithms that could cope with the text distortions and noisy background.

Image captchas can be difficult for the visually impaired. ExSite provides a fallback in the form of plain-text captchas.

Text Captchas

Here are some examples of plain-text captchas:

Which one of these is not like the other?
APPLE BOB POTATO BEEF BREAD
9 × 8 = ?
Enter the first and second letters of the last word in the following list:
coming carpels stiff westwards mercuric

Text captchas are relatively easy to defeat programmatically, so ExSite uses several different text catpcha algorithms to expand the problem space. Some of the algorithms (such as the last one in the above examples) make use of random words, and ExSite uses the system dictionary to provide a large pool of source data.

Programming Captchas

To generate an image captcha (complete with form elements):

use ExSite::Captcha;

my $c = new ExSite::Captcha(); # captcha object
my $captcha_form_input = $c->make(); # HTML form input

To generate an image captcha that includes a link to fall back to a plain-text captcha if the user requests it:

use ExSite::Input;
use ExSite::Captcha;

my $input = ExSite::Input->new()->combine(); # fetch all input data
my $c = new ExSite::Captcha(textmode=>$input->{captchamode});
my $captcha_form_input = $c->make();

The form HTML that the captcha object returns looks like this:

01:  <div class="captcha">
02: <table cellspacing="0" cellpadding="0"><tr><td>
03: <span id="captcha_image">
04: <img src="/cgi/captcha.cgi?fjFwiZ1K1JD" height="40" width="100">
05: <input type="hidden" name="captcha_solution" value="fjFwiZ1K1JD">
06: </span>
07: </td><td>
08: Enter the text you see: <input type=text name="captcha" size="8">
09: </td></tr></table>
10: <a href="/cgi/page.cgi?captchamode=text&_id=14">Can't see the image?</a>
11: </div>

Notes:

  1. The catpcha.cgi script generates the captcha image. (It uses the ImageMagick tools to do this.) The query string passes the text to encode in the image, but in an encrypted form so that the solution cannot be stolen.
  2. The catpcha_solution hidden input contains the solution to the captcha, encrypted so that it cannot be decoded or tampered with by the user.
  3. The catpcha input is where the user enters their solution for the captcha.
  4. If permitted, a link to switch to text captchas is provided. This URL is ignored if text captchas are not allowed.

Note that text captchas also include the captcha and captcha_solution form fields. These fields can be renamed in the declaration of the captcha object, if necessary.

To test whether the user has passed the captcha, use this template code:

use ExSite::Input;
use ExSite::Captcha;

my $input = ExSite::Input->new()->combine();
my $c = new ExSite::Captcha();
if ($input->{captcha}) {
if ($c->pass()) {
# PASS: code to accept the form goes here
}
else {
# FAIL: user entered the wrong solution
}
}
else {
# FAIL: no captcha data - user attempted to submit a form
# with no captcha fields in it.
}

We use the combine() method here because we don't know whether the form in question is using GET or POST. Otherwise you can use get() or post().

Complete Example

This self-contained perl script illustrates how to use ExSite captchas. It does not include a check for missing captcha data.

#/usr/bin/perl
use strict;
use ExSite::Config;
use ExSite::Input;
use ExSite::Captcha;

&exsite_init;
print "content-type:text/html\n\n";

my $input = ExSite::Input->new()->combine();
my $c = new ExSite::Captcha(textmode=>$input->{captchamode});
# evaluate the user's response
if ($input->{captcha}) {
if ($c->pass()) {
print "Thanks, that's the correct answer!";
exit;
}
else {
print "Sorry, that's not the correct answer.";
}
}
# output a new captcha
my $captcha = $c->make();

print <<END;
<form method=POST>
$captcha
<input type=submit>
</form>
END

Configuration

Image captchas are drawn onto the blank captcha file _ExSite/images/captcha.png. You can create your own captcha background by replacing this file.

The Captcha module makes use of the following configuration settings:

captcha.charsize=19
captcha.color=#333366
captcha.dictionary=/usr/share/dict/words
captcha.distort=.25
captcha.font=ps:Courier-Bold
captcha.max_password_size=2
captcha.pointsize=30
captcha.start_x=5
captcha.start_y=30
captcha.word_set_size=5

The parameters that are most likely to need special attention on your installation are dictionary and font.

charsize is the horizontal spacing of characters in the image captcha.

color is the color of the characters in the image captcha, if the ImageMagick on your system will accept this parameter.

dictionary is a file containing a source of random words, one per line. The default points to a common system dictionary on Linux servers.

distort is the degree of distortion desired (captchas currently use the "implode" method of distorting text.

font is the font of the characters in the image captcha. Values of ps:Courier-Bold and Courier-Bold have been found to work well. Consult the documentation for the convert program in the ImageMagick suite to see what other fonts you may be able to use. You must have some fonts installed on your server to make use of.

max_password_size is the number of characters that have to be extracted from random dictionary words in some text captchas.

pointsize is the height of the characters in the image captcha.

start_x is the X coordinate of the first character in an image captcha.

start_y is the Y coordinate of the first character in an image captcha.

word_set_size is the number of random dictionary words to use some text captchas.

Troubleshooting

No text in image captcha
The font you are using is probably not installed on your server. Try a different font.
No image, or broken image
Missing captcha.png base file, or broken/faulty convert program.
Missing text data in text captchas
Cannot find dictionary of source words. Try another dictionary, or provide one.
Text data is overlapping or running off the edge of the image
Your font or text size is too large for the base image. Choose a smaller pointsize or charsize, or create a larger base image.

To test alternative captcha parameters, you can run the captcha generation program directly:

convert \
-font [font] \
-pointsize [pointsize] \
-draw "text [start_x],[start_y] '[captcha_text]'" \
-background white \
-fill '[color]' \
-implode [distort] \
_ExSite/images/captcha.png output.png
Filed under: programming