Generating CAPTCHA challenges

In many situations such as blogs, forums, and online polls (to name a few) website operators want to guard against automated postings by spambots without wanting to burden human visitors with registration and authentication. In such situations it has become common to provide the visitor with a so-called CAPTCHA challenge (http://en.wikipedia.org/wiki/Captcha). A CAPTCHA challenge (or just Captcha) in its simplest form is a picture that should be hard to recognize for a computer, yet simple to decipher by a human as it is, typically a distorted or blurred word or number.

Of course, no method is foolproof and certainly Captchas are neither without their flaws nor immune to the ever-growing computing power available, but they still remain quite effective. Although the current consensus is that simple blurring and coloring schemes are not up to the task, computers still have a hard time separating individual characters in words when they slightly overlap where humans have hardly any problem doing that.

Given these arguments, this might be an excellent application of 3D rendering of text as presumably three-dimensional renditions of words in suitable lighting conditions (that is, harsh shadows) are even harder to interpret than two-dimensional text. Our challenge then is to design a server that will respond to requests to render three-dimensional images of some text.

We will design our server as a web server that will respond to requests addressed to it as URLs of the form http:<hostname>:<port>/captcha?text=<sometext> and that will return a PNG image—a 3D rendition of that text. In this way it will be easy to integrate this server into an architecture where some software, such as a blog, can easily incorporate this functionality by simply accessing our server through HTTP. An example of a generated challenge is shown in the illustration:

Generating CAPTCHA challenges

Design of a CAPTCHA server

By making use of the modules available in a full Python distribution the task of implementing an HTTP server is not as daunting as is may seem. Our Captcha server will be based on the classes provided in Python's BaseHTTPServer module so we start by importing this module along with some additional utility modules:

import BaseHTTPServer
import re
import os
import shutil

The BaseHTTPServer module defines two classes that together comprise a complete HTTP server implementation. The BaseHTTPServer class implements the basic server that will listen to incoming HTTP requests on some network port and we will use this class as is.

Upon receiving a valid HTTP request BaseHTTPServer will dispatch this request to a request handler. Our implementation of such a request handler based on the BaseHTTPRequestHandler is pretty lean as all it is expected to do is to field GET and HEAD requests for URIs of the form captcha?text=abcd. Therefore, all we have to do is override the do_GET() and do_HEAD() methods of the base class.

A HEAD request is expected to return only the headers of a requested object, not its content, to save time when the content isn't changed since the last request (something that can be determined by checking the Last-Modified header). We ignore such niceties; we will return just the headers when we receive a HEAD request but we will generate a completely new image nonetheless. This is something of a waste but does keep the code simple. If performance is important, another implementation may be devised.

Our implementation starts off by defining a do_GET() method that just calls the do_HEAD() method that will generate a Captcha challenge and return the headers to the client. do_GET() subsequently copies the contents of the file object returned by do_HEAD() to the output file, such as object of the request handler (highlighted), which will in turn return this content to the client (the browser for example):

class CaptchaRequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
def do_GET(self):
f=self.do_HEAD()
shutil.copyfileobj(f,self.wfile)
f.close()

The do_HEAD() method first determines whether we received a valid request (that is, a URI of the form captcha?text=abcd) by calling the gettext() method (highlighted, defined later in the code). If the URI is not valid, gettext() will return None and do_HEAD() will return a File not found error to the client by calling the send_error() method of the base class:

def do_HEAD(self):
text=self.gettext()
if text==None:
self.send_error(404, "File not found")
return None

If a valid URI was requested, the actual image is generated by the captcha() method that will return the filename of the generated image. If this method fails for any reason an Internal server error is returned to the client:

try:
filename = self.captcha(text)
except:
self.send_error(500, "Internal server error")
return None

If everything went well we open the image file, send a 200 response to the client (indicating a successful operation), and return a Content-type header stating that we will return a png image. Next, we use the fstat() function with the number of the open file handle as argument to retrieve the length of the generate image and return this as a Content-Length header (highlighted) followed by the modification time and an empty line signifying the end of the headers before returning the open file object f:

f = open(filename,'rb')
self.send_response(200)
self.send_header("Content-type", 'image/png')
fs = os.fstat(f.fileno())
self.send_header("Content-Length", str(fs[6]))
self.send_header("Last-Modified", self.date_time_string(fs.st_mtime))
self.end_headers()
return f

The gettext() method verifies that the request passed to our request handler in the path variable is a valid URI by matching it against a regular expression. The match() function from Python's re module will return a MatchObject if the regular expression matches and None if it does not. If there actually is a match we return the contents of the first match group (the characters that match the expression between the parentheses in the regular expression, in our case the value of the text argument), otherwise we return None:

def gettext(self):
match = re.match(r'^.*/captcha?text=(.*)$',self.path)
if match != None:
return match.group(1)
return None

Now we come to the Blender-specific task of actually generating the rendered 3D text that will be returned as a png image. The captcha() method will take the text to render as an argument and will return the filename of the generated image. We will assume that the lights and camera in the .blend file we run captcha.py from are correctly set up to display our text in a readable way. Therefore, the captcha() method will just consider itself with configuring a suitable Text3d object and rendering it.

Its first task is to determine the current scene and check whether there is an Object called Text that can be reused (highlighted). Note that it is perfectly valid to have other objects in the scene to obfuscate the display even more:

def captcha(self,text):
import Blender
scn = Blender.Scene.GetCurrent()
text_ob = None
for ob in scn.objects:
if ob.name == 'Text' :
text_ob = ob.getData()
break

If there was no reusable Text3d object, a new one is created:

if text_ob == None:
text_ob = Blender.Text3d.New('Text')
ob=scn.objects.new(text_ob)
ob.setName('Text')

The next step is to set the text of the Text3d object to the argument passed to the captcha() method and make it 3D by setting its extrude depth. We also alter the width of the characters and shorten the spacing between them to deteriorate the separation. Adding a small bevel will soften the contours of the characters what may add to the difficulty for a robot to discern the characters if the lighting is subtle (highlighted). We could have chosen to use a different font for our text that is even harder to read for a bot and this would be the place to set this font (see the following information box).

Note

Something is missing

Blender's API documentation has a small omission: there seems to be no way to configure a different font for a Text3d object. There is an undocumented setFont() method, however, that will take a Font object as argument. The code to accomplish the font change would look like this:

fancyfont=Text3d.Load( '/usr/share/fonts/ttf/myfont.ttf') text_ob.setFont(fancyfont)

We have chosen not to include this code, however, partly because it is undocumented but mostly because the available fonts differ greatly from system to system. If you do have a suitable font available, by all means use it. Script type fonts which resemble handwriting for example may raise the bar even further for a computer.

The final step is to update Blender's display list for this object so that our changes will be rendered:

text_ob.setText(text)
text_ob.setExtrudeDepth(0.3)
text_ob.setWidth(1.003)
text_ob.setSpacing(0.8)
text_ob.setExtrudeBevelDepth(0.01)

ob.makeDisplayList()

Once our Text3d object is in place our next task is to actually render an image to a file. First, we retrieve the rendering context from the current scene and set the displayMode to 0 to prevent an additional render window popping up:

context = scn.getRenderingContext()
context.displayMode=0

Next, we set the image size and indicate that we want a png image. By enabling RGBA and setting the alpha mode to 2 we ensure that there won't be any sky visible and that our image will have a nice transparent background:

context.imageSizeX(160)
context.imageSizeY(120)
context.setImageType(Blender.Scene.Render.PNG)
context.enableRGBAColor()
context.alphaMode=2

Even though we will render just a still image, we will use the renderAnim() method of the rendering context because otherwise the results will not be rendered to a file but to a buffer. Therefore, we set the start and end frames of the animation to 1 (just like the current frame) to ensure that we generate just a single frame. We then use the getFrameFilename() method to return the filename (with the complete path) of the rendered frame (highlighted). We then both store this filename and return it as a result:

context.currentFrame(1)
context.sFrame=1
context.eFrame=1
context.renderAnim()
self.result=context.getFrameFilename()

return self.result

The final part of the script defines a run() function to start the Captcha server and calls this function if the script is running standalone (that is, not included as a module). By defining a run() function this way we can encapsulate the often used server defaults, such as port number to listen on (highlighted), yet allow reuse of the module if a different setup is required:

def run(HandlerClass = CaptchaRequestHandler,
ServerClass = BaseHTTPServer.HTTPServer,
protocol="HTTP/1.1"):
port = 8080

server_address = ('', port)
HandlerClass.protocol_version = protocol
httpd = ServerClass(server_address, HandlerClass)
httpd.serve_forever()
if __name__ == '__main__':
run()

The full code is available as captcha.py in the file captcha.blend and the server may be started in a number of ways: from the text editor (with Alt + P) from the menu Scripts | render | captcha or by invoking Blender in background mode from the command line. To stop the server again it is necessary to terminate Blender. Typically, this can be done by pressing Ctrl + C in the console or DOSbox

Note

Warning

Note that as this server responds to requests from anybody it is far from secure. As a minimum it should be run behind a firewall that restricts access to it to just the server that needs the Captcha challenges. Before running it in any location that might be accessible from the Internet you should think thoroughly about your network security!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.172.130