jump to navigation

Course News: TMA #2 Results and Discussion January 8, 2009

Posted by eingang in Course News.
Tags: ,
add a comment
Chart showing mark distribution for TMA 2. Average is in the 65 to 69 range.
Chart by Michelle A. Hoyle
Attribution-NonCommercial License


I just realized that, in the holiday madness, I finished marking TMA 2 and made the nice chart above, but I did not remember to post the chart or comment generally on TMA 2’s results. Here, belatedly, are some comments to remedy that.

The above chart shows the distribution of marks. The actual range was 44% to 87%—very similar to the range in TMA #1. The chunking, however, was more distinct, with the majority of people scoring in the 65% to 69% range. Similar to TMA #1, the big issue for most people was being able to justify what they had done to specific issues and research covered in the course. For the ECA it will be very important to write from your own context but you must specifically relate to content in the course. For example, “I provided both a PDF and an RTF version of my learning resource because, according to Taylor (2008), many assistive technology devices cannot read Word documents but can read RTFs.” This phrasing accomplishes three things. It explains what you did; it explains why you did it; and it supports your action and justification with a course-based reference. Just stating that you did something, while that something may demonstrate good practice, does not illustrate that you understand why it is good practice. Likewise, being very vague about what specifically you did may not be helpful. Adding alternative text descriptions to graphics may be good practice, but it only enhances accessibility when the descriptions are both appropriately added and appropriate in content. The classic example of that is people using blank “spacer” images to position content on web pages and blindly adding text descriptions of “spacer image”. Imagine how tedious, annoying, and useless it is to hear that over and over again via your screen reading software.

On the whole, though, everyone is doing satisfactory and is well on the way to passing the course. I did get to see some very neat resources. I was particularly impressed with Dwayne’s enthusiastic attempt at converting a PowerPoint presentation into a Flash-based application for use on a Moodle site. That was quite brave, even if the result wasn’t completely what would be desired. I’m sure he and everyone else learned a lot in the process of creating their learning resources. I know that I certainly had a very interesting experience in trying out the various resources using the built-in screen reading software for my computer.

That leaves only the end of course assessment (ECA), due January 23rd. I will be posting some separate advice and comments about that later.

CAPTCHAs: Accessibility vs Security January 8, 2009

Posted by eingang in Interesting.
Tags: , , , , , ,
add a comment
Sample CAPTCHA image
Image by BMauer
Public Domain


You probably have signed up at a web site where you were presented with a graphic showing you some combination of letters, numbers, or words in a graphic to prove that you are a real human being and not some kind of spam bot. The “following/finding” image above is an example of a word-based version of that task. These images are known as CAPTCHAs (Completely Automated Public Turing Test to tell Computers and Humans Apart). My earliest recollection of seeing them in wide use was on blog sites with open commenting. Automated programs would submit “comments” consisting of links to pornographic web sites or pharmacy sites. For popular bloggers, even if they had a system to moderate comments before making the comments publicly visible, the overhead in managing their blog could quickly become unreasonable. For similar reasons, sites like Yahoo Directory, Google Mail, and HotMail were also fairly quick to adopt CAPTCHAs.

For most people, the main issue about CAPTCHAs was whether they were effective or not. As with anti-virus efforts, it is an ongoing fight between the guys in the white hats to protect their systems against the guys in the black hats who want to pervert the protected systems to their own ends. From an accessibility point of view, though, that issue was minor potatoes. Even users with perfect vision often have trouble with CAPTCHAs because of the level of distortion involved in obscuring the letters or words. The solution to that was to add a “refresh” or “recycle” button to the CAPTCHA so it would give you a new CAPTCHA.

However, if you were blind or had poor vision, it was pretty much well impossible to work past the graphic. What the initial CAPTCHA developers had failed to consider was how users relying on assistive technology to surf the web were going to be able to use a CAPTCHA graphic.   Why was that? Consider the usual way of making graphical content accessible: add a description to the image. If the task for our sample CAPTCHA above is to type out the words in the picture, putting “CAPTCHA image with the words ‘following’ and ‘finding’” as the description is going to help those not using images, yes, but it is also one hundred percent accessible to automated programs. While we obviously like to endorse accessibility for all, there is a tension between accessibility and security;it is completely undesirable for automated programs to be able to circumnavigate a security system so easily.

reCAPTCHA sample with refresh and audio components
Figure 1: “overlooks/inquiry” reCAPTCHA Example

One solution to the accessibility issue was to add an audio component to the CAPTCHA. The “overlooks/inquiry” image shows a reCAPTCHA example that incorporates both the refresh button (the recycle-like symbol at the top of the column of icons), a help icons (at the bottom of the icon column), and the audio CAPTCHA icon (middle of the icon column). When you click the audio icon, the large word area of the CAPTCHA is replaced with a mini audio player and you are instructed to type what you hear. The audio in most examples I have tried is not the words in the graphical version. The audio quality is usually poor and may, on purpose, be distorted with additional people speaking or background noise in order to make it difficult for automated speech recognition programs to function. I often have trouble with the audio because of my own neurological hearing problems and the interference caused by background noise and lack of context. Try it yourself on a few examples at the reCAPTCHA site.

You might be thinking that the audio reCAPTCHA is a good compromise at trying to ensure accessibility for human beings while denying it to automated programs. Unfortunately, recent research studies have revealed that all of the common audio CAPTCHAs in use were vulnerable to automated speech processing techniques, with anywhere from roughly 50 percent to 70 percent accuracy. This excerpt from the December 8, 2008 Ars Technica article Computer scientists find audio CAPTCHAs easy to crack summarizes the important results:

The work involved gathering 1,000 audio CAPTCHAs from Google, Digg, and the reCAPTCHA service. 900 of these were used as a training set and the remaining 100 were set aside to test the system when done. The software first did a rough audio analysis, dividing each item into equal-sized chunks, each sufficiently long to fit any spoken character. Those segments with the highest energy peaks, which are considered most likely to contain actual letters, were set aside for analysis.

The authors tested a number of methods used to extract features from recordings of speech (for the curious, these are mel-frequency cepstral coefficients and two forms each of perceptual linear prediction and relative spectral transform-PLP). These features were then subjected to analysis using machine learning programs, which were trained on the identification of individual characters. Three methods—AdaBoost, support vector machines (SVM), and k-nearest neighbor (k-NN)—were trained using the 900 audio CAPTCHAs that had been processed manually. The result of this pairing of processing and analysis methods was a total of 15 different attempts at cracking each of the 100 test audio CAPTCHAs.

Google’s audio CAPTCHAs consist of a series of the digits 0 through 9 recited over background noise of speech played backwards. That was nowhere close to enough to consistently fool the researchers’ software; the SVM technique got the CAPTCHA right about two-thirds of the time, and AdaBoost wasn’t far behind (k-NN performed badly in this test). Digg uses both digits and letters, but plays them over a less complex background that sounds like flowing water. AdaBoost failed this test entirely, but SVM was able to clear 70 percent accuracy with several of the processing techniques; k-NN trailed it by a significant margin.

reCAPTCHA’s own audio version was similar to Google’s but used different speakers for different digits. This proved to be a significant barrier to the learning algorithms, which, at best, got it right a bit less than half the time (again, SVM was the star). As the authors point out, however, getting it right half the time would be more than worth the effort for spammers that may have hundreds or thousands of computers at their disposal. Some sites also allow the answer to be off by one digit, which would significantly increase the success rate.

[From Computer scientists find audio CAPTCHAs easy to crack]

We again have that tension between accessibility for people but inaccessibility for automated programs. A 50 percent success rate is not low enough to deter the bad guys. What can be done? The researchers, however, did conclude that “more of just about everything is better: more speakers, more characters, more distortion, and longer strings of tokens all seem to make a difference. As a result, they have expanded their own service to include all numbers from 0 to 99.” Time will tell how that pans out. I still wish we did not have to rely on different speakers, distortion, and entire sentences for audio CAPTCHAs as that too poses its own accessibility issues for those with physical or neurological hearing problems.

Perhaps there is mileage in some of the lesser-used systems that ask people to do simple mathematics or ask common-sense questions like “What colour is grass?” I suspect those too will be quite vulnerable to automated systems as the number of questions will be limited. Unsatisfactorily, we may have to settle with the situation as it currently stands until someone cleverer than me has a bright idea. If you had to solve the problem of making CAPTCHA technology accessible but secure, how would you do it? Or is there a better way to separate the people from the programs?

Further Reading:

Course News: TMA 2 Marking Update December 8, 2008

Posted by eingang in Course News.
Tags: ,
add a comment
The Progress Bar with a Windows progress bar as its logo
Photo by jacksonmedeiros
Attribution-NonCommercial-Share Alike License


I’m afraid I’m only just starting the TMA 2 marking as my laptop had a horrible fan death a few days after I picked up everyone’s TMA 2. I had to back everything up and send the laptop away, which sucked up a week. The good news is that I am starting today and I hope <crosses fingers> to do at least one per day, returning all submitted on time by Saturday. I’d already hoped to be done, but these things happen.

In the meantime, I’ve noticed it’s been pretty quiet. Threads for Weeks 13 and 14 were created previously, so why not stop in and give those a whirl while you’re waiting for TMA 2 to come back? I’ll probably work ahead a little and create Week 15’s before TMA 2 is submitted.

Disability 2.0 Seminar by Sarah Lewthwaite November 26, 2008

Posted by eingang in Interesting.
Tags: , , , , , , , , , , ,
add a comment

If you’re in the region of the University of Sussex at Falmer this Friday afternoon, you might want to consider attending the Human-Centred Technology group’s seminar series. The speaker on November 28th is Sarah Lewthwaite from the University of Nottingham’s School of Education. She’ll be giving a talk about the experiences of disabled students and Web 2.0 technologies. The abstract is as follows:

Presenter: Sarah Lewthwaite (University of Nottingham)

Title: Disability 2.0: Facebook, the Academy and Student (dis)Connections.

Abstract: For many young people, online social networks such as Facebook are an essential part of their student experience. Other social web-based services like Wikipedia and YouTube are also an important facet of everyday student life. New technologies have always been scrutinized for their capacity to support education and, as these social technologies become more pervasive, universities are increasingly seeking to appropriate them for teaching and learning.

However, the educational impact of applying these Web 2.0 technologies for all users is unclear.

The experiences of disabled students crystallize many of the issues raised by the movement of the academy into the digital domain, disputing the notion of social networks as universally popular, transparent and inclusive. This presentation is based upon ongoing qualitative PhD research. Discussion will focus on data collected during 14 interviews with disabled students at different stages in their University studies. Interviews utilise screen capture, participatory and accessible methods to explore how the societal elements of disability transpire and transform online.

The seminar will be in the Interact Lab (Arundel 223), starting at 13:30 and lasting for an hour, with tea/coffee & cake afterwards. All are welcome to attend. More information can be found at http://www.informatics.sussex.ac.uk/events/HCTSeminars/. I shall be there and I’ve heard a rumour that Chris Douce will also be attending.

Course News: Seale e-Book Now Downloadable for Personal Use November 25, 2008

Posted by eingang in Course News.
Tags: ,
add a comment
Columns and columns of newspapers at a newspaper kiosk
Photo by birdfarm
Attribution-NonCommercial License


As you know, access to the the e-book version of Jane Seale’s E-Learning and Disability in Higher Education: Accessibility Research and Practice was a little problematic at the beginning of the course, due to restrictions on concurrent access and saving/printing. Many people ended up ordering print copies of their own. The Course Team was aware of issues and was working with Library to come up with some kind of a satisfactory solution. While they haven’t got the ideal solution, the situation has improved somewhat. Mary Taylor posted a message on the 21st of November in the course’s General Forum saying that an agreement had been reached whereby students could be given a one-time only download link to a DRM-protected Adobe PDF version of the e-book. It can be printed out or read offline, but you cannot copy and paste from it. Nevertheless, it is an improvement. If you’re interested, drop Mary an e-mail as per the instructions in her post (Requires authentication).