Text-based options for verification purposes on websites and other digital forums are going to be a thing of the past, that too, pretty soon. So, get ready to identify objects like cars, parks, and storefronts form CAPTCHA image grids.
CAPTCHA is the abbreviation for Completely Automated Public Turing test to tell Computers and Humans Apart. It is a kind of authentication test that is important to differentiate between humans and bots. However, the presence of so many fake accounts especially on social media platforms like Facebook clearly indicates that this mechanism isn’t foolproof.
Researches of a recent study [PDF] on image recognition mechanisms and machine learning titled “Yet Another Text Captcha Solver: A Generative Adversarial Network Based Approach,” believe that it is important to design harder puzzles that can accurately detect software and humans. For this purpose, a new algorithm is developed using artificial intelligence that’s based on deep learning methods.
This new system is highly effective in resolving the loopholes in CAPTCHA security and authentication systems. Moreover, it can also greatly outshine those text CAPTCHA systems that majorities of the websites rely upon to ensure optimum protection.
Called the Solver, this new algorithm is developed collectively by computer scientists from the Northwest University, Lancaster University, UK, and Peking University, China. It is claimed that the solver will provide highly accurate results than the text-based method. It can easily crack those versions of CAPTCHA where previous attack systems couldn’t be successful. Moreover, the solver’s efficiency is also matchless as it can solve a CAPTCHA in just 0.05 of a second on a desktop computer.
Text-based CAPTCHAs are not considered as reliable when it is the matter of security of websites. Such CAPTCHAs involve the use of a mixture of alphabets and numbers along with other features like occluding lines so as to identify humans and malicious automated software. The basic idea behind text-based CAPTCHA is that humans can easily identify letters and numbers than machines.
The solver uses the technique called GAN (Generative Adversarial Network). The technique involves teaching the program of CAPTCHA generation to create large numbers of training CAPTCHAs that are indistinguishable from an original CAPTCHA. These numbers are then used to train a solver much quickly and it is then tested against genuine CAPTCHAs. This way, using a machine-learned automated CAPTCHA, it becomes possible to reduce the time and effort involved in identifying and manually tagging CAPTCHAs for training the software.
Lancaster University’s senior lecturer and co-author of the research Dr. Zheng Wang says:
“This is the first time a GAN-based approach has been used to construct solvers. Our work shows that the security features employed by the current text-based captcha schemes are particularly vulnerable under deep learning methods. We show for the first time that an adversary can quickly launch an attack on a new text-based captcha scheme with very low effort.”
The system only needs 500 genuine CAPTCHAs and not millions, which are normally required to train an attack program effectively.
Since the new solver technique needs minimal human involvement, it is easy to rebuild it to target new or modified CAPTCHA schemes. The solver was tested on around 33 CAPTCHA schemes out of which 11 are most commonly used by popular websites like Microsoft, eBay, and Wikipedia.
According to the lead student author of the research, Mr. Guixin:
“Given the high success rate of our approach for most of the text captcha schemes, websites should be abandoning captchas.”
Researchers also opine that it is high time websites start looking for alternative methods of authentication, and employ those methods that use multiple layers of security including biometric information or device location. Moreover, researchers assert that websites should be considering alternative measures that use multiple layers of security, such as a user’s use patterns, the device location or even biometric information.