google下载_在线阅读_12

首页 > google

is_048425

暂无简介

google The Robustness of Google CAPTCHAs Ahmad S El Ahmad, Jeff Yan, Mohamad Tayara School of Computer Science Newcastle University, UK {Ahmad.Salah-El-Ahmad, Jeff.Yan, Mohomad.Tayara}@ncl.ac.uk May 8, 2011 ABSTRACT We report a novel attack on two CAPTCHA...

The Robustness of Google CAPTCHAs Ahmad S El Ahmad, Jeff Yan, Mohamad Tayara School of Computer Science Newcastle University, UK {Ahmad.Salah-El-Ahmad, Jeff.Yan, Mohomad.Tayara}@ncl.ac.uk May 8, 2011 ABSTRACT We report a novel attack on two CAPTCHAs that have been widely deployed on the Internet, one being Google's home design and the other acquired by Google (i.e. reCAPTCHA). With a minor change, our attack program also works well on the latest ReCAPTCHA version, which uses a new defence mechanism that was unknown to us when we designed our attack. This suggests that our attack works in a fundamental level. Our attack appears to be applicable to a whole family of text CAPTCHAs that build on top of the popular segmentation-resistant mechanism of "crowding character together" for security. Next, we propose a novel framework that guides the application of our well-tested security engineering methodology for evaluating CAPTCHA robustness, and we propose a new general principle for CAPTCHA design. Categories and Subject Descriptors D.4.6 Security and Protection, H.1.2 User/Machine Systems. General Terms Security, Human factors Keywords CAPTCHA, robustness, segmentation attack 1. INTRODUCTION A CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart), also known as a human interaction proof, is a program that generates and grades tests that are human solvable but intended to be beyond the capabilities of current computer programs [1]. CAPTCHA makes use of a hard, open problem in AI, and is now a standard technology to defend against undesirable or malicious computer bot programs. The most widely deployed CAPTCHAs are text-based schemes, which require a user to recognize distorted characters, a task that state- of-the-art AI programs supposedly cannot do. CAPTCHAs’ robustness is the strength of their resistance to AI programs written to automatically solve CAPTCHA tests. It is well known that the robustness of a text CAPTCHA should rely on the difficulty of finding where each character is rather than what it is. The rationale is the following. Computers perform better than humans in recognizing individual characters, even under severe distortion [6]. However, locating individual characters in the right order (i.e. segmentation) is in general a computationally expensive and combinatorially hard problem for computers. Therefore, a text CAPTCHA should be segmentation- resistant; if a scheme is vulnerable to a segmentation attack, then it is effectively broken. A commonly accepted goal for CAPTCHA design is that automated attacks should not be more than 0.01% successful but that the human success rate should be at least 90%. Various segmentation-resistant mechanisms have been proposed (some representative ones are shown in Figure 1), but many of them were broken, including those developed by major companies such as Microsoft, Yahoo and Megaupload [3, 10]. A segmentation-resistant mechanism known as “crowding character together” or CCT, initially adopted by Google to let adjacent characters touch or overlap with each other at a random intersection (see Figure 1 for illustrations), was shown to be more resistant to known attacks [3]. This mechanism has been used by Google until now, and it has gained wide popularity and been in use by a large number of different text CAPTCHAs. Figure 1. CAPTCHAs designed by Microsoft, Yahoo, Megauplad and Google (clock wise), each using a different segmentation-resistant mechanism. This paper first answers an open problem that has intrigued the CAPTCHA community for years: Is the CCT mechanism vulnerable to novel attacks? Our analysis of a Google’s CAPTCHA, as currently deployed by Gmail, Blogger.com and BlogSpot, suggests that a simple but novel attack can break its CCT mechanism. This is to date the most effective attack against this scheme (For convenience, we will refer to this CAPTCHA as the Google scheme in this paper). Our attack is also applicable to ReCAPTCHA, a scheme that is widely used by millions of users of Facebook, Twitter and other Internet services on a daily basis (used by over 100,000 websites), and that was recently acquired by Google. Our attack was implemented for a ReCAPTCHA version that was active in May 2010. To our surprise, our attack, without any significant change, also works well on the latest ReCAPTCHA version, which uses a new defence mechanism that was unknown to us when we designed our attack. This suggests that our attack works in a fundamental level. Our attack exploits shape patterns found in individual characters (such as the “loop” shape in characters “a” and “p”, and the cross shape in characters ‘f’ and ‘t’), as well as connection patterns of adjacent characters. The key insight is that these features and patterns largely stay invariant under various distortions and transformations applied by the CAPTCHA generators, and can be exploited to segment connected characters. Given the CCT mechanism’s popularity, the impact of our attack is beyond the Google scheme and ReCAPTCHA. We urge the designers of other captcha schemes that rely on the CCT mechanism to carefully check how vulnerable their designs are to our attack or its variants. In the recent years, we have been working to establish a novel security engineering approach to the evaluation of CAPTCHA robustness through a series of work, including [3, 4, 7, 10] and this paper. In contrast to a parallel approach developed primarily in the computer vision, document analysis and pattern recognition communities, where sophisticated objection recognition algorithms are often the design goal, our approach applies adversarial thinking skills searching for and exploiting vulnerabilities hidden in CAPTCHAs. In this paper, we will summarise our approach, and for the first time provide a detailed framework that classifies CAPTCHA vulnerabilities into major categories. This framework provides key insights on security vulnerabilities that text CAPTCHAs should account for. It can be used to examine a CAPTCHA’s robustness, as well as to guide the design of next generation CAPTCHAs. We will also summarise general principles for the design of robust text CAPTCHAs. We not only revisit established principles, but also propose a new one. The rest of this paper is organized as follows. Section 2 discusses related work. Section 3 presents our attack on the Google scheme, and Section 4 shows that a variant of our attack works on ReCAPTCHA. Section 5 discusses further applicability of our attack and its defence. Section 6 introduces our framework for understanding Captcha vulnerabilities, and Section 7 focuses on general principles for CAPTCHA design. 2. RELATED WORK CAPTCHA has been an active field in the research communities. Due to space limit, we only review the literature that is most relevant to this paper. Kurt’s thesis [18] provided a good review of CAPTCHA research. Chellapillas’ et al [5] attacked a number of early CAPTCHAs using machine learning algorithms, and they achieved 4.89% success on an early version of Google’s CAPTCHA (around year 2004. In 2007, we developed a technique that exploited gaps (whit e space) between characters in Google’s CAPTCHA at the time, and our attack achieved 12% success [3]. It was reported in [14] that some spammers succeeded in breaking the Google CAPTCHA using two compromised zombie hosts, with each host using a variation of their attack. This attack claimed a success rate of 20%, yet no technical details have been revealed. An attack on ReCAPTCHA using a combination of image processing and OCR techniques was reported with a success rate of 17.5% [13]. At DEFCON’18, Houck presented another attack on ReCAPTCHA; using both character segmentation and character template matching technique, his attack achieved 10% success on an early version of ReCAPTCHA and 31% success on a more recent version [8]. Other notable attacks on (image recognition) CAPTCHAs include [3, 12]. 3. A SEGMENTATION ATTACK on the GOOGLE SCHEME In this section, we first review key features of the Google scheme, and then present a novel segmentation attack. Google have tweaked their CAPTCHA from time to time in order to improve its robustness. Figure 2(a) shows a version of the Google scheme that is not user-friendly. For example, it is difficult for users to tell between “d” and “cl”, “ri” and “n”, “rn” and “m”, and “w” and “vv”. We did not choose to work on this version of Google CAPTCHA, because even human cannot decode them properly with a high accuracy. For such captchas, users’ complaint will soon become a major issue or even push them offline, just as we have seen it again and again. Figure 2(b) shows another version of Google CAPTCHA, which still adopts the CCT mechanism but is easier for humans to solve than the other version, and therefore has been deployed most of the time in our experience. We have chosen to develop our attack based on the second version. (a) (b) Figure 2. Google CAPTCHA. (a) a user-unfriendly version; (b) a usable version. We studied a hundred samples that was randomly collected from the Internet1, and observed the following features. • Each challenge uses only two colours: Red, Green, or Blue for the challenge text, and White for the background. • CCT is the main segmentation-resistant mechanism so that characters connect or touch with one another horizontally. • Global warping is applied to add randomized distortion. • The thickness of characters varies much; even different portions of the same character differ in thickness with each other. This is a powerful feature for defending against potential segmentation attacks. • Random text strings are used; a text string’s length varies between 5 and 8 characters; only lower-case letters are used. • Multiple font typefaces and styles (such as bold, italics and regular) are used. The key insight behind our attack is that although the Google CAPTCHA uses different font typefaces and styles, and uses heavy distortions, shape patterns in some characters remain invariant, e.g. the “loop” in characters “a” and “p”, and the cross shape in ‘f’ and ‘t’. Our main idea is to first detect distinctive shape patterns, identify their associated characters, and then cut out these characters. It turns out that once these detectable 1 All samples we used for this paper were randomly collected from Google email registration page and Blogger.com, where the CAPTCHA was used whenever one attempted to create an account or post a comment. characters are cut out, most often a whole challenge text is also properly segmented already. Our attack includes the following sequential steps: • Pre-processing – A set of standard techniques is applied to prepare each challenge image. • Pattern based detection of characters – A set of algorithms is used to locate characters with their detectable shape patterns. • Character segmentation – A set of rules and heuristics is used to segment detected characters. 3.1 Preprocessing We first perform image up-sampling, which enlarges the image, increases its pixel details, and thus smoothes the embedded text. Then, we convert the up-sampled image into a black-and-white image. This binarising process is done via the standard thresholding method: all the pixels with a color value above a heuristically predetermined threshold is converted to black and those bellow it to white. Next, we apply Zhangs’ algorithm [16] to thin the image. Thinning is the process of identifying a binary image’s skeleton. Figure 3 shows the output of our preprocessing on a sample image. The main advantages of using thinning in our attack are: 1)Thinning normalizes the character thickness, which varied very much in a random way, to a uniformed one-pixel thickness. This greatly simplifies our attack, as it does not have to account for different character thicknesses. 2) It is much faster to process a thinned image than an un-thinned one, as the former contains far fewer pixels that matter. Clearly thinning does not segment characters – connected characters stay connected after thinning. Figure 3. After preprocessing (the image is not in its actual size due to space concerns; the original image is in Figure 2(b)) 3.2 Pattern Based Characters Detection In this step, we aim to detect categories of characters in a thinned image using shape patterns found in their generic shape. We identified four shape patterns and each belongs to a category of characters as follows. • Dot shape: “i” and “j”. • Loop shape: “a”, “b”, “d”, “e”, “g”, “o”, “p”, “q”. • Cross shape: “t” and “f”. • S shape or “S Vertical Histogram”: Characters that contain three vertically juxtaposed lines, the character “s” Detecting Characters that contains a Dot Shape. We first use “Color Filling” segmentation on the foreground components (on the thinned version of an image). A foreground component contains a single character, a group of connected characters, or a part of a character, and has a black color. “Color Filling” Segmentation or CFS is effectively like using a distinct color to flood each component, and it could be used to segment against any color, so we call it “Color Filling” segmentation, more about CFS in [3] . In Figure 4, each foreground component is segmented by CFS and is identified by its unique color. Figure 4. Segmentation of connecting component using CFS. Each component is indentified by a different color. Characters such as “i” and “j” consist of a dot and a body part underneath the dot. We detect both parts as follows: a) To detect the dot part: We use its relatively small pixel count (i.e., pixel count is the number of pixels in an component). b) To detect the body part: we apply a series of steps as follows: First, using the position of the dot, we locate all foreground components underneath it. In Figure 5 (a), only the component “spi” is positioned underneath the dot. Second, we use a modified version of CFS (flood up and down only) to ignore all parts of the components with a horizontal orientation. Figure 5 (b) shows the remaining components having a vertical orientation. Third, for each of the remaining components, we calculate the equation of a line representing its orientation (components far from the dot are ignored). In this case, only two lines are plotted as shown in figure 5 (c). Finally, the component with the line path closest to the dot component is considered as the body part of either “i” or “j”. Figure 5 (d) shows the only remaining component; in this case the body part of “i”. (a) (b) (c) (d) Figure 5. Detecting characters with a dot shape: (a) a connected component located under the dot .(b) components with a vertical orientation. (c) Plotting the line equation for the remaining thinned components underneath the dot. (c) The body part of “i”. Detecting Characters with a Loop Component. We adopted a loop detection method that we used in [3], and which involves two steps as follows: First, CFS is applied to the background color (i.e., white) of the image. As shown in Figure 6 (a), background components are now segmented and identified by different colors. Second, the above step return two types of loop components, “character loops” and “connection loops”. Connection loops are created as a result of the crowding of characters together. Characteristics of connection loops are: 1) a relatively small pixel count, or 2) a relatively large pixel count if vertically overlapping and in close proximity with character loop(s). We developed heuristics based on the pixel count and the relative position of loops to detect and remove connection loops. Figure 6 (b) shows a different example containing a “connection loop” before and after removal. (a) (b) Figure 6. Detection of characters with a loop shape. (a) CFS on the background color is used for loop detection. (b) An example of a connection loop before and after removal. Detecting Characters with a Cross. A unique characteristic of a cross shape is having four sides; upper, lower, left and right sides. We observed that drawing an imaginary box around the cross shape must intersect with the box once from each side, with each intersection representing one of the cross shape four sides. We detect the cross as follows. a) We traverse the image using the imaginary box, and if each of the four sides of the box intersects with one and only one foreground colored pixel, then the box position is labeled as a possible cross shape component. After that, we shift the box position and continue searching for other cross shapes, until the entire image is traversed. b) We filter through all the possible cross shapes, and we keep only those satisfying these conditions. First, the position of the cross shape is in the upper side of its foreground component. Second, all the foreground pixels covered by the box area are connected with each other (we used CFS to verify this condition), this condition is needed as all the pixels in a valid cross shape are connected with each other Finally, the cross shape must not overlap vertically with a loop shape. In Figure 7, the red box indicates the imaginary box and thus the location of a cross shape. Figure 7. Detecting characters with a cross shape (the detected cross shapes are highlighted by a red box). S Vertical Histogram. The unique shape characteristic of the character “s” is that it contains three vertically overlapping strokes in its shape. We detect it as follows. First, we map the image against a vertical histogram that represents the total number of foreground pixels in each column. Then we ignore all parts of the histogram that intersects with other character shapes (this is done to insure no false detection of characters having three vertically overlapping strokes, such as “a” or “e”). Second, we search the histogram for consecutive occurrence of columns with the value of three or more pixels in each column; we call such occurrence of columns as the “s-span”. Finally, if an “s-span” has a width larger that 25 pixels (a threshold for the character “s” minimum width; i.e., the component under analysis has a width large enough to contain an “s”), then we use the s-span’s left-most and right-most columns as a reference to draw a bounding box around the characters “s”. Figure 8 shows the histogram (magnified by a factor of 4) and the “s” character bounding box. Figure 8. S Vertical Histogram. Identification of the character “s” location, as highlighted by a bounding box. 3.3 Segmentation In this step, we cut out characters that have a shape pattern detected in the previous step. We use the examples used in the previous section to show how we separate ‘i’ from ‘sp’, and how to split ‘sp’, ‘ut’ and ‘ws’ – four examples illustrate how to segment a character with a dot, a character with a loop, a character with a cross, and a character with “s” shape, respectively. We first convert the detected shape pattern’s color to white (i.e. the image background color). This effectively hides the detected shape, and breaks connected characters into separate components, as shown in Figure 9. Note: for a character with dot, the detected shape includes the dot, and the vertical part of the body. All visible components in Figure 9 can be classified into two types. The first type belongs to only one character, and we call them private components. For exam

本文档为【google】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。

google

热门搜索

历史搜索