如何使用 Selenium WebDriver 和 Java 从图像（验证码）中读取文本答案

【问题标题】：How to read the text from image (captcha) by using Selenium WebDriver with Java如何使用 Selenium WebDriver 和 Java 从图像（验证码）中读取文本
【发布时间】：2013-09-21 17:50:13
【问题描述】：

我有注册网页，但在最后一个验证码显示..

我无法从图像中读取文本。我要提一下代码和输出..

@Test
public void loginTest() throws InterruptedException {
    System.out.println("Testing");
    driver.get("https://customer.onlinelic.in/ForgotPwd.htm");

    WebElement element = driver.findElement(By.xpath("//*[@id='forgotPassword']/table/tbody/tr[5]/td[3]/img"));
    System.out.println(" get the instance ");

    String elementTest = element.getAttribute("src");
    System.out.println("Element : " + elementTest);
}

输出：错误

线程“主”org.openqa.selenium.NoSuchElementException 中的异常：无法定位元素： {"method":"xpath","selector":"//[@id='forgotPassword']/table/tbody/tr[5]/td[3]/img"} 命令持续时间或超时：60.02 秒有关此错误的文档，请访问： http://seleniumhq.org/exceptions/no_such_element.html 构建信息：版本：'2.35.0'，修订：'8df0c6b'，时间：'2013-08-12 15:43:19' 系统信息：os.name：'Windows 7'，os.arch：'amd64'，os.version： '6.1', java.version: '1.6.0_26' 会话 ID: 5f5b2e1a-56a4-49ad-8fd3-2870747a7768 驱动程序信息： org.openqa.selenium.firefox.FirefoxDriver 功能 [{platform=XP, acceptSslCerts=true, javascriptEnabled=true, browserName=firefox, 可旋转=false，locationContextEnabled=true，版本=23.0.1， cssSelectorsEnabled=true，databaseEnabled=true，handlesAlerts=true， browserConnectionEnabled=true, nativeEvents=true, webStorageEnabled=true, applicationCacheEnabled=true, 需要屏幕截图 = true}] 在 sun.reflect.NativeConstructorAccessorImpl.newInstance0（本机方法）在 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) 在 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) 在 java.lang.reflect.Constructor.newInstance(Constructor.java:513) 在 org.openqa.selenium.remote.ErrorHandler.createThrowable(ErrorHandler.java:191) 在 org.openqa.selenium.remote.ErrorHandler.throwIfResponseFailed(ErrorHandler.java:145) 在 org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:554) 在 org.openqa.selenium.remote.RemoteWebDriver.findElement(RemoteWebDriver.java:307) 在 org.openqa.selenium.remote.RemoteWebDriver.findElementByXPath(RemoteWebDriver.java:404) 在 org.openqa.selenium.By$ByXPath.findElement(By.java:344) 在 org.openqa.selenium.remote.RemoteWebDriver.findElement(RemoteWebDriver.java:299) 在 seleniumtest.CaptchaTest.loginTest(CaptchaTest.java:41) 在 seleniumtest.CaptchaTest.main(CaptchaTest.java:59) 原因： org.openqa.selenium.remote.ErrorHandler$UnknownServerException: 无法定位元素： {"method":"xpath","selector":"//[@id='forgotPassword']/table/tbody/tr[5]/td[3]/img"} 构建信息：版本：'2.35.0'，修订：'8df0c6b'，时间：'2013-08-12 15:43:19' 系统信息：os.name：'Windows 7'，os.arch：'amd64'， os.version: '6.1', java.version: '1.6.0_26' 驱动信息： driver.version：在 .FirefoxDriver.prototype.findElementInternal_(file:///C:/Users/lukup/AppData/Local/Temp/anonymous4043037924964932185webdriver-profile/extensions/fxdriver@googlecode.com/components/driver_component.js:8880 处未知) 在 .fxdriver.Timer.prototype.setTimeout/<.notify>

【问题讨论】：

标签： java selenium selenium-webdriver captcha

【解决方案1】：

只是为了详细说明以前的答案，CAPTCHA 是“完全自动化的公共图灵测试以区分计算机和人类”的首字母缩写词。所以，如果“机器”能解决它，它就不是真正的工作。

为了解决这个问题，你可以做一些事情——使用外部服务的API，例如http://www.deathbycaptcha.com。您实现他们的 API，向他们传递 CAPTCHA 并返回文本。我观察到的平均求解时间约为 10-15 秒。

实现示例（取自here）

import com.DeathByCaptcha.AccessDeniedException;
import com.DeathByCaptcha.Captcha;
import com.DeathByCaptcha.Client;
import com.DeathByCaptcha.SocketClient;
import com.DeathByCaptcha.HttpClient;

/* Put your DeathByCaptcha account username and password here.
   Use HttpClient for HTTP API. */
Client client = (Client)new SocketClient(username, password);
try {
    double balance = client.getBalance();

    /* Put your CAPTCHA file name, or file object, or arbitrary input stream,
       or an array of bytes, and optional solving timeout (in seconds) here: */
    Captcha captcha = client.decode(captchaFileName, timeout);
    if (null != captcha) {
        /* The CAPTCHA was solved; captcha.id property holds its numeric ID,
           and captcha.text holds its text. */
        System.out.println("CAPTCHA " + captcha.id + " solved: " + captcha.text);

        if (/* check if the CAPTCHA was incorrectly solved */) {
            client.report(captcha);
        }
    }
} catch (AccessDeniedException e) {
    /* Access to DBC API denied, check your credentials and/or balance */
}

【讨论】：

【解决方案2】：

两个问题。

您的 xpath 错误，因此您得到了 NoSuchElement 异常。
即使您有正确的 xpath，您也无法提取文本，因为如果使用 CAPTCHA，那将失去意义

【讨论】：

请浏览 URL 并查看验证码 customer.onlinelic.in/ForgotPwd.htm

【解决方案3】：

验证码的全部目的是防止 UI 自动化！您可能想使用内部 API 来验证操作。

【讨论】：

【解决方案4】：

我有一个适用于特定网站的解决方案。您可以获取整个页面的快照并获取验证码的图像。然后将验证码图像的整个宽度除以字符总数（在验证码中它通常是恒定的）。现在我们有了验证码图像的各个字符。通过重新加载页面收集验证码的所有可能字符。

一旦您拥有所有可能的字符，然后给定任何验证码图像，您就可以将其字符与我们拥有的图像进行比较，并确定它是哪个字母或数字。

要遵循的步骤：

收集验证码图像并将其分成单个字符。

private static BufferedImage cropImage(File filePath, int x, int y, int w,
            int h) {

        try {
            BufferedImage originalImgage = ImageIO.read(filePath);
            BufferedImage subImgage = originalImgage.getSubimage(x, y, w, h);

            return subImgage;
        } catch (IOException e) {
            e.printStackTrace();
            return null;
        }
    }

将所有可能的图像保存在一个文件夹中

现在读取验证码的每个字符图像并将其与上述文件夹中的所有其他图像进行比较。您可以使用像素值比较两个图像 public static float getDiff(File f1, File f2, int width, int height) 抛出 IOException { BufferedImage bi1 = null; BufferedImage bi2 = null; bi1 = new BufferedImage(宽度, 高度, BufferedImage.TYPE_INT_ARGB); bi2 = new BufferedImage(width, height, BufferedImage.TYPE_INT_ARGB);

        bi1 = ImageIO.read(f1);
        bi2 = ImageIO.read(f2);
        float diff = 0;
        for (int i = 0; i < width; i++) {
            for (int j = 0; j < height; j++) {
                int rgb1 = bi1.getRGB(i, j);
                int rgb2 = bi2.getRGB(i, j);

                int b1 = rgb1 & 0xff;
                int g1 = (rgb1 & 0xff00) >> 8;
                int r1 = (rgb1 & 0xff0000) >> 16;

                int b2 = rgb2 & 0xff;
                int g2 = (rgb2 & 0xff00) >> 8;
                int r2 = (rgb2 & 0xff0000) >> 16;

                diff += Math.abs(b1 - b2);
                diff += Math.abs(g1 - g2);
                diff += Math.abs(r1 - r2);
            }
        }
        return diff;
    }

差异值较小的图像是实际匹配的。将其名称附加到字符串中。
读取验证码返回字符串的所有图像后 1: https://i.stack.imgur.com/FYPhd.png

在上图中图像名称指定数字或字符。

这仅适用于简单的验证码，例如 [1

【讨论】：

【解决方案5】：

这是从上图中读取文本的示例代码：

import java.awt.Image;
import java.awt.image.BufferedImage;
import java.awt.image.RenderedImage;
import java.io.File;
import java.io.IOException;
import java.net.URL;
import javax.imageio.ImageIO;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.testng.annotations.BeforeTest;
import org.testng.annotations.Test;
import com.asprise.util.ocr.OCR;

public class ExtractImage {

 WebDriver driver;

 @BeforeTest
  public void setUpDriver() {
   driver = new FirefoxDriver();
  }

 @Test
 public void start() throws IOException{

 /*Navigate to http://www.mythoughts.co.in/2013/10/extract-and-verify-text-from-image.html page
  * and get the image source attribute
  *  
  */  
 driver.get("http://www.mythoughts.co.in/2013/10/extract-and-verify-text-from-image.html");
 String imageUrl=driver.findElement(By.xpath("//*[@id='post-body-5614451749129773593']/div[1]/div[1]/div/a/img")).getAttribute("src");
 System.out.println("Image source path : \n"+ imageUrl);

 URL url = new URL(imageUrl);
 Image image = ImageIO.read(url);
 String s = new OCR().recognizeCharacters((RenderedImage) image);
 System.out.println("Text From Image : \n"+ s);
 System.out.println("Length of total text : \n"+ s.length());
 driver.quit();

 /* Use below code If you want to read image location from your hard disk   
  *   
   BufferedImage image = ImageIO.read(new File("Image location"));   
   String imageText = new OCR().recognizeCharacters((RenderedImage) image);  
   System.out.println("Text From Image : \n"+ imageText);  
   System.out.println("Length of total text : \n"+ imageText.length());   

   */ 
}

}

这是上面程序的输出：

图片来源路径： http://2.bp.blogspot.com/-42SgMHAeF8U/Uk8QlYCoy-I/AAAAAAAADSA/TTAVAAgDhio/s1600/love.jpg

永远不要 M2 使用 O, ne 谁喜欢你永远不要对你说忙谁需要你永远不要欺骗那个谁 ReaZZy 信任你，永远不要忘记那个谁还记得你。

总文本长度： 175

【讨论】：

【解决方案6】：

忘记密码表单位于 iframe 中。这就是硒找不到元素的原因。您需要先切换到保存表单的 iframe，然后运行 findelement。你的 xpath 是正确的。

使用driver.switchTo().frame(arg0) 切换到框架。见 javadoc here

要获取验证码文本，我不明白您所说的“存储测试并比较”是什么意思。理想情况下，您不应该能够从验证码中读取文本（正如其他人所提到的）。我见过的另一种方法是将验证码值存储为 alt text 在开发和 QA 环境中。这样您就可以阅读它并在文本框中输入。当代码进入生产环境或任何外部环境时，这个alt text 可以被删除。

【讨论】：

【解决方案7】：

无法从 CAPTCHA 中读取。如果您可以读取 CAPTCHA，那么使用 CAPTCHA 是没有意义的。

【讨论】：

但是从后端，我们可以去存储测试和比较