有的页面要动态加载完,爬虫才能拿到完整信息,于是在网上找到了 HTMLunit 封装的办法但是:
执行到 WebClient webClient = new WebClient();
就报:
ClassNotFoundException: org.w3c.css.sac.ErrorHandler
package htmlunit;
import java.io.IOException;
import java.net.MalformedURLException;
import org.htmlparser.visitors.HtmlPage;
import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException;
import com.gargoylesoftware.htmlunit.WebClient;
public class WebClientTest {
public static void main(String args[]){
final WebClient webClient = new WebClient();
HtmlPage page;
try {
page = webClient.getPage("http://www.youerw.com");
System.out.println(page.toString());
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
webClient.closeAllWindows();
}
}
org.w3c.css.sac.ErrorHandler这个类你有吗?包都加全了吗?