Open Source Crawler in JAVA
In case you have the need to crawl the web to get information. I would suggest few:
Of these i would recommend https://crawler.dev.java.net/
Why? because I have used it.
This is a quite simple crawler which serves the purpose. It loads the page and parses the page.
Throws appropriate events while crawling. (i.e supports event model to serve the requirement).
It has HTML and HTTP parsing capabilities.
- Smart and Simple Web Crawler - https://crawler.dev.java.net/
- Websphnix - http://www.cs.cmu.edu/~rcm/websphinx/
- Archive-Crawler - http://archive-crawler.sourceforge.net
Why? because I have used it.
This is a quite simple crawler which serves the purpose. It loads the page and parses the page.
Throws appropriate events while crawling. (i.e supports event model to serve the requirement).
It has HTML and HTTP parsing capabilities.
Comments
Post a Comment