Crawling multithread library in java

Wednesday, 8 April 2009 11:21 by romeosa

This is a simple crawling multithread library I developed(consider this a beta version, but should work fine).
If you need the source-code, let me know.
If you want to send feedback, feel free to post a comment.

This is an example on how to use the library:
1) the library dispatches an event when a new document is loaded, so make a listener for this event; 2) set the maximum number of concurrent threads; 3) start crawling.

 

MultiThreadCrawler crawler = new MultiThreadCrawler();
crawler.addDocumentCrawledEventListener(new DocumentCrawledListener() {
public void documentCrawled(DocumentCrawledEvent e)
{
System.out.println("" + e.getDocument().getUrl());
//System.out.println("" + ((MultiThreadCrawler)e.getSource()).getRemainingUrlsCount());
}
});
try
{
crawler.setMaxConcurrentThreads(48);
crawler.startCrawling(new URL("FULL_URL_OF_THE_SITE(WITH_HTTP)"));
} catch (MalformedURLException ex)
{
//DO SOMETHING
}

And finally you can download the library here: RomeosaCrawler.jar (23.68 kb)

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Categories:  
Actions:   E-mail | Permalink | Comments (1) | Comment RSSRSS comment feed

Use your usb pen to encrypt your data with truecrypt

Saturday, 14 February 2009 06:01 by romeosa

Have you ever desired to access your data using your usb pen as the "password"? In this post I'll explain how to do it.

What you need


1) a pen drive
2) truecrypt http://www.truecrypt.org
3) see below

What to do Part 1: create a Truecrypt volume with a keyfile


1) Go to http://www.truecrypt.org and press Download

 

 

2) Download thw Vista/XP version


3) Click on TrueCrypt Setup 6.1a



More...

Currently rated 5.0 by 1 people

  • Currently 5/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Let's start

Saturday, 7 February 2009 06:32 by romeosa

Today I will start a new blog about software development. 

I'm a software engineer and I want to share some stuff with the community, so...let's start!!

   More...

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
Categories:   Off Topic
Actions:   E-mail | Permalink | Comments (0) | Comment RSSRSS comment feed