Posts Tagged ‘Hadoop’

Announcing Dutch Lucene User Group

August 26th, 2009 by Uri Boness
(http://blog.jteam.nl/2009/08/26/announcing_lucene_user_group/)

In the last 3 years we’ve witnessed the rise of open source enterprise search. Of course it was always there, and Apache Lucene in particular was there since, well… the previous century. But in the last 3 years the interest in this area has grown dramatically and the install/user base of the different Lucene related projects (Lucene Java and Solr in particular) has grown at an amazing rate. Today, the Lucene ecosystem is booming – there’s a high demand for expertise in this field, yet still there is relatively low supply. The Lucene / Solr mailing lists are flooded with hundreds of questions each week and the need to share knowledge is evident.

Read the rest of this entry »

Introduction to Hadoop

August 4th, 2009 by Martijn van Groningen
(http://blog.jteam.nl/2009/08/04/introduction-to-hadoop/)

Recently I was playing around with Hadoop, after a while I really recognized that this was a great technology. Hadoop allows you to write and run your application in a distributed manner and process large amounts of data with it. It consists out of a MapReduce implementation and a distributed file system. Personally I did not have any experience with distributed computing beforehand, but I found MapReduce quiet easily to understand.

In this blog post I will give an introduction to Hadoop by showing a relative simple MapReduce application. This application will count the unique tokens inside text files. With this example I will try to explain how Hadoop works. Before we start creating our example application we need to know the basics of MapReduce itself.

Read the rest of this entry »