In my previous blog, Searching your Java CMS using Apache Solr: Introduction, I looked at how to synchronize the information in a Java CMS with a Solr index. This blog is an introduction to the Lucene Connectors Framework, a crawler framework I will use to solve the problem of making the information from a Java CMS search-able using Solr. I will show you how to build, deploy and get it running as a web crawler. In part 2 of this introduction I will extend LCF with a new Connector.
The Lucene Connectors Framework, an incubator project at Apache, provides a framework for connecting a source content repository to target repositories or indexes, such as Apache Solr. Last month the Lucene Connector Framework published their first build-able sources.
Read the rest of this entry »