Apache Solr 4 Cookbook Review

Summary

Apache Solr 4 Cookbook is published by Packt Publishing Ltd., on January 2013.  This is a second cookbook published for Apache Solr, followed by Apache Solr 3.1 Cookbook.  I will be reviewing Apache Solr 4 Cookbook and what is expected from;  since I got an opportunity from Packt Publishing to review and publish my review comments with an unbiased opinion.  Book details as follows-

Book Details

Apache Solr 4 CookbookTitle : Apache Solr 4 Cookbook

Language : English

Paperback : 328 pages [ 235mm x 191mm ]

Release Date : January 2013

ISBN : 1782161325

ISBN 13 : 9781782161325

Author(s) : Rafał Kuć


Review: Apache Solr 4 Cookbook

Let’s see what we are getting from cookbook-

  • Provides an information of Solr Installation and its configuration for Jetty, Apache Tomcat
  • Provides ZooKeeper standalone installation and its configuration
  • Provides various capabilities of DataImportHandler (DIH) and its configuration
  • Document Language detection configuration and library reference
  • DirectSolrSpellchecker – New Spellchecker class configuration
  • XML and HTML extraction, indexing using Nutch & Tika
  • Various approach and capabilities of querying Solr Index data
  • Various Faceting capabilities and how to query
  • Basic configuration of Document, Query & Filter cache of Solr
  • SolrCloud information starts from Chapter 7
  • Result Highlighting configuration
  • Basic problems and how to handle – Chapter 9
  • Typically expected scenario of querying a data – Appendix

We can learn above concepts and its configuration from Apache Solr 4 Cookbook.  I have observed this book a good starting point for Solr Beginners.  Meaning of cookbook; from information technology perspective-

Cookbook means – involving or using step-by-step procedures whose rationale is usually not explained

According to definition Apache Solr 4 Cookbook satisfy to beginners, however cookbook typically meant for intermediate to experienced level.



Let’s move on to ‘what could have been covered in the cookbook’

  • First of all, Solr 4 aka SolrCloud; providing capabilities and its functionalities would add more value to the Cookbook
  • Recipes about New Functionalties introduced in Solr 4, configurations and recommendation. Like Solr 4 vs Solr 3x, etc.
  • Recipes of discussing Deprecated and Discontinued configurations of Solr 4
  • Recipes of Do’s and Don’ts of SolrCloud configurations & recommendations
  • Chapter 9 talks about ‘Dealing with Problems’, just take a deep look into it.  Since, these problems are known from day one. It doesn’t talk about SolCloud problems and how to deal with it, this is expected in the cookbook!
    • For example: what could happen when turning on/turning off softcommit,  Deploying multiple collections with same ZooKeeper ensemble and its problems,  How to till with SolrCloud shard distribution and replication numbers,  How to till with problems while resizing a SolrCloud Cluster, etc
  • How it could have been?
    • Taking an example: In Solr 4 DirectSolrSpellChecker is introduced.  It provides a capability to use main index to provide spelling suggestions and didn’t need to be rebuilt after every commit. Since describing this new capability and its configuration could hit the spot for Solr Community

Describing above mentioned pointers would surely hit the spot for Solr Community, I believe Solr Community will agree here!


Conclusion

Apache Solr 4 Cookbook talks about solr configuration which is mostly known with previous versions of Solr.  When a new version hits the market, it’s recommended to talk about new functionalities and improvements, that provides the precise idea of Solr 4 aka SolrCloud to the users/Community.

I’m suggesting a title as ‘Apache Solr 4 Beginner’s Cookbook’

Advanced users? as always dig yourself or just poke around in the internet – like solr.pl, etc 🙂