BIG DATA SEARCH USING SOLR & HADOOP

BIG DATA SEARCH USING SOLR & HADOOP

Recently, Kreara successfully implemented a solr based search server for one of our clients in UK. The existing search server was running on a home grown custom built search engine. During the initial days of the company the search engine scaled well with lesser number of datasets and few filters. As the data started to exponentially grow up to millions of records, the scalability and the caching started choking.  It had become a hurdle in the search engine performance and the index caching.

Currently they have 50 million records which is residing in their search server and is expected to grow by 100 to 200 million records within the next couple of years. The search server was not sufficient enough to handle such scale up. It was decided that this issue could be resolved with the use of Solr Search Server. The indexing process was distributed into parallel nodes with redundant data using the power of Hadoop and MapReduce. The result was amazing. While the query time or the response time was 100-250ms with the old search engine, the solr search server was able to handle the same in 10-30ms without caching. This POC was done on a standalone search server.

The advantages of the Big Data Search server using Solr and Hadoop includes

  • Power to scale out with the commodity hardware.
  • Solr cloud will make its way easier and go faster.
  • Solr search server can avoid the batch processing and manual intervention to update the indexes.
  • Solr search server works on near real time.

solr-search

Post A Comment

Protected by WP Anti Spam