Other BigData Tools andTechnologies | 185
named Doug Cutting. There is a hosted Lucene service on Microsoft Windows Azure, which
means that Lucene can be utilized as a cloud service as well.
7.6.1 Lucene in Search Applications
Lucene is simple yet a powerful Java-based search library. It can be used in any application to
add search capability. Lucene is a scalable and high-performance library used to index and search
virtually any kind of text. Lucene library provides the core operations which are required by any
search application, such as indexing and searching.
If we have a web portal with a huge volume of data, then we will most probably require a
search engine in our portal to extract relevant information from the huge pool of data. Lucene
works as the heart of any search application and provides vital operations pertaining to indexing
and searching.
7.6.2 Features of Apache Solr
Solr is a tool which is wrap around Lucene’s Java API. Therefore, using Solr, all the features of
Lucene can be leveraged.
Take a look at some of the most prominent features of Solr as listed below.
Full Text Search: Solr provides all the capabilities needed for a full text search, such as
tokens, phrases, spell check, wildcard and auto-complete.
Flexible and Extensible: By extending the Java classes and configuring accordingly, the
components of Solr can be easily customized.
Highly Scalable: While using Solr with Hadoop, its capacity can be scaled by adding replicas.
Restful APIs For Web Service: To communicate with Solr, it is not mandatory to have Java
programming skills. Instead we can use restful services to communicate with it. We enter
documents in Solr in file formats, like XML, JSON and .CSV and get results in the same file
formats.
Enterprise Ready: According to the need of the organization, Solr can be deployed in any
kind of systems (big or small), such as standalone, distributed, cloud, etc.
NoSQL Database: Solr can also be used as big data scale NoSQL database where we can
distribute the search tasks along a cluster.
Admin Interface: Solr provides an easy-to-use, user-friendly, feature powered, user interface,
using which we can perform all the possible tasks, such as manage logs, add, delete, update
and search documents.
Text-centric and Sorted by Relevance: Solr is mostly used to search text documents and the
results are delivered according to the relevance with the user’s query in order.
Unlike Lucene, a programmer does not need to have JAVA programming skills while work-
ing with Apache Solr. Knowledge of working with XML is sufficient. It provides a ready-
to-deploy service to build a search box featuring autocomplete, which Lucene does not
provide. Using Solr, we can scale, distribute, and manage index for large scale (Big Data)
applications.
M07 Big Data Simplified XXXX 01.indd 185 5/17/2019 2:50:16 PM
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.206.25