Contents

About the Author

About the Technical Reviewer

Acknowledgments

Introduction

image Chapter 1: Apache Solr: An Introduction

Overview

Inside Solr

What Makes Apache Solr So Popular

Major Building Blocks

History

What ’ s New in Solr 5. x

Beyond Search

Solr vs. Other Options

Relational Databases

Elasticsearch

Related Technologies

Summary

Resources

image Chapter 2: Solr Setup and Administration

Stand-Alone Server

Prerequisites

Download

Terminology

General Terminology

SolrCloud Terminology

Important Configuration Files

Directory Structure

Solr Installation

Solr Home

Hands-On Exercise

Start Solr

Create a Core

Index Some Data

Search for Results

Solr Script

Starting Solr

Using Solr Help

Stopping Solr

Restarting Solr

Determining Solr Status

Configuring Solr Start

Admin Web Interface

Core Management

Config Sets

Create Configset

Create Core

Core Status

Unload Core

Delete Core

Core Rename

Core Swap

Core Split

Index Backup

Index Restore

Instance Management

Setting Solr Home

Memory Management

Log Management

Common Exceptions

OutOfMemoryError—Java Heap Space

OutOfMemoryError—PermGen Space

TooManyOpenFiles

UnSupportedClassVersionException

Summary

image Chapter 3: Information Retrieval

Introduction to Information Retrieval

Search Engines

Data and Its Categorization

Structured

Unstructured

Semistructured

Content Extraction

Text Processing

Cleansing and Normalization

Enrichment

Metadata Generation

Inverted Index

Retrieval Models

Boolean Model

Vector Space Model

Probabilistic Model

Language Model

Information Retrieval Process

Plan

Execute

Evaluate

Summary

image Chapter 4: Schema Design and Text Analysis

Schema Design

Documents

schema.xml File

Fields

fieldType

copyField

Define the Unique Key

Dynamic Fields

defaultSearchField

solrQueryParser

Similarity

Text Analysis

Tokens

Terms

Analyzers

Analysis Phases

Analysis Tools

Analyzer Components

Common Text Analysis Techniques

Going Schemaless

What Makes Solr Schemaless

Configuration

Limitations

REST API for Managing Schema

Configuration

REST Endpoints

Other Managed Resources

solrconfig.xml File

Frequently Asked Questions

How do I handle the exception indicating that the _version_ field must exist in the schema?

Why is my Schema Change Not Reflected in Solr?

I Have Created a Core in Solr 5.0, but Schema.xml is Missing. Where Can I find it?

Summary

image Chapter 5: Indexing Data

Indexing Tools

Post Script

SimplePostTool

curl

SolrJ Java Library

Other Libraries

Indexing Process

UpdateRequestHandler

UpdateRequestProcessorChain

UpdateRequestProcessor vs. Analyzer/Tokenizer

Indexing Operations

XML Documents

JSON Documents

CSV Documents

Index Rich Documents

DataImportHandler

Document Preprocessing

Language Detection

Generate Unique ID

Deduplication

Document Expiration

Indexing Performance

Custom Components

Custom UpdateRequestProcessor

Frequently Occurring Problems

Copying Multiple Fields to a Single-Valued Field

Summary

image Chapter 6: Searching Data

Search Basics

Prerequisites

Solr Search Process

SearchHandler

SearchComponent

QueryParser

QueryResponseWriter

Solr Query

Default Query

Phrase Query

Proximity Query

Fuzzy Query

Wildcard Query

Range Query

Function Query

Filter Query

Query Boosting

Global Query Parameters

Query Parsers

Standard Query Parser

DisMax Query Parser

eDisMax Query Parser

JSON Request API

Customizing Solr

Custom SearchComponent

Sample Component

Frequently Asked Questions

I have used KeywordTokenizerFactory in fieldType definition but why is my query string getting tokenized on whitespace?

How can I find all the documents that contain no value?

How can I apply negative boost on terms?

Which are the special characters in query string. How should they be handled?

Summary

image Chapter 7: Searching Data: Part 2

Local Parameters

Syntax

Example

Result Grouping

Prerequisites

Request Parameters

Example

Statistics

Request Parameters

Supported Methods

LocalParams

Example

Faceting

Prerequisites

Syntax

Example

Faceting Types

Reranking Query

Request Parameters

Example

Join Query

Limitations

Example

Block Join

Prerequisites

Example

Function Query

Prerequisites

Usage

Function Categories

Example

Caution

Custom Function Query

Referencing an External File

Usage

Summary

image Chapter 8: Solr Scoring

Introduction to Solr Scoring

Default Scoring

Implementation

Scoring Factors

Scoring Formula

Limitations

Explain Query

Alternative Scoring Models

BM25Similarity

DFRSimilarity

Other Similarity Measures

Per Field Similarity

Custom Similarity

Summary

image Chapter 9: Additional Features

Sponsored Search

Usage

Spell-Checking

Generic Parameters

Implementations

How It Works

Usage

Autocomplete

Traditional Approach

SuggestComponent

Document Similarity

Prerequisites

Implementations

Summary

image Chapter 10: Traditional Scaling and SolrCloud

Stand-Alone Mode

Sharding

Master-Slave Architecture

Master

Slave

Shards with Master-Slave

SolrCloud

Understanding the Terminology

Starting SolrCloud

Restarting a Node

Creating a Collection

Uploading to ZooKeeper

Deleting a Collection

Indexing a Document

Load Balancing

Document Routing

Working with a Transaction Log

Performing a Shard Health Check

Querying Results

Performing a Recovery

Shard Splitting

Adding a Replica

ZooKeeper

Frequently Asked Questions

Why is the size of my data/tlog directory growing drastically? How can I handle that?

Can I totally disable transaction logs? What would be the impact?

I have recently migrated from traditional architecture to SolrCloud. Is there anything that I should be careful of and not do in SolrCloud?

I am migrating to SolrCloud, but it fails to upload the configurations to ZooKeeper. What could be the reason?

Summary

image Chapter 11: Semantic Search

Limitations of Keyword Systems

Semantic Search

Tools

OpenNLP

Apache UIMA

Apache Stanbol

Techniques Applied

Part-of-Speech Tagging

Solr Plug-in for POS Tagging

Named-Entity Extraction

Using Rules and Regex

Using a Dictionary or Gazetteer

Using a Trained Model

Semantic Enrichment

Synonym Expansion

WordNet

Solr Plug-in for Synonym Expansion

Summary

Index

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.7.116