Home Page Icon
Home Page
Table of Contents for
Table of Contents
Close
Table of Contents
by Anshul Johri, Sachin Handiekar
Apache Solr for Indexing Data
Apache Solr for Indexing Data
Table of Contents
Apache Solr for Indexing Data
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Getting Started
Overview and installation of Solr
Installing Solr in OS X (Mac)
Running Solr
Installing Solr in Windows
Installing Solr on Linux
The Solr architecture and directory structure
Solr directory structure
Cores in Solr (Multicore Solr)
Summary
2. Understanding Analyzers, Tokenizers, and Filters
Introducing analyzers
Analysis phases
Tokenizers
Standard tokenizer
Keyword tokenizer
Lowercase tokenizer
N-gram tokenizer
Filters
Lowercase filter
Synonym filter
Porter stem filter
Running your analyzer
Summary
3. Indexing Data
Indexing data in Solr
Introducing field types
Defining fields
Defining an unique key
Copy fields and dynamic fields
Building our musicCatalogue example
Using the Solr Admin UI
Facet searching
Summary
4. Indexing Data – The Basic Technique and Using Index Handlers
Inserting data into Solr
Configuring UpdateRequestHandler
Indexing documents using XML
Adding and updating documents
Deleting a document
Indexing documents using JSON
Adding a single document
Adding multiple JSON documents
Sequential JSON update commands
Indexing updates using CSV
Summary
5. Indexing Data with the Help of Structured Datasources – Using DIH
Indexing data from MySQL
Configuring datasource
DIH commands
Indexing data using XPath
Summary
6. Indexing Data Using Apache Tika
Introducing Apache Tika
Configuring Apache Tika in Solr
Indexing PDF and Word documents
Summary
7. Apache Nutch
Introducing Apache Nutch
Installing Apache Nutch
Configuring Solr with Nutch
Summary
8. Commits, Real-Time Index Optimizations, and Atomic Updates
Understanding soft commit, optimize, and hard commit
Using atomic updates in Solr
Using RealTime Get
Summary
9. Advanced Topics – Multilanguage, Deduplication, and Others
Multilanguage indexing
Removing duplicate documents (deduplication)
Content streaming
UIMA integration with Solr
Summary
10. Distributed Indexing
Setting up SolrCloud
The collections API
Updating configuration files
Distributed indexing and searching
Summary
11. Case Study of Using Solr in E-Commerce
Creating an AutoSuggest feature
Facet navigation
Search filtering and sorting
Relevancy boosting
Summary
Index
Search in book...
Toggle Font Controls
Playlists
Add To
Create new playlist
Name your new playlist
Playlist description (optional)
Cancel
Create playlist
Sign In
Email address
Password
Forgot Password?
Create account
Login
or
Continue with Facebook
Continue with Google
Sign Up
Full Name
Email address
Confirm Email Address
Password
Login
Create account
or
Continue with Facebook
Continue with Google
Prev
Previous Chapter
Cover
Next
Next Chapter
Apache Solr for Indexing Data
Table of Contents
Apache Solr for Indexing Data
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Getting Started
Overview and installation of Solr
Installing Solr in OS X (Mac)
Running Solr
Installing Solr in Windows
Installing Solr on Linux
The Solr architecture and directory structure
Solr directory structure
Cores in Solr (Multicore Solr)
Summary
2. Understanding Analyzers, Tokenizers, and Filters
Introducing analyzers
Analysis phases
Tokenizers
Standard tokenizer
Keyword tokenizer
Lowercase tokenizer
N-gram tokenizer
Filters
Lowercase filter
Synonym filter
Porter stem filter
Running your analyzer
Summary
3. Indexing Data
Indexing data in Solr
Introducing field types
Defining fields
Defining an unique key
Copy fields and dynamic fields
Building our musicCatalogue example
Using the Solr Admin UI
Facet searching
Summary
4. Indexing Data – The Basic Technique and Using Index Handlers
Inserting data into Solr
Configuring UpdateRequestHandler
Indexing documents using XML
Adding and updating documents
Deleting a document
Indexing documents using JSON
Adding a single document
Adding multiple JSON documents
Sequential JSON update commands
Indexing updates using CSV
Summary
5. Indexing Data with the Help of Structured Datasources – Using DIH
Indexing data from MySQL
Configuring datasource
DIH commands
Indexing data using XPath
Summary
6. Indexing Data Using Apache Tika
Introducing Apache Tika
Configuring Apache Tika in Solr
Indexing PDF and Word documents
Summary
7. Apache Nutch
Introducing Apache Nutch
Installing Apache Nutch
Configuring Solr with Nutch
Summary
8. Commits, Real-Time Index Optimizations, and Atomic Updates
Understanding soft commit, optimize, and hard commit
Using atomic updates in Solr
Using RealTime Get
Summary
9. Advanced Topics – Multilanguage, Deduplication, and Others
Multilanguage indexing
Removing duplicate documents (deduplication)
Content streaming
UIMA integration with Solr
Summary
10. Distributed Indexing
Setting up SolrCloud
The collections API
Updating configuration files
Distributed indexing and searching
Summary
11. Case Study of Using Solr in E-Commerce
Creating an AutoSuggest feature
Facet navigation
Search filtering and sorting
Relevancy boosting
Summary
Index
Add Highlight
No Comment
..................Content has been hidden....................
You can't read the all page of ebook, please click
here
login for view all page.
Day Mode
Cloud Mode
Night Mode
Reset