Chapter 2, Indexing with Local PDF Files

Pop quiz

Q1

1

False

2

False

3

True

Q2

1

False: You can, for example, put values from an administrative metadata in a stored field in order to have them saved and returned in the results, without actually the need to perform searches on them.

2

True: This is the reason why we decide to use an indexed field.

3

False: A field must be stored to be returned in the results.

Q3

1

False: This query will simply delete every document.

2

False: The syntax is not correct.

3

True: This is the correct syntax.

Q4

1

False: This particular codec only partially uses binary format, and it exposes most of the data in plain text.

2

True: Looking at the plain text structure saved, we can recognize the internal structure of an inverted index, and make an idea of how it's made.

3

True: The values are saved as plain text, so they are easy to read.

Q5

1

False: The files saved reflects the changes in the index.

2

False: What we mean to be "a word" can be composed by one or more tokens, depending on the chosen text analysis chain. Every token will be saved as a single term.

3

True: Every term will be saved and updated with its reference.

Q6

1

True: The number of segments should vary depending on the action you do on the index. Note that in some circumstances, imagine for example you ask to clean an already empty index, the number of segments will not vary at all, but if you look at the time of last modification, you'll easily see that the files are updated as well.

2

False: Even while cleaning an index, not all segments files are deleted: there will be always at least one file which represents a created index.

3

False: See the previous answers. Furthermore, the core/data folder can contain other files needed for specific components, such as compiled dictionary for spell checking.

Q7

1

True

2

True

3

False: It is partially true, as we can use DataImportHandler and connect it to a specific handler, but we will change the configuration for the DataImportHandler itself, and not for an update handler.

Q8

1

True

2

False: We can use the Tika configurations.

3

False: We can change the configurations, but we can also send the added metadata by appending a parameter in the URL.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.226.66