Analyzing the entities that we want to index

In this section, we will start analyzing our data, and define the basic fields for a logical entity, which will be used for writing the documents to be indexed.

Looking at the structures of the RDF/XML downloaded files (they are represented using an XML serialization for RDF descriptions), we don't need to think too much about the RDF in itself for our purpose. On opening them with a text editor, you will find that they contain the same metadata for every resource, so you can easily find its corresponding DBpedia page. As seen before, most of them are based on best practices and standard vocabularies, such as dublin core, which are designed to share the representation of resources, and can be indexed almost directly. Starting from that, we can decide how to describe our paintings for our searches, and then what are the basic elements we need to select to construct our basic example core.

You can look at the sketch schema shown in the following diagram. It's simple to start thinking about a painting entity, which you can think of like a box for some fields:

Analyzing the entities that we want to index

The elements cited are inspired by some of the usual metadata we intuitively expect and are able to find in most cases in the downloaded files.

I strongly suggest you to make a schema like the previous image when you are about to start writing your own configuration. This makes things more clearer than when you start coding directly, and also helps us to speak with each other, sharing, and understanding an emergent design.

In this collection of ideas for important elements, we can then start isolating some essential fields (let's see things directly from the Solr perspective), and when the new Solr core first runs, we could then add new specific fields and configurations.

Analyzing the first entity – Painting

To represent our Painting entity, define a simple Solr document with the following fields:

Field

Example

uri

http://dbpedia.org/page/Mona_lisa

title

Mona Lisa

artist

Leonardo Da Vinci

museum

Louvre

city

Paris

year

~1500

wikipedia_link

http://en.wikipedia.org/wiki/Mona_Lisa

We have adopted only a few fields, but there could be several; in this particular case we have selected those which seem to be the most easy and recognizable for us to explore.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.67.235