Chapter 7. Information Architecture: Organizing Your Wiki

The objective of information architects is to make information easy to find. They do this in two ways. The first way is by organizing a site, usually in a hierarchical fashion, and using that organizational structure to create a system of navigation that enables users to drill down into the content by following links. The second way is through search engines. Wikis are organized differently than other websites, and in this chapter you will learn how to organize your content on MediaWiki so that users can quickly and easily find the information they are looking for.

How Users Find Information

Typical Wikipedia users find the page they are looking for by searching it by title. In effect, users are guessing the name of the article, because the default search is a title search, which, when found, takes the user directly to the page. Wiki pages can also be grouped into categories, which enables users to browse the site in order to find the content they are looking for.

Site Navigation

The default monotone skin provides a navigation box in the left column of the wiki. The links to the community portal, current events, help, and donations all link to pages that do not exist when the wiki is first set up. The other links, recent changes and random page, are links to special pages. You can either create the pages that are being linked to or you can remove them from the list. For now, it is worthwhile to take a look at the site navigation links themselves and see what they do (see Figure 7-1).

Site navigation links

Figure 7.1. Site navigation links

The term "navigation" can be somewhat misleading and should be changed. When you customize your wiki, it is recommended that you remove it (the next chapter will show you how). For example, the title could be changed to something like "Related Information" or "About This Wiki," or something more descriptive because the fact is, you do not use the navigation box to navigate the site. Most true navigation takes place in the search box, which you will learn about next.

Search

True user navigation—the way users actually find the information they are looking for and navigate to it—is handled by the Search box. There are two button options in the search box that provide two different kinds of results. The Go button searches for a page that matches (i.e., has the same title as) the text entered into the Search box. The Search button searches page titles and page body text for any occurrence of the search term(s).

  • Go: When you enter a search term and select Go, MediaWiki checks to see whether a page by that title exists. If the page does exist, the user is redirected to that page directly. The page title has to be an exact match. If a page with that page title does not exist, the user is taken to a Search Results page (see Figure 7-2), as if they had pressed the Search button instead.

  • Search: When a user clicks on the Search button, the search results are displayed on a page that is divided into two sections: Article title matches and Page text matches. If the terms being searched for can be found in the title but aren't an exact match for the title, then those pages are listed under Article title matches. If the terms are in the body text of the article or page, then those pages are listed under Page text matches.

The Search Results page

Figure 7.2. The Search Results page

Search Preferences

Users can set certain search preferences that affect how search results are returned. The options are listed under the Search tab in the User Preferences page. The following items can be set:

  • Hits per page: If the user performs a search that returns a large number of results, MediaWiki will page through the results, rather than display them all at once. This setting determines how many results are displayed per page. The default value is 20.

  • Lines per hit: When the search results are returned, each line on which the search term is found is displayed. This setting limits the number of lines that are displayed for each item returned in the search results. The default value is 5.

  • Context per line: When a line is displayed with a search term, MediaWiki displays some of the text around the search term in order to provide context for how the term is being used on the page. You can add to or subtract from the amount of surrounding text that is displayed by setting this value. The default value is 50 characters.

  • Default namespaces searched: On the preferences page, the complete list of namespaces is shown. By default, only the Main namespace is checked, which means that the search only applies to articles contained in the Main namespace. Users can change their preferences so that other namespaces are included as well.

Search Options

MediaWiki implements the site's search feature by default, using the database that was selected during installation, either MySQL or PostgreSQL. While these options work well for many sites, they do present some limitations, especially for large sites.

For example, the MySQL full-text search feature does not scale particularly well because the full-text search indexes are stored in memory (which makes the search very fast when the indexes are not too large). MySQL also has limited features in terms of how the search results are returned. The ranking of search results is determined only by how often a word appears in a document, and it does not calculate things such as word distance, which reflects how far apart search terms appear in a given document.

Using Google as Your Search Engine

Because of these limitations, you can decide not to use the default search engines in your wiki. The simplest change to make is to disable MediaWiki's text search by inserting the following text into the LocalSettings.php page:

$wgDisableTextSearch = true;

With this value set, MediaWiki defaults to using Google as the site search, as shown in Figure 7-3. This works well as long as Google has indexed your site. The downside is that you cannot control if, or how often, Google will index your site.

Default Google search when text search is disabled

Figure 7.3. Default Google search when text search is disabled

Using an External Search Engine

You also have the option to use any arbitrary external search engine. To do so, in addition to disabling the text search, you also have to tell MediaWiki the URL of the external search engine, as shown here (in LocalSettings.php):

$wgForwardSearchUrl = 'http://www.google.com/search?q=$1&domains=http://choate
   .info&sitesearch=http://choate.info&ie=utf-8&oe=utf-8';

Note that some examples on MediaWiki.org mistakenly refer to $wgSearchForwardUrl—don't let that confuse you. It should be $wgForwardSearchUrl.

When you forward the search URL, you are sending the search to an entirely different server. In the following example, I'm using a search URL for Google, telling it to search my domain choate.info. The $1 value in the query string of the URL will be replaced with the search terms the user entered into the search form. Then the user will be taken directly to the Google site for the search results. If you search for the term "wiki," then the following URL is used:

http://www.google.com/search?q=wiki&domains=http://choate
   .info&sitesearch=http://choate.info&ie=utf-8&oe=utf-8

Note how $1 has been replaced by the word wiki.

Apache Lucene Search

It is also possible to use Apache Lucene as the full-text search engine. Wikipedia uses Lucene because it is optimized for full-text searching and offers the scalability to accommodate such a large site. Relative to MySQL, Lucene can handle full-text search requests much more efficiently. The implementation used by Wikipedia is Lucene.NET, a .NET port of the original Java Lucene, with code written in C#. The details of using Lucene go beyond the scope of this book, but you can find more detailed information www.mediawiki.org/wiki/Lucene.

Category Pages

MediaWiki uses categories as a way to organize pages, and groups similar pages together. Different pages can be grouped into a category, and these pages are listed on a special category page. Users can go to a special page, Special:Catagories, in order to browse through the pages of the wiki based upon the categories to which they refer. The categories can also be arranged hierarchically, so that a more complex navigation scheme can be developed, with categories and subcategories.

Adding a Page to a Category

Similar to wikilinks, you can create category pages simply by embedding a category link into a page. When you create a category this way, the page containing the category link is automatically added to the category.

[[Category: My Category]]

Regardless of where you enter the category tag on the page, it is displayed at the bottom of the page so that users can see to which categories a page refers, as shown in Figure 7-4, and follow a link to the category page itself to see other pages in the same category, as shown in Figure 7-5.

Category links appear at the bottom of the page

Figure 7.4. Category links appear at the bottom of the page

Category pages display an alphabetized list of pages for that category

Figure 7.5. Category pages display an alphabetized list of pages for that category

Creating Categories

You can also create a category page directly without automatically adding a page to the category by using the following syntax:

[[:Category: My Category]]

When you create a category this way, a link to the category appears on the page where you entered the text, but the page with the link isn't added to the category. The category page is created, so you can follow that link to the category page, but you will see that there are no pages in the category.

Linking to Category Pages Using Alternate Text

You can also use this syntax to create a link to a category page that displays alternate text. The following tag links to a category page called Help:Basketball, but only displays the word "Basketball" in the link.

[[:Category:Help:Basketball|Basketball]]

Sorting Categories

You can control how pages are sorted on category pages. This is useful when you are adding a page that is not in the default namespace to a category. For example, if you have a page in the Help namespace called Help:Basketball and you want to place that page in the Sports category, it normally would be listed under H for Help:Basketball. If, however, you want it listed in the B section, then you could create the category link like this:

[[Category:Help:Basketball|Basketball]]

The category link at the bottom of the page will still say Help:Basketball. This only affects where the item is listed alphabetically on the category page.

Editing Category Pages

Category pages can be edited like any other page. Any text that you enter will appear above the list of links to pages in that category.

Subcategories

It is also possible to use categories to create a hierarchy of categories and subcategories. There's no such thing as a subcategory per se in MediaWiki, just categories, but you can organize your pages in a hierarchy by virtue of the fact that category pages themselves can be categorized. A category page that is part of another category is a subcategory.

Suppose you're creating a site about sports and you want to include articles about the following topics:

  • Sports

  • College Sports

  • Pro Sports

  • College Basketball

  • Pro Basketball

  • College Football

  • Pro Football

MediaWiki employs a flat hierarchy, and each of these pages will be in the default namespace. One way a user can find these pages is to type the phrase "College Basketball" in the Search field and click the Go button. If the user were looking for information about a particular college basketball player, however, she would need to enter the player's name in the Search field and click the Search button to find any page with the player's name in it.

If the player has a common name, the results might include a list of pages that contain information about other people who share the same name. One way to make the search more efficient would be to group pages about a similar subject together. For instance, it might be a good idea to group all articles about college sports together, all pages about pro sports together, and so on. You might also want to group all articles about college basketball together, and so on. If you were really ambitious, you also might divide college basketball into men's college basketball and women's college basketball, and then separate those into divisions, and then individual teams. All of that is possible with categories.

Whether it is advisable to go into that much detail is another question. There is a delicate balancing act one must perform when categorizing pages. Too much categorization and the site becomes confusing and difficult to maintain. Too little categorization and users might find it hard to find the information they are looking for. There is no hard-and-fast rule, but it is possible to borrow some rules from more traditional content management systems and apply them here.

The hierarchy should be no more than three levels deep, and each category should have only five to seven subcategories. Of course, it's not always possible to fit within these parameters, but they are useful rules of thumb that you can use to help gauge the complexity of your site and spot potential usability issues.

Conceptually, you can group the pages of the example site as shown in Figure 7-6.

The sports wiki hierarchy

Figure 7.6. The sports wiki hierarchy

Figure 7-6 simply shows how I chose to categorize these pages conceptually. It's easy to understand that Sports is the root node of the tree, that College Sports and Pro Sports are subcategories, and so on. If you want to implement this hierarchy in MediaWiki using categories, you first need to determine the nature of the Sports page, and the College Sports and Pro Sports pages.

Category pages are in a different namespace than article pages, which means that a category page can share the same title as an article page. The question you need to answer is whether these pages are category pages, or whether you will have both category pages and article pages that share the same title. In other words, should you have an article page and a category page called Sports, or just a category page called Sports?

When you are developing a traditional website using traditional content management software, you usually develop the taxonomy first, and then the content is developed to go into the pre-defined categories. Wikis, conversely, often start with the articles first, and only later are the articles added to categories. This is an important difference between a taxonomy and a folksonomy: Folksonomies are created by users after the content has been created. You may find that you already have article pages called Sports, College Sports, Pro Sports, and so on. In that case, you have the option of being able to create category pages with the same name.

In our example, the College Sports article is in the College Sports category, and the College Sports category is a subcategory of Sports. If, however, you simply have the articles College Basketball, College Football, Pro Basketball, and Pro Football, then you can choose to use category pages exclusively for College Sports and Pro Sports, and so on.

Multi-Faceted Categories

The sports example reveals one of the common problems that arises when trying to properly categorize information into categories: There is often more than one sensible way to categorize them. The previous example used college sports and pro sports as subcategories of sports, but you could also just as easily decide to put basketball and football under sports. While people could argue the point both ways, very often the ultimate decision about how to categorize things falls upon the whim or personal preference of the categorizer.

A user of your wiki may have a different preference, or may conceptualize the topic differently than you do. Wouldn't it be nice to be able to organize your content into multi-faceted hierarchies, letting pages exist in different parts of your taxonomy? With MediaWiki, you can. Because any page can be in multiple categories, it's possible to create rather complex, multi-faceted hierarchies.

In order to address the sports problem, all you need to do is add a new category called Category: Basketball to the college basketball and pro basketball pages, and one called Category: Football to the college and pro football pages. Figure 7-7 shows how the hierarchy looks now (the football pages were omitted for clarity). As you can see, a user can now navigate to basketball-related pages in two different ways: through either the college sports or the pro sports categories, or through the basketball category.

Multi-faceted categories

Figure 7.7. Multi-faceted categories

Categories as Folksonomies

In the previous sections, you learned how to organize your content into meaningful hierarchies. There's one important detail (or caveat) you should be aware of: Anybody can add a page to any hierarchy. In other words, one user may categorize pages in one way, while another chooses an entirely different approach. This decentralized categorization is often referred to as a folksonomy, in contrast to a taxonomy, which is a hierarchy of relationships developed centrally, usually by specialists.

You may be familiar with sites like Flickr that allow users to add tags to pages. Conceptually speaking, a tag and a category are very similar—they are both keywords that are used to describe or group a page into some conceptual category or topic. The advantage to this approach is that with many people adding pages to different categories, multiple points of view are represented by the links. What makes sense to one person may not have occurred to another person, and this open-ended approach makes it possible to discover unexpected connections between pages.

Improving Findability

I've already mentioned the importance of using simple, clear page titles to help users find the information they are looking for. Other useful tactics can be employed as well to improve the user experience. Wikipedia has a Manual of Style that establishes consistent ways of formatting pages and other conventions that improve the search experience (see http://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style).

Redirects and Synonyms

Clearly, there are often many words or phrases that describe the same thing. The best way to address this with your wiki is with the use of redirects. You should select one name per topic, and all variants of that name or phrase should be redirect pages that point to the canonical version. Redirects are discussed in Chapter 6.

Disambiguation Pages

You might run into another problem as well, which results from the fact that one word or phrase may refer to different things. In this case, you can use what is called a disambiguation page. A disambiguation page isn't a formal page type in MediaWiki. Rather, it's a standard practice that is used to address this problem. If you have a term that applies to multiple topics, you can create a page with that term and then link from there to the other terms. You can use a template to ensure standard formatting of the disambiguation page.

Note

You can read about templates in Chapter 8.

You can also list the disambiguation page on the MediaWiki:Disambiguations page, and this will ensure that the page is listed on the Special:Disambiguations page. You can view this Wikipedia page at http://en.wikipedia.org/wiki/MediaWiki:Disambiguationspage, where you will see a list of templates, discussed in the next chapter.

Wiki Gardening

While it is comforting to think of a wiki as an organic process, with order arising out of chaos naturally, without human intervention or irritating authoritarianism on the part of some consulting taxonomist, it is also naive. Don't get me wrong: Letting the organizational structure of your wiki develop naturally is a good thing, but what emerges needs to be tended in order for it to thrive. Every wiki needs a gardener—someone to pull the weeds, water the plants, and occasionally move a plant from one bed to another.

Several special pages help with the wiki gardening task:

  • Uncategorized pages (Special:Uncategorizedpages): These are pages that do not have one or more categories assigned. Use this page to ensure that all pages are categorized.

  • Uncategorized images (Special:Uncategorizedimages): These are image (or file) pages that have not been categorized.

  • Uncategorized categories (Special:Uncategorizedcategories): Category pages themselves can be categorized. This is useful when building a relatively deep hierarchical structure for your wiki. This special page lists all the category pages that have not been assigned to a category.

  • Unused categories (Special:Unusedcategories): These are category pages with no pages in them.

  • Unused files (Special:Unusedfiles): These are files (usually images) that have been uploaded but that are not being linked to.

  • Wanted categories (Special:Wantedcategories): This page returns a list of categories that have been created but that do not have any content in them. Having no content in them does not mean that there are no pages in the category; it means that the category pages have not been edited and no additional explanatory content has been added.

  • Wanted pages (Special:Wantedpages): These are pages for which a wiki link exists but they have not been edited, so they have no content.

  • Dead-end pages (Special:Deadendpages): These pages contain no links to other pages in the wiki.

  • Long pages and short pages (Special:Longpages, Special:Shortpages): Both of these pages function the same way, returning a list of pages ordered either from the smallest to the largest (for short pages) or from the largest to the smallest (for long pages), as measured in bytes. When a page is getting too long, it is often a good idea to break it up into two or more smaller pages. Small pages indicate pages that can possibly be expanded upon.

Summary

In this chapter, you learned how to organize content on your wiki using categories. You also learned how to customize the search engine used by MediaWiki. In the next chapter, you will learn about magic words, templates, and skins. This will enable you to customize the look and feel of your wiki, and add more complex content to your pages.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.107.161