As with any application, when we start developing our wiki application we must first take a few steps to create a data model that can act as a starting point for the development:
The image shows the different entities and relations we can identify in our wiki application. Note that like in the books application, a User is a separate entity that is distinct from any user in, for example, a password database.
In the diagram, we have illustrated the fact that a Topic
may have more than one Page
while a Page
refers to a single User in a rather informal way by representing Page as a stack of rectangles and User
as a single rectangle. In this manner, we can grasp the most relevant aspects of the relations at a glance. When we want to show more relations or relations with different characteristics, it might be a good idea to use more formal methods and tools. A good starting point is the Wikipedia entry on UML: http://en.wikipedia.org/wiki/Unified_Modelling_Language.
With the entities and relations in our data model identified, we can have a look at their specific qualities.
The basic entity in a wiki is a Topic
. A topic, in this context, is basically a title that describes what this topic is about. A topic has any number of associated Pages. Each instance of a Page
represents a revision; the most recent revision is the current version of a topic. Each time a topic is edited, a new revision is stored in the database. This way, we can simply revert to an earlier version if we made a mistake or compare the contents of two revisions. To simplify identifying revisions, each revision has a modification date. We also maintain a relation between the Page
and the User
that modified that Page
.
In the wiki application that we will develop, it is also possible to associate any number of tags with a topic. A Tag
entity consists simply of a tag
attribute. The important part is the relation that exists between the Topic
entity and the Tag
entity.
Like a Tag
, a Word
entity consists of a single attribute. Again, the important bit is the relation, this time, between a Topic
and any number of Words
. We will maintain this relation to reflect the words used in the current versions (that is, the last revision of a Page)
of a Topic
. This will allow for fairly responsive full text search facilities.
The final entity we encounter is the Image
entity. We will use this to store images alongside the pages with text. We do not define any relation between topics and images. Images might be referred to in the text of the topic, but besides this textual reference, we do not maintain a formal relation. If we would like to maintain such a relation, we would be forced to scan for image references each time a new revision of a page was stored, and probably we would need to signal something if a reference attempt was made to a non-existing image. In this case, we choose to ignore this: references to images that do not exist in the database will simply show nothing:
from entity import Entity from relation import Relation class User(Entity): pass class Topic(Entity): pass class Page(Entity): pass class Tag(Entity): pass class Word(Entity): pass class Image(Entity): pass class UserPage(Relation): pass class TopicPage(Relation): pass class TopicTag(Relation): pass class ImagePage(Relation): pass class TopicWord(Relation): pass def threadinit(db): User.threadinit(db) Topic.threadinit(db) Page.threadinit(db) Tag.threadinit(db) Word.threadinit(db) Image.threadinit(db) UserPage.threadinit(db) TopicPage.threadinit(db) TopicTag.threadinit(db) ImagePage.threadinit(db) TopicWord.threadinit(db) def inittable(): User.inittable(userid="unique not null") Topic.inittable(title="unique not null") Page.inittable(content="", modified="not null default CURRENT_TIMESTAMP") Tag.inittable(tag="unique not null") Word.inittable(word="unique not null") Image.inittable(type="",data="blob",title="", modified="not null default CURRENT_TIMESTAMP", description="") UserPage.inittable(User,Page) TopicPage.inittable(Topic,Page) TopicTag.inittable(Topic,Tag) TopicWord.inittable(Topic,Word)
Because we can reuse the entity and relation modules we developed earlier, the actual implementation of the database layer is straightforward (full code is available as wikidb.py)
. After importing both modules, we first define a subclass of Entity
for each entity we identified in our data model. All these classes are used as is, so they have only a pass
statement as their body.
Likewise, we define a subclass of Relation
for each relation we need to implement in our wiki application.
All these Entity
and Relation
subclasses still need the initialization code to be called once each time the application starts and that is where the convenience function initdb()
comes in. It bundles the initialization code for each entity and relation (highlighted).
Many entities we define here are simple but a few warrant a closer inspection. The Page
entity contains a modified
column that has a non null
constraint. It also has a default: CURRENT_TIMESTAMP
(highlighted). This default is SQLite specific (other database engines will have other ways of specifying such a default) and will initialize the modified column to the current date and time if we create a new Page
record without explicitly setting a value.
The Image
entity also has a definition that is a little bit different: its data
column is explicitly defined to have a blob
affinity. This will enable us to store binary data without any problem in this table, something we need to store and retrieve the binary data contained in an image. Of course, SQLite will happily store anything we pass it in this column, but if we pass it an array of bytes (not a string that is), that array is stored as is.
18.222.114.28