Time for action examining the HTML markup

Let's have a look at how the HTML markup produced by the index() method looks:

<table class="entitylist" start="0" page="10">
	<thead>
		<tr>
				<th class="sorted-desc">id</th>
				<th class="sorted-asc">Size</th>
		</tr>
	</thead>
	<tbody>
		<tr id="86"><td>86</td><td>7702</td></tr>
		<tr id="14"><td>14</td><td>12331</td></tr>
		<tr id="72"><td>72</td><td>17013</td></tr>
		<tr id="7"><td>7</td><td>26236</td></tr>
		<tr id="12"><td>12</td><td>48481</td></tr>
		<tr id="10"><td>10</td><td>63060</td></tr>
		<tr id="15"><td>15</td><td>64824</td></tr>
		<tr id="85"><td>85</td><td>69352</td></tr>
		<tr id="8"><td>8</td><td>84442</td></tr>
		<tr id="53"><td>53</td><td>94749</td></tr>
	</tbody>
</table>
<form method="GET" action=".">
	<div class="buttonbar">
		<input name="start" type="hidden" value="0">
		<input name="sortorder" type="hidden" value="n,asc">
		<input name="sortorder" type="hidden" value="id,desc">
		<input name="cacheid" type="hidden"
				value="57ec8e0a53e34d428b67dbe0c7df6909">
		<p class="info">items 1-10/100</p>
		<button name="first" type="submit">First</button>
		<button name="previous" type="submit">Previous</button>
		<button name="next" type="submit">Next</button>
		<button name="last" type="submit">Last</button>
		<button name="clear" type="submit">Clear</button>
	</div>
</form>
<form method="GET" action="add">
	<button type="submit">Add new</button>
</form>

Apart from the actual table, we have a<form> element with quite a number of<button> and<input> elements, albeit that most have their type attribute set to hidden.

The<form> element has an action attribute "." (a single dot), which will cause all the information in the form to be submitted to the same URL that originated this form, so the data will be processed by the same index() method we are now examining. A submit is triggered when any of the<button> elements with a type attribute equal to submit is clicked, in which case, not only the<input> elements are sent, but also the name of the button that was clicked.

Note

Note that any<input> element that has to be sent to the server should have a name attribute. Omitting the name attribute will cause it to be missed out.<input> elements with type equal to hidden are sent as well if they have a name attribute. Hidden<input> elements are not displayed, but do play an important role in keeping essential information associated with a form together.

The first hidden<input> element in the form stores the start index of the items currently displayed in the table. By adding it as a hidden element, we can calculate which items to show when we take action when the Next or Previous button is clicked.

We also want to remember if and how the items are sorted. Therefore, we include a number of hidden input elements with a name attribute equal to sortorder, each having a value consisting of a column name and a sort direction separated by a comma.

When a form is submitted, input elements with the same name are added in order as arguments to the action URL and CherryPy will recognize this pattern and convert them to a list of values. In this example, the index() method of the Browse class receives this list as its sortorder argument. Any pattern values are present as hidden<input> elements as well and processed in an identical way.

The form also contains an info class<p> element, that contains information on the number of items and the items actually shown on the current page. The final part of the form is a collection of submit buttons.

What just happened?

The index() method may be called with no arguments at all or with any or all contents of the form it displays. If the client-side JavaScript code wants to call it asynchronously while preventing the browser from caching it, it may even pass an _ (underscore) argument with a random value, which will be ignored.

The rest of the arguments are relevant and checked for sanity before being acted upon.

We want the sortorder variable to contain a list of tuples, each consisting of a column name and a sort direction, but the values of the input elements are simply interpreted as strings by CherryPy, so we have to convert this list of strings to a list of tuples by splitting those strings on the comma separator. We neither check for the validity of the column names, nor for that of the sort directions because that will be done by the code doing the actual work.

The pattern variable is treated in a similar way, but because we may want to filter on values containing commas, we cannot simply use the split() method here, but have to pass it a limit of 1 to restrict its splitting to the first comma it encounters.

Next, we pass the sortorder and pattern variables to the listids() class method of the entity we stored with this Browse instance. It will return a list of IDs of instances that match the pattern criteria (or all instances if no patterns are specified) sorted in the correct order. Note that since the number of instances might be huge, we do not use the list() method here because converting all IDs to entity instances at once might render the application unresponsive. We will just convert those IDs to instances that we will actually show on the page, based on the start and page variables.

To calculate the new start index, we will have to check if we act upon one of the paging buttons (highlighted) and add or subtract a page length if we are acting on a click on the Next or Previous button. We set the start index to 0 if the First button was clicked. If the Last button was clicked, we set the start index to the number of items minus the length of the page. If any of these calculations result in a start index that is less than zero, we set it to zero.

The next step is to produce the actual output, yielding one line at a time, beginning with the<table> element. Our table consists of a head and a body, the head consisting of a single row of<th> elements, each containing either the display name of the column we are showing if it represents an attribute of the entity, or the name of the class if it represents a related entity. Any sort order associated with this column is represented in its class attribute, so we may use CSS to make this visible to the user.

To display the rows in the body of the table, we convert the relevant IDs in the selection to actual entities (highlighted) and generate<td> elements for each attribute. If the column refers to related entities, their primary attributes are displayed, each related entity encapsulated in its own<span> element. The latter will enable us to associate relevant actions with each individual item shown, for example, displaying it in full when it is clicked.

The final long list of yield statements is used to produce the form with its many hidden input elements, each recording the arguments that were passed to the index() method.

Caching

The bulk of the activity when browsing through lists in a typical application is paging forward and backward. If we would need to retrieve the full list of entities each time we forward a single page, the application might feel sluggish if the list was huge or the sorting and filtering was complicated. It might therefore be sensible to implement some sort of caching scheme.

There are a couple of things to consider though:

  • Our CherryPy applications are multithreading, so we should be aware of that, especially when storing things in a cache, as we don't want threads to trash the shared cache.
  • We have to devise some scheme to limit the number of cached items as resources are limited.
  • We must overcome the limitations of the statelessness of the HTTP protocol: each time the client issues a request. This request should contain all necessary information to determine if we have something cached for him available and of course we have to understand that each request may be served by a different thread.

These requirements can be satisfied if we change the line in the index() that retrieves the matching IDs into the following few lines:

Chapter7/browse.py

			if not (next is None and previous is None
								and first is None and last is None):
					cacheid=self.iscached(cacheid,sortorder,pattern)
			else:
					cacheid=None
			if cacheid is None:
					ids = self.entity.listids(
							pattern=pattern,sortorder=sortorder)
					cacheid = self.storeincache(ids,sortorder,pattern)
			else:
					ids = self.getfromcache(cacheid,sortorder,pattern)
					if ids == None:
							ids = self.entity.listids(
									pattern=pattern,sortorder=sortorder)
							cacheid = self.storeincache(ids,sortorder, 
pattern)

Because we will store a unique cacheid in a hidden<input> element, it will be passed as an argument when the form is submitted. We use this cachid together with the sortorder and pattern arguments to check whether a previously retrieved list of IDs is present in the cache with the iscached() method. Passing the sortorder and pattern arguments will enable the iscached() method to determine if these are changed and invalidate the cache entry.

iscached() will return the cacheid if it exists in the cache or None if it doesn't. iscached() will also return None if the cacheid does exist but the sortorder or pattern arguments were changed.

Next, we check if the cacheid is None. This may seem redundant, but if index() was called for the first time (without arguments, that is) none of the submit button arguments would be present and we wouldn't have checked the cache.

Note

This is intended: if we would, at a later point, revisit this list, we would want a fresh set of items, not some old cached ones. After all, the contents of the database might have changed.

If the cacheid is None we retrieve a fresh list of IDs and store it in the cache together with the sortorder and pattern arguments. The storeincache() method will return a freshly minted cacheid for us to store in the hidden<input> element.

If the cacheid was not None, we use the getfromcache() method to retrieve the list of IDs from the cache. We check the returned value because between our checking for the existence of the key in the cache and retrieving the associated data, the cache might have been purged, in which case, we still call the listids() method.

The implementation of the iscached(), getfromcache(), and storeincache() method takes care of all the thread safety issues:

			def chash(self,cacheid,sortorder,pattern):
					return cacheid + '-' + hex(hash(str(sortorder))) + '-' + 
hex(hash(str(pattern)))
			def iscached(self,cacheid,sortorder,pattern):
					h=self.chash(cacheid,sortorder,pattern)
					t=False
					with self.cachelock:
							t = h in self.cache
							if t :
									self.cache[h]=(time(),self.cache[h][1])
					return cacheid if t else None
			def cleancache(self):
					t={}
					with self.cachelock:
							t={v[0]:k for k,v in self.cache.items()}
					if len(t) == 0 :
							return
					limit = time()
					oldest = limit
					limit -= 3600
					key=None
					for tt,k in t.items():
							if tt<limit:
									with self.cachelock:
											del self.cache[k]
							else:
								if tt<oldest:
										oldest = tt
										key = k
					if key:
							with self.cachelock:
								del self.cache[key]
			def storeincache(self,ids,sortorder,pattern):
					cacheid=uuid().hex
					h=self.chash(cacheid,sortorder,pattern)
					if len(self.cache)>self.cachesize :
							self.cleancache()
					with self.cachelock:
							self.cache[h]=(time(),ids)
					return cacheid
			def getfromcache(self,cacheid,sortorder,pattern):
					ids=None
					h=self.chash(cacheid,sortorder,pattern)
					with self.cachelock:
							try:
									ids=self.cache[h][1]
							except KeyError:
									pass
					return ids

All methods use the chash() method to create a unique key from the cacheid and the sortorder and pattern arguments. iscached() waits until it acquires a lock to check if this unique value is present in the cache. If it is, it updates the associated value, a tuple consisting of a timestamp and a list of IDs. By updating this timestamp here, we reduce the chance that this item is purged from the cache between the check for existence and the actual retrieval.

The getfromcache() method creates a unique key with the chash() method in the same way iscached() did and waits to acquire the lock before it uses the key to retrieve the value from the cache. If this fails, a KeyError will be raised that will be caught, causing the None value to be returned as that was what the IDs variable was initialized to.

The storeincache() method first creates a new cacheid using one of the uuid() functions from Python's uuid module, essentially creating a random string of hexadecimal characters. Together with the sortorder and pattern arguments, this new cacheid is used to generate a unique key.

Before we store the list of IDs in the cache, we check whether there is any space left by comparing the number of keys in the cache to the maximum length we are prepared to accept. If there isn't any room left, we make room by calling the cleancache() method that will remove any entries that are too old. We then store the IDs together with a time stamp after acquiring a lock and return the cacheid just generated.

The final cog in our caching machinery is the cleancache() method. After requiring a lock, a reverse map is built, mapping timestamps to keys. If this map holds any items, we use it to locate any key that is older than an hour. Those are deleted after acquiring a lock.

Note

The whole business with acquiring a lock and releasing it as quick as possible instead of acquiring the lock and doing all the cache-related business in one go ensures that other threads accessing the cache do not have to wait very long, which keeps the whole application responsive.

If the age of an entry is less than an hour, we keep notes to see which of the remaining ones is the oldest to remove that one at the end. This way, we ensure that we always retire at least one entry, even if there aren't any really old ones.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.67.5