Let's have a look at how the HTML markup produced by the index()
method looks:
<table class="entitylist" start="0" page="10"> <thead> <tr> <th class="sorted-desc">id</th> <th class="sorted-asc">Size</th> </tr> </thead> <tbody> <tr id="86"><td>86</td><td>7702</td></tr> <tr id="14"><td>14</td><td>12331</td></tr> <tr id="72"><td>72</td><td>17013</td></tr> <tr id="7"><td>7</td><td>26236</td></tr> <tr id="12"><td>12</td><td>48481</td></tr> <tr id="10"><td>10</td><td>63060</td></tr> <tr id="15"><td>15</td><td>64824</td></tr> <tr id="85"><td>85</td><td>69352</td></tr> <tr id="8"><td>8</td><td>84442</td></tr> <tr id="53"><td>53</td><td>94749</td></tr> </tbody> </table> <form method="GET" action="."> <div class="buttonbar"> <input name="start" type="hidden" value="0"> <input name="sortorder" type="hidden" value="n,asc"> <input name="sortorder" type="hidden" value="id,desc"> <input name="cacheid" type="hidden" value="57ec8e0a53e34d428b67dbe0c7df6909"> <p class="info">items 1-10/100</p> <button name="first" type="submit">First</button> <button name="previous" type="submit">Previous</button> <button name="next" type="submit">Next</button> <button name="last" type="submit">Last</button> <button name="clear" type="submit">Clear</button> </div> </form> <form method="GET" action="add"> <button type="submit">Add new</button> </form>
Apart from the actual table, we have a<form>
element with quite a number of<button>
and<input>
elements, albeit that most have their type attribute set to hidden.
The<form>
element has an action attribute "." (a single dot), which will cause all the information in the form to be submitted to the same URL that originated this form, so the data will be processed by the same index()
method we are now examining. A submit is triggered when any of the<button>
elements with a type
attribute equal to submit
is clicked, in which case, not only the<input>
elements are sent, but also the name of the button that was clicked.
Note that any<input>
element that has to be sent to the server should have a name
attribute. Omitting the name
attribute will cause it to be missed out.<input>
elements with type
equal to hidden
are sent as well if they have a name
attribute. Hidden<input>
elements are not displayed, but do play an important role in keeping essential information associated with a form together.
The first hidden<input>
element in the form stores the start index of the items currently displayed in the table. By adding it as a hidden element, we can calculate which items to show when we take action when the Next or Previous button is clicked.
We also want to remember if and how the items are sorted. Therefore, we include a number of hidden input elements with a name
attribute equal to sortorder
, each having a value consisting of a column name and a sort direction separated by a comma.
When a form is submitted, input elements with the same name are added in order as arguments to the action
URL and CherryPy will recognize this pattern and convert them to a list of values. In this example, the index()
method of the Browse
class receives this list as its sortorder
argument. Any pattern
values are present as hidden<input>
elements as well and processed in an identical way.
The form also contains an info
class<p>
element, that contains information on the number of items and the items actually shown on the current page. The final part of the form is a collection of submit buttons.
The index()
method may be called with no arguments at all or with any or all contents of the form it displays. If the client-side JavaScript code wants to call it asynchronously while preventing the browser from caching it, it may even pass an _
(underscore) argument with a random value, which will be ignored.
The rest of the arguments are relevant and checked for sanity before being acted upon.
We want the sortorder
variable to contain a list of tuples, each consisting of a column name and a sort direction, but the values of the input elements are simply interpreted as strings by CherryPy, so we have to convert this list of strings to a list of tuples by splitting those strings on the comma separator. We neither check for the validity of the column names, nor for that of the sort directions because that will be done by the code doing the actual work.
The pattern
variable is treated in a similar way, but because we may want to filter on values containing commas, we cannot simply use the split()
method here, but have to pass it a limit of 1 to restrict its splitting to the first comma it encounters.
Next, we pass the sortorder
and pattern
variables to the listids()
class method of the entity we stored with this Browse
instance. It will return a list of IDs of instances that match the pattern
criteria (or all instances if no patterns are specified) sorted in the correct order. Note that since the number of instances might be huge, we do not use the list()
method here because converting all IDs to entity instances at once might render the application unresponsive. We will just convert those IDs to instances that we will actually show on the page, based on the start and page variables.
To calculate the new start index, we will have to check if we act upon one of the paging buttons (highlighted) and add or subtract a page length if we are acting on a click on the Next or Previous button. We set the start index to 0 if the First button was clicked. If the Last button was clicked, we set the start index to the number of items minus the length of the page. If any of these calculations result in a start index that is less than zero, we set it to zero.
The next step is to produce the actual output, yielding one line at a time, beginning with the<table>
element. Our table consists of a head and a body, the head consisting of a single row of<th>
elements, each containing either the display name of the column we are showing if it represents an attribute of the entity, or the name of the class if it represents a related entity. Any sort order associated with this column is represented in its class
attribute, so we may use CSS to make this visible to the user.
To display the rows in the body of the table, we convert the relevant IDs in the selection to actual entities (highlighted) and generate<td>
elements for each attribute. If the column refers to related entities, their primary attributes are displayed, each related entity encapsulated in its own<span>
element. The latter will enable us to associate relevant actions with each individual item shown, for example, displaying it in full when it is clicked.
The final long list of yield
statements is used to produce the form with its many hidden input elements, each recording the arguments that were passed to the index()
method.
The bulk of the activity when browsing through lists in a typical application is paging forward and backward. If we would need to retrieve the full list of entities each time we forward a single page, the application might feel sluggish if the list was huge or the sorting and filtering was complicated. It might therefore be sensible to implement some sort of caching scheme.
There are a couple of things to consider though:
These requirements can be satisfied if we change the line in the index()
that retrieves the matching IDs into the following few lines:
Chapter7/browse.py
if not (next is None and previous is None and first is None and last is None): cacheid=self.iscached(cacheid,sortorder,pattern) else: cacheid=None if cacheid is None: ids = self.entity.listids( pattern=pattern,sortorder=sortorder) cacheid = self.storeincache(ids,sortorder,pattern) else: ids = self.getfromcache(cacheid,sortorder,pattern) if ids == None: ids = self.entity.listids( pattern=pattern,sortorder=sortorder) cacheid = self.storeincache(ids,sortorder, pattern)
Because we will store a unique cacheid
in a hidden<input>
element, it will be passed as an argument when the form is submitted. We use this cachid
together with the sortorder
and pattern
arguments to check whether a previously retrieved list of IDs is present in the cache with the iscached()
method. Passing the sortorder
and pattern
arguments will enable the iscached()
method to determine if these are changed and invalidate the cache entry.
iscached()
will return the cacheid
if it exists in the cache or None
if it doesn't. iscached()
will also return None
if the cacheid
does exist but the sortorder
or pattern
arguments were changed.
Next, we check if the cacheid
is None
. This may seem redundant, but if index()
was called for the first time (without arguments, that is) none of the submit button arguments would be present and we wouldn't have checked the cache.
This is intended: if we would, at a later point, revisit this list, we would want a fresh set of items, not some old cached ones. After all, the contents of the database might have changed.
If the cacheid
is None
we retrieve a fresh list of IDs and store it in the cache together with the sortorder
and pattern
arguments. The storeincache()
method will return a freshly minted cacheid
for us to store in the hidden<input>
element.
If the cacheid
was not None
, we use the getfromcache()
method to retrieve the list of IDs from the cache. We check the returned value because between our checking for the existence of the key in the cache and retrieving the associated data, the cache might have been purged, in which case, we still call the listids()
method.
The implementation of the iscached(), getfromcache()
, and storeincache()
method takes care of all the thread safety issues:
def chash(self,cacheid,sortorder,pattern): return cacheid + '-' + hex(hash(str(sortorder))) + '-' + hex(hash(str(pattern))) def iscached(self,cacheid,sortorder,pattern): h=self.chash(cacheid,sortorder,pattern) t=False with self.cachelock: t = h in self.cache if t : self.cache[h]=(time(),self.cache[h][1]) return cacheid if t else None def cleancache(self): t={} with self.cachelock: t={v[0]:k for k,v in self.cache.items()} if len(t) == 0 : return limit = time() oldest = limit limit -= 3600 key=None for tt,k in t.items(): if tt<limit: with self.cachelock: del self.cache[k] else: if tt<oldest: oldest = tt key = k if key: with self.cachelock: del self.cache[key] def storeincache(self,ids,sortorder,pattern): cacheid=uuid().hex h=self.chash(cacheid,sortorder,pattern) if len(self.cache)>self.cachesize : self.cleancache() with self.cachelock: self.cache[h]=(time(),ids) return cacheid def getfromcache(self,cacheid,sortorder,pattern): ids=None h=self.chash(cacheid,sortorder,pattern) with self.cachelock: try: ids=self.cache[h][1] except KeyError: pass return ids
All methods use the chash()
method to create a unique key from the cacheid
and the sortorder
and pattern
arguments. iscached()
waits until it acquires a lock to check if this unique value is present in the cache. If it is, it updates the associated value, a tuple consisting of a timestamp and a list of IDs. By updating this timestamp here, we reduce the chance that this item is purged from the cache between the check for existence and the actual retrieval.
The getfromcache()
method creates a unique key with the chash()
method in the same way iscached()
did and waits to acquire the lock before it uses the key to retrieve the value from the cache. If this fails, a KeyError
will be raised that will be caught, causing the None
value to be returned as that was what the IDs variable was initialized to.
The storeincache()
method first creates a new cacheid
using one of the uuid()
functions from Python's uuid
module, essentially creating a random string of hexadecimal characters. Together with the sortorder
and pattern
arguments, this new cacheid
is used to generate a unique key.
Before we store the list of IDs in the cache, we check whether there is any space left by comparing the number of keys in the cache to the maximum length we are prepared to accept. If there isn't any room left, we make room by calling the cleancache()
method that will remove any entries that are too old. We then store the IDs together with a time stamp after acquiring a lock and return the cacheid
just generated.
The final cog in our caching machinery is the cleancache()
method. After requiring a lock, a reverse map is built, mapping timestamps to keys. If this map holds any items, we use it to locate any key that is older than an hour. Those are deleted after acquiring a lock.
The whole business with acquiring a lock and releasing it as quick as possible instead of acquiring the lock and doing all the cache-related business in one go ensures that other threads accessing the cache do not have to wait very long, which keeps the whole application responsive.
If the age of an entry is less than an hour, we keep notes to see which of the remaining ones is the oldest to remove that one at the end. This way, we ensure that we always retire at least one entry, even if there aren't any really old ones.
3.139.67.5