Implementing caching strategies

There are a number of ways to decide when to create and when to expire the cache items, so we'll look at one of the easier and faster methods for doing so. But if you are interested in developing this further, you might consider other caching strategies; some of which can provide efficiencies for resource usage and performance.

Using Least Recently Used

One common tactic to maintain cache stability within allocated resources (disk space, memory) is the Least Recently Used (LRU) system for cache expiration. In this model, utilizing information about the last cache access time (creation or update) and the cache management system can remove the oldest entry in the list.

This has a number of benefits for performance. First, if we assume that the most recently created/updated cache entries are for entries that are presently the most popular, we can remove entries that are not being accessed much sooner; in order to free up the resources for the existing and new resources that might be accessed much more frequently.

This is a fair assumption, assuming the allocated resources for caching is not inconsequential. If you have a large volume for file cache or a lot of memory for memcache, the oldest entries, in terms of last access, are quite likely not being utilized with great frequency.

There is a related and more granular strategy called Least Frequently Used that maintains strict statistics on the usage of the cache entries themselves. This not only removes the need for assumptions about cache data but also adds overhead for the statistics maintenance.

For our demonstrations here, we will be using LRU.

Caching by file

Our first approach is probably best described as a classical one for caching, but a method not without issues. We'll utilize the disk to create file-based caches for individual endpoints, both API and Web.

So what are the issues associated with caching in the filesystem? Well, previously in the chapter, we mentioned that disk can introduce its own bottleneck. Here, we're doing a trade-off to protect the access to our database in lieu of potentially running into other issues with disk I/O.

This gets particularly complicated if our cache directory gets very big. At this point we end up introducing more file access issues.

Another downside is that we have to manage our cache; because the filesystem is not ephemeral and our available space is. We'll need to be able to expire cache files by hand. This introduces another round of maintenance and another point of failure.

All that said, it's still a useful exercise and can still be utilized if you're willing to take on some of the potential pitfalls:

package cache

const (
  Location "/var/cache/"
)

type CacheItem struct {
  TTL int
  Key string
}

func newCache(endpoint string, params ...[]string) {

}

func (c CacheItem) Get() (bool, string) {
  return true, ""
}

func (c CacheItem) Set() bool {

}

func (c CacheItem) Clear() bool {

}

This sets the stage to do a few things, such as create unique keys based on an endpoint and query parameters, check for the existence of a cache file, and if it does not exist, get the requested data as per normal.

In our application, we can implement this simply. Let's put a file caching layer in front of our /page endpoint as shown:

func ServePage(w http.ResponseWriter, r *http.Request) {
  vars := mux.Vars(r)
  pageGUID := vars["guid"]
  thisPage := Page{}
  cached := cache.newCache("page",pageGUID)

The preceding code creates a new CacheItem. We utilize the variadic params to generate a reference filename:

func newCache(endpoint string, params ...[]string) CacheItem {
cacheName := endponit + "_" + strings.Join(params, "_")
c := CacheItem{}
return c
}

When we have a CacheItem object, we can check using the Get() method, which will return true if the cache is still valid, otherwise the method will return false. We utilize filesystem information to determine if a cache item is within its valid time-to-live:

  valid, cachedData := cached.Get()
  if valid {
    thisPage.Content = cachedData
    fmt.Fprintln(w, thisPage)
    return
  }

If we find an existing item via the Get() method, we'll check to make sure that it has been updated within the set TTL:

func (c CacheItem) Get() (bool, string) {

  stats, err := os.Stat(c.Key)
  if err != nil {
    return false, ""
  }

  age := time.Nanoseconds() - stats.ModTime()
  if age <= c.TTL {
    cache, _ := ioutil.ReadFile(c.Key)
    return true, cache
  } else {
    return false, ""
  }
}

If the code is valid and within the TTL, we'll return true and the file's body will be updated. Otherwise, we will allow a passthrough to the page retrieval and generation. At the tail of this we can set the cache data:

  t, _ := template.ParseFiles("templates/blog.html")
  cached.Set(t, thisPage)
  t.Execute(w, thisPage)

We then save this as:

func (c CacheItem) Set(data []byte) bool {
  err := ioutil.WriteFile(c.Key, data, 0644)
}

This function effectively writes the value of our cache file.

We now have a working system that will take individual endpoints and innumerable query parameters and create a file-based cache library, ultimately preventing unnecessary queries to our database, if data has not been changed.

In practice we'd want to limit this to mostly read-based pages and avoid putting blind caching on any write or update endpoints, particularly on our API.

Caching in memory

Just as file system caching became a lot more palatable because storage prices plummeted, we've seen a similar move in RAM, trailing just behind hard storage. The big advantage here is speed, caching in memory can be insanely fast for obvious reasons.

Memcache, and its distributed sibling Memcached, evolved out of a need to create a light and super-fast caching for LiveJournal and a proto-social network from Brad Fitzpatrick. If that name feels familiar, it's because Brad now works at Google and is a serious contributor to the Go language itself.

As a drop-in replacement for our file caching system, Memcached will work similarly. The only major change is our key lookups, which will be going against working memory instead of doing file checks.

Note

To use memcache with Go language, go to godoc.org/github.com/bradfitz/gomemcache/memcache from Brad Fitz, and install it using go get command.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.17.137