How it works...

If the HTML content in the database is valid, when you put the following code in the template, it will retrieve the media tags from the content field of the object; otherwise, an empty string will be returned if no media is found:

{% load utility_tags %}
{{ object.content|first_media }}

Regular expressions are a powerful feature to search/replace patterns of text. At first, we define lists of all the supported media tag names, split into groups for those that have both opening and closing tags (MEDIA_CLOSED_TAGS), and those that are self-closed (MEDIA_SINGLE_TAGS). From these lists, we generate the compiled regular expression as MEDIA_TAGS_REGEX. In this case, we search for all the possible media tags, allowing for them to occur across multiple lines.

Let's see how this regular expression works, as follows:

Alternating patterns are separated by the pipe (|) symbol.
There are two groups within the patterns—first of all, those with both opening and closing normal tags (<figure>, <object>, <video>, <audio>, <iframe>, and <picture>), and then one final pattern for what are called self-closing or void tags (<img> and <embed>).
For the possibly multiline normal tags, we will use the [Ss]+? pattern that matches any symbol at least once; however, we do this as few times as possible until we find the string that goes after it. Therefore, <figure[Ss]+?</figure> searches for the start of the <figure> tag and everything after it, until it finds the closing the </figure> tag.
Similarly, with the [^>]+ pattern for self-closing tags, we search for any symbol except the right-angle bracket (possibly better known as a greater than symbol, that is to say, >) at least once and as many times as possible, until we encounter such a bracket indicating the closure of the tag.

The re.MULTILINE flag ensures that matches can be found even if they span multiple lines in the content. Then, in the filter, we perform a search using this regular expression pattern. By default, in Django, the result of any filter will show the <, >, and & symbols escaped as the <, >, and & entities, respectively. In this case, however, we use the mark_safe() function to indicate that the result is safe and HTML-ready, so that any content will be rendered without escaping. Because the originating content is user input, we do this instead of passing is_safe=True when registering the filter, as we need to explicitly certify that the markup is safe.

Table of Contents for How it works...

Create new playlist

Sign In

Sign Up

Table of Contents for
How it works...