Chapter 8

Looking Ahead

Section Editors: Chris Lennon and Clyde Smith

We conclude with a section discussing where things are likely headed in the area of media workflow. This is an area that is undergoing constant change, and so understanding where things are going is the most important thing, once the reader has a good grasp of the entire ecosystem. Industry trends, as well as future specifications and standards that will help to make the media workflow puzzle a much easier one to bring together will be addressed.

We have enlisted industry experts Al Kovalick and Stan Moote to peer into their crystal balls and tell us what they see.

Looking Deep into the Future: Al Kovalick

With 100+ years of SMPTE’s technology and standards developments behind us it is fitting to look ahead and ask, “what could the future look like for video technology and facility infrastructure?” On July 1, 1941 the first commercial TV stations WNBT and WCBW made use of the new NTSC video and over-the-air standards to start broadcasting in New York City. The video signal was analog, 525 line, interlace, black & white and at exactly 60 (not 59.94…) fields per second. In 2020 we have all digital, UHD/4/8K progressive images with high dynamic range, wide color gamut, and high frame rate choices. The image quality is several magnitudes improved from the early days. Consider too the primitive systems for analog video production in 1941 compared to the all-digital, IT-based systems of today.

Over this 80-year period, many notable improvements would have been impossible to predict. So, is it a “fool’s errand” to predict what video technology and infrastructure design may look like in 2036, 15 years hence? Observe what the famous quantum physicist Niels Bohr (1885–1962) said, “Prediction is very difficult, especially if it’s about the future.” He was likely referring to a Danish proverb from the 1948 time-period.

Interestingly, Mr. Bohr also said, “Technology has advanced more in the last thirty years than in the previous two thousand. The exponential increase in advancement will only continue.

If this was stated in about 1960 or earlier then his perspective starts from 1930 or sooner. Despite how difficult it is to accurately forecast even ten seconds in advance for some events, predicting the future state of some technology stands a chance if we can extend relevant historical trends. Although Mr. Bohr’s first quote is slightly pessimistic his second quote is optimistic and applies to extending likely exponential trends to obtain realistic forecasts. Of course, there are many reasons why relying on brute force extrapolation often fails. Bottom line, there is merit in the approach if applied sensibly.

So, the approach in this section is to identify several current technology trends and extend them 15 years into the future. Due caution is applied with acknowledgment of expected plateaus for some trends. Why 15 years? It is a reachable goal given the current momentums and advances in areas that impact video technology and infrastructure.

The Granddaddy of Exponential Laws in Electronics

The most famous exponential curve in electronics is Moore’s Law. This law states “The number of transistors on an integrated circuit (IC) would double every year [starting from 1959].” It was postulated (not called Moore’s Law at the time) in a paper in Electronics Magazine in 1965 by Gordon Moore of Intel (Figure 8.1). There is no unanimous consensus on who invented the IC. Certainly Jack Kilby (Texas Instruments), Robert Noyce (Intel), and Kurt Lehovec (Sprague Electric) all played vital roles and Mr. Kilby received the Nobel Prize in Physics (2000) for his work.

The first data point on the graph in Figure 8.1 was likely for Mr. Kilby’s IC, a one transistor, multiple resistor, phase-shift sinusoidal oscillator. Six years later in 1965 there were 26 (64) transistors on an IC. Mr. Moore’s great idea was to extend the data trend with the dotted line heading to 216 in 1975, a growth that was a doubling of transistor density every year. This author has extended the original graph to 2016 indicating 235 (about 34 billion) transistors per IC. The original growth rate has slowed down to what is now about a doubling every 2.5 years, hence the different slope of the line heading towards 2016. The Altera/Intel Stratix 10 FPGA has approximately 30 billion transistors [Ref 1] so it’s in the ballpark of 235.

The fact that Moore’s Law has remained true, albeit with some slope modification, for about 57 years is amazing and provides confidence in extending its predictions year after year. There are many thoughtful researchers who continue to predict the end of Moore’s Law but so far it has not been deterred. There is a joke about the projected failure of the Law: The number of people predicting the end of Moore’s Law doubles every two years.

Figure 8.1 Moore’s Law from 1959 to 2016

Of course, someday it will end. In the insightful paper Universal Limits on Computation [Ref 2], the authors put a limit on the total number bits that can be processed in a theoretical computer as 1.35×10120. This fantastic number in turn requires an end to Moore’s Law sometime around the year 2600. This is an unreachable upper limit and reality is a date considerably closer, possibly near 2022 [Ref 3]. This topic is explored in more detail later in the chapter. Next, an appreciation for the power of exponential growth.

The Power of the Exponential

It is easy for humans to appreciate linear growth, but exponential thinking challenges our logic. See Figure 8.2.

If a person has a step distance of 2.5' then after 30 steps, they have progressed 75 feet. This is easy to grasp (line ‘A’). On the other hand, if a person with “very long legs” makes each step double the previous then after 30 steps they have walked about 1 million miles (line ‘B’). What an incredible difference with the same number of steps. This is the power of exponential growth and explains Moore’s Law going from 1 to ~235 transistors per-IC in just 57 years.

Note also line ‘C’: This shows a curve starting as exponential but slowing down and coming to an end or plateau. Adults who experienced the COVID-19 pandemic in 2020 have a very real understanding of exponential infection-growth and the meaning of “flattening the curve” as the growth tapers off. There are many examples of (sometimes fast) rising technologies sidelined due to competing choices. Instances are1:

  • Postal/Telegraph/Phone

  • Analog/Digital (SDI)

  • Slide rule/Computer

  • Vinyl/Tape/CD/Online

  • Point-to-point/Internet

  • CRT/LCD

    Figure 8.2 Linear Compared to Exponential Growth

  • DVD/Netflix

  • Videotape/Servers

  • Custom AV/IP-COTS

  • Facility data center/Cloud

A classic example is the rise and fall of the audio CD. From about 1985 to 2000 units-shipped grew about 32×, a doubling every three years on average [Ref 4]. After 2000, the CD purchase rate slowed down and is nearly dead in 2018 being replaced by online music choices. The CD example is common as technologies follow the “slow-exponential-slow” path as in ‘C.’ This is often called the S-curve life cycle [Ref 5]. Typically, one S-curve replaces the previous one as has occurred for the vinyl/tape/CD/online timeline. It’s important to understand the concept of the S-curve when making forecasts.

By way of illustration, Figure 8.3 shows a simplified evolution of video image quality over time. The S-curves are shown “in series” but in reality, two or more will overlap as one method loses favor and another slowly replaces it. From the ‘black & white’ era2 (1941) to the HDR+ era (2015 and beyond) image quality has improved many orders of magnitude following an exponential path of S-curves. The Y axis is labeled Progress and for the general case can represent some assigned “metric of improvement” (e.g., image resolution, bit depth, colorimetry, etc.).

With this background, let’s apply these ideas to forecasting the state of video technology and infrastructure in 2036. The next sections of this chapter will consider what future growth looks like for compute, storage, networking, and related video infrastructure.

The Evolution of Infrastructure Elements

The three key elements that comprise media systems infrastructure are storage, compute, and networking (abbreviated here as S-C-N). Each will be extended with analysis on the basis for the prediction. The results developed here are applicable to any size of media system.

Figure 8.3 The Exponential and the S-Curves of Evolution for Video

Storage Capacity

In this case the approach is to study the existing trends for storage and then extrapolate to about 2036. Over the period 1956–2014 the average Cumulative Annual Growth Rate (CAGR) for Hard Disk Drive (HDD) areal bit density was 41% and nearly identical to the Moore’s Law rate for semiconductor devices over the same period [Ref 6].

For this analysis, a “conservative” future CAGR of 26% (about 2× every 3 years) is used to predict disk capacity in 2036. Figure 8.4 shows this growth starting from 4TB, a common capacity in 2016. In 2037 the estimated hard-disk capacity is about 512TB. Of course, this may seem unlikely but only if you bet against exponential growth. On the other hand, it can be argued that a constant 26% CAGR will not last for 20 years. Nonetheless, new storage methods are being invented now that could possibly fill the gap if one occurs. Remember the S-curve. In 2020, 14 TB drives are commercially available.

Figure 8.4 Predicting Hard-Disk Storage Capacity

As the physical universe goes, so goes the digital universe; it is expanding rapidly. In 2020 it contains nearly as many digital bits as there are stars in the universe. It is doubling in size every two years, and by the end of 2020 the digital universe of data we generate will reach 44 zettabytes, or 44 trillion gigabytes [paraphrased from Ref 7]. Some of this is temporary data and will be erased after its lifecycle ends. With the death of videotape, most professional video is now stored on hard disk or LTO (or similar) tape for archive. So even with the projected large growth of storage capacity, it seems there will be ample data to fill the devices.

Compute Performance

Next, the future state of compute power, measured by the number of transistors per IC, is evaluated. There are other metrics but this one will be used given its time-honored status. Naturally, following Moore’s Law is a good guide. The current growth rate is closer to 2× every 2.5 years, according to Intel CEO Brian Krzanich [Ref 8]. This is considerably less than the 2× every year that the Law started with in 1965. For our analysis we will use 2× every 3 years as with the storage case. This also applies to the Graphics Processing Unit (GPU), very useful for video processing and machine learning computation.

Modern CPUs have multiple compute cores per IC and transistor count is divided among the cores. For our purposes, the Intel 24-core Xeon Broadwell-E5 (22 active to increase yield) will be the starting benchmark with about 7.2 billion transistors. Applying a CAGR of 26%, Figure 8.5 predicts the number of cores per IC in 2037. The Y axis unit is “CoreX” and is the multiplier for the base benchmark.

Figure 8.5 Predicting Compute Core Increase

With a CoreX value of 128 in 2037, this implies that the future CPU will have 3,072 cores (maybe 10% or more inactive due to chip yield) with about 922 billion transistors. As pointed out earlier and in Ref [3], it is unlikely that Moore’s Law will continue unabated until 2036 even at the slower 2× per 3 years. However, the same reference also describes research to replace the traditional CMOS transistor and keep the Law alive. One way to increase the core count is to connect “chiplets” each with say N cores. A chiplet is an IC block that has been specifically designed to integrate with other similar chiplets to form larger more complex chips. So, K chiplets (N cores each) interconnected will yield K*N useful compute cores.

Thus, Moore’s Law may indeed slow down considerably or even abruptly stop within the next 6–7 years. Or, it may not. Humanity has an insatiable appetite for compute power and researchers are doing their best to ensure that performance increases year after year.

Even if the transistor as we know it comes to a dead end, no doubt something will replace it as a “binary switch,” the foundation of computing. For this chapter it is assumed, somehow, the Law continues even if recrafted as Version 2.0.

Software Powers Media Workflows

What about the software that executes on 3,072 cores? It is beyond the scope of this chapter to investigate the state of software in 15 years. However, there is little doubt that methods will exist to leverage this enormous compute power.

With the advent of Siri, Cortana, Google Assistant, Alexa, and others, “digital assistants” using software-based artificial intelligence (implemented using machine learning (ML) techniques) are mainstream. Many companies at NAB 2019 showcased products using ML and leveraged services from Google Cloud Platform, Amazon Web Services (AWS), IBM Waston, and Microsoft Azure Cognitive Services. The range of applications is impressive and ranges from simple metadata analysis to natural language processing (NLP) to automated program editing.

Accuracy and usefulness are improving daily. It’s not too difficult to imagine AI assistants helping to run the media infrastructure of the future. Will an AI assistant ever manage and control the workflows necessary to run a media business? Consider this conclusion from a survey of 550 AI experts [Ref 9];

Results reveal a view among [550] experts that AI systems will probably (over 50%) reach overall human ability by 2040–50, and very likely (with 90% probability) by 2075.

Sometime after 2036 AI assistants may well run most operational aspects of the media enterprise. The collective prediction of experts in the field indicates a strong probability by 2075, about 40 years short of the 200-year anniversary of SMPTE.

Does this sound a little too fanciful? In his famous 1950 paper [Ref 10], Alan Turing predicted:

I believe that in about fifty years’ time [~2000] it will be possible to program computers … to play the imitation game [question-answer] so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning.

Basically, for a machine to pass the Turing test, a questioner cannot distinguish a machine from a human more than 70% of the time during a 5-min conversation. Mr. Turing’s prediction time is close given the February 2011 winning display of knowledge3 by IBM’s Watson on the TV program Jeopardy [Ref 11].

Figure 8.6 Networking Router Performance

Networking Capability and Performance

The final element in the S-C-N triumvirate is networking. This includes individual routers/switches and their interconnected meshes. The analysis here considers the standalone router/switch. Its capability may be measured by the four gages listed in Figure 8.6. Routers with hundreds of 100G and 400G ports are common for universal leaf and spine designs in 2020.

Methods to stream real-time A/V essence over Ethernet/IP are replacing the Serial Digital Interface (SDI) and AES3 (audio) methods in use today. The transition to IP is well underway utilizing the SMPTE ST2110-xx standards suite. A network’s performance (lossless transport, non-blocking switching, massive data throughput) and features will be heavily relied upon as the basis of the media facility, large and small [Ref 12].

A key gauge of “Routing Performance” is data-throughput (e.g., packets-per-second) and the ability to do deep packet inspection decision making. This metric has followed Moore’s Law since the packet switching is based on silicon integrated circuits. A second gage is the number of ports and their rate (e.g., 1, 10, 25, 40, 100 Gb/s). Historically, link rates start with the IEEE standardized rate of 10 Mb/s in 1983. The 1 Tb/s link is expected in about 2025 (Ethernet Alliance). Hence, the link rate will have grown by about 100,000× in ~40 years. This is an end-to-end CAGR of about 33%, surpassing Moore’s Law in a sense4 and not as log-linear due to the uneven date spacing of each new IEEE standard.

Note that Ethernet rates are closely tied to transistor speed and not chip density. An Ethernet transceiver has very few transistors compared to say a CPU. Plus, Ethernet can use “parallel lanes” and this multiplies the native link rate by the number of lanes. With photonic wavelength division multiplexing the limit of “link” speed may reach 6.2 Tb/s [Ref 13]. Extremely high-speed Ethernet links find application in generic data trunking and other concentrated connectivity uses.

The “Scale” gauge in the figure is applied when individual routers/switches are clustered into a network. The Internet is at one end of scale. For professional media networking, the leaf-spine architecture is gaining popularity and will be a common method [Ref 14] to route AV streams in a facility much as SDI switches do today. Leaf-spine configurations enabling 1000s of HD/UHD routed video-over-IP streams are available from some vendors in 2020.

The final gauge in the figure is “Control.” For most data traffic, standard routing protocols and methods suffice to steer packets across a network. However, for IP video streams in a studio/venue environment, more deterministic route control is necessary. Streams must be transported using lossless (no packets dropped) paths with guaranteed rates and very low latency end-to-end. Software-defined Networking (SDN) methods are ideal to meet the deterministic routing needs for media transport. SDN is not required for all use cases but many will benefit.

Products using SDN techniques to route IP multicast flows across switches and, in some cases, support frame accurate video switching performance [Ref 15] are common. Most modern switches support some form of SDN control, and we should expect the control dynamics to improve over time with more deep packet inspection ability. One can imagine video stream switching occurring on a frame boundary within the router itself in the future.

Using past developments in networks as a guide, expect links, meshes, and the standalone router/switch to improve markedly in the future. There is no one metric that can capture all the improvements that are likely to occur, but the performance/capability will likely increase exponentially over the next 15 years and beyond.

Infrastructure Rides the Exponential

Media systems’ infrastructure is composed of many elements and the S-C-N triplet has the biggest share. As shown, these three have been riding exponential growth curves and will likely continue for a time possibly in the classic S-curve fashion. Figure 8.3 provides an example of how one S-curve is replaced by another over time.

What might the future of the media infrastructure look like? One view is illustrated in Figure 8.7.

The Analog and Digital S-Curves

It all started in the 1940s with the advent of analog television. All production was either live or using film until the videotape recorder was invented by AMPEX in 1956. In about 1990 digital SDI transport came into being and slowly replaced analog transport. SDI has improved starting5 at 270 Mb/s (SMPTE ST 259) and a single link data rate of at 24 Gb/s (SMPTE ST 2083-20) in 2019. This is an 88× data-rate growth in about 28 years and needed to support the ever-improving video resolutions and increasing frame rates. Video multiplexed across dual and quad links increases the aggregate transport rate at the cost of system complexity (e.g., quad-link ST 2083-22 at 96 Gb/s). However, the next S-curve labeled “COTS/IP” is gaining steam, eventually to replace SDI and other “AV custom” links and equipment.

Figure 8.7 Exponential S-Curves of Media Infrastructure Progress

The COTS/IP S-Curve

COTS is commercial-off-the-shelf IT equipment including the commodity versions of S-C-N. The intent is to use these elements to ingest, process, store, and distribute media. The IP term indicates the move towards IP/Ethernet transport to carry A/V and metadata streams. This S-curve is mature for file-based workflows and at the emergent stage for real-time IP video streaming. Both vendors and end-users appreciate the business and technical benefits of COTS/IP infrastructure compared to bespoke AV systems [Ref 16].

The SDMI S-Curve

The next S-curve taking shape is labeled Software-defined Media Infrastructure (SDMI6. Common themes of software-defined (SD) are:

  • Providing “resource services” that are independent of the hardware

  • Programmability of behavior

  • Dynamic resource control/management

Using APIs, controllers allocate each S-C-N resource to match workloads. If 5 TB of additional storage is needed to execute a given workflow, controllers allocate it, if available, without effort from an administrator. Same for compute and networking resources in a respective way. The three models are:

  • SD Compute using virtualization and containers

  • SDS for storage

  • SDN for networking

The full SD-based facility offers resource agility and programmable workflows. The level of user programming may become very advanced as in, “Create a new channel with these features….” SD-based infrastructure enables machine automation of workflows not possible with “just a collection of IT hardware.” Of course, SD methods rely on commodity IT equipment but importantly the software-based resource control features give this model its benefits.

How does the SDMI differ from a generic SD-based system? Broadcast, venue events, and post media workflows require special tunings of the infrastructure that non-optimized systems may not support. Here are a few:

  • Real-time media transport, point-to-multipoint streaming

  • Lossless transport, very low end-to-end latency

  • Precise time and AV sync support based on IEEE 1588v2 Precision Time Protocol

  • Other media transport and processing features

Figure 8.8 shows an SDMI integrated with a traditional media system and any required non-IT media devices (e.g., cameras, mics, etc.). For this case the SDMI is assumed located in a private data center (e.g., broadcast operations facility). The scale of the SDMI compared to the traditional system will vary depending on many factors. Some systems will be weighted towards the traditional and other towards SDMI. Either way, traditional AV, SDMI (on premise/private), and public cloud hybrid configurations will work together in harmony.

According to DAvid® Floyer, co-founder and CTO of Wikibon, “Software-led infrastructure is a game-changer for businesses and organizations, on the same scale as the Internet was in 1995.” Time will tell if this is hyperbolic speech or an accurate prediction.

Importantly, SDMI techniques can leverage the public cloud; it is inherently software- defined. Operators may have less control over some performance metrics when going public, but all file-based and many real-time AV workflows can certainly be implemented.

Figure 8.8 A Hybrid SDMI and Traditional Media System

The Public Cloud S-Curve

The next S-curve in Figure 8.7 is labeled “Full public cloud, SaaS, XaaS.” This is public cloud-centric approach for executing media workflows. A few of the notable players are: Google’s Cloud Platform, Amazon Web Services (AWS), and Microsoft’s Azure. All services are based on mature SD principles at web-scale and services are typically sold by-the-hour.

Software-as-a-Service (SaaS, apps running in browsers) are already becoming an important aspect of many media workflows. XaaS represents “anything-as-a-Service” and implies the wide range of services from public cloud providers and their partners. Examples include Desktop-as-a-Service (DaaS), Disaster Recovery-as-a-Service (DRaaS), IDentity as a Service (IDaaS), Infrastructure-as-a-Service (IaaS), Video Encoding-as-a-Service (VEaaS), ML-as-a-service (MLaaS), and many others. These services will find use when building some media workflows. At the API level, the SMPTE is supporting “Microservices for Media” an ongoing effort to define common services for the media enterprise.

A few media vendors are building cloud-based workflow products and offer them as solutions-for-hire in some fashion. One example of this is the CLEAR software suite for broadcasters from Prime Focus Technologies [Ref 17]. This is a good example of a cloud- native media product that can work either alone or in a hybrid environment.

Expect cloud-based media solutions to mature and follow the cloud’s growth patterns. It is inevitable that complete broadcast operations will be based on public clouds. The worries of security and reliability are already fading as the major cloud providers continue to show that they offer world-class security and high-availability services.

Other usage concerns are (1) access connection reliability and bandwidth and (2) limited real-time performance specs. For the first concern, access rates are ever increasing, and most cloud providers offer private 10 Gb/s (and greater) connectivity directly to their cloud, bypassing the Internet. For the second concern, there are methods to reliably transport and process real-time video despite some uncertainly on compute performance metrics.

The public cloud is only about 15 years old measured from Amazon’s first IaaS service offerings in 2006. Cloud adoption is growing exponentially with Amazon’s AWS reporting an average CAGR (revenue) of ~37% over the past four years. This is quite large and not representative for all cloud service suppliers. Respected research firm IDC stated in 1/2016 [Ref 18]:

We expect worldwide spending on public cloud services will grow at a 19.4% CAGR – almost six times the rate of overall IT spending growth – from nearly $70 billion in 2015 to more than $141 billion in 2019.

From a different perspective, Statistica reports that spending on public cloud IaaS hardware and software was $25B in 2015 with forecast growth to $161B in 2023. This is a forecast CAGR of ~26% over the eight-year period [Ref 19].

The public cloud is certainly on the exponential growth part of its S-curve. However, many designers and systems integrators of media workflows are conservative and taking a wait-and-see approach. On balance, the overall benefits the cloud offers are too good to pass up despite the concerns some have. Cloud trust, security, features, reliability, performance, and ROI are all improving. Will all media operations be public cloud based in 2036? Highly unlikely. There will always be at least some use cases for local, on premise/truck, workflows. However, clever designers will be looking for reasons to use the cloud rather than reasons not to.

Contemporaneous S-Curves

Figure 8.7 shows four S-curves having influence during the same time period, centered around 2016. Media workflow infrastructures are being built in 2020 based on one or more of these. In time, “Digital AV” and even “COTS/IP” will become less important as “SDMI” and “Full public cloud” takes more market share over time. What is the next S-curve? Quantum entanglement for media transport? Quantum compute clouds? Time will tell.

Final Words

These predictions are a “best guess” using historical trends with nonaggressive extrapolation coupled with the thoughtful predictions of other researchers. No trend lasts forever so using S-curves provides a method to replace a dying trend with a likely startup. This approach has been successful in understanding the progress of media systems methodologies overtime (Figure 8.7). Of course, some of the conclusions in this chapter will fall short while others may surprise us and go beyond the extrapolated value. Gordon Moore in 1965 never imagined that his eponymous Law would be so long-lasting. It’s not dead yet.

Albert Allen Bartlett, Professor of Physics (1923–2013) said, “The greatest shortcoming of the human race is our inability to understand the exponential function.” This seems an exaggeration but there is truth to his words because humans have a difficult time thinking non- linearly. If we assume that storage, compute, and networking growth metrics are all following a corresponding steep exponential then the future of the media enterprise over the next 15 years will look considerably different from that of today.

Will the conclusions presented here help you make better planning decisions? As an end-user, the discussion should improve your ability see a little farther down the road. If you are a vendor of media solutions, start thinking about “designing for the cloud.” Many hardware devices need replacement in < 5 years so there will likely be 3–4 generations of on-premises equipment over 15 years; time enough for big changes. It is also true that the farther out one looks the less relevant the predictions may be. Nonetheless, it is insightful to consider the future of the media enterprise but doing so with the proverbial pinch of salt.

Source Note: The materials for this chapter are based on a talk given by the author at the SMPTE Annual Technical Conference, October 2016.

The Future Is Real

Stan Moote, CTO – IABM

Back in 2008 I was encouraging the broadcast industry to consider cloud activities. This was a huge struggle. Being one who has always embraced technology and new business tactics, I grappled with trying to understand the reluctance to use any sort of cloud activity for broadcast. My gut feel was broadcasters didn’t want to lose control. Control of what, I thought. Was it simple, plain control over every aspect of their day-to-day activities? After all, broadcasting is a unique animal, not only filled with niche, specialized equipment, but also with special, distinctive workflows. On top of these special workflows, it seemed there were always multiple exceptions too. For example, an extra pre-roll for the evening news – but only for a specific time zone; a local cut-in perhaps only once or twice a year; or even something as simple as a customized voice over to meet some regional legal requirement.

These “specials” created a whole mixture of homegrown along with custom “products” often only used in a single facility. Just one, two, or perhaps three technical types would create a company to support a facility filled with “specials.” Chief Engineers had their way to practically do anything they desired to keep operations happy – often with little discussion of “Why are we really doing this?”; “Are there other options?”. It wasn’t the Chief Engineer’s place to question, only to dream up ways to serve operations. Sometimes this even involved improving upon the concept with yet more complexities, yielding the need for historical knowledge to keep the plant running.

In came the concept of central-casting. After all, why should each station have its own master-control with operators? Now that was more than terrifying – how could that possibly work under so many special conditions? Quickly the last-mile conductivity became the “excuse” not to central-cast. This was because the larger playout centers could get the programs to the downtown core switching facility, but no further without bearing horrendous costs to make those last mile connections. The second excuse was localized ads and bumpers. How could the station create local material and use it for spot playouts? Local news/sports/weather was easy to handle, as it would typically be for a full hour, so the central-casting systems simply were not used during this time. The ad salespeople screamed – “How can I sell a last-minute ad, get it to the central-casting facility and change the rundown in minutes?”

Basically, central-casting was shelved in the late 1990s, yet a few brave souls practically (and secretly) figured it out. It wasn’t so much about having local insertion; it was more about understanding how to get content and control of the content back to the central- casting operation in a timely manner. The point was to maintain control and most operational people could not accept this could be possible.

Almost invisibly, central-casting started to creep in, mainly due to the cost savings that it could provide. Along with these directives to save operating costs and improve the bottom line, local productions started to disappear. Not only was it cheaper to “buy-in” programming, the change in technology to high definition meant new capital equipment retrofits to shoot and produce local shows. Sure, this was fine for the cash cows of local news and sports, but not for full studio productions.

As I was travelling around promoting the switch to digital, it quickly became apparent to me that pretty much everywhere worldwide facilities were playing reruns of just about every syndicated program voice dubbed and subtitled for the local market. They clearly were suffering from a shortage of programs. I could see them buying in third and fourth tier programs from various countries just to fill out their playlist.

I thought, why couldn’t all these empty local studios around the world start shooting productions again? Well, to do this, the economics of the local markets couldn’t support this enough, yet everyone wanted local programming – they just wouldn’t pay enough for it through local advertising. They needed to have a broader audience. It was also doubtful that similar local market viewers would be interested in watching programs from other markets, so it was clear that to be successful, these needed to be sold throughout the world. With their skinny margins, third party distribution consolidators could not be used. They needed to sell direct, similar to how public and state broadcasters distribute their programs.

I was still tackling why broadcasters were so strongly resisting cloud technology. Each one seemed to have different concerns; however, it all came down to fear. Not fear of the unknown, it was fear of something they couldn’t touch, feel, or for that matter put their hands around: fear of the intangible. That’s right, an apprehension of putting all their eggs in one basket, having no idea where that basket actually was situated and to top it off, being run by a bunch of IT nerds that had no idea or concepts of the broadcast business. The truth be-known, these IT nerds understand about having five 9s of reliability, probably in much more detail than broadcasters. They simply have a different definition of down time and maintenance schedules because they don’t work in a “live” environment. For movies and very short videos, playout from the cloud was very marginal due to the lack of Internet speeds, which in turn led to buffering. This was accepted as the norm, however certainly not for television. Broadcasters were blinded by the mere fact if they started working in the cloud, they would be riding the technology curve. All would become adequate for many of their needs. They would be ahead of the competition. But how to convince them?

Taking the fact that the world needed more programs and the fear about cloud, I came up with a concept called Cloud Broadcasting in 2010. Simply put, any local program could be put up into the cloud as a file and sold to as many world markets as possible. Files didn’t care about live or buffering concerns when being transferred. A cloud service simply needed to be reliable, cater worldwide, and be cost effective.

Rights management is always an issue; however, given that the programs were original local productions, rights issues were a no-brainer. So, at this point in time, focus on selling programs as files; live streaming would come later. Simply use the cloud as a tool to house and sell programs worldwide. Why not? This has been working well for e-Books, and audio books too!

Corporate LANs were no longer special, just the norm and typically running as private clouds to house storage of data files. Similar to central-casting, cloud and data centers started to creep into our industry. The mere fact of people seeing the term Cloud and Broadcasting together meant it had to be possible, without the risk of fear. This is one of the cool bits about many people in our industry: tying a few key words together unquestionably conjures up different ideas from people working in different areas of the workflow chain. Operations thought – perhaps cloud is practical for me. Chief Engineers started to tinker with other concepts like playout and IP backhauls. Moore’s Law was on everyone’s side. Yes, some early adopters had issues, but they learned and moved on beyond file based to actual live streams – and today even with HDR 4K!

So how does this relate to workflows in the future? The assumption is the younger generation has this all figured out, so I started to question various people square in the middle of the production process to get some insights of what they were thinking the future would hold. This was a real eye-opening experience for me, being the kind of person who embraces technology and figures out how to take advantage of it for the good of the industry. More often than not this “younger generation” could not see beyond their current workflows and job. Let me give you an example. I asked a 20 something color corrector how he saw the future about his job and work environment. He was adamant nothing would change. He would come in to work, have a bunch of files and do his job, same equipment, same workflow, same outputs. I explained to him how color correction has changed over the decades, gone from expensive suites into a single computer. He was not even aware of these changes. He never saw changes. He is trapped in what I call a technology warp. Think about it. His smart phone gets replaced bi-annually because it is cool. Not because it is faster, provides new services, strictly because it is cooler, looks hot, and perhaps has a better camera with multiple lenses and a second one for selfies. Did his workflow change on the smartphone – for sure it did. But he just accepts this, doesn’t even think about the differences – that is his norm.

The more seasoned people whine about new features, reminisce about the past, and appreciate what the future will hold. Think about the future of acquisition. Cameras are everywhere. Cheap 4K cameras proliferate outside of our industry. Just like cloud, they were not mature, had cheap lenses and were pooh-poohed as being no use. Take any reality program today: they use dozens of them. Sure, they may not genlock, have proper timecode, but they do work to capture content. And this they do in such a unique, awesome way that workflows and production techniques quickly changed to accommodate the shortfalls of these cameras. Cheap meant economical, inexpensive, and cost-effective and by no means crap or unreliable, low quality.

As for DSLRs, future-wise I don’t see them cutting it anymore. They are clumsy and not ergonomically friendly for video shoots. Sure, for some fixed tripod work they are great, but that’s about it. Workflows don’t mix with audio and DSLRs. Beyond large productions, it is also about being able to do as much as you can at the time of shooting. This may even mean that audio mixers and selectors will be built right into the cameras. I believe the key to the future is to do as much work up front as possible.

Lens sets are clearly constantly getting lighter and smaller, which may appear to contradict my previous statement. This keeps a good balance between the cheap 4K devices and professional cameras that maintain a practical form factor for seasoned camera people to use. Beyond the constant improvement in lenses, stabilization, and sensors, I do see two evolving technologies propagating into cameras. These are Blockchain and Machine Learning.

Beyond currency, blockchain within our industry is considered as a way to firm up distribution rights, payments, and contracts. The way it does this is with multiple digital ledgers that can’t be corrupted. Why would we want this with acquisition? Simple – everyone is struggling with the reality of fake news. Suppose during a shoot the camera logged into the blockchain ledger system with data such as GPS coordinates, check codes on the essence – perhaps even enough metadata to be used by recognition engines for people, locations, and objects. The essence would not be in the blockchain; however, enough defining information would be there to prove where and when it was shot. When the question comes up about this being real or not, going back to the blockchain ledgers with strict metadata controls will give proof of the pertinent details. The same goes for audio given the march of newer technologies that can create and alter words, sentences, and lips with the intent to mislead the public.

There was a lot of hesitation about SaaS when it comes to production. Companies that rely on remote editors and creatives have been well into the cloud now. The confusing part is there have been stacks and stacks of drives on editors’ desks until we hit the ultimate pivotal change within the whole industry – COVID-19. Suddenly cloud was recognized as not only a completely acceptable alternative to on-site activities, but in many cases as a preferred approach.

The industry quickly learned that in the future, it is a must to have direct uploading of all proxies from the camera into the cloud while shooting, following on with the complete footage – perhaps with some mezzanine compression into the cloud allowing instant access for all parts of the workflow. As for raw sensor data footage, this will still be sometime before this is practical to use in all but higher-end productions. Again, with the appropriate smarts within the camera, raw sensor data won’t be necessary for many parts of the workflow.

Machine learning fits in two completely separate paths within acquisition. The first is strictly to assist with shooting. We can program a camera to make preset zooms, etc., but it will become far more intelligent. Think of this: the camera begins to learn by the talent’s facial expressions and voice subtexts when to zoom in or out, have soft focus and for that matter pan. It can even be taught that for that specific talent, there are rules set out by the producer – like always have sharp focus on a facial detail like a dark mole or chin dimples. The camera through learning would be able to look after depth of field automatically. Some would think this takes away from the creative gift camera people have. I maintain this frees them up to be creative in ways they have never considered before.

Additionally, we already are seeing how machine learning (ML) and artificial intelligence (AI) are being used to search through archives and help in finding the appropriate clips from both current and past shoots. Why not take this to the next level? Consider if the camera can be taught these search criteria. This could aid directly to help assure the correct details; scene understanding – emotions and situation – are shot, saving on reshoots and searching through endless amounts of footage.

With AI and ML being used in the production process and the availability of seeming boundless image resolutions, we can have cameras everywhere. So many cameras, it will tend to line up with the fact we have microphones everywhere – so why not cameras everywhere too? To understand the future of imagery I always look at the past and current of audio. This goes for video too. Radio had solid-state cart machines: when technology caught up, we had still video stores and clip servers. Sporting venues have microphones everywhere; now we have point-of-view (POV) cameras everywhere. This will go well beyond special events into day-to-day shoots.

This brings us back to the expensive loop discussed previously in this book, “we will fix it in post.” I certainly learned the better thought out a shoot, the more cost effective in both dollars and time it was to complete the production. By having dozens of cameras, the concept of “we will fix it in post” turns into “we will create it in post.” This may be true – however I do caution this could turn into people having a “remix” mentality in an attempt to make a profit rather than be creative. Hmmm – the director’s cut, the producer’s cut, each actor’s cut, perhaps even the chief grip’s cut with different camera angles? There are so many moving parts, components, and workflows during a full production to consider this as being “simplified,” but are there? The Internet of things (IoT) will come to the rescue. With everything being completely connected the camera can command the lights to change to achieve the proper depth of field while maintaining the appropriate shutter to match the producer’s dream. For large studio productions, green screens will be pretty much completely taken over by massive LED screens. LED sets will also provide a good percentage of the set’s lighting. This again ties directly into AI/ML techniques being sympathetic with all the requirements for the cameras on the set.

The real question is, will systems be taught through AI and ML to grab content as it is being shot, taking it through the complete production process including important look and feel like color correction/grading to generate the complete scene on the spot? Perhaps even adding in compositing and creating appropriate effects too. No doubt it will happen as this is all supported by cloud stuff too.

Our format wars are no longer 4:3 or 16:9, and in the future not about 1080 vs 4K vs 8K. It will become more of a question how immersive the experience is. Immersive is typically thought of as virtual reality (VR) with goggles or in a theater; it is classically all about sound. Well, VR may be thought of as creating the ultimate immersive experience, but the potential to stimulate people’s senses and draw them deep into the media production will become a key differentiator from production to production. Immersion may be as simple as sitting close to a 70” plus television. This leads us again into multiple production formats. Can you letter box immersion? Of course not; however, you can take an immersive production that is meant for sitting close to your large screen TV into a non-immersive experience. This conjures up concepts of rather than having safe action area markings in the viewfinder, rather there is a non-immersive safe area marking. It becomes a concern about pan and scan too. With sound down-mixing? Or not? This is exactly where AI and EDIDs fit into our future world.

EDID (Extended Display Identification Data) was developed in the mid-1990s to let the graphic card know the details about the monitor connected such as pixel details, sync levels, pedestal, gamma, scan rates, etc. In the web world, servers know your browser details and the web designer takes the assumption that they are in control of every pixel you see. In our world just as a streaming engine knows the type of mobile or handheld device it is streaming to, AI could take this to the next level knowing the viewing/sound devices and even the viewer’s likes, dislikes, and preferences too!

I dare say some of this is happening at some level today. It will become more and more prevalent in the near future. This is all about us learning how to take advantage of the new tools available, to learn, and create new skills while tying this back to our roots of producing and distributing great content.

Getting back to cloud workflows and direct uploads from the camera, well-tuned business and production workflows can start to use content the second it is shot. Social media and web posts will build excitement about a project even during the shoot under strict controls of the content owners with assistance from blockchain. Near live interviews can be shot, approved, and posted as just part of the standard workflow process. Being in the cloud, this eliminates the geographical and country boundaries allowing the workflow and posts to be tailored towards the local market in a flash.

This may all sound simplistic; the stumbling block won’t be connection bandwidth, it will be security. We will suffer a wave of cyber panics, so simply assume this and prepare for it. Don’t assume only hackers are looking to steal your content. They will always find the weakest link to meet their own personal goals. The secret will be to get the mind-set changed in the future that security is everyone’s responsibility. We will lose some speed and flexibility in the short term. In the long term it will become part of our daily routine and we will not even notice. The question isn’t: Can you prepare your organization for this? The question is: When can you prepare your organization and workflows for this eventuality? Cyber-security has to become second nature for all involved in the process. Organizations need to teach AI machines to look for the huge blips we typically have in our network behaviors that are more event driven (like breaking news) rather than business driven (like quarter end activities).

Within Telco environments, self-healing networks have been the norm. The same goes for many public cloud operations. Until COVID-19 hit, disaster recovery (DR) was all about second sites for playout, storage, etc. – lessening the operational and profit hit due to earthquakes, hurricanes, and major network/power outages. Future-wise a big effort will go less into having redundant chains, and more about using AI/ML to sense faults and self-heal over dispersed geographies – all without human touch. Low-Earth orbit (LEO) satellites will support the control data required switch and self-heal operations without the need for reliable network connections.

At the beginning of this section I was droning on about the hundreds of customized workflows within our industry. The trend was to attempt to limit customization as it is expensive. I call this an attempt because the IABM research is showing a growth in BIY (build it yourself) solutions by end users. We will see less and less unique operational workflows and the balance between linear and on-demand continuous to settle down. The trend will be to keep adding unique natures and options into the creative process. This is exactly how tools like IMF will take on a stronger role in the future with various enhancements thanks to AI. There is a constant drive to harmonize shooting and production to deliver content in different formats (HD-UHD, SDR-HDR, WGC, Object-based audio, etc.) and via different delivery methods. These cost saving drivers will be the enabler for customizable multi-dimensional “SMART-Creative” productions. Just like how color correctors changed post – expect new tools to make a revolutionary change to the business.

As our world becomes more connected, keeping track of which time zones your co-workers are in will be more and more critical for productivity. Technology can’t put us all on the same time zone; however, it certainly can assist with it. Let’s accept the inevitable of the future, keep our minds, schedules apps, and timecode too set to UTC (Coordinated Universal Time) just like the Internet and air traffic controllers do. If an on-location shoot is out by 16 hours, prepare your personal environment, sleep, and meals to line up with the talent’s time zone. Keep in mind two points: our bodies need “solar energy” too, so schedule this accordingly, and select a single UTC time to jam-sync all your timecode operations every day (include leap second changes as the same time – currently about every 18 months).

Will we as an industry become more risk adverse? Absolutely! Learning how to quickly parse out fads versus successful ventures, accepting and spinning up new workflows/business models due to pandemics, artistic, social and economic adaptations, as well as understanding, which are the right technologies to match your goals and aspirations, needs be assisted by a culture transformation with everyone in the industry. We can’t simply be given new tools to use and accept this as the norm; everyone needs to jump in to generate new ideas, strategies, and creative techniques just like the industry pioneers have done as unmistakably explained in this book!

Notes

  1. 1 The two examples in red text are transitions in progress.

  2. 2 Not discussed here is the “mechanical age” of television beginning in earnest in 1924–1926 with J.L. Baird and C.F. Jenkins public demonstrations of capture, transmission, and display of moving images.

  3. 3 While Watson did not officially pass the Turing Test, it showed remarkable progress towards doing so.

  4. 4 Using Moore’s Law to evaluate Ethernet rate growth should be applied loosely since transistor count is not the metric but rather speed.

  5. 5 Other rates are supported in ST 259 including 360, 143, and 177 Mbit/s but 270 is the most common in use.

  6. 6 Some of the material in this section is based on the article “Software-defined Media Infrastructures”, Al Kovalick, SMPTE Journal, Centennial Issue 6, August 2016.

References

  1. 1 Altera’s 30 Billion Transistor FPGA, http://www.gazettabyte.com/home/2015/6/28/alteras-30-billion-transistor-fpga.html.

  2. 2 Universal Limits on Computation, L. M. Krauss & G. D. Starkman, May 10, 2004, https://arxiv.org/pdf/astro-ph/0404510v2.pdf.

  3. 3 After Moore’s Law, The Economist, March 12, 2016, http://www.economist.com/technology-quarterly/2016-03-12/after-moores-law.

  4. 4 Blame It On The CD, Jacob’s Media Strategies, March 9, 2016, http://jacobsmedia.com/blame-it-on-the-cd/.

  5. 5 Technology Life Cycle, Wikipedia, https://en.wikipedia.org/wiki/Technology_life_cycle.

  6. 6 HDD Areal Density Reaches 1 terabit/sq-in, Computer History Museum, http://www.computerhistory.org/storageengine/hdd-areal-density-reaches-1-terabit-sq-in.

  7. 7 The Digital Universe of Opportunities, IDC, April 2014, http://www.emc.com/leadership/digital-universe/2014iview/executive-summary.htm.

  8. 8 Intel Chief Raises Doubt over Moore’s Law, Financial Times, July 15, 2015, https://www.ft.com/content/36b722bc-2b49-11e5-8613-e7aedbb7bdb7.

  9. 9 Future Progress in Artificial Intelligence: A Survey of Expert Opinion, Vincent C. Müller & Nick Bostrom, Oxford University, http://www.nickbostrom.com/papers/survey.pdf.

  10. 10 Computing Machinery and Intelligence, A. M. Turing, Mind, Vol. 59, No. 236 (October 1950), pp. 433–460.

  11. 11 Computer Wins on ‘Jeopardy!’: Trivial, It’s Not, J. Markoff, February 16, 2011, New York Times.

  12. 12 Design Elements for Core IP Media Infrastructure, Al Kovalick, SMPTE Motion Imaging Journal, Vol. 125, No. 2 (2016), pp. 16–23.

  13. 13 Ethernet Roadmap 2015, Ethernet Alliance, http://www.ethernetalliance.org/wp-content/uploads/2015/03/Front-of-Map-04-28-15.jpg.

  14. 14 Journey of 9’s – High Availability for IP Based Production Systems, Pradeep Kathail, Charles Meyer, SMPTE Annual Technical Conference and Exhibition, SMPTE 2015, DOI: 10.5594/M001634.

  15. 15 Software-Defined Networking, http://en.wikipedia.org/wiki/Software-defined_networking.

  16. 16 Video Systems in an IT Environment (2nd Ed., 2009), Al Kovalick, Focal Press.

  17. 17 http://www.primefocustechnologies.com/sites/default/files/files-uploded/PFT_Whitepaper_TVTechnology.pdf.

  18. 18 Worldwide Public Cloud Services Spending Forecast to Double by 2019, According to IDC, https://www.idc.com/getdoc.jsp?containerId=prUS40960516.

  19. 19 Public Cloud Infrastructure as a Service (IaaS) Hardware and Software Spending, Statistica, 2016, https://www.statista.com/statistics/507952/worldwide-public-cloud-infrastructure-hardware-and-software-spending-by-segment/.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.12.222