The Imaging Chain and the Internet

Consider what taking a simple photograph meant 25 years ago. You had film cameras, negatives, dark rooms, emulsions, and paper. It would take days to produce a picture you could show to someone. Now consider what taking a photograph means today. With a smartphone that fits in our pocket, we can capture a photo, edit it, view it, and show it to anyone, all in as a little as a few seconds. These fantastic devices are simultaneously capable of capturing high resolution visuals, displaying them, and immediately distributing them to a billion other devices similarly capable of the exact same things. It is now actually perceivable that a photo you take one moment could be seen by over half of the entire human population a moment later. This power in visual expression has never existed before.

We can create and express visual media so incredibly fast that it is causing new behavioral patterns to emerge. Images and videos have come to inform our every purchasing decision, our selection of potential mates, and even our day-to-day communications with each other. They are used to meaningfully represent thought, emotion, intention, status and presence. Visual media has made every interaction on the Internet faster, especially as the Internet has expanded further and further beyond the desktop. Consequently, nearly every mobile application today is some combination of scrolling up, scrolling down, swiping left, swiping right, tapping or snapping image or video content.

These emergent behaviors have created an explosion in the amount of visual information traversing the Internet. Image and video data constitutes the vast, vast majority of all bandwidth used. Device makers, browser vendors, and network engineers have had to rapidly innovate to keep pace with this influx of content and the new markets that this content is creating. Higher resolution cameras, higher resolution displays, new ultra-compressible file formats, and new standards for delivering visual media are being introduced every week. Making all of these technologies fit together is causing rapid iteration in a technology stack known as the “imaging chain”.

The imaging chain is the sum total of all processes that factor into capturing a subject matter visually and communicating it back to the human visual cortex. It is a metric ton of intersecting sciences that we as consumers take incredible advantage of without ever even noticing. The relatively recent transition into digital imaging has had the consequence of abstracting most of the imaging science away from us. The average consumer does not need to consider the optics of their camera, the responsiveness of it’s capturing sensors, the nuances of the file format the data is stored in, the consequences of the compression algorithms used, the performance of the uplink, or the resolutions of the audience’s screens when posting a photo to Instagram.

As technology advances, new uses for visual media are discovered and the imaging chain is forced to adapt to keep pace. A good example is the rise of cellphone photos such as “selfies” and “snaps”. Given the quick and disposable nature of cellphone photos, few invest much sophistication into preparing their shots. This often leads to poor construction, tilted horizon lines, and blurry focus. To help stabilize and orient these kinds of images, accelerometers and gyroscopes are added to smartphone cameras. Signals from these sensors are fed through algorithms to help make sense of the captured image data (a concept known as computational photography). This has the compound effect of reducing motion blur, correcting horizon lines and fixing other deficiencies that occur when “shooting from the hip.”

The unanticipated consequence of adding accelerometers and gyroscopes to smartphones is that they emit new signals to the imaging chain that have sparked new possibilities. These new signals enable developers to orient visual media to the device, fixing the aforementioned problems. More interestingly though, these signals also enable the reciprocal capability of orienting the device within the visual media. By strapping one of these devices to your face, you can feed your own viewing perspective into the imaging chain. This is how “virtual reality” has developed in this technology cycle. The subject matter may be a digital projection of a fictional world, but the stack driving it is no different then the one used to quickly orient your selfies. Virtual reality is a new output of the modern imaging chain that only became possible with new signals for position and orientation.

Whenever a new signal is introduced, it amplifies the possibilities of the entire imaging chain. GIFs, Vines, Snapchats, Periscopes, Cinemagrams, Twitch streams, 3D video, virtual reality, and augmented reality are all different exercises of the same fundamental imaging chain, with a variety of new signals playing a role in how they came into existence. There are going to be more signals introduced as our devices increase in capability and our uses for visual media expand. The reality is that the Internet will increasingly operate in service to visual media until the imaging chain and the Internet become indistinguishable. Developing technologies that understand and respond to signals feeding into the imaging chain will be the greatest area of exploration and value creation in Internet technology for the next decade.

 
122
Kudos
 
122
Kudos

Now read this

Page Weight Matters

Three years ago, while I was a web developer at YouTube, one of the senior engineers began a rant about the page weight of the video watch page being far too large. The page had ballooned to as high as 1.2MB and dozens of requests. This... Continue →