Transforming images into matrix or tensor objects of various libraries

In most cases, images are represented in computer memory in an interleaved format, which means that pixel values are placed one by one in linear order. Each pixel value consists of several numbers representing a color. For example, for the RGB format, it will be three values placed together. So, in the memory, we will see the following layout for a 4 x 4 image:

rgb rgb rgb rgb
rgb rgb rgb rgb
rgb rgb rgb rgb
rgb rgb rgb rgb

For image processing libraries, such a value layout is not a problem, but many machine learning algorithms require different ordering. For example, it's a common approach for neural networks to take image channels separately ordered, one by one. The following example shows how such a layout is usually placed in memory:

r r r r   g g g g   b b b b
r r r r   g g g g   b b b b
r r r r   g g g g   b b b b
r r r r , g g g g , b b b b

So, often, we need to deinterleave image representation before passing it to some machine learning algorithm.

Moreover, we usually need to convert a color's value data type too. For example, OpenCV library users often use floating-point formats, which allows them to preserve more color information in image transformations and processing routines. The opposite case is when we use a 256-bit type for color channel information, but then we need to convert it to a floating-point type. So, in many cases, we need to convert the underlying data type to another one more suitable for our needs.

Table of Contents for Transforming images into matrix or tensor objects of various libraries

Create new playlist

Sign In

Sign Up

Table of Contents for
Transforming images into matrix or tensor objects of various libraries