Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 6
Volume Projection Metrics

So much for our own account of these things. But in a more fitting place we shall attempt to show by quotations from the ancients, what others have said.

—Eusebius

Overview

We postulate volume projection metrics as CAM neurons, a biologically plausible model for clusters of low-level neurons describing an object region, that take input from image regions assembled in the LGN. CAM neurons produce low-level features to feed into the visual cortex V1–Vn regions, following the Hubel and Weisel model. CAM neurons are modeled via a 3-input and 1-out neuron, which composes an output address from three inputs. The output address is referred to as a CAM feature, which enables volume projection metrics. The CAM neuron encodes the pixel inputs provided to the neuron, and the output is a concatenation of all three inputs into an address (see Figure 6.1). The CAM address is a content-addressable memory (CAM). The CAM address is the feature. The number of CAM features found per genome is the metric.

Figure 6.1: A generic CAM neuron which concatenates magno or parvo LGN inputs into a CAM memory address (i.e. CAM feature).

The CAM address feature can be understood and visualized as a volumetric projection metric, where the CAM address is decomposed into (x,y,z) axis values used to record feature presence in a volume, as shown in the upper right of Figure 6.1. The CAM address is a volumetric projection metric.

Each input to the CAM neuron represents a low-level magno or parvo feature from triples, or a 3x1 matrix of adjacent pixels from oriented lines within LGN images at 0, 45, 90 and 135 degree angles or orientations and is assembled into a CAM address as shown in Figure 6.2. For some input spaces, the 3x1 triple is assembled from oriented lines, and for other input spaces, the 3x1 is assembled from Z columns from a single RBG pixel, as explained in the next section and summarized in Table 6.1 in the “CAM Feature Spaces” section.

Figure 6.2: The oriented 3x1 lines containing triples of consecutive pixels within LGN images used as input to CAM neurons. Copyright © 2016 Springer International Publishing. Used by permission (see [166]).

As shown in Figure 6.2, the 8-bit quantized volume projection space is contained in four 16M feature segments (224 addresses), one for each orientation A, B, C and D. Quantization spaces of 8,5,4,3,2 bits are supported (see Figure 6.6 for volume renderings).

Memory Structure: 3x3 vs. 3x1

Why use 3x1 features instead of 3x3 or some other shape? Here we provide some discussion, trade-offs, and future plans.

As shown in Figure 6.2, the CAM feature memory address is created by extracting four oriented 3x1 lines into four separate features oriented at 0,45,90, 135 degrees; the pixel values comprise the CAM address feature. The idea is to represent each visual pixel impression as a memory feature and preserve all visual information. The set of 3-byte concatenated memory address representations shown in Figure 6.2 are a simple way to preserve all the information in the 3x3 regions. Why not 3x3? A 9-byte (72-bit) CAM address using all nine pixels from a 3x3 pixel region was originally considered, and seems like a good idea, but 72-bit addressing is impractical for desktop computers as shown in Figure 6.3. While it is possible to reduce the pixel resolution to 5-bits and still retain good accuracy, the resulting 3x3 245 memory space is about 35 terabytes (35,184,372,088,832), which is still too large for common computers today. Even a 4-bit quantization 236 yields a 6 terabyte address space 6,871,947,673,668; the resolution would be too low for many metrics. Intel provides a 128-bit set of math operations, but still the memory address space is 32GB for the current XEON processors (which apparently only uses 34 address lines).

Figure 6.3: A 9-byte address scheme for 72-bit addressing, which is currently too large for common computer systems.

The current VGM implementation uses a trade-off to segment the address space into four oriented 3x1 regions, as shown in Figure 6.2. Most desktop computers using 32-bit and 64-bit memory addressing with commercial operating systems support at least 2GB of address space per process, so 24-bit addresses are fine (8 bits x 3). NOTE: For practical reasons, desktop computers and operating systems do not use all 64 bits of the CPU address lines to map against a contiguous 64-bit addressed memory space.

However, in the future when computers provide much larger address spaces, it is desirable to add a larger CAM neuron type into the system, capable of 72-bit addressing using a 3x3 kernel to compute the address. Or eventually perhaps even a 5x5 kernel region to produce a 25-byte address with 2200 bits.

CAM Feature Spaces

CAM addresses describe a primitive combined color, texture, and shape metric in and oriented quantization space, implemented as variable bit precision values. The quantization space provides a form of blur-sharp and scale invariance analogous to an image pyramid as used with SIFT, ORB, and other feature descriptors [1]. Thus, the content of the feature forms the CAM address.

Note that there are several types of input spaces taken by CAM neurons; to describe it differently, there are several types of CAM neurons in the VGM metrics spaces as shown in Figure 6.4:

R,G,B,I color component spaces: 3x1 inputs from each R,G,B color component, one 3x1 for each 2D orientation [0,45,90,135 deg.]
RGB triple space: the 3x1 is composed of the RGB colors from each pixel [R,G,B] with no orientation, emulating the rods and cones at a particular LGN spatial pixel location.
LBP space: the 3x1 is composed of the RGB LBP (linear binary pattern) space values [RLBP,GLBP,BLBP] for each pixel, computed using 3x3 LBP kernels, with no orientation.
RANK MIN space: the 3x1 is composed of the MIN value for each RGB color component [RMIN, GMIN, BMIN] computed over a 3x3 kernel in each color component for each pixel, with no orientation.
RANK MAX space: the 3x1 is composed of the MIN value for each RGB color component [RMAX, GMAX, BMAX] computed over a 3x3 kernel in each color component for each pixel, with no orientation.
RANK AVE/BLUR space: the 3x1 is composed of the AVE value for each RGB color component [RAVE, GAVE, BAVE] computed over a 3x3 kernel in each color component for each pixel, with no orientation.

Table 6.1: The CAM Neuron Input Spaces

CAM Neuron Inputs Spaces	# Input Features for Each Space
3x1 matrix of adjacent values [p-1, p, p+1] from oriented lines	Spaces: R,G,B,I
R 3x1 [p-1,p,p+1] [A_0,B_90,C_135,D_45] [8-bit, 5-bit, 4-bit, 3-bit, 2-bit]
G 3x1 [p-1,p,p+1] [A_0,B_90,C_135,D_45] [8-bit, 5-bit, 4-bit, 3-bit, 2-bit]	4 orientations *
B 3x1 [p-1,p,p+1] [A_0,B_90,C_135,D_45] [8-bit, 5-bit, 4-bit, 3-bit, 2-bit]	5 quantizations per space
I 3x1 [p-1,p,p+1] [A_0,B_90,C_135,D_45] [8-bit, 5-bit, 4-bit, 3-bit, 2-bit]
1x3 Matrix Z-column from components of single pixels with no orientation	Spaces: RGB, LBP, MIN, MAX, AVE
RGB -> 3x1 [R,G,B] [8-bit, 5-bit, 4-bit, 3-bit, 2-bit]
LBP -> 3x1 [RLBP,GLBP,BLBP] [8-bit, 5-bit, 4-bit, 3-bit, 2-bit]	5 quantizations per space
RANK-MIN -> 3x1 [RMIN,GMIN,BMIN] [8-bit, 5-bit, 4-bit, 3-bit, 2-bit]
RANK-AVE -> 3x1 [RAVE,GAVE,BAVE] [8-bit, 5-bit, 4-bit, 3-bit, 2-bit]
RANK-MAX -> 3x1 [RMAX,GMAX,BMAX] [8-bit, 5-bit, 4-bit, 3-bit, 2-bit]

Figure 6.4: The 25 different types of CAM neurons, corresponding to the CAM neuron input spaces and magno and parvo feature channels.

CAM Neural Clusters

All the CAM addresses in a genome feed into a set of summary CAM neural clusters, which record all CAM features from each input space as shown in Figure 6.4 into a set of 3D histogram volumes to sum the occurrence of each feature in the genome for each input space (see Figure 6.5).

Figure 6.5: A CAM neural cluster, which records all occurrences of each CAM neuron within each genome into a 3D histogram volume.

As shown in Figure 6.5, the CAM neurons feed into a CAM neural cluster to sum all the CAM features in the genome—one cluster for each specific metric input space. Since there are 25 CAM input spaces (Figure 6.4), there are 25 corresponding CAM neural clusters per genome, one per each of the five pre-processed images (raw, sharp, retinex, histeq, blur), for a total of 125 CAM cluster neurons per genome. For well-segmented genomes representing homogenous bounded regions, the CAM neural clusters are regular shapes centered about the axis, usually with very few outliers. In other words, the feature counts are concentrated in a smaller area revealing similar features, rather than spread out in the volume revealing something more like a noise distribution of unlike features.

The magnitude (corresponding to size) of the CAM cluster neuron emulates biologically plausible neural growth [1] each time a CAM neuron feature impression increments the corresponding (x,y,z) cell in the CAM neural cluster. The CAM cluster neuron is a memory device. Each cluster represents related features from an input space. The size of each neuron follows plausible neuroscience findings and is determined by (1) how often the visual impression is observed and (2) the number of neural connections. Thus, CAM cluster neuron size is a function of the frequency which a visual function is observed.

As an alternative to the 3x1 pixel mappings to generate the CAM cluster addresses, the VGM supports various other methods as discussed in Table 6.1; for example, RGB volume clustering uses each RGB pixel component to compose an (x,y,z) address by assigning x = R, y = G, z = B, so for each pixel in the genome we increment the neural cluster:

\begin{array}{l} \sum_{x = 0}^{l e n_{x}} \sum_{y = 0}^{l e n_{y}} i n c r e m e n t_f e a t u r e_c o u n t (v_{x = r p i x e l (x, y), y = g r e e n p i x e l (x, y), z = b l u e p i x e l (x, y)}) \\ w h e r e : \\ v_{x, y, z} = v o l u m e a d d r e s s c p p r d i n a t e s \\ x, y, z a r e v o l u m e a d d r e s s c o o r d i n a t e s f o r e a c h r, g, b p i x e l v a l u e \end{array}

$\begin{array}{l} \sum_{x = 0}^{l e n_{x}} \sum_{y = 0}^{l e n_{y}} i n c r e m e n t_f e a t u r e_c o u n t (v_{x = r p i x e l (x, y), y = g r e e n p i x e l (x, y), z = b l u e p i x e l (x, y)}) \\ w h e r e : \\ v_{x, y, z} = v o l u m e a d d r e s s c p p r d i n a t e s \\ x, y, z a r e v o l u m e a d d r e s s c o o r d i n a t e s f o r e a c h r, g, b p i x e l v a l u e \end{array}$

Volume Projection Metrics

CAM neural clusters can be rendered as a simple volume rendering as shown in Figure 6.6. The volume is the feature; another way to say it is the neural memory is the feature. CAM neural clusters are used for correspondence using various distance functions discussed in this chapter. The number of times a CAM feature is discovered over the entire genome region is recorded or summed in the volume, so the volume projection is a feature metric.

The metric projection concept is often employed in statistics; for example, the support vector machine (SVM) approach of representing metrics in higher dimensional spaces is commonly used to find a better correspondence (see Vapnik [80][77][78][79], and also [1]). Likewise, we find insights via multivariate volumetric projection metrics. The basic volume projection is based on simple Hubel and Weisel style edge information over RGBI + LBP color spaces taken within a quantization space, as discussed below, emulating varying levels of detail across the low levels of the magno and parvo pathways.

Figure 6.6: Volumetric projection metrics in a range of genome quantization spaces: left to right, 2-bit, 3-bit, 4-bit, 5-bit, 8-bit.

As shown in Figure 6.6, the volumetric projection metrics contain a range of color, shape, and texture metrics. The false coloring in the renderings represents impression counts (magnitude) across the genome for each CAM feature, so the volume rendering is a 4D representation. The volume renderings in Figure 6.6 are surface renderings in this case, obscuring the volume internals, and use a color map to represent magnitude at each voxel. Other volume rendering styles can be used to view the internals of the volume as shown later in this chapter. Volume metrics are often rendered using the familiar 3D scatter-plot for data visualization (see [81][82][83]). However, we use volumetric projections as a native metric for feature representation and correspondence. Several distance functions have proven to be useful as discussed later in this chapter. Note that the CAM neural clusters are accessed by the visual processing centers V1–Vn of the visual cortex and used for correspondence as texture, shape, and color features.

Quantization Space Pyramids

Quantization space pyramids are used to represent visual data at various levels of detail and can be used to perform first-order comparisons of genome metrics to narrow down the best matches within the genome space by using progressively more bits to increase resolution. We use 8-bit, 5-bit, 4-bit, 3-bit, and 2-bit quantization. The bit-level quantization simulates a form of attentional level of detail, which is biologically plausible [1].

As shown in Figure 6.7, different bit resolution per color yield different levels of detail, and quantization to 6-bits and 7-bits seems unnecessary, given that 8-bit quantization results are perceptually close to 5-bit quantization. Based on testing, we have found that 8-bit and 5-bit quantization yield similar correspondence results, so 5-bits are used for some of the color and volume metrics, but 8-bit color is better suited for many metrics. For color, using 5-bits instead of 8-bits coalesces similar colors into a common color, which is desirable for some metrics. The quantization input ? to the CAM neuron shown earlier in Figure 6.1 can be used to shape the memory address by masking each pixel to coalesce similar memory addresses which focuses and groups similar features together. Even so, the full 8-bit resolution is still preserved in the genome and used when needed for various metrics.

Figure 6.7: Bit quantization. Top left: 2-bits per RGB color, top right: 3-bits per RGB color, bottom left: 4-bits per RGB color, bottom right: 5-bits per RGB color. Note that 8-bit color is virtually indistinguishable from 5-bit color in almost all cases.

Strand CAM Cluster Pyramids

For each image, a strand containing a summary of 2-bit quantizations of CAM clusters for each genome can be created to assist in optimizing correspondence, similar to an image pyramid used by SIFT at various resolutions [1], where SIFT correspondence is measured across the image pyramid to find features even when the image scale changes. In an analogous manner, 2-bit quantized CAM neural cluster features can be created for each genome in 128 bits, which can be evaluated natively in most Intel processor instruction sets today. Therefore, by searching for 128-bit strands, it is possible to quickly narrow down candidates target genomes to follow up with higher-level correspondence at 8- or 5-bit resolution. Using quantization spaces larger than 2-bits is beyond 128 bits, more complicated, and not supported natively in the CPU instruction set. We will illustrate the concept with the example below.

Imagine we use a 2-bit (four unique values) resolution CAM volume, with 4x4x4=64 cells in the (x,y,z) volume. We reduce the resolution of each cell counter to 4-bits and scale the input magnitudes using floats for input and mask off to the range 0..4. Then the total number of unique 2-bit genomes is:

\begin{array}{l} 2^{2} * 2^{2} * 2^{2} = 64 c e l l c o u n t e r s 4^{64} \\ = 340, 282, 366, 920, 938, 463, 463, 374, 607, 431, 768, 211, 456 \\ 2^{128} = 340, 282, 366, 920, 938, 463, 463, 374, 607, 431, 768, 211, 456 \end{array}

$\begin{array}{l} 2^{2} * 2^{2} * 2^{2} = 64 c e l l c o u n t e r s 4^{64} \\ = 340, 282, 366, 920, 938, 463, 463, 374, 607, 431, 768, 211, 456 \\ 2^{128} = 340, 282, 366, 920, 938, 463, 463, 374, 607, 431, 768, 211, 456 \end{array}$

*By coincidence, the Intel XEON processor provides 128-bit arithmetic, and

4^{64} = 2^{128}

$4^{64} = 2^{128}$

So, an address composed of all 64 counters in a 2-bit quantized volume, each with a 4-bit counter in base 4 (0,1,2,3), can be represented in a 128-bit value and compared in a 128-bit Intel ALU register as follows:

Typically, a 20MP image sequences to perhaps 3,000 unique genome regions, so a strand for each 20MP image would contain a set of 3,000 2-bit quantized genomes, each having 128 bits or 16 bytes, which is supported by current Intel architecture.

Volume Metric Details

In this section we provide some discussion on the details of volume projection metrics, including the definitions, distance functions used, and memory size requirements.

Volume Impression Recording

Each time a CAM feature address is detected in the image, the count for the address is incremented in the CAM neural cluster volume, corresponding to feature commonality. The method for computing the feature addresses and counts is simple (as illustrated in the following code snippet) and relies on the quantization input value as an 8-bit hexadecimal mask value of 0xF8 (binary 1111 1000). Then each pixel value in the address is bit-masked into the desired quantization space to ignore the bottom three bits.

Volume Metrics Functions

CAM neurons are built for each of the five types of images (raw, sharp, blur, retinex, histeq), and at 5-bit quantization levels (8,5,4,3,2). So, taking the 25 different types of CAM neurons for each image as illustrated previously in Figure 6.4, there are a total of:

\begin{array}{l} 25_{C A M t y p e s} x 5_{i m a g e s s p a c e s} x 5_{q u a n t i z a t i o n l e v e l s} \\ = 625 t o t a l C A M f e a t u r e s p e r g e n o m e \end{array}

$\begin{array}{l} 25_{C A M t y p e s} x 5_{i m a g e s s p a c e s} x 5_{q u a n t i z a t i o n l e v e l s} \\ = 625 t o t a l C A M f e a t u r e s p e r g e n o m e \end{array}$

Currently, we define a set of 25 distance metrics for CAM features as shown in Table 6.2. Note that some of the metric functions are volume intersection metrics, and others are total volume metrics. The intersection metrics mi are computed if and only if both volume values are nonzero (i.e. an impression count exists in both volumes for the same (x,y,z) coordinate, so the volumes intersect), and the volume total metrics mt are computed over the entire volumes regardless of full or empty cells. The intersection and total distance metrics f() are computed as follows:

\begin{array}{l} i n t e r s e c t i o n = m_{i} \\ = \sum_{(V_{1} \neq 0) \Leftrightarrow (V_{2} \neq 0)} f (| V_{1} \\ - V_{2} |) (V o l u m e I n t e r s e c t i o n M e t r i c) \\ t o t a l = m_{t} = \sum f (| V_{1} - V_{2} |) (V o l u m e T o t a l M e t r i c) \end{array}

$\begin{array}{l} i n t e r s e c t i o n = m_{i} \\ = \sum_{(V_{1} \neq 0) \Leftrightarrow (V_{2} \neq 0)} f (| V_{1} \\ - V_{2} |) (V o l u m e I n t e r s e c t i o n M e t r i c) \\ t o t a l = m_{t} = \sum f (| V_{1} - V_{2} |) (V o l u m e T o t a l M e t r i c) \end{array}$

Table 6.2: The volume metrics and distance functions

Volume Projection Metrics	Volume Region
CAM Neural Cluster Difference Metrics $m = f (\| V_{1} - V_{2} \|)$ $m = f (\| V_{1} - V_{2} \|)$
ID_pyramid_SAD_genome_correlation ID_pyramid_IntersectionSAD_genome_correlation	total intersection
ID_pyramid_SSD_genome_correlation ID_pyramid_IntersectionSSD_genome_correlation	total intersection
ID_pyramid_Hellinger_genome_correlation ID_pyramid_IntersectionHellinger_genome_correlation	total intersection
ID_pyramid_Hammingsimilarity_genome_correlation ID_pyramid_IntersectionHammingsimilarity_genome_correlation	total intersection
ID_pyramid_Chebychev_genome_correlation ID_pyramid_IntersectionDivergencesimilarity_genome_correlation	total total
ID_pyramid_Outliermagnitude_genome_correlation ID_pyramid_Outlierratio_genome_correlation	total intersection
ID_pyramid_Cosine_genome_correlation ID_pyramid_IntersectionCosine_genome_correlation	total intersection
ID_pyramid_Jaccard_genome_correlation ID_pyramid_IntersectionJaccard_genome_correlation	total intersection
ID_pyramid_Fidelity_genome_correlation ID_pyramid_IntersectionFidelity_genome_correlation	total intersection
ID_pyramid_Sorensen_genome_correlation ID_pyramid_IntersectionSorensen_genome_correlation	total intersection
ID_pyramid_Canberra_genome_correlation ID_pyramid_IntersectionCanberra_genome_correlation	total intersection
CAM Neural Cluster Shape Metrics $m = f (V)$ $m = f (V)$
ID_centroid	total
ID_largest	total
ID_density	total
ID_displacement	total
ID_full	total
ID_weight	total
ID_spread	total

We apply the volumetric metrics primarily to texture and shape features as discussed in Chapters 8 and 9. However, the volumetric projections do contain color information as well, so the volumetric projections combine shape, texture, color, and quantization.

Volume Metrics Memory Size Discussion

Recall that a genome is a 2D segmented region of pixels, typically containing pixels. Each genome is computed for each type of possible input image from the eye/retinal model for color channel and pre-processing and also over either an orientation space or a volume space. Finally, each of the metric combinations discussed in the preceding section are computed within a quantization space. Here is a laborious illustration of the memory requirements for volumetric CAM features.

PARVO feature spaces

4CP = 4 color channels: red,green,blue,intensity

5IP = 5 pre-processed versions of each image: raw,sharp,retinex,histeq,blur

4OP = 4 genome orientations 0,90,135,45 degrees

5QP = 5 quantization channels: 8,5,4,3,2 bits

Parvo oriented spaces: 4CP ∗ 5IP ∗ 4OP ∗ 5QP = 4000

5ZP =5 Z-column spaces, *unoriented: RAW, RANK-MIN, RNK-MAX, RANK-AVE, LBP

Parvo Z-column spaces: 4CP ∗ 5IP ∗ 5QP ∗ 5ZP = 500

MAGNO feature spaces

1C M = 1 color channels: luma (intensity)

5l M = 5 pre-processed versions of each image: raw,sharp,retinex,histeq,blur

4O M = 4 genome orientations 0,90,135,45 degrees

5Q M = 5 quantization channels: 8,5,4,3,2 bits

Magno oriented spaces: 1C M∗ 5l M∗ 4O M ∗ 5Q M= 1000

5Z M =5 Z-column spaces, *unoriented: RAW, RANK-MIN, RNK-MAX, RANK-AVE, LBP

Magno Z-column spaces: 1C M∗ 5I M∗ 5Q M∗ 5Z M= 125

Total feature spaces = total CAM Neural Clusters

4000 + 500 + 1000 + 125 = 5625

*Clusters can be compared 29 ways via distance metrics; see Table 6.2.

Each quantization space determines the amount of memory required to contain the volume metric space, since the (x,y,z) coordinate range is determined by the quantization (see Figure 6.5 earlier in the chapter). Since each of the CAM neural cluster volumes consume a different amount of memory based on the quantization space, the total memory required to store all CAM feature volumes for a given image is worked out here for a 20MP image.

*NOTE: each volume cell is a 4-byte long range: 0 - 0xffffffff (4,294,967,295)

Parvo memory per individual volumes

4000/5 = 800 oriented spaces

500/5 = 100 Z-column spaces

800 + 100 = 900 spaces at each quantization (8,5,4,3,2)

\begin{array}{l} 8 - bit quantization: 900_{s p a c e s} * 2_{q u a n t i z a t i o n c e l l s}^{24} * 4_{\frac{b y t e s}{c e l l}} = 60, 397, 977, 600 b y t e s (60 G B) \\ 5 - bit quantization: 900_{s p a c e s} * 2_{q u a n t i z a t i o n}^{15} * 4_{\frac{b y t e s}{c e l l}} = 117, 964, 800 b y t e s (118 M B) \\ 4 - bit quantization: 900_{s p a c e s} * 2_{q u a n t i z a t i o n}^{12} * 4_{\frac{b y t e s}{c e l l}} = 14, 745, 600 b y t e s (15 M B) \\ 3-bit quantization: 900_{s p a c e s} * 2_{q u a n t i z a t i o n}^{9} * 4_{\frac{b y t e s}{c e l l}} = 1, 843, 200 b y t e s (2 M B) \\ bit quantization: 900_{s p a c e s} * 2_{q u a n t i z a t i o n}^{6} * 4_{\frac{b y t e s}{c e l l}} = 230, 400 b y t e s (230 K B) \end{array}

$\begin{array}{l} 8 - bit quantization: 900_{s p a c e s} * 2_{q u a n t i z a t i o n c e l l s}^{24} * 4_{\frac{b y t e s}{c e l l}} = 60, 397, 977, 600 b y t e s (60 G B) \\ 5 - bit quantization: 900_{s p a c e s} * 2_{q u a n t i z a t i o n}^{15} * 4_{\frac{b y t e s}{c e l l}} = 117, 964, 800 b y t e s (118 M B) \\ 4 - bit quantization: 900_{s p a c e s} * 2_{q u a n t i z a t i o n}^{12} * 4_{\frac{b y t e s}{c e l l}} = 14, 745, 600 b y t e s (15 M B) \\ 3-bit quantization: 900_{s p a c e s} * 2_{q u a n t i z a t i o n}^{9} * 4_{\frac{b y t e s}{c e l l}} = 1, 843, 200 b y t e s (2 M B) \\ bit quantization: 900_{s p a c e s} * 2_{q u a n t i z a t i o n}^{6} * 4_{\frac{b y t e s}{c e l l}} = 230, 400 b y t e s (230 K B) \end{array}$

Magno individual volumes

1000/5 = 200 oriented spaces

125/5 = 25 Z-column spaces

200 + 25 = 225 spaces at each quantization (8,5,4,3,2)

\begin{array}{l} 8 - bit quantization {:200}_{s p a c e s} * 2_{q u a n t i z a t i o n c e l l s}^{24} * 4_{\frac{b y t e s}{c e l l}} = 13, 421, 772, 800 b y t e s (13 G B) \\ 5 - bit quantization {:200}_{s p a c e s} * 2_{q u a n t i z a t i o n c e l l s}^{15} * 4_{\frac{b y t e s}{c e l l}} =, 214, 400 b y t e s (26 M B) \\ 4 - bit quantization {:200}_{s p a c e s} * 2_{q u a n t i z a t i o n c e l l s}^{12} * 4_{\frac{b y t e s}{c e l l}} = 3, 276, 800 b y t e s (3 M B) \\ 3 - bit quantization {:200}_{s p a c e s} * 2_{q u a n t i z a t i o n c e l l s}^{9} * 4_{\frac{b y t e s}{c e l l}} = 409, 600 b y t e s (410 K B) 0 \\ 2 - bit quantization {:200}_{s p a c e s} * 2_{q u a n t i z a t i o n c e l l s}^{6} * 4_{\frac{b y t e s}{c e l l}} = 51, 200 b y t e s (51 K B) \end{array}

$\begin{array}{l} 8 - bit quantization {:200}_{s p a c e s} * 2_{q u a n t i z a t i o n c e l l s}^{24} * 4_{\frac{b y t e s}{c e l l}} = 13, 421, 772, 800 b y t e s (13 G B) \\ 5 - bit quantization {:200}_{s p a c e s} * 2_{q u a n t i z a t i o n c e l l s}^{15} * 4_{\frac{b y t e s}{c e l l}} =, 214, 400 b y t e s (26 M B) \\ 4 - bit quantization {:200}_{s p a c e s} * 2_{q u a n t i z a t i o n c e l l s}^{12} * 4_{\frac{b y t e s}{c e l l}} = 3, 276, 800 b y t e s (3 M B) \\ 3 - bit quantization {:200}_{s p a c e s} * 2_{q u a n t i z a t i o n c e l l s}^{9} * 4_{\frac{b y t e s}{c e l l}} = 409, 600 b y t e s (410 K B) 0 \\ 2 - bit quantization {:200}_{s p a c e s} * 2_{q u a n t i z a t i o n c e l l s}^{6} * 4_{\frac{b y t e s}{c e l l}} = 51, 200 b y t e s (51 K B) \end{array}$

For practical reasons, the individual volume projection spaces are kept in files stored with lossless compression, which usually provides a 100:1 compression ratio, since most of the volume data is zeros and compresses very well. The volume projection spaces are loaded into memory on demand from files and uncompressed for correspondence. Assuming a 4000x3000 12Mpixel image is sequenced into 2,000 genome regions, total volumetric space storage is shown below. Note that 8-bit quantization spaces are computed for base metric computations, but not stored, reducing storage. So, total compressed volumetric storage space is usually 1–2GB per 12Mpixel image.

Parvo volume memory structures totals = ~120.270TB : 1–2GB actually stored (compressed)

8 − bit quantization ∶ 2000 ∗ 60GB = 120TB (not stored to save space)

5 − bit quantization ∶ 2000 ∗ 118MB = 236GB

4 − bit quantization ∶ 2000 ∗ 15MB = 30GB

3 − bit quantization ∶ 2000 ∗ 2MB = 4GB

2 − bit quantization ∶ 2000 ∗ 230KB = 460MB

Magno volume memory structures totals = ~27TB : ~10MB actulally stored (compressed)

8 − bit quantization ∶ 2000 ∗ 13GB = 26TB (not stored to save space)

5 − bit quantization ∶ 2000 ∗ 26MB = 52GB

4 − bit quantization ∶ 2000 ∗ 3MB = 6GB

3 − bit quantization ∶ 2000 ∗ 410KB = 820MB

2 − bit quantization ∶ 2000 ∗ 51KB = 102MB

Magno and Parvo Low-Level Feature Tiles

The magno and parvo feature tiles are composed of groups of low-level 3x1 pixel gradient information, following the basic Hubel and Weisel observations. Tiles are low-level features, summed in the higher-level CAM neural clusters features discussed earlier in this chapter. We note that Hubel and Weisel define primal shapes [1] for the lowest level receptive fields as oriented edge-like features, which CAM neurons model in a similar manner as edge orientations A, B, C and D. High-level feature shapes are represented in strands of segmented regions resembling corners, blobs, and arbitrarily shaped regions. The primal features are recorded over time by experiential learning (see [1, ref.552]). Magno and parvo tiles are illustrated in Figures 6.8 and 6.9.

Figure 6.8: A total of 15 image inputs for RGB parvo tiles (5x3). Copyright © 2016 Springer International Publishing. Used by permission (see [166]).

The parvo CAM features are computed from five input image spaces: raw, sharpened, blurred, local contrast enhanced, and global contrast enhanced, broken into 3 RGB channels, for a total of 15 input images composed into the four CAM orientations A, B, C, D for each RGB color. Figure 6.9 shows magno luminance channel input from the five input image spaces at the four magno CAM orientations A, B, C, D.

Figure 6.9: The luminance channel magno image recorded into four orientations; A, B, C and D, volume spaces not shown. Copyright © 2016 Springer International Publishing. Used by permission (see [166]).

Realistic Values for Volume Projections

It should be noted that for realistic images, the higher the quantization level, the more sparsely the volume will be populated. For example, for an 8-bit quantization, most of the volume will be empty and clustered around the center axis, but for a 2-bit quantization, most of the volume will be populated and likely still most highly populated around the center axis. Several volume renderings are provided to illustrate the point in the next section, “Quantized Volume Projection Metric Renderings.”

Note that maximally or widely diverging adjacent pixel values within the volumes do not often occur from natural images, and rather adjacent pixels are usually closer together in value. Widely diverging adjacent pixel values are more characteristic of very sharp edge transitions, noise and saturation effects; reasonable divergence corresponds to texture; and no divergence corresponds to no texture or a flat surface. So, the extremes of the volume address space will likely never be populated for visual genome features of natural images, which resemble sparse volumetric shapes clustered about the center axis.

The Visual Genome Project will be able to determine the most popular CAM clusters by sequencing millions of images and recording a master volume for all CAM neural clusters to record all known genomes. Or for a specific application domain, a master volume can be recorded as well.

Quantized Volume Projection Metric Renderings

The following volume renderings are made using the ImageJ Fiji Volume Viewer Plugin (http://fiji.sc/Volume_Viewer) to illustrate the detail provided by different quantization spaces. The renderings that are primarily reddish hues are surface renderings that include opaque surface lighting and shading but do not show internal details of the distributions. The bluish renderings do not use lighting and shading but rather use transparency effects to reveal the internal details of the volumes. The false coloring represents CAM neuron feature count.

The following renderings are provided:

Fig. 6.10. Renderings from a high-texture genome region from the Sequoias image

Fig. 6.11. Renderings from a medium-texture image of kids playing in a room

Fig. 6.12. An ambiguous rendering made without ignoring the 0-values in the addresses

Fig. 6.13. 4-bit quantization renderings of F18 ceiling region

Fig. 6.14. 5-bit quantization renderings of F18 ceiling region

Fig. 6.15. 8-bit quantization renderings of F18 ceiling region

Fig. 6.16. 8-bit rendering of F18 ceiling region under bright light, showing saturation effects in the addresses which bleed outside the volume

Figure 6.10: High-texture genome region from the Sequoias image. Top row: volume surface renderings of Sequoias genome_A; left to right: 8-bit, 5-bit, 4-bit; bottom row: transparent volume projection of Sequoias genome_A; left to right: 8-bit, 5-bit, 4-bit, false colored to show volume density.

Figure 6.11: Volume renderings of medium textured image of kids playing in a room: top row surface renderings, left to right: 8-bit, 5-bit, 4-bit; bottom row transparent volume renderings, 8-bit, 5-bit, 4-bit, false colored to show volume density.

Figure 6.12: Ambiguous rendering made without ignoring the 0-values. An illustration of the need to ignore the zero valued mask region in the mask in the top image when computing the genomes. In the bottom image, volume rendering shows the zero value artifacts rendered as slices of the central volume pasted to the zero planes along the (x,y,z) axis. By ignoring the zero values during rendering, all the data is rendered in the central volume region as expected.

Figure 6.13: 4-bit quantization renderings of F-18 ceiling region using a range of volume rendering parameters: Top row surface renderings, left to right, orientations 0, 90, 45, and 135 degrees; bottom row transparent renderings, left to right, orientations 0, 90, 45, and 135 degrees, false colored to show volume density.

Figure 6.14: 5-bit quantization renderings of F-18 ceiling region using a range of volume rendering parameters: Top row surface renderings, left to right, orientations 0, 90, 45, and 135 degrees; bottom row transparent renderings, left to right, orientations 0, 90, 45, and 135 degrees, false colored to show volume density.

Figure 6.15: 8-bit quantization renderings of F-18 ceiling region using a range of volume rendering parameters: Top row surface renderings, left to right, orientations 0, 90, 45, and 135 degrees; bottom row transparent renderings, left to right, orientations 0, 90, 45, and 135 degrees, false colored to show volume density.

Figure 6.16: Saturation effects where addresses bleed outside the volume. F-18 ceiling region: Top image, surface rendering of a ceiling region under bright light, notice that the pixel are saturated and bleed of the top axis of the volume. Shaded volume renderings: Top row, left to right, 8-bit, 5-bit, and 4-bit renderings; bottom row, transparent projections, left to right, 8-bit, 5-bit and 4-bit, false colored to show volume density.

Summary

In this chapter we discussed volume projection metrics, which are rendered into an (x,y,z) volume for purposes of visualization and correspondence, and represent CAM neuron clusters. We discussed how CAM neurons implement CAM address features within the 125 input spaces including RGB, pre-processed images, and low-level 3x1 Hubel & Weisel style primal gradient features. The concept of clustering CAM features summed into CAM neural clusters was discussed in detail, as well as the associated volume metrics and distance functions available in the synthetic model. Details on memory size for all the volumetric memory features were enumerated, along with some discussion on the trade-offs for creating volumetric metric structures from different sized micro regions. Finally, volume projections were presented as volume renderings across a representative range of quantization spaces to provide insight.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 6: Volume Projection Metrics

Create new playlist

Sign In

Sign Up

Chapter 6 Volume Projection Metrics