Chapter 11
Applications, Training, Results

There is nothing so powerful as truth—and often nothing so strange.

—Daniel Webster

Overview

This chapter walks through the process of training a test application, using an interactive training scenario to provide insight into genome metric comparisons, and scoring results. The process of genome selection, strand building, genome compare testing, metric selection, metric scoring, and reinforcement learning are outlined at a high level. This chapter does not provide a complete working application as source code, but instead explores low-level details using VGM command line tools to develop intuition about VGM internals.

The key section in this chapter for understanding metrics and scoring for the genome compare tests is “Selected Uniform Baseline Test Metrics” in the “Test Genomes and Correspondence Results” section, where a uniform baseline set of thousands of color, shape, and texture metrics are explained, as used for all genome compare tests. By examining the detailed test results in the scoring tables for each genome compare, insight is developed for metric selection, qualifier metrics tuning, classifier design, and the various learning methods discussed in Chapter 4.

This chapter illustrates how each reference genome is a separate ground truth, requiring specific metrics discovered via reinforcement learning, as discussed in Chapter 4. The volume learning model leverages reinforcement learning, beginning from the autolearning hull thresholds, to establish an independent classifier for each metric in each space for each genome. VGM allows for classifier learning by discovering optimal sets of metrics to combine and tune as a group into a structured classifier for each genome.

The VGM command line tools are used to illustrate an interactive test application development process, including genome selection and training (see Chapter 5 for tool details). Problems and work-arounds are explored. The test application is designed to locate a squirrel in a 12MB test image (see Figure 11.1). Reinforcement learning considerations are highlighted for a few robustness criteria such as rotation and lighting invariance. Detailed metric correspondence results are provided.

In addition to the test application, three unit tests are presented and executed to show how pairs of genomes compare against human expectations. The three unit test categories are (1) genome pairs which look like matches, (2) genome pairs which appear to be close matches and might spoof the metrics comparisons, and (3) genomes which are clearly nonmatches. Summary test tables are provided containing the final unit test scores, which reveal anomalies in how humans compare genomes, why ground truth data selection is critical, and why reinforcement learning and metrics tuning are needed to build a good set of baseline metrics for scoring within a suitable classifier structure.

Major topics covered in this chapter include:

Test application outline

Background segmentations, problems, and work-arounds

Genome selection, strands, and training

Baseline genome metrics selection from CSV agents

Structured classification

Testing and reinforcement learning

Nine selected test genomes to illustrate genome compare metrics

Selected metrics (Color, Texture, Shape)

Scoring strategies

Unit tests showing first order metrics for MATCH, NOMATCH CLOSE MATCH

Sample high level agent code

Figure 11.1: (Left) 12MP 4000x3000 test image containing several squirrel instances, (center) 400x300 squirrel region from 12MP image, and (right) squirrel head downsampled 10x to 40x30 (as a DNN might see it after downsampling the entire training image from 4000x3000 to 400x300).

Test Application Outline

The test image in Figure 11.1 (left) includes several views of a squirrel, with different pre-processing applied to each squirrel, so the squirrels are all slightly different. A set of test genomes are extracted from the test image for genome comparisons, identified later in this chapter. VGM operates on full resolution images, and no downsampling of images is required, so all pixel detail is captured in the model.

Here is an outline and discussion of the test application parameters, with some comparison of VGM training to DNN training:

Image resolution: 4000x3000 (12MP image). Note that the VGM is designed to fully support common image sizes such as 5MP and 12MP, or larger. Note that a DNN pipeline operates typically on a fixed size such as 300x300 pixel images; therefore, this test application could not be performed using a DNN (see the Figure 11.1 caption).

Robustness criteria: scale, rotation, occlusion, contrast, and lighting variations are illustrated in the compare metrics. The training set contains several identical squirrels, except that the separate squirrel instances are each modified using image pre-processing applied to change rotation, scale, and lighting to test invariance criteria. DNN training protocols add training images expressing the desired invariance criteria.

Interactive training: interactive training sessions are explained, where the trainer selects the specific squirrel genome features to compose into a strand object, similar to the way a person would learn from an expert. (Note that an agent can also select genomes and build strands, with no intervention.) By contrast, DNN training protocols learn the entire image presented, learning both intended and unintended features, which contributes to DNN model brittleness.

Rapid training: VGM can be trained quickly and interactively from selected features in a single image, and fine-tuned during training using additional test images or other images. VGM training is unlike DNN training, since DNN training requires epochs of training iterations (perhaps billions) as well as a costly-to-assemble set of perhaps tens and hundreds of thousands of hand-selected (or machine-selected and then hand-selected) training images of varying usefulness.

CSV agent reinforcement learning: the best metric comparison scores for each genome comparison are learned and stored in correspondence signature vectors (CSVs) by CSV agents, using the autolearning hull thresholds yielding first order metric comparisons. Reinforcement learning and training is then applied to optimize the metrics.

Here are the key topics in the learning process covered in subsequent sections of this chapter:

Strands and genome segmentations: selecting genomes to build strands, problems, and work-arounds

Testing and interactive reinforcement learning: CSV agents, finding the best metrics, classifier design, metrics tuning

Test genomes and correspondence results: compare test genomes; evaluate genome comparisons using baseline metrics

Strands and Genome Segmentations

As shown in Figure 11.2, note that the parvo segmentations (left) are intended to be smaller and more uniform in size, simulating the AR saccadic size, and the magno segmentations (right) are intended to include much larger segmentated regions as well as blockier segmentation boundaries. The jSLIC segmentation produced 1,418 total parvo scale genome regions, but only 1,246 were kept in the model, since 172 were culled because they were smaller than the AR size. See Chapter 2 regarding saccadic dithering and segmentation details. The morpho segmentation produced 551 magno scale features; 9 were culled as too large (> 640x480 pixels) and 75 were culled as smaller than the AR, yielding 467 modeled genome features. By using both the parvo and magno scale features, extra invariance is incorporated into the model. Using several additional segmentations combined would be richer and beneficial, but for brevity only two are used.

Figure 11.2: (Left) jSLIC parvo resolution color segmentation into uniform sizes and (right) morpho segmentation into nonuniform sizes.

Resulting genome segmentations of the squirrel are shown in Figure 11.3. Note that the parvo features are generally smaller and more uniform in size, and the magno features contain larger and nonuniform sizes, which is one purpose for keeping both types of features. No segmentation is optimal. Comparing nonuniform sized regions (i.e. a small region compared to a large region) is performed in the MCC classifiers by normalizing each metric to the region size prior to comparison. However, based on current test results, a few metric comparisons seem to work a little better when the region size is about the same; however, region size sensitivity is generally not an issue.

Figure 11.3: Two pairs of segmentations shown top and bottom. jSLIC hi-res segmentation (left of the pairs) from 4000x3000 RGB image vs. morpho lo-res segmentation (right of the pairs) from 800x600 5:1 downsampled image. See Chapter 2 for details.

Building Strands

An interactive training process is performed using the vgv command line tool discussed in Chapter 5, which displays the test image and allows for various interactive training options. Using vgv, a parvo strand and a magno strand are built to define the squirrel features, as well as define additional test features. To learn the best features, a default CSV agent is used to create and record comparison metrics in a CSV signature vector to allow the trainer to interactively learn preferred metrics. Not shown is the process of using the master learning controller, discussed in Chapter 5, to locate the best metrics and auto-generate C++ code to implement the CSV agent learnings in a custom agent.

For this example, we build a parvo strand as shown in Figure 11.4 and a magno strand as shown in Figure 11.5 to illustrate segmentation differences. By using both a parvo strand and a magno strand of the same object, robustness can be increased. Note that the magno strand includes coarser level features since the morpho segmentation is based on the 5:1 reduced resolution magno images, compared to the full resolution parvo images. Recall from Chapter 2 that the VGM models the magno pathway as the fast 5:1 subsampled low-resolution luma motion tracking channel, and the parvo channel is the slower responding high-resolution channel for RGB color using saccadic dithering for fine detail.

The following sections build strands from finer parvo features and coarser magno features, with some discussion about each strand. Note that during training, there can be problems building strands which manifest as shown in our examples. We discuss problems and work-arounds after both strands are built.

Parvo Strand Example

Figure 11.4: Parvo strand, smaller genome regions, smoother boundaries. This is the reference strand for this chapter exercise.

The commands used to build the parvo strands in Figure 11.4 are shown below. Details on the command vgv CREATE_STRAND are provided in Chapter 5.

Magno Strand Example

Figure 11.5: Magno strand from morph coarse segmentations; blocky region perimeter boundaries are due to the 5:1 subsampling of the magno images, also note back segmentation is wrong and includes part of the wall.

The commands used to build and show the magno strand in Figure 11.5 are shown below. Details on the command vgv CREATE_STRAND are provided in Chapter 5.

The magno strand in Figure 11.5 only contains three genomes, each coarser in size and shape compared to the nine finer grained genomes for the parvo strand in Figure 11.4. Also note that the squirrel back genome in Figure 11.5 contains part of the stucco wall—not good. A better strand will need to be built from a different segmentation. See the next section for discussion.

Discussion on Segmentation Problems and Work-arounds

Figure 11.5 illustrates the types of problems that can occur with image segmentation, which manifest as ill-defined genome region boundaries including wrong colors and textures, resulting in anomalous metrics and suboptimal genome region correspondence. As shown in Figure 11.5, the magno squirrel back+front_leg region is segmented incorrectly to include part of the stucco wall. What to do? Some discussion and work-arounds are given here to deal with segmentation issues as follows:

  1. Use a different LGN input image segmentation (raw, sharp, retinex, blur, histeq) or different color space component (Leveled, HLS_S, . . .)
  2. Use a different segmentation method or parma (jSLIC, morpho, . . .)
  3. Use multiple segmentations and perhaps create multiple strands; do not rely on a single segmentation
  4. Combinations of all of the above; learn and infer during training.

Future VGM releases will enhance the LGN model to provide as many of the above work-around options as possible to improve segmentation learning. The current VGM eye/LGN segmentation pipeline is still primitive but emulates how humans scan a scene by iterating around the scene, changing LGN image pre-processing parameters and resegmenting the scene in real-time. So for purposes of LGN emulation, it is advantageous to use more than one LGN image pre-processing method and more than one segmentation method, based on the analysis of the image metrics, to emulate the human visual system. Then strands can be built from the optimal segmentations.

Strand Alternatives: Single-image vs. Multi-image

There are two basic types of strands: single image or multi image. A default single-image strand collects genomes from a single-image segmentation, and a multi-image strand includes genomes from multiple-image segmentations, which may be desirable for some applications. Since the interactive training vgv app works on a single image at a time, we use single-images strands here. However, it is also possible to collect a set of independent genomes, and the agent can perform genome-by-genome correspondence and manage the scoring results.

For the current test exercise, we emulate agent-managed strands for genome-by-genome comparison, rather than using the strand set structure and shape correspondence functions provided in the VGM, since the aim is to provide more insight into the correspondence process. The strand correspondence functions are discussed in Chapter 8.

Testing and Interactive Reinforcement Learning

In this section we provide metrics comparison results between selected genome reference/target pairs in Figure 11.7 below. For each reference genome we illustrate the process of collecting and sorting genome-by-genome correspondence results to emulate the strand convenience function match__strand_set_metrics ( ... ) discussed in Chapter 8. By showing the metrics for selected genome compares, low-level details and intuition about genome correspondence is presented. For brevity, we ignore the strand metrics for angle, ellipse shape, and Fourier descriptor, which are provided in the MCC shape analysis function match__strand_shape_metrics ( ... ), and instead focus on just the set metrics and correspondence scores.

Hierarchical Parallel Ensemble Classifier

For these tests, a structured classifier is constructed, as shown in Figure 11.6, using a hierarchical ensemble of parallel MCCs. to compute correspondence using a group of CSV agents operating in an ensemble, where each CSV agent calls several MCC classifiers and performs heuristics and weighting based on the MATCH_CRITERIA and the OVERRIDES parameters (as discussed in Chapter 4). Each CSV agent uses an internal hierarchical classifier to select and weight a targeted set of metrics. The CSV agent MR_SMITH is called six times in succession using different match criteria parameters for each of the six agent invocations: MATCH_NORMAL, MATCH_STRICT, MATCH_RELAXED, MATCH_HUNT, MATCH_PREJUDICE, MATCH_BOOSTED. The end result is a larger set of metrics which are further reduced by a learning agent to a set of final classification metrics, shown in the following sections.

Figure 11.6: The hiearchical ensemble classifier used for the tests.

Reinforcement Learning Process

In this chapter, we go through a manual step-by-step process illustrating how reinforcement learning works inside the CSV agent functions. This approach builds intuition and reveals the challenges of learning the best metrics in the possible metric spaces. The CSV agents are used to collect metrics and sort the results, which we present in tables. Each CSV agent calls several MCC functions, performs some qualifier metric learning, as well as sorting the metric comparisons into MIN, AVE, and MAX bins for further data analysis. Please refer to Chapter 4 for a complete discussion of the VGM learning concepts including CSVs, qualifier learning, autolearning hulls, hull learning, and classifier family learning.

The approach for VGM reinforcement learning is to qualify and tune the selected metrics to correspond to the ground truth, so if metrics agree with the ground truth, they are selected and then further tuned. Obviously bad ground truth data will lead to bad classifier development. The learning process followed by the master learning controller reinforces the metric selection and sorting criteria to match ground truth, and involves interactive genome selection to build the reference and target strands, followed by an automated test loop to reinforce the metrics. We manually illustrate the MLC steps in this chapter as an exercise, instead of using the MLC, to provide intuition and low-level details.

In the examples for this chapter, the reference genome metrics are the ground truth. For VGM training as well as general training protocols, the reference genomes chosen are critical to establish ground truth assumptions. A human visual evaluation may not match a machine metric comparison, indicating that selected ground truth may not be a good sample to begin with, or that the metric is unreliable. The VGM learning process evaluates each reference genome metric against target test genome metrics, selecting the best scoring metrics, adjusting metric comparison weights, and reevaluating genome comparisons to learn and reinforce the best metrics to match ground truth. This is referred to as VGM classifier learning. Ground truth, as usual, is critical.

The master learning controller (MLC) can automatically find the best genomes to match the reference genome ground truth assumption by comparing against various target genomes and evaluating the first order metric comparison scores. However, in this chapter the MLC process is manually emulated to provide low-level details. The MLC process works as follows:

Note that various VGM reinforcement learning models can be defined in separate agents, and later agents can be retrained to produce derivative agents, allowing for a family of related agents to exist and operate in parallel. The default CSV agents can be used as starting points.

In the next sections we illustrate only the initial stages of the reinforcement learning process, step by step, by examining low-level details for a few select genome comparison metrics to illustrate how the agent may call MCC functions and CSV agents and then sort through correspondence scores to collect and reinforce the learnings.

Test Genomes and Correspondence Results

As shown in Figure 11.7, nine test genomes have been selected from Figure 11.1. Pairs of test genomes are assigned as target and reference genomes for correspondence tests. For each of the nine test genomes, a uniform set of base genome metrics are defined for texture, shape, and color. Most of the test metrics are computed in RGB, HLS, and color leveled spaces, while some metrics such as Haralick and SDMX texture are computed over luma only. Genome comparison details are covered in subsequent sections.

Figure 11.7: Parvo genome masks, shown as RGB masks and luma masks, used for metrics comparison illustrations.

*Note on object vs. category or class learning: This chapter is concerned with specific object recognition examples as shown in Figure 11.7. VGM is primarily designed to learn a specific object, a genome, or strand of genomes, rather than learn a category of similar objects as measured by tests such as ImageNet [1] which contain tens of thousands of training samples for each class, some of which are misleading and suboptimal as training samples. VGM learning relies on optimal ground truth. However, VGM can learn categories by first learning a smaller selected set of category reference objects and then extrapolating targets to category reference objects via measuring group distance in custom agents. See Chapter 4 on object learning vs. category learning.

Selected Uniform Baseline Test Metrics

A uniform set of baseline test metrics are collected into CSVs for each reference/target genome comparison. The idea is to present a uniform set of metrics for developing intuition. But for a real application, perhaps independent sets of metrics should be used as needed to provide learned and tuned ground truth metrics for each reference genome. In the following sections, the uniform baseline metrics are presented spanning the selected (T) texture, (S) shape and (C) color metrics. No glyph metrics are used for these tests.

Thousands of metric scores are computed for each genome compare by ten CSV agents used in an ensemble, spanning (C) color spaces, (T) texture spaces, (S) shape spaces, and quantization spaces. However, only a uniform baseline subset of the scores are used as a group to compute the final AVE and MEDIAN score for the genome compare, highlighted in yellow in the scoring tables below. Further tuning by agents would include culling the baseline metric set to use only the best metrics for the particular reference genome ground truth. Of course, other scoring strategies can be used such as deriving the average from only the first or second quartiles, rather than using all quartiles as shown herein.

After each test genome pair is compared, the results are summarized in tables. Finally, the summary results for all tests are discussed and summarized in Tables 11.10, 11.11, and 11.12.

Uniform Texture (T) Base Metrics

The SDMX and Haralick base texture metrics for each test genome (top row of each table) are shown in Tables 11.1 and 11.2. Base metrics of each reference/target genome are compared together for scoring. In addition to the Haralick and SDMX metrics, selected volume distance texture metrics can also be used, as discussed in Chapter 6. Each base metric in the SDMX and Haralick tables below is computed as the average value of the four 0-, 45-, 90-, and 135-degree orientations of SDMs in luma images.

Table 11.1: SDMX base metrics for test genomes (luma raw)

Table 11.2: Haralick base metrics for test genomes (luma raw)

Uniform Shape (S) Base Metrics

In the base metrics displayed in Table 11.3, centroid shape metrics are shown taken from the raw, sharp, and retinex images as volume projections (CAM neural clusters) as discussed in Chapter 8. Also see Chapter 6 for details on volume projections. The centroid is usually a reliable metric for a quick correspondence check.

Each [0,45,90,135] base centroid metric in Table 11.3 is computed as the cumulative average value of all volume projection metrics in [RAW, SHARP, RETINEX] images and each color space component [R,G,B,L, HLS_S].

Table 11.3: Centroid base metrics for the nine test genomes

Uniform Color (C) Base Metrics

The selected color base texture metrics for each test genome are listed and named in the rows of Table 11.4. The 15 uniform color metrics are computed across all available color spaces and color-leveled spaces by the CSV agents, and the average value is used for metric comparisons. Color metric details are provided as color visualizations below for each test genome: Figure 11.8 shows 5-bit and 4-bit color popularity methods, and Figure 11.9 shows the popularity colors converted into standard color histograms.

Table 11.4: 8-bit Color space metrics for test genomes

Each of the color metrics is computed a total of 24 times by each CSV agent, represented as three groups [RAW,SHARP,RETINEX] in each of eight color leveled spaces. Each CSV agent is invoked six times using different match criteria to build up the classification metrics set. The color visualizations in Figures 11.8 and 11.9 contain a lot of information and require some study to understand (see Chapter 7, Figures 7.10a and 7.11, for details on interpretation).

Figure 11.8: 5-bit MEDIAN_CUT and 5-bit POPULARITY_COLORS for each test genome. Each image contains 24 lines, each representing separate color spaces as per the details provided in Figure 7.10.
Figure 11.9: Standard color histograms for each test genome. The histogram contains 24 lines representing separate color spaces as per the details provided in Figure 7.11.

Note that traces of unexpected greenish color are visible in some of the standard histograms in Figure 11.9 (see also Figure 11.8). This is due to a combination of (1) the LGN color space representation, (2) a poor segmentation, and (3) standard color distance conversions. The popularity colors distance functions can filter out the uncommon greenish color, by (1) ignoring the least popular colors (i.e. use the 64 most popular colors) to maximize the use of the most popular colors, or (2) using all 256 popularity colors, thereby diminishing the effect of the uncommon colors. In addition, the color region metric algorithms, as described in Chapter 7, contain several options to filter and focus on specific popularity or standard colors metric attributes.

Test Genome Pairs

In the next section we provide low-level details for five genome comparisons between selected test genomes to illustrate the metrics and develop intuition. The metrics are collected across each of the ten different CSV agents discussed in Chapter 4. Note that each agent is designed to favor a different set of metrics as specified in MCC parameters to tune heuristics and weighting. Therefore, each agent can compute hundreds and thousands of different metrics, and each MCC function is optimized for different metric space criteria. The five genome comparisons are:

Compare leaf : head (lo-res) genomes

Compare front squirrel : stucco genomes

Compare rotated back : brush genomes

Compare enhanced back : rotated back genomes

Compare left head : right head genomes

For each genome comparison, the uniform metrics set is computed, and summary tables are provided for each score. Metrics are computed across the CST spaces for selected color, shape, and texture. The uniform set of metrics is highlighted in yellow in the tables, since not all the metrics in the tables are selected for use in the final scoring. The testing and scoring results are discussed after the test sections.

Compare Leaf : Head (Lo-res) Genomes

This section provides genome compare results for the test metrics on the leaf and head (lo-res) genomes in Figure 11.10. A total of 5,427 selected metrics were computed and compared by the CSV agents and summarized in Table 11.5. NOTE: Some metrics need qualification and tuning, such as dens_8, locus_mean_density.

TOTAL SCORE : 1.438 NOMATCH(MEDIAN)

Figure 11.10: Leaf:head (lo-res) genome comparison.

Table 11.5: Uniform metrics comparison scores for leaf:head genomes in Figure 11.10

Compare ront Squirrel : Stucco Genomes

This section provides genome compare results for the test metrics on the squirrel and stucco genomes in Figure 11.11. A total of 5,427 selected metrics were computed and compared by the CSV agents and summarized in Table 11.6. NOTE: Some metrics need qualification and tuning, such as dens_8.

CORRESPONDENCE SCORE : 15.57 NOMATCH (MEDIAN)

Figure. 11.11: Front squirrel : stucco genome comparison.

Table 11.6: Uniform metrics comparison scores for squirrel:stucco genomes in Figure 11.11

Compare Rotated Back : Brush Genomes

This section provides genome compare results for the test metrics on the rotated back and brush genomes in Figure 11.12. A total of 5,427 selected metrics were computed and compared by the CSV agents and summarized in Table 11.7. NOTE: Some metrics need qualification and tuning, such as Proportional_SAD8, dens_8, locus_mean_density. NOTE: Haralick and SDMX are computed in LUMA only for this example. Computing also in R, G,B,HSL_S and taking the AVE would add resilience to the metrics.

CORRESPONDENCE SCORE : 1.774 NOMATCH (MEDIAN)

Figure 11.12: Rotated back:brush genome comparison.

Table 11.7: Uniform metrics comparison scores for back:brush genomes in Figure 11.12

Compare Enhanced Back : Rotated Back Genomes

This section provides genome compare results for the test metrics on the enhanced back and rotated back genomes in Figure 11.13. A total of 5,427 selected metrics were computed and compared by the CSV agents, and summarized in Table 11.8. NOTE: Several metrics need qualification and tuning, and are slightly above 1.0 which is the match threshold, such as linearity_strength.* This is a case where tuning and second order metric training is needed, since the total score is so close to the 1.0 threshold.

*Haralick and SDMX are computed in LUMA only for this example. Computing also in R,G,B,HSL_S and taking the AVE would add resilience to the metrics.

CORRESPONDENCE SCORE: 0.9457 MATCH (MEDIAN)

Figure 11.13: Enhanced back: rotated back genome comparison.

Table 11.8: Uniform metrics comparison scores for enhanced back:rotated back genomes in Figure 11.13

Compare Left Head : Right Head Genomes

This section provides genome compare results for the test metrics on the left head and right head genomes in Figure 11.14. NOTE: the genomes should be identical, and are mirrored images of each other. A total of 5,427 selected metrics were computed and compared by the CSV agents, and summarized in Table 11.9. NOTE: Several metrics are perfect matches, such as centroid8 and Haralick CORR. And as usual the centroid metric works very well.

CORRESPONDENCE SCORE: 0.1933 MATCH (MEDIAN)

Figure 11.14: Left head: right head genome comparison.

Table 11.9: Uniform metrics comparison scores for left head:right head genomes in Figure 11.14

Test Genome Correspondence Scoring Results

All compare metrics are weighted at 1.0 unless otherwise noted; these are raw first order metric scores without qualification and weight tuning applied. The metric correspondence scores all come out at reasonable levels, and some scores may be suitable as is with no additional tuning or learning beyond the built-in autolearning hull learning applied to all first order metric comparisons by the MCC functions. As expected, the left and right mirrored squirrel heads show the best correspondence and the stucco and squirrel head show the worst correspondence.

Note that not all the uniform compare metrics are useful in each case. Some are more reliable than others for specific test genomes, and as expected the optimal metrics must be selected and tuned using reinforcement learning for best results. The uniform base metrics used for scoring are summarized in Tables 11.10 and 11.11 below, including agent override criteria details.

Table 11.10: Uniform set of metrics for the test genome correspondence scoring

Metric Category Metric Spaces Computed Per Category Agent Overrides
Color metrics Proportional_SAD5 Sl_contrast_JensenShannon8 Popularity5_CLOSEST MR_SMITH BOOSTED
MR_SMITH NORMAL
MR_SMITH NORMAL
Volume shape metrics Centroid_8
Density_8
AGENT_RGBVOL_RAW AGENT_RGBVOL_MIN
Volume texture metrics RGB VOL AVE of (0,45,80,135 GENOMES)
R,G,B,L HSL_S AVE of (0,45,90,135 GENOMES)
AGENT_RGBVOL_RAW
AGENT_RGBVOL_AVE
SDMX metrics SDMX AVE locus_mean_density SDMX AVE linearity_strength MR_SMITH NORMAL
MR_SMITH NORMAL
Haralick metrics HARALICK CONTRAST (scaled uniformly using a weight of 10.0) MR_SMITH NORMAL

Table 11.11: Total base metrics computed for each test genome

Metric Category Total Metrics Metric Spaces Computed Per Category
Color metrics 1,800 15 metrics, 5 color space components [R,G,B,L,HLS_S], 4 color levels, 5 images [RAW,SHARP,RETINEX,HISTEQ,BLUR], 6 agents: total metrics 1800 = 15x5x4x6
Volume shape metrics 2,250 9 metrics, 5 color space components [R,G,B,L,HLS_S], 5 images [RAW,SHARP,RETINEX,HISTEQ,BLUR], 10 agents, total metrics 9x5x5x10
Volume texture metrics 1,200 4 metrics, 5 color space components [R,G,B,L,HLS_S], 5 images [RAW,SHARP,RETINEX,HISTEQ,BLUR], 12 agents, total metrics: 4x5x5x12=1200
SDMX metrics 165 11 metrics, 5 angles, 3 images [RAW,SHARP,RETINEX], I color LUMA: total metrics: 11x5x3x1=165
Haralick metrics 12 4 metrics, 3 images [RAW,SHARP,RETINEX], 1 color LUMA, total metrics: 4x3x1=12
ALL TOTAL 5,427 Total of CSV metric comparisons made for each genome compare

A total 5,427 base metrics are computed for each genome comparison by the CSV agents, as shown in Table 11.11 and then compared using various distance functions as discussed in Chapters 610. The final correspondence scores are summarized in Table 11.12, showing that the uniform set of metrics used in this case does in fact produce useful correspondence. The CSV agents used to collect the test results include built-in combination classification hierarchies as well as metrics qualifier pairs and weight adjustments (see Chapter 4 and the open source code for CSV agent for details).

In an application, using a uniform set of metrics for all genome comparisons is usually not a good idea, since each genome still corresponds best to specific learned metrics and tuning. In other words, each reference genome is a separate ground truth, and the best metrics to describe each reference genome must be learned, as well as adding qualifier metrics tuning using trusted and dependent metrics as discussed in Chapter 4. Therefore, each genome is optimized for correspondence by learning its own classifier.

Scoring Results Discussion

Scoring results for all tests are about as expected. Table 11.12 summarizes the final scores. Note that even using only first order metric scores for the tests, the results are good, and further training and reinforcement learning would improve the scoring. Note that the left/right head genome mirrored pair are expected to have the best cumulative scores, as observed in Table 11.12, while the enhanced back and rotated back are very similar and do match, revealing robustness to rotation and color similarity. The other genome scores show nonmatches as expected. For scoring, the MEDIAN may be preferred instead of the AVE if the scores are not in a uniform distribution or range, as is the case for the front squirrel/stucco compare scores (see Table 11.12).

Table 11.12: Test correspondence scores (< 1.0 = MATCH, 0 = PERFECT MATCH)

*NOTE: Adjusted score is the average of the AVE+MEDIAN. Other adjusted score methods are performed by each CSV agent.

To summarize, the test metric correspondence scores in Table 11.12 are first order metrics produced by the CSV agents and MCC functions using default agent learning criteria and default overrides. No second order tuned weights are applied to the scores. Also note that the test scores in the preceding individual genome tests show that not all the metrics are useful in a raw first order state and need qualification pair and tuning to incorporate into an optimal learned classifier.

Scoring Strategies and Scoring Criteria

The master learning controller (MLC) would select a similar set of metrics similar to those in Table 11.4 and then create qualifier metrics and dependent metric tuning weights for a series of tests. As discussed in the “VGM Classifier Learning” section of Chapter 4, we can learn the best scoring feature metrics to reserve as trusted metrics and then qualify and weight dependent metrics based on the strength of the trusted metrics scores. Using qualifier metrics to tune dependent metrics is similar to boosting strategy used in HAAR feature hierarchies (see [168]).

However, scoring criteria are developed using many strategies, none being a perfect strategy. If the trainer decides that the sky is red, and the metrics indicate that the sky is blue, the metrics will be adjusted to match ground truth (“No, the sky should be red, it just looks blue; tune the metrics to make the sky look red, minimize this metric contribution in the final classifier, and adjust the training set and test set to make it appear that the sky is actually red.”). This is the dilemma of choosing ground truth, training sets, test sets, metrics, distance functions, weight tunings, and classifier algorithms.

Unit Test for First Order Metric Evaluations

This section contains the results for three unit tests and is especially useful to understand genome ground truth selection and scoring issues. The unit tests computing correspondence between visually selected genome pairs, intended to explore cases where the human trainer has uncertainty (close match or maybe a spoof), as well as where the human trainer expects a match or nonmatch. The tests provide only first order comparisons against a uniform set of metrics that are not learned or tuned. The unit tests include a few hundred genome compare examples total—valuable for building intuition about MCC functions.

Unit Test Groups

The test groups are organized by creating named test strands containing a pair of genomes to compare as follows:

GROUP 1: MATCH_TEST: The two genomes look like a match, based on a quick visual check. The metric compare scores in this test set do not all agree with the visual expectation, and this is intended to illustrate how human and machine comparisons differ.

GROUP 2: NOMATCH_TEST: The two genomes do not match, based on a quick visual check. This is expected to be the easiest test group to pass with no false positives.

GROUP 3: CLOSE_TEST: The genomes are chosen to look similar and are deliberately selected to be hard to differentiate and cause spoofing. This test group is the largest, allowing the metrics to be analyzed to understand details.

For the unit tests, the test pairs reflect human judgment without the benefit of any training and learning. Therefore, the test pairs are sometimes incorrectly scored, intended to demonstrate the need for refinement learning and metrics tuning. Scoring problems may be due to (1) arbitrary and unlearned metrics being applied or (2) simply human error in selecting test pairs.

Unit Test Scoring Methodology

The scoring method takes the average values from a small, uniform set of metrics stored in CSVs for shape, color, and texture, as collected by the ./vgv RUN_COMPARE_TEST command output; a sample is shown below. The metrics in the CSV are not tuned or reinforced as optimal metrics, and as a result the average value scoring method used is not good and will be skewed away from a more optimal tuned metric score. The unit tests deliberately use nonoptimal metrics to cause the reader to examine correspondence problems to force analysis of each metric manually, since this exercise is intended for building intuition about metric selection, tuning, and learning. The complete unit test image set and batch unit test commands is available in the developer resources in open source code library, including an Excel spreadsheet with all metric compare scores as shown in the ./vgv RUN_COMPARE_TEST output below.

Each unit test is run using the vgv tool (the syntax is shown below), where the file name parameters at the end of each vgv command line (TEST_MATCH.txt, TEST_NOMATCH.txt, TEST_CLOSE.txt) contains the list of strand files containing the chosen genome reference and target pairs encode in individual strand files. The strand reference and target genome pairs can be seen as line-connected genomes in the test images shown in the unit tests in Figures 11.15, 11.16, and 11.17.

The final score for each genome pair compare is computed by the CSV agent agent_mr_smith() using a generic combination of about fifty CTS metrics, heavily weighted to use CAM clusters from volume projection metrics. Without using a learned learned classifier, metric qualification, or tuning, the scoring results are not expected to be optimal and only show how good (or bad) the autolearning hull thresholds actually work for each metric compare.

For example, compare metrics produced by agent mr_smith NORMAL are shown below for a genome comparison between the F18 side (gray) and the blue sky by the giant Sequoia trees (see Figure 11.17). The final average scores are shown at the end of the test output, as a summary AVE score of color, shape, and texture scores. Also, the independent AVE scores for color, shape, and texture are listed as well, showing that the color score of 6.173650 (way above the 1.0 match threshold) is an obvious indication of the two genomes not matching.


As shown in the test output above, the volume projection metrics (CAM neural clusters) show quite a bit of variability according to the orientation of the genome comparison (A, B, C, or D). Using the average orientation score does not seem to be very helpful. Rather, simply selecting the best scoring orientation from each CAM comparison and culling the rest would certainly make sense as a metric selection criteria, followed by a reinforcement learning phase. The color and texture metrics show similar scoring results, illustrating how metrics selection must be learned for optimal correspondence.

MATCH Unit Test Group Results

The MATCH test genome pairs are connected by lines, as shown in Figure 11.15. Each MATCH pair is visually selected to be a reasonable genome match. The final AVE scores for each genome compare are collected as shown in Table 11.13 and reflect no metric tuning or reinforcement learning. The results show mainly good match scoring < 1.0, which is to be expected in this quick visual pairing. However, the scoring results generally validate the basic assumptions of the VGM volume learning method.

Figure 11.15: MATCH test genome pairs connected by lines

Table 11.13: MATCH scores from genome pairs in Figure 11.15

STRAND SCORE
MATCH_F18_boxes.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.095470
MATCH_F18_shade.strand AGENT_SCORE_NO_MATCH :: 5.755328
MATCH_Guadalupe_flows.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.314187
MATCH_birch_branch.strand AGENT_SCORE_NO_MATCH :: 5.163757
MATCH_birch_branch_shadows.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.260213
MATCH_birch_branches2.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.727441
MATCH_blue_sky.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.665718
MATCH_duck_back_scaled.strand AGENT_SCORE_NO_MATCH :: 2.265636
MATCH_eves_shade.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.540162
MATCH_foliage_shaded.strand AGENT_SCORE_NO_MATCH :: 4.018654
MATCH_foliage_shaded2.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.143295
MATCH_mountain_high.strand AGENT_SCORE_NO_MATCH :: 5.951056
MATCH_mountain_top.strand AGENT_SCORE_NO_MATCH :: 3.759139
MATCH_mulch_shadow.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.443319
MATCH_red_leaves2.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.781485
MATCH_rose_leaves_variegated.strand AGENT_SCORE_NO_MATCH :: 3.697482
MATCH_sequoia_dark_shade.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.330170
MATCH_squirrel_back_shade.strand AGENT_SCORE_NO_MATCH :: 5.405850
MATCH_squirrel_body_brightness.strand AGENT_SCORE_NO_MATCH :: 3.333360
MATCH_squirrel_body_lightness.strand AGENT_SCORE_NO_MATCH :: 3.333360
MATCH_squirrel_head_scaled.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.666846
MATCH_squirrel_head_scaled2.strand AGENT_SCORE_NO_MATCH :: 2.586894
MATCH_squirrel_tail_scaled.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.726419
MATCH_stucco_blurred.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.433831
MATCH_stucco_brown2.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.194627
MATCH_stucco_shadows2.strand AGENT_SCORE_NO_MATCH :: 3.650200
MATCH_white_blurry_stucco.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.495373
MATCH_white_stucco_far.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.360789
MATCH_white_stucco_shadows.strand AGENT_SCORE_NO_MATCH :: 2.027036
MATCH_window_shades_dark.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.664059
MATCH_window_shades_scaled.strand AGENT_SCORE_NO_MATCH :: 3.091924
MATCH_window_shades_scaled_dark2.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.213476
MATCH_F18_side.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.592661
MATCH_bush.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.930191
MATCH_dark_squrrel.strand AGENT_SCORE_NO_MATCH :: 2.202091
MATCH_dark_stream.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.669367
MATCH_dark_window_shades.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.664784
MATCH_green_stucco.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.662674
MATCH_light_brown_stucco.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.723992
MATCH_light_squirrel.strand AGENT_SCORE_NO_MATCH :: 2.023359
MATCH_red_leaves.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.449019
MATCH_rose_foliage2.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.212059
MATCH_scaled_squirrel_tails.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.342239
MATCH_sequoia_foliage.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.481594
MATCH_white_stucco.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.463335
MATCH_mulch_patch.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.482750
MATCH_F18_black_box.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.139320
MATCH_F18_dashboard.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.603240
MATCH_F18_side.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.254288
MATCH_F18_topside.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.835994
MATCH_Guadalupe_river.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.205179
MATCH_black_rail.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.314160
MATCH_blue_sky.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.665718
MATCH_blurry_white_stucco.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.418017
MATCH_branch_pieces.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.795241
MATCH_bricks.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.478810
MATCH_brown_branch.strand AGENT_SCORE_NO_MATCH :: 2.095601
MATCH_brown_stucco.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.495838
MATCH_dark_bird_parts.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.628077
MATCH_dark_branch_pieces.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.800275
MATCH_dark_squirrel_sides.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.242783
MATCH_dark_window_shade.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.301873
MATCH_distant_forest.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.873285
MATCH_dove_bird.strand AGENT_SCORE_NO_MATCH :: 3.760651
MATCH_drainpipe.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.275193
MATCH_drainpipe_sections.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.514497
MATCH_duck_back_scaled.strand AGENT_SCORE_NO_MATCH :: 2.265636
MATCH_forest.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.260056
MATCH_green_foliage_scaled.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.768534
MATCH_green_stucco.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.450392
MATCH_gutter_sections.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.697693
MATCH_gutters.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.762837
MATCH_red_dark_leaves.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.673092
MATCH_red_leaf.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.323145
MATCH_red_leave_blurry.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.928191
MATCH_reddish_leaves.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.781485
MATCH_rose_leaves.strand AGENT_SCORE_NO_MATCH :: 2.157976
MATCH_rose_leaves1.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.284253
MATCH_rose_stem_and_leaves.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.760716
MATCH_scaled_sparrow_back.strand AGENT_SCORE_NO_MATCH :: 2.819345
MATCH_scaled_stucco.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.664633
MATCH_sequoia_foliage.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.501199
MATCH_sequoia_trunk.strand AGENT_SCORE_NO_MATCH :: 5.602278
MATCH_sequoia_trunk_parts.strand AGENT_SCORE_NO_MATCH :: 5.395199
MATCH_shrub_dead_leaves.strand AGENT_SCORE_NO_MATCH :: 3.202691
MATCH_shrub_leaves.strand AGENT_SCORE_NO_MATCH :: 2.288919
MATCH_shrub_pieces.strand AGENT_SCORE_NO_MATCH :: 2.206612
MATCH_sky_pieces.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.500165
MATCH_squirrel_front_leg.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.825025
MATCH_squirrel_head_scaled.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.661087
MATCH_squirrel_heads_scaled.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.796902
MATCH_squirrel_side_patch.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.879493
MATCH_squirrel_tail.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.970856
MATCH_squirrel_thigh.strand AGENT_SCORE_NO_MATCH :: 2.668796
MATCH_squirrel_thigh2.strand AGENT_SCORE_NO_MATCH :: 2.000557
MATCH_textured_stucco.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.046600
MATCH_thin_plant.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.950717
Plot of Table 11.13 MATCH scores: AVE 1.67 MEDIAN 1.260213

NOMATCH Unit Test Group Results

The NOMATCH test genome pairs shown in Figure 11.16 are selected to be visually obvious nonmatches for false positive testing. The final AVE scores are collected 11.14, reflecting no metric tuning or reinforcement learning. The results show zero matches scoring < 1.0, which is to be expected, and generally validates the basic assumptions of the VGM volume learning method.

Figure 11.16: NOMATCH test genome pairs connected by lines.

Table 11.14: NOMATCH scores from genome pairs in Figure 11.16

STRAND CORE
NOMATCH_Guadalupe_and_gray_sweater.strand AGENT_SCORE_NO_MATCH :: 2.919924
NOMATCH_brick_and_stucco.strand AGENT_SCORE_NO_MATCH :: 4.523128
NOMATCH_ceiling_and_gray_sweater.strand AGENT_SCORE_NO_MATCH :: 4.487708
NOMATCH_clouds_and_leave.strand AGENT_SCORE_NO_MATCH :: 6.778410
NOMATCH_dark_bird_and_dark_foliage.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.593241
NOMATCH_dark_bird_and_dark_windfow_shades.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.664686
NOMATCH_dark_rail_and_F18_side.strand AGENT_SCORE_NO_MATCH :: 15.757048
NOMATCH_dove_and_sky.strand AGENT_SCORE_NO_MATCH :: 4.772722
NOMATCH_duck_back_and_gray_sweater.strand AGENT_SCORE_NO_MATCH :: 5.107744
NOMATCH_face_and_F18_side.strand AGENT_SCORE_NO_MATCH :: 3.852120
NOMATCH_face_and_dark_window_shades.strand AGENT_SCORE_NO_MATCH :: 5.926214
NOMATCH_face_and_dead_leaves.strand AGENT_SCORE_NO_MATCH :: 7.410056
NOMATCH_face_and_reddish_blurry_leaf.strand AGENT_SCORE_NO_MATCH :: 4.493346
NOMATCH_face_and_stucco.strand AGENT_SCORE_NO_MATCH :: 4.224622
NOMATCH_gray_sweater_and_F18_blackbox.strand AGENT_SCORE_NO_MATCH :: 3.060954
NOMATCH_gray_sweater_and_F18_side.strand AGENT_SCORE_NO_MATCH :: 3.527554
NOMATCH_red_blurred_leaf_and_stucco.strand AGENT_SCORE_NO_MATCH :: 3.489793
NOMATCH_red_leaf_and_foliage.strand AGENT_SCORE_NO_MATCH :: 3.118031
NOMATCH_red_leaf_and_squirrel_back.strand AGENT_SCORE_NO_MATCH :: 4.042901
NOMATCH_reddish_leave_and_mulch.strand AGENT_SCORE_NO_MATCH :: 3.805251
NOMATCH_sky_and_branch.strand AGENT_SCORE_NO_MATCH :: 3.920162
NOMATCH_squirrel_and_bird.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.440115
NOMATCH_squirrel_and_brown_wood.strand AGENT_SCORE_NO_MATCH :: 2.336200
NOMATCH_stucco_and_bricks.strand AGENT_SCORE_NO_MATCH :: 2.205192
NOMATCH_yellow_and_reddish_leaves.strand AGENT_SCORE_NO_MATCH :: 6.421245
NOMTACH_stucco_and_squirrel.strand AGENT_SCORE_NO_MATCH :: 9.101189
NOMATCH_F18_metals_dark_light.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.677038
NOMATCH_F18_metals_highlights.strand AGENT_SCORE_NO_MATCH :: 2.416763
NOMATCH_F18_and_sequoias.strand AGENT_SCORE_NO_MATCH :: 5.383193
NOMATCH_bird_and_rail.strand AGENT_SCORE_NO_MATCH :: 3.503938
NOMATCH_bird_and_red_foliage.strand AGENT_SCORE_NO_MATCH :: 5.066473
NOMATCH_blue_sky_and_white_sky.strand AGENT_SCORE_NO_MATCH :: 4.240115
NOMATCH_dark_stucco_and_squirrel.strand AGENT_SCORE_NO_MATCH :: 2.427723
NOMATCH_dark_window_shade_and_white_stucco.strand AGENT_SCORE_NO_MATCH :: 15.187078
NOMATCH_green_stucco_and_dark_bird.strand AGENT_SCORE_NO_MATCH :: 6.916977
NOMATCH_green_stucco_and_foliage.strand AGENT_SCORE_NO_MATCH :: 5.197909
NOMATCH_red_foliage_and_squirrel.strand AGENT_SCORE_NO_MATCH :: 3.375886
NOMATCH_roof_eves_and_bird.strand AGENT_SCORE_NO_MATCH :: 2.629679
NOMATCH_roof_eves_and_sky.strand AGENT_SCORE_NO_MATCH :: 3.616902
NOMATCH_rose_leave_and_squirrel.strand AGENT_SCORE_NO_MATCH :: 11.977369
NOMATCH_squirrel_and_F18.strand AGENT_SCORE_NO_MATCH :: 4.047519
Plot of Table 11.14 NOMATCH scores : AVE 4.82 MEDIAN 4.042901

CLOSE Unit Test Group Results

The CLOSE test genome pairs shown in Figure 11.17 are selected to be visually close to matches, but are intended to fool the scoring and metric anomalies for further study. The final AVE scores are collected as shown in Table 11.15, reflecting no metric tuning or reinforcement learning. The results show only three matches scoring < 1.0, with a few other scores close to 1.0, validating the intent of the test.

Figure 11.17: CLOSE test genome pairs connected by lines.

Table 11.15: CLOSE scores from genome pairs in Figure 11.17

STRAND SCORE
CLOSE_F18_side_and_sky.strand AGENT_SCORE_NO_MATCH :: 3.722243
CLOSE_F18_white_gear.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.386598
CLOSE_Guadalupe_and_F18_side.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.614165
CLOSE_bark_mulch.strand AGENT_SCORE_NO_MATCH :: 5.660834
CLOSE_bird_back.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.135999
CLOSE_branch_and_dark_dove.strand AGENT_SCORE_NO_MATCH :: 7.895787
CLOSE_brown_bird.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.812419
CLOSE_brown_bird_and_mulch.strand AGENT_SCORE_NO_MATCH :: 3.610680
CLOSE_cream_branch.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.672800
CLOSE_dark_bush_and_dark_sequoia.strand AGENT_SCORE_NO_MATCH :: 9.356194
CLOSE_dark_rail_and_F18_black_box.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.727999
CLOSE_dark_window_shade_and_F18_black_box.strand AGENT_SCORE_NO_MATCH :: 3.563392
CLOSE_ducks_scaled_in_Guadalupe.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.049226
CLOSE_face_and_mulch.strand AGENT_SCORE_NO_MATCH :: 3.206767
CLOSE_face_and_sequoia_trunk.strand AGENT_SCORE_NO_MATCH :: 3.060082
CLOSE_foresst_and_shrub.strand AGENT_SCORE_NO_MATCH :: 4.620559
CLOSE_green_stucco_and_forest.strand AGENT_SCORE_NO_MATCH :: 3.367237
CLOSE_gutter_and_drainpipe.strand AGENT_SCORE_POSSIBLE_MATCH :: 1.217596
CLOSE_mulch_and_sequoia_trunk.strand AGENT_SCORE_NO_MATCH :: 5.072832
CLOSE_rose_old_new.strand AGENT_SCORE_NO_MATCH :: 2.162367
CLOSE_scaled_shaded_squirrels.strand AGENT_SCORE_NO_MATCH :: 6.919095
CLOSE_sequoia_and_rose_leaf.strand AGENT_SCORE_NO_MATCH :: 6.267932
CLOSE_sequoia_and_shrub.strand AGENT_SCORE_NO_MATCH :: 4.599521
CLOSE_shaded_scaled_blurred_squirrel_side.strand AGENT_SCORE_NO_MATCH :: 3.333360
CLOSE_shrubbery.strand AGENT_SCORE_NO_MATCH :: 3.463401
CLOSE_squirrel_and_brown_bird.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.787083
CLOSE_squirrel_and_brown_bird_back.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.400416
CLOSE_squirrels_shaded_blurred.strand AGENT_SCORE_NO_MATCH :: 4.996769
CLOSE_stucco_white_brown.strand AGENT_SCORE_NO_MATCH :: 3.847540
CLOSE_yellow_leaves.strand AGENT_SCORE_NO_MATCH :: 2.915234
CLOSE_yellowish_leaves.strand AGENT_SCORE_NO_MATCH :: 5.967200
CLOSE_F18_metal_regions.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.487417
CLOSE_F18_metals.strand AGENT_SCORE_NO_MATCH :: 4.164549
CLOSE_F18_shaded_regions.strand AGENT_SCORE_NO_MATCH :: 4.874395
CLOSE_F18_white_metal_red_trim_bad_segmentation. strand AGENT_SCORE_NO_MATCH :: 2.844012
CLOSE_Guadalupe_river.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.797324
CLOSE_birch_branch_brightness.strand AGENT_SCORE_NO_MATCH :: 6.223036
CLOSE_bird_and_bush_stem.strand AGENT_SCORE_NO_MATCH :: 2.746322
CLOSE_bush_trunk_and_birch_trunk.strand AGENT_SCORE_NO_MATCH :: 2.124699
CLOSE_ceiling_and_stucco.strand AGENT_SCORE_NO_MATCH :: 3.915943
CLOSE_ceiling_bright_and_stucco.strand AGENT_SCORE_NO_MATCH :: 3.649400
CLOSE_dark_squirrel2.strand AGENT_SCORE_NO_MATCH :: 3.318282
CLOSE_dead_leaves_and_mulch3.strand AGENT_SCORE_NO_MATCH :: 2.167940
CLOSE_different_green_leaves.strand AGENT_SCORE_NO_MATCH :: 3.998639
CLOSE_duck_back.strand AGENT_SCORE_NO_MATCH :: 2.106557
CLOSE_foliage_and_foliage.strand AGENT_SCORE_NO_MATCH :: 3.204379
CLOSE_foliage_and_rose_leaf.strand AGENT_SCORE_NO_MATCH :: 9.235049
CLOSE_foliage_and_rose_leaves2.strand AGENT_SCORE_NO_MATCH :: 2.751150
CLOSE_foliage_shaded.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.633926
CLOSE_light_squirrel2.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.403573
CLOSE_pole_and_F18box.strand AGENT_SCORE_NO_MATCH :: 3.559419
CLOSE_red_leaves_different.strand AGENT_SCORE_NO_MATCH :: 3.834274
CLOSE_sequoia_shade_scale.strand AGENT_SCORE_NO_MATCH :: 3.866813
CLOSE_sequoia_shaded.strand AGENT_SCORE_NO_MATCH :: 3.962010
CLOSE_squirrel_head_and_thigh.strand AGENT_SCORE_NO_MATCH :: 2.639876
CLOSE_squirrel_saddle_and_tail.strand AGENT_SCORE_NO_MATCH :: 5.853207
CLOSE_squirrel_tail_and_breast.strand AGENT_SCORE_NO_MATCH :: 3.888022
CLOSE_stucco_brown_white2.strand AGENT_SCORE_NO_MATCH :: 4.371788
CLOSE_F18_and_sky.strand AGENT_SCORE_NO_MATCH :: 3.722243
CLOSE_bird_and_suirrel.strand AGENT_SCORE_NO_MATCH :: 2.865686
CLOSE_branch_and_bird.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.643073
CLOSE_brick_and_dark_stucco.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.994665
CLOSE_bush_and_rose_foliage.strand AGENT_SCORE_NO_MATCH :: 4.572423
CLOSE_dead_leaves_and_mulch.strand AGENT_SCORE_EXCELLENT_MATCH :: 0.980117
CLOSE_fall_leaves.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.883045
CLOSE_metal_rail_and_box.strand AGENT_SCORE_NO_MATCH :: 2.105480
CLOSE_mulch_and_squirrel.strand AGENT_SCORE_NO_MATCH :: 3.632143
CLOSE_dark _eves_and_stucco.strand AGENT_SCORE_NO_MATCH :: 5.713335
CLOSE_rose_and_sequoia_foliage.strand AGENT_SCORE_NO_MATCH :: 2.994992
CLOSE_rose_and_sequoia_foliage2.strand AGENT_SCORE_NO_MATCH :: 4.101783
CLOSE_sequoia_branches_and_sky.strand AGENT_SCORE_UNLIKELY_MATCH :: 1.442477
CLOSE_sequoia_trunk_and_mulch.strand AGENT_SCORE_NO_MATCH :: 7.817751
CLOSE_sequoia_trunk_and_squirrel_tail.strand AGENT_SCORE_NO_MATCH :: 3.968621
CLOSE_yellowish_and_reddish_leaves.strand AGENT_SCORE_NO_MATCH :: 8.695245
NOMATCH_Guadalupe_and_gray_sweater.strand AGENT_SCORE_NO_MATCH :: 2.919924
Plot of Table 11.15 CLOSE scores : AVE 3.57 MEDIAN 3.367237

Agent Coding

Basic boilerplate agent code is provided below, illustrating one method to write a basic agent to: register a custom agent, receive sequencer pipeline callbacks, receive correspondence callbacks, and start and stop the pipeline. This section is intended to be brief, since the VGM open source code library contains all the information and examples needed to write agent code. See Chapter 12 for details on open source code.



Summary

This chapter lays the foundation for VGM application development, using interactive training to learn strands of visual genome features of a squirrel and other test set genomes. The training process illustrates how each reference genome is a separate ground truth item—with its own learned classifier using the autolearning hull and the best learned metrics, which shloud be tuned via reinforcement learning using qualifier metrics to learn a structured classifier. To illustrate the process and build intuition, a small set of test genomes is selected using the interactive vgv tool, and then CSV agents are used to collect and compare metrics between test genomes. Reinforcement learning details are illustrated by walking through the process of evaluating genome comparison scores, with discussion along the way regarding scoring criteria and scoring strategies. A set of three unit tests are executed to evaluate cases where the genomes (1) visually match, (2) might match, and (3) do not match. Finally, brief sample agent code is provided as an example of how to run through the VGM pipeline.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.159.178