Chapter 12
Visual Genome Project

What is seen was made from things that are not visible.

—Paul of Tarsus

Overview

This book describes the first release of the synthetic vision model and the VGM, pointing the way toward a visual genome project. During the early SW development and preparation for this book, several areas for future work have been identified and listed later in this chapter. Some of the areas for future work are expected to become trends in synthetic vision and machine learning research, as reflected in the topics of various published papers, other items are currently under development for the next version of the VGM regardless.

It is expected that improvements in future synthetic vision systems will be made by incorporating foundational improvements in several key areas including:

AI and learning model improvements

image segmentation accuracy and parameter learning

complimentary segmentation methods including real-time micro-segmentations

re-segmentations of interesting regions in multiple image and color spaces

developing better distance functions and learning distance function parameters

learning complimentary distance functions,

learning feature matching thresholds for scoring

learning the best features for a specific classifier or set of classifiers

automating the code generation and parameterization of classifiers

and learning multi-level classifier group structures instead of relying on a single classifier

To complete this chapter, we identify specific areas where the VGM will be enhanced, as well as how engineers and scientists may get involved in the project via software development.

VGM Model and API Futures

Current enhancements are being made to the VGM API, with several more planned for a future version. The items under development are shown in the list below.

Multiple segmentation API: most critical features include segmentation learning to identify the best parameters and region sizes, including targeted segmentations to support a range of image resolutions.

Dynamic segmentation API: an interactive tool interface providing several choices of genome segmentations around a feature of interest, like a segmentation display and picker.

Image space permutation API: allow for the selection of multiple genome cross-compare permutations in various image spaces such as reference:[raw, sharp, retinex, histeq, blur] target:[raw, sharp, retinex, histeq, blur] to find the optimal correspondence. This will be a high-level API which calls the MCC functions several times with various permutation ordering and parameterization and collects the results into a CSV.

Color space permutation API: provide additional API support to agents for the selection of multiple color space compare permutations: reference[color component, color_space, leveled-space]: target[color component, color_space, leveled-space]. This will be a high-level API which calls the MCC functions several times with a set of permutation orderings and parameterizations, and collects the results into a CSV.

Expanded volume processing spaces: provide parameter options for additional sharp, retinex, blur, and histeq volumes over a sliding range of strength, such as raw → soft_sharpen → medium_sharpen → heavy_sharpen.

Chain metric qualification API: start from selected reliable metrics as qualifier metrics, creating chains of candidate dependent metrics, similar to the boosting methods in ADABOOST [1].

Expanded autolearning hull families: provide additional autolearning hull defaults to cover image permutations, metric spaces, and invariance criteria. Hull selection will be parameterized into the MCC functions, with parameterized hull family code generation.

Expanded scoring API: provide API support in MCC functions for a family of autolearning hull computations tuned for each type of metric, taking better account of the metric range, using separate algorithms for volume hulls, color hulls, Haralick hulls, SDMX hulls. Will include a default Softmax classifier.

Color autolearning hull filtering: filter out unwanted colors due to poor segmentations by reducing the hull to the most popular colors.

Qualifier metric selection overrides API: a learning method to pair and weight dependent metrics against trusted metrics.

Low-bit resolution VDNA histogram API tool: for grouping and plotting metrics groups to visualize gross metric distance for selected metrics in an image or a group of images.

Alignment spaces specification API: to provide additional MCC parameters for 3D volume comparison alignment of target to reference volume, using 3D centroids as the alignment point. Besides centroid alignment, other alignments will be added such as sliding volume alignment (similar to the sliding color metrics). Additional 2D alignment spaces will be added to align target to reference values within color leveling alignment spaces and other alignment markers, such as min, max, ave, and centroid. (NOTE: centroid marker alignment is currently supported for MCC volume related functions via a global convenience function). Add a new alignment space model for squinting using a common contrast remapping method.

VGM Cloud Server, API, and iOS App

The VGM open source can be ported and hosted in a cloud server for interfacing to IoT devices, phones, tablets, or other remote devices. The entire VGM can be hosted on any suitable Linux device with adequate compute power, Porting considerations include the size of the VGM feature model used (i.e. VGM model profile level of detail), and necessary compute and memory resources to meet performance targets. Multicore hosts are recommended for acceleration; the more cores the better.

A VGM cloud server is available from krigresearch.com, including a remote cloud API. The cloud is designed for cooperative work and contains all the registries for images and model files, agents, and controllers discussed in Chapter 5. A remote device can rely on the cloud server for storage and computations for supplied images.

A simple app is available for Apple machines which is cloud based, allowing a nonprogrammer to interactively register images, build and train strands, learn classifiers, and search for strands in other registered images. A basic cloud API using SOAP/REST HTML protocol is also available with the app in order to build simple web-based applications. Also, the basic command line tools described in Chapter 5 are available as an iOS app, which runs on an iMac with at least four cores (the more cores the better).

Licensing, Sponsors, and Partners

The open-source code contains additional documentation and source code examples. Contact http://krigresearch.com to obtain open source code license information, as well as VGM Cloud Server, API, and App licensing information. Potential sponsors and partners for the visual genome project are encouraged to contact Krig Research.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.164.143