[1] Moravec HP. Obstacle Avoidance and Navigation in the Real World by a Seeing Robot Rover. Department of Computer Science, Stanford University; 1980.
[2] Harris C, Stephens M. A combined corner and edge detector. 50. Alvey Vision Conference. 1988;vol. 15 Manchester, UK,
[3] Rohr K. Localization properties of direct corner detectors. J. Math. Imaging Vision. 1994;4(2):139–150.
[4] Tomasi C, Kanade T. Detection and Tracking of Point Features. Citeseer; 1991.
[5] Shi J, Tomasi C. Good features to track. In: IEEE International Conference on Computer Vision and Pattern Recognition; 1994:593–600.
[6] Kenney CS, Zuliani M, Manjunath BS. An axiomatic approach to corner detection. 191–197. IEEE International Conference on Computer Vision and Pattern Recognition. 2005;vol. 1.
[7] Schmid C, Mohr R, Bauckhage C. Comparing and evaluating interest points. In: IEEE International Conference on Computer Vision; 1998:230–235.
[8] Lowe DG. Object recognition from local scale-invariant features. In: IEEE International Conference on Computer Vision, Corfu, Greece; 1999:1150–1157.
[9] Lowe DG. Distinctive image features form scale-invariant keypoints. Int. J. Comput. Vision. 2004;20(2):91–110.
[10] Mikolajczyk K, Schmid C. A performance evaluation of local descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 2005;27(10):1615–1630.
[11] Hubel DH, Wiesel TN. Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J. Physiol. 1962;160(1):106–154.
[12] Marr D. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. New York, NY, USA: Henry Holt and Co., Inc. 1982.
[13] Zhu SC, Wu YN, Mumford D. Minimax entropy principle and its application to texture modeling. Neural Comput. 1997;9(8):1627–1660.
[14] Poggio T, Girosi F. Networks for approximation and learning. Proc. IEEE. 2002;78(9):1481–1497.
[15] Li F-F, Pietro P. A Bayesian hierarchical model for learning natural scene categories. In: IEEE International Conference on Computer Vision, Rio de Janeiro, Brazil; 2007.
[16] Nister D, Stewenius H. Scalable recognition with a vocabulary tree. In: IEEE International Conference on Computer Vision and Pattern Recognition, New York, USA; 2006.
[17] Xie X, Lu L, Jia M, Li H, Seide F, Ma W-Y. Mobile search with multimodal queries. Proc. IEEE. 2008;4:589–601.
[18] Ke Y, Sukthankar R. PCA-SIFT: a more distinctive representation for local image descriptors. In: IEEE International Conference on Computer Vision and Pattern Recognition, Washington, DC, USA; 2004:506–513.
[19] Belongie S, Malik J, Puzicha J. Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 2002;24(4):509–522.
[20] Lazebnik S, Ponce J. A sparse texture representation using local affine regions. IEEE Trans. Pattern Anal. Mach. Intell. 2005;27(8):1265–1278.
[21] Brown M, Szeliski R, Winder S. Multi-image matching using multi-scale oriented patches. In: IEEE International Conference on Computer Vision and Pattern Recognition, San Diego, USA; 2005:510–517.
[22] Simon A, Winder J. Learning local image descriptors. In: IEEE International Conference on Computer Vision and Pattern Recognition, Minneapolis, USA; 2007.
[23] Hua G, Brown M, Winder S. Discriminant embedding for local image descriptors. In: IEEE International Conference on Computer Vision, Rio de Janeiro, Brazil; 2007.
[24] Philbin J, Chum O, Isard M, Sivic J, Zisserman A. Object retrieval with large vocabularies and fast spatial matching. In: IEEE International Conference on Computer Vision and Pattern Recognition, Minneapolis, USA; 2007.
[25] Indyk P, Thaper N. Fast image retrieval via embeddings. In: In 3rd International Workshop on Statistical and Computational Theories of Vision, Nice, France; 2003:1–15.
[26] Jegou H, Douze M, Schmid C. Hamming embedding and weak geometric consistency for large scale image search. In: European Conference on Computer Vision, Marseille, France, Springer; 2008:304–317.
[27] Schindler G, Brown M. City-scale location recognition. In: IEEE International Conference on Computer Vision and Pattern Recognition, Minneapolis, USA; 2007.
[28] Salton G, Buckley C. Term-Weighting Approaches in Automatic Text Retrieval. San Francisco, USA: Morgan Kaufmann Publishers, Inc.; 1988.
[29] Matas J, Chum O, Urban M, Pajdla T. Robust wide-baseline stereo from maximally stable extremal regions. Image Vision Comput. 2004;22(10):761–767.
[30] Sivic J, Philipin J, Zisserman A. Video Google: a text retrieval approach to object matching in videos. In: IEEE International Conference on Computer Vision, Nice, France; 2003:1470–1477.
[31] Jurie F, Triggs B. Creating efficient codebooks for visual recognition. In: IEEE International Conference on Computer Vision, Beijing, China; 2005:604–610.
[32] Yang J, Jiang Y, Hauptmann AG, Ngo C-W. Evaluating bag-of-visual-words representations in scene classification. In: ACM Multimedia Information Retrieval Conference, Augsburg, Germany; 2007:197–206.
[33] Wang L. Toward a discriminative codebook: codeword selection across multi-resolution. In: IEEE International Conference on Computer Vision and Pattern Recognition, Minneapolis, USA; 2007.
[34] Leung T, Malik J. Representing and recognizing the visual appearance of materials using 3-D textons. Int. J. Comput. Vis. 2001;43(1):29–44.
[35] Jegou H, Harzallah H, Schmid C. A contextual dissimilarity measure for accurate and efficient image search. In: IEEE International Conference on Computer Vision and Pattern Recognition, Minneapolis, USA; 2007.
[36] MacQueen D. Information Theory, Inference and Learning Algorithms. Cambridge, United Kingdom: Cambridge Press; 2003.
[37] Comaniciu D, Meer P. Mean Shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2002;24(5):603–619.
[38] Basu S, Bilenko M, Mooney RJ. A probabilistic framework for semi-supervised clustering. In: ACM Conference on Knowledge and Data Discovery, Seattle, USA; 2004:59–68.
[39] Mairal J, Bach F, Ponce J, Sapiro G, Zisserman A. Supervised dictionary learning. Advances in Neural Information Processing Systems. Vancouver, Canada: Neural Information Processing Systems Foundation; 2007 pp. 481–488.
[40] Lazebnik S, Raginsky M. Supervised learning of quantizer codebooks by information loss minimization. IEEE Trans. Pattern Anal. Mach. Intell. 2009;31(7):1294–1309.
[41] Moosmann F, Triggs B, Jurie F. Fast discriminative visual codebooks using randomized clustering forests. Advances in Neural Information Processing Systems. Vancouver, Canada: Neural Information Processing Systems Foundation; 2006 pp. 481–488.
[42] Perronnin F, Dance C, Csurka G, Bressan M. Adapted vocabularies for generic visual categorization. In: European Conference on Computer Vision, Graz, Austria, Springer; 2006:464–475.
[43] Zhang J, Marszalek M, Lazebnik S, Schmid C. Local features and kernels for classification of texture and object categories: a comprehensive review. Int. J. Comput. Vision. 2007;73(2):213–238.
[44] Liu J, Yang Y, Shah M. Learning semantic visual vocabularies using diffusion distance. In: IEEE International Conference on Computer Vision and Pattern Recognition, Miami, USA; 2009.
[45] Kohonen T. Learning vector quantization for pattern recognition, Technical Report, TKK-F-A601. Helsinki Institute of Technology; 1996.
[46] Kohonen T. Self-Organizing Maps. third ed. 2000 Springer, Cambridge, United Kingdom.
[47] Rao A, Miller D, Rose K, Gersho A. A generalized VQ method for combined compression and estimation. In: IEEE International Conference on Acoustics, Speech and Signal Processing, Toulouse, France; 1996:2032–2035.
[48] Leibe B, Leonardis A, Schiele B. Combined object categorization and segmentation with an implicit shape model. In: European Conference on Computer Vision, Prague, Czech, Springer; 2004:17–23.
[49] Agarwal S, Roth D. Learning a sparse representation for object detection. In: European Conference on Computer Vision, Prague, Czech, Springer; 2002:97–101.
[50] Bosch A, Zisserman A, Munoz X. Scene classification using a hybrid generative/discriminative approach. IEEE Trans. Pattern Anal. Mach. Intell. 2008;30(4):712–727.
[51] Gionis A, Indyk P, Motwani R. Similarity search in high dimensions via hashing. In: International Conference on Very Large Data Bases, Edinburgh, Scotland, Morgan Kaufmann; 2002:518–529.
[52] Shakhnarovich G, Darrell T, Indyk P. Nearest-Neighbor Methods in Learning and Vision: Theory and Practice. Cambridge, Massachusetts: MIT Press; 2006.
[53] Shakhnarovich G, Viola P, Darrell T. Fast pose estimation with parameter-sensitive hashing. In: International Conference on Computer Vision, Nice, France; 2003:750–757.
[54] Torralba A, Weiss Y, Fergus R. Small codes and large databases of images for object recognition. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Anchorage, United States; 2008.
[55] Weiss Y, Torralba A, Fergus R. Spectral hashing. Advances in Neural Information Processing Systems. Vancouver, Canada: MIT Press; 2008.
[56] Kulis B, Grauman K. Kernelized locality-sensitive hashing for scalable image search. In: IEEE International Conference on Computer Vision, Kyoto, Japan; 2009.
[57] Raginsky M, Lazebnik S. Locality-sensitive binary codes from shift-invariant kernels. Advances in Neural Information Processing Systems. Vancouver, Canada: MIT Press; 2009.
[58] Beis J, Lowe D. Indexing without invariants in 3D object recognition. IEEE Trans. Pattern Anal. Mach. Intell. 1999;21(10):1000–1015.
[59] Arya S, Mount D, Netanyahu N, Silverman R, Wu A. An optimal algorithm for approximate nearest neighbor searching in fixed dimensions. J. ACM. 1998;45(6):891–923.
[60] Liang L, Liu C, Xu Y, Guo B, Shum H. Real-time texture synthesis by patch-based sampling, ACM Trans. Graph. 2001;20(3):127–150.
[61] Hjaltason G, Samet H. Index-driven similarity search in metric spaces. ACM Trans. Database Syst. 2003;28(4):517–580.
[62] Nene S, Nayar S. A simple algorithm for nearest neighbor search in high dimensions. IEEE Trans. Pattern Anal. Mach. Intell. 1997;19(9):989–1003.
[63] Grauman K, Darrell T. Approximate correspondences in high dimensions. Advances in Neural Information Processing Systems. Vancouver, Canada: Neural Information Processing Systems Foundation; 2007 pp. 481–488.
[64] Muja M, Lowe D. Fast approximate nearest neighbors with automatic algorithm configuration. In: IEEE International Conference on Computer Vision Theory and Applications, Lisbon, Portugal; 2009.
[65] Bay H, Tuytelaars T, Gool LV. SURF: speeded up robust features. In: European Conference on Computer Vision, Graz, Austria, Springer. 2006:404–417.
[66] Csurka G, Bray C, Dance C, Fan L. Visual categorization with bags of keypoints. In: European Conference on Computer Vision, Workshop on Statistical Learning in Computer Vision, Prague, Czech, Springer; 2004:1–22.
[67] Fergus R, Perona P, Zisserman A. Object class recognition by unsupervised scale-invariant learning. In: IEEE International Conference on Computer Vision and Pattern Recognition, Madison, USA; 2003:264–271.
[68] Crandall, Felzenszwalb P, Hutternlocher D. Spatial priors for part-based recognition using statistical models. In: IEEE International Conference on Computer Vision and Pattern Recognition, San Diego, USA; 2005:10–17.
[69] Sivic J, Zisserman A. Video data mining using configurations of viewpoint invariant regions. In: IEEE International Conference on Computer Vision and Pattern Recognition, Washington, DC, USA; 2004:488–495.
[70] Quack T, Ferrari V, Gool LV. Video mining with frequent item set configurations. In: International Conference on Content-Based Image and Video Retrieval, Tempe, USA, Springer; 2006:360–369.
[71] Yuan J, Wu Y, Yang M. Discovery of collocation patterns: from visual words to phrase. In: IEEE International Conference on Computer Vision and Pattern Recognition, Minneapolis, USA; 2007.
[72] Quack T, Ferrari V, Gool LV. Efficient mining of frequent and distinctive feature configurations. In: IEEE International Conference on Computer Vision, Rio de Janeiro, Brazil; 2007.
[73] Fischler MA, Bolles RC. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM. 1981;24:381–395.
[74] Li T, Mei T, Kweon I-S, Hua X-S. Contextual bag-of-words for visual categorization. IEEE Trans. Circuits Syst. Video Technol. 2011;21(4):381–392.
[75] Wu Z, Ke Q, Isard M, Sun J. Bundling features for large scale partial-duplicate web image search. In: IEEE International Conference on Computer Vision and Pattern Recognition, Miami, United States; 2009.
[76] Brin S, Page L. The anatomy of a large-scale hypertextual (web) search engine. In: International World Wide Web Conference; 1998.
[77] Hofmann T. Probabilistic latent semantic indexing. In: ACM International Conference on Information Retrieval; 1999:50–57.
[78] Blei D, Ng AY, Jordan M. Latent dirichlet allocation. J. Mach. Learn. Res. 2003;3:993–1022.
[79] Harris C, Stephens M. A combined corner and edge detector. In: Alvey Vision Conference, Haifa, Israel, Alvey Publisher; 1988:147–152.
[80] Mikolajczyk K, Schmid C. Indexing based on scale invariant interest points. In: IEEE International Conference on Computer Vision, Vancouver, Canada; 2001:525–531.
[81] Mikolajczyk K, Schmid C. Scale and affine invariant interest point detectors. Int. J. Comput. Vision. 2004;60(1):63–86.
[82] Hubel D. Eye, Brain and Vision. New York: Scientific American Library; 1995.
[83] Gazzaniga M, Ivry R, Mangun G. Cognitive Neuroscience: The Biology of the Mind. second ed. New York: W.W. Norton; 2002.
[84] Viola P, Jones M. Rapid object detection using a boosted cascade of simple features. In: IEEE International Conference on Computer Vision and Pattern Recognition, Hawaii, USA; 2001:511–518.
[85] Lin H, Si J, Abousleman GP. Dynamic point selection in image mosaicking. Opt. Eng. 2006;45(3):030501–2–030501-3.
[86] Paletta L, Fritz G, Seifert C. Q-learning of sequential attention for visual object recognition from informative local descriptors. In: International Conference on Machine Learning, Bonn, Germany, International Machine Learning Society; 2005:649–656.
[87] Lazebnik S, Schmid C, Ponce J. Semi-local affine parts for object recognition. In: Britism Machine Vision Conference, London, United Kingdom, Britism Machine Vision Society; 2004:959–968.
[88] Bruckstein A, Rivlin E, Weiss I. Scale space semi-local invariants. Image Vision Comput. 1997;15(5):335–344.
[89] Bileschi S, Wolf L. Image representations beyond histograms of gradients: the role of gestalt descriptors. In: IEEE International Conference on Computer Vision and Pattern Recognition, Minneapolis, USA; 2007.
[90] Torralba A, Oliva A. Contextual guidance of attention in natural scenes: the role of global features on object search. Psychol. Rev. 2006;113(4):766–786.
[91] Torralba A, Murphy KP, Freeman WT. Contextual models for object detection using boosted random fields. Advances in Neural Information Processing Systems. Vancouver, Canada: Neural Information Processing Systems Foundation; 2004 pp. 1401–1408.
[92] Torralba A. Contextual priming for object detection. Int. J. Comput. Vision. 2003;53(2):169–191.
[93] Jegou H, Schmid C, Harzallah H, Verbeek J. Accurate image search using the contextual dissimilarity measure. IEEE Trans. Pattern Anal. Mach. Intell. 2009;32(1):2–11.
[94] Itti L, Koch C, Niebur E. A model for saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 1998;20(11):1254–1259.
[95] Serre T, Wolf L, Bileschi S, Riesenhuber M, Poggio T. Robust object recognition with cortex-like mechanisms. IEEE Trans. Pattern Anal. Mach. Intell. 2006;29(3):411–426.
[96] Hou X, Zhang L. Saliency detection: a spectral residual approach. In: IEEE International Conference on Computer Vision and Pattern Recognition, Minneapolis, USA; 2007.
[97] Jamieson M, Dickinson S, Stevenson S, Wachsmuth S. Using language to drive the perceptual grouping of local image features. In: IEEE International Conference on Computer Vision and Pattern Recognition, New York, USA; 2006.
[98] Fukunaga K. Statistical Pattern Recognition. second ed. Boston, MA, USA: Boston Academic Publishers, Inc.; 1990.
[99] Huang Y, Shekhar S, Xiong H. Discovering collocation patterns from spatial data sets: a general approach. IEEE Trans. Knowl. Data Eng. 2004;16(12):1472–1485.
[100] Mikolajczyk K, Tuytelaars T, Schmid C, Zisserman A, Matas J, Schaffalitzky F, et al. A comparison of affine region detectors. Int. J. Comput. Vision. 2006;65(1–2):43–72.
[101] Cheng Y. Mean shift, model seeking and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 1995;17(8):790–799.
[102] Dundar M, Bi J. Joint optimization of cascaded classifiers for computer aided detection. In: IEEE International Conference on Computer Vision and Pattern Recognition, Minneapolis, USA; 2007.
[103] Torralba, WordNet Structure in LabelMe, Available from: http://people.csail.mit.edu/torralba/research/LabelMe/wordnet/test.html.
[104] Path Labeling Corespondence Dataset Released in CVPR 2010. Towards semantic embedding in visual vocabulary, Available from: http://vilab.hit.edu.cn/~rrji/index_files/SemanticEmbedding.htm.
[105] Fellbaum C. WordNet: An Electronic Lexical Database. Massachusetts, USA: MIT Press; 1998.
[106] Pedersen T, Patwardhan S, Michelizzi J. WordNet: similarity-measuring the relatedness of concepts. In: Association for the Advancement of Artificial Intelligence Conference, San Jose, USA, Association for the Advancement of Artificial Intelligence; 2004:1024–1025.
[107] Li W, Sun M. Automatic Image Annotation Based on WordNet and Hierarchical Ensemble. Springer, Computational Linguistics and Intelligent Text Processing; 2006.
[108] Geman S, Geman D. Stochastic relaxation, gibbs distributions and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 1984;6:721–741.
[109] Hammersley JM, Clifford P. Markov fields on finite graphs and lattices. Unpublished manuscript; 1971.
[110] Winn J, Criminisi A, Minka T. Object categorization by learned universal visual dictionary. In: IEEE International Conference on Computer Vision, Beijing, China; 2005.
[111] PASCAL, Pascal voc database, Available from: http://www.PASCAL-network.org/challenges/VOC/.
[112] Chen D, Tsai S, Chandrasekhar V, Takacs G, Singh J, Girod B. Tree histogram coding for mobile image matching. DCC. 2009.
[113] Chandrasekhar V, Takacs G, Chen D, Tsai S, Grzeszczuk R, Girod B. CHoG: compressed histogram of gradients a low bit-rate feature descriptor. CVPR. 2009.
[114] Li F-F, Perona P. A Bayesian hierarchical model for learning natural scene categories. CVPR. 2005.
[115] Bosch A, Zisserman A, Munoz X. Scene classification using a hybrid generative/discriminative approach. PAMI. 2008.
[116] Weiss Y, Torralba A, Fergus R. Spectral hashing. NIPS. 2008.
[117] Hofmann T. Unsupervised learning by probabilistic latent semantic analysis. ML Journal. 2001.
[118] Yang J, Yu K, Gong Y, Huang T. Linear spatial pyramid matching using sparse coding for image classification. CVPR. 2009.
[119] Fergus R, Perona P, Zisserman A. A sparse object category model for efficient learning and exhaustive recognition. CVPR. 2005.
[120] Ji R, Duan L-Y, Chen J, Yao H, Yuan J, Rui Y, Gao W. Location discriminative vocabulary coding for mobile landmark search. IJCV. 2011.
[121] Snavely N, Seitz SM, Szeliski R. PhotoTourism: exploring photo collections in 3D. SIGGRAPH. 2006.
[122] Agrawal R, Imielinski T, Swami AN. Mining association rules between sets of items in large database. In: ACM Conference on Management of Data, Barcelona, Spain; 1993:207–216.
[123] Ji R, Duan L-Y, Chen J, Gao W. Towards compact topical descriptor. CVPR. 2012.
[124] Tibshirani R. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society. 1997.
[125] Jegou H, Douze M, Schmid C, Perez P. Aggregating local descriptors into a compact image representation. CVPR. 2010.
[126] Winder S, Brown M. Learning local image descriptors. CVPR. 2007.
[127] Salton G, Wong A, Yang CS. A vector space model for automatic indexing. Commun. ACM. 1975;18(11):613–620.
[128] Yang J, Hauptamann A. A text categorization approach to video scene classification using keypoint features. 2006 CMU Technical Report.
[129] Mitra M, Buckley C, Cardie C, Singhal A. An analysis of statistical and syntactic phrases. In: Recherche d’Information Assistée par Ordinateur, New York, USA; 1997:200–217.
[130] ETH-Zurich, Zurich building image database, Available from: http://www.vision.ee.ethz.ch/showroom/zubud/index.en.html.
[131] Shao T, Svoboda V, Ferrari T, Tuytelaars LV. Gool, Fast indexing for image retrieval based on local appearance with re-ranking. In: IEEE International Conference on Image Processing, Barcelona, Spain; 2003:737–740.
3.15.14.98