A note on the digital index A link in an index entry is displayed as the section title in which that entry appears. Because some sections have multiple index markers, it is not unusual for an entry to have several links to the same section. Clicking on any link will take you directly to the place in the text in which the marker appears.
A abort() method, HBaseAdmin class, Basic Operations Abortable interface, Basic Operations Accept header, switching REST formats, Supported formats , JSON (application/json) , Protocol Buffer (application/x-protobuf) access control Bigtable column families for, HBase Versus Bigtable coprocessors for, Introduction to Coprocessors ACID properties, The Problem with Relational Database Systems add() method, Bytes class, The Bytes Class add() method, Put class, Single Puts addColumn() method, Get class, Single Gets addColumn() method, HBaseAdmin class, Schema Operations addColumn() method, Increment class, Multiple Counters addColumn() method, Scan class, Introduction addFamily() method, Get class, Single Gets addFamily() method, HTableDescriptor class, Table Properties addFamily() method, Scan class, Introduction , Client API: Best Practices add_peer command, HBase Shell, Replication alter command, HBase Shell, Data definition Amazon data requirements of, The Dawn of Big Data S3 (Simple Storage Service), S3 –S3 Apache Avro (see Avro) Apache binary release for HBase, Apache Binary Release –Apache Binary Release Apache HBase (see HBase) Apache Hive (see Hive) Apache Lucene, Search Integration , Search Integration Apache Maven (see Maven) Apache Pig (see Pig) Apache Solr, Search Integration Apache Whirr, deployment using, Apache Whirr –Apache Whirr Apache ZooKeeper (see ZooKeeper) API (see client API) append feature, for durability, Durability append() method, HLog class, HLog Class architecture, storage (see storage architecture) assign command, HBase Shell, Tools assign() method, HBaseAdmin class, Cluster Operations AssignmentManager class, The Region Life Cycle AsyncHBase client, Other Clients atomic read-modify-write, Dimensions compare-and-delete operations, Atomic compare-and-delete –Atomic compare-and-delete compare-and-set, for put operations, Atomic compare-and-set –Atomic compare-and-set per-row basis for, Tables, Rows, Columns, and Cells , Storage API , General Notes row locks for, Row Locks for WAL edits, WALEdit Class auto-sharding, Auto-Sharding –Auto-Sharding Avro, Introduction to REST, Thrift, and Avro –Introduction to REST, Thrift, and Avro , Avro –Operation documentation for, Operation installing, Installation port used by, Operation schema compilers for, Avro schema used by, Advanced Schemas starting server for, Operation stopping, Operation B B+ trees, B+ Trees –B+ Trees backup masters, adding, Adding a local backup master , Adding a backup master –Adding a backup master balancer, Load Balancing –Load Balancing , Node Decommissioning balancer command, HBase Shell, Tools , Load Balancing balancer() method, HBaseAdmin class, Cluster Operations , Load Balancing balanceSwitch() method, HBaseAdmin class, Cluster Operations , Load Balancing balance_switch command, HBase Shell, Tools , Load Balancing , Node Decommissioning base64 command, XML (text/xml) Base64 encoding, with REST, XML (text/xml) , JSON (application/json) BaseEndpointCoprocessor class, The BaseEndpointCoprocessor class –The BaseEndpointCoprocessor class BaseMasterObserver class, The BaseMasterObserver class –The BaseMasterObserver class BaseRegionObserver class, The BaseRegionObserver class –The BaseRegionObserver class Batch class, The CoprocessorProtocol interface , The BaseEndpointCoprocessor class batch clients, Batch Clients batch operations for scans, Caching Versus Batching –Caching Versus Batching , Custom Filters on tables, Batch Operations –Batch Operations batch() method, HTable class, Batch Operations –Batch Operations , Introduction to Counters Bigtable storage architecture, Backdrop , Summary , Nomenclature , HBase Versus Bigtable –HBase Versus Bigtable “Bigtable: A Distributed
Storage System for Structured Data” (paper, by Google), Preface , Backdrop bin directory, Apache Binary Release BinaryComparator class, Comparators BinaryPrefixComparator class, Comparators binarySearch() method, Bytes class, The Bytes Class bioinformatics, data requirements of, The Dawn of Big Data BitComparator class, Comparators block cache, Column Families Bloom filters affecting, Bloom Filters controlling use of, Single Gets , Introduction , Client API: Best Practices enabling and disabling, Column Families metrics for, Region Server Metrics settings for, Configuration block replication, MapReduce Locality –MapReduce Locality blocks, HFile Format –HFile Format compressing, HFile Format size of, Column Families , HFile Format Bloom filters, Column Families , Bloom Filters –Bloom Filters bypass() method, ObserverContext class, The ObserverContext class Bytes class, Single Puts , Single Gets , The Bytes Class –The Bytes Class C caching, Caching Versus Batching (see also block cache; Memcached) regions, The HTable Utility Methods for scan operations, Caching Versus Batching –Caching Versus Batching , Client API: Best Practices , HBase Configuration Properties Cacti server, JMXToolkit on, JMX Remote API call() method, Batch class, The CoprocessorProtocol interface CAP (consistency, availability, and partition tolerance)
theorem, Nonrelational Database Systems, Not-Only SQL or NoSQL? CAS (compare-and-set) for delete operations, Atomic compare-and-delete for put operations, Atomic compare-and-set –Atomic compare-and-set CaS (core aggregation switch), Networking Cascading, Cascading –Cascading causal consistency, Nonrelational Database Systems, Not-Only SQL or NoSQL? CDH3 Hadoop distribution, Hadoop , Cloudera’s Distribution Including Apache Hadoop –Cloudera’s Distribution Including Apache Hadoop cells, Tables, Rows, Columns, and Cells –Tables, Rows, Columns, and Cells timestamp for (see versioning) cellular services, data requirements of, The Dawn of Big Data CentOS, Operating system checkAndDelete() method, HTable class, Atomic compare-and-delete –Atomic compare-and-delete checkAndPut() method, HTable class, Atomic compare-and-set –Atomic compare-and-set checkHBaseAvailable() method, HBaseAdmin
class, Cluster Operations checkTableModifiable() method, MasterServices
class, The MasterCoprocessorEnvironment class Chef, deployment using, Puppet and Chef CLASSPATH variable, Client Configuration clearRegionCache() method, HTable class, The HTable Utility Methods client API, Storage API –Storage API , General Notes batch operations, Batch Operations –Batch Operations byte conversion operations, The Bytes Class –The Bytes Class connection handling, Connection Handling –Connection Handling coprocessors, Coprocessors –The BaseEndpointCoprocessor class counters, Counters –Multiple Counters delete method, Delete Method –Atomic compare-and-delete filters, Filters –Filters Summary get method, Get Method –Related retrieval methods HTablePool class, HTablePool –HTablePool put method, Put Method –Atomic compare-and-set row locks, Row Locks –Row Locks scan operations, Scans –Caching Versus Batching utility methods, The HTable Utility Methods –The HTable Utility Methods client library, Implementation client-managed search integration, Search Integration client-managed secondary indexes, Secondary Indexes client-side write buffer (see write buffer) clients, Introduction to REST, Thrift, and Avro –Introduction to REST, Thrift, and Avro (see also HBase Shell; web-based UI for HBase) batch, Batch Clients –Cascading configuration for, Client Configuration interactive, Interactive Clients –Other Clients Clojure-based MapReduce API, Clojure close() method, HBaseAdmin class, Basic Operations close() method, HTable class, The HTable Utility Methods close() method, ResultScanner class, The ResultScanner Class closeRegion() method, HBaseAdmin class, Cluster Operations closeTablePool() method, HTablePool class, HTablePool close_region command, HBase Shell, Tools Cloudera’s Distribution including Apache Hadoop, Cloudera’s Distribution Including Apache Hadoop –Cloudera’s Distribution Including Apache Hadoop CloudStore filesystem, Other Filesystems cluster monitoring (see monitoring systems) operations on, Cluster Operations –Cluster Operations shutting down, Cluster Operations starting, Quick-Start Guide , Running and Confirming Your Installation status information for, Web-based UI Introduction , Cluster Status Information –Cluster Status Information , Main page –Main page status of, Cluster Operations stopping, Quick-Start Guide , Stopping the Cluster two, coexisting, Coexisting Clusters –Coexisting Clusters ClusterStatus class, Cluster Status Information –Cluster Status Information , General CMS (Concurrent Mark-Sweep Collector), Garbage Collection Tuning Codd’s 12 rules, The Dawn of Big Data column families, Tables, Rows, Columns, and Cells , Table Properties , Concepts –Concepts adding, Schema Operations block cache for, Column Families block size for, Column Families Bloom filters for, Column Families compression for, Column Families deleting, Schema Operations , Data manipulation in-memory blocks for, Column Families maximum number of versions for, Column Families modifying structure of, Schema Operations name for, Column Families , Column Families , Column Families replication scope for, Column Families time-to-live (TTL) for, Column Families column family descriptors, Column Families –Column Families , Schema Operations column keys, Concepts , Time-Ordered Relations –Time-Ordered Relations column qualifiers, Column Families , Concepts column-oriented databases, The Dawn of Big Data ColumnCountGetFilter class, ColumnCountGetFilter , Filters Summary ColumnPaginationFilter class, ColumnPaginationFilter –ColumnPaginationFilter , Filters Summary , Pagination ColumnPrefixFilter class, ColumnPrefixFilter , Filters Summary columns, Tables, Rows, Columns, and Cells –Tables, Rows, Columns, and Cells commas, in HBase Shell, Commands commit log (see WAL) commodity hardware, Hardware compact command, HBase Shell, Tools compact() method, HBaseAdmin class, Cluster Operations compacting collections, reducing, Memstore-Local Allocation Buffer compaction, Implementation , Compactions –Compactions major compaction, Implementation , Compactions , Enabling Compression managed, with splitting, Managed Splitting –Managed Splitting metrics for, Region Server Metrics minor compaction, Implementation , Compactions performing, Cluster Operations , Cluster Operations , Tools , User Table page properties for, HBase Configuration Properties , HBase Configuration Properties compaction.dir file, Region-level files comparators, for filters, Comparators –Comparators CompareFilter class, The filter hierarchy , Comparison Filters compareTo() method, Bytes class, The Bytes Class comparison filters, Comparison Filters –DependentColumnFilter comparison operators, for filters, Comparison operators complete() method, ObserverContext
class, The ObserverContext class completebulkload tool, Bulk load procedure , Using the completebulkload Tool CompositeContext class, Contexts, Records, and Metrics compression, Dimensions , Compression –Enabling Compression algorithms for, Available Codecs –GZIP for column
families, Column Families enabling, Enabling Compression –Enabling Compression settings for, Compression verifying installation of, Verifying Installation –Startup check CompressionTest tool, Compression test tool Concurrent Mark-Sweep Collector (CMS), Garbage Collection Tuning concurrent mode failure, Garbage Collection Tuning conf directory, Apache Binary Release configuration, Configuration –Client Configuration accessing from client code, Single Puts , The HTable Utility Methods caching, Caching Versus Batching client-side write buffer, Client-side write buffer clients, Client Configuration coexisting clusters, Coexisting Clusters , Coexisting Clusters coprocessors enabling, The BaseRegionObserver class loading, Loading from the configuration –Loading from the configuration data directory, Quick-Start Guide file descriptor limits, File handles and process limits fully distributed mode, Fully distributed mode garbage collection, Garbage Collection Tuning HBase Shell, Basics Java, Java , Run Modes lock timeout, Row Locks performance tuning, Configuration –Configuration ports, for web-based UI, Master UI properties, list of, HBase Configuration Properties –HBase Configuration Properties pseudodistributed mode, Pseudodistributed mode replication, Replication swapping, Swappiness ZooKeeper, ZooKeeper setup , ZooKeeper setup , Using the existing ZooKeeper ensemble , Configuration Configuration class, Single Puts configureIncrementalLoad() method,
HFileOutputFormat class, Bulk load procedure connection handling, Connection Handling –Connection Handling consistency models, Nonrelational Database Systems, Not-Only SQL or NoSQL? , Dimensions (see also CAP theorem) constructors, parameterless, Tables contact information for this book, How to Contact Us containsColumn() method, Result class, The Result class Content-Type header, switching REST formats in, Supported formats conventions used in this book, Conventions Used in This Book Coprocessor interface, Introduction to Coprocessors –The Coprocessor Class CoprocessorEnvironment class, The Coprocessor Class coprocessorExec() method, HTable class, The CoprocessorProtocol interface CoprocessorProtocol interface, The CoprocessorProtocol interface –The CoprocessorProtocol interface coprocessorProxy() method, HTable class, The CoprocessorProtocol interface coprocessors, Storage API , Coprocessors –The BaseEndpointCoprocessor class endpoint coprocessors, Introduction to Coprocessors , Endpoints –The BaseEndpointCoprocessor class loading, Coprocessor Loading –Loading from the table descriptor observer coprocessors, Introduction to Coprocessors , The RegionObserver Class –The BaseMasterObserver class priority of, The Coprocessor Class search integration using, Search Integration secondary indexes using, Secondary Indexes state of, The Coprocessor Class CopyTable tool, CopyTable Tool –CopyTable Tool core aggregation switch (CaS), Networking .corrupt directory, Root-level files , Log splitting count command, HBase Shell, Data manipulation counters, Counters –Multiple Counters encoding and decoding, Introduction to Counters incrementing, Introduction to Counters –Introduction to Counters , Introduction to Counters , Single Counters , Multiple Counters –Multiple Counters , Data manipulation initializing, Introduction to Counters multiple counters, Multiple Counters –Multiple Counters retrieving, Introduction to Counters , Introduction to Counters , Data manipulation single counters, Single Counters –Single Counters CPU requirements for, Servers utilization of, Garbage collection/memory tuning create command, HBase Shell, Quick-Start Guide , Shell Introduction , Data definition , Presplitting Regions create() method, HBaseConfiguration class, Single Puts createAndPrepare() method, ObserverContext
class, The ObserverContext class createRecordReader() method, TableInputFormat
class, Table Splits createTable() method, HBaseAdmin class, Table Operations –Table Operations , Presplitting Regions createTableAsync() method, HBaseAdmin class, Table Operations , Table Operations Crossbow project, The Dawn of Big Data CRUD operations, CRUD Operations –Atomic compare-and-delete delete method, Delete Method –Atomic compare-and-delete get method, Get Method –Related retrieval methods put method, Put Method –Atomic compare-and-set curl command, Operation D data directory, setting, Quick-Start Guide data locality, MapReduce Locality –MapReduce Locality data models, Dimensions database normalization, Tables databases access requirements for, The Dawn of Big Data –The Dawn of Big Data classifying, dimensions for, Dimensions –Dimensions column-oriented (see column-oriented databases) consistency models for, Nonrelational Database Systems, Not-Only SQL or NoSQL? denormalizing, Database (De-)Normalization , Time-Ordered Relations nonrelational (see NoSQL database systems) quantity requirements for, The Dawn of Big Data –The Dawn of Big Data relational (see RDBMS) scalability of, Scalability –Scalability sharding, The Problem with Relational Database Systems , Scalability , Auto-Sharding –Auto-Sharding datanode handlers, Datanode handlers , DataNode connections DDI (Denormalization, Duplication, and Intelligent
Keys), Database (De-)Normalization deadlocks, Dimensions Debian, Operating system debug command, HBase Shell, Basics DEBUG logging level, Changing Logging Levels debugging, Changing Logging Levels (see also troubleshooting) debug mode for, Basics logging level for, Changing Logging Levels text representations of data for, The Result class thread dumps for, Shared Pages decorating filters, Decorating Filters –WhileMatchFilter dedicated filters, Dedicated Filters –RandomRowFilter Delete class, Single Deletes –Single Deletes delete command, HBase Shell, Quick-Start Guide , Data manipulation delete marker, Implementation , Log-Structured Merge-Trees Delete type, KeyValue class, The KeyValue class delete() method, HTable class, Delete Method –Atomic compare-and-delete (see also checkAndDelete() method, HTable class) for multiple operations, List of Deletes –List of Deletes for single operations, Single Deletes –Single Deletes deleteall command, HBase Shell, Data manipulation deleteAllConnections() method, HConnectionManager
class, Connection Handling DeleteColumn type, KeyValue class, The KeyValue class deleteColumn() method, Delete class, Single Deletes deleteColumn() method, HBaseAdmin class, Schema Operations deleteColumns() method, Delete class, Single Deletes deleteConnection() method, HConnectionManager class, Connection Handling DeleteFamily type, KeyValue class, The KeyValue class deleteFamily() method, Delete class, Single Deletes deleteTable() method, HBaseAdmin class, Table Operations Delicious RSS feed, Data Sink Denormalization, Duplication, and Intelligent
Keys (see DDI) DependentColumnFilter class, DependentColumnFilter –DependentColumnFilter , Filters Summary describe command, HBase Shell, Data definition disable command, HBase Shell, Quick-Start Guide , Data definition disableTable() method, HBaseAdmin class, Table Operations disableTableAsync() method, HBaseAdmin class, Table Operations disable_peer command, HBase Shell, Replication disks, requirements for, Servers distcp command, Hadoop, Import and Export Tools distributed mode, Run Modes , Distributed Mode –Using the existing ZooKeeper ensemble adding servers in, Fully distributed cluster –Adding a region server distributions of HBase, Distributions DNS (Domain Name Service), requirements for, Domain Name Service docs directory, Apache Binary Release drop command, HBase Shell, Quick-Start Guide , Data definition durability of data, Durability –Durability dynamic provisioning, for MapReduce, Dynamic Provisioning –Dynamic Provisioning E empty qualifier, Tall-Narrow Versus Flat-Wide Tables enable command, HBase Shell, Data definition enableTable() method, HBaseAdmin class, Table Operations enableTableAsync() method, HBaseAdmin class, Table Operations enable_peer command, HBase Shell, Replication endpoint coprocessors, Introduction to Coprocessors , Endpoints –The BaseEndpointCoprocessor class environmental companies, data requirements of, The Dawn of Big Data EQUAL operator, Comparison operators equals() method, Bytes class, The Bytes Class equals() method, HTableDescriptor class, Table Operations ERD (entity relationship diagram), for Hush, Database (De-)Normalization –Database (De-)Normalization error messages in logfiles, Analyzing the Logs –Analyzing the Logs Ethernet card, requirements for, Servers Evans, Eric (coined “NoSQL”), Nonrelational Database Systems, Not-Only SQL or NoSQL? eventual consistency, Nonrelational Database Systems, Not-Only SQL or NoSQL? “Eventually Consistent”
(article, by Werner Vogels), Nonrelational Database Systems, Not-Only SQL or NoSQL? examples in this book, Building the Examples –Building the Examples (see also Hush (HBase URL Shortener)) building, Building the Examples –Building the Examples location of, Building the Examples permission to use, Using Code Examples running, Building the Examples exists command, HBase Shell, Data definition exists() method, HTable class, Related retrieval methods exit command, HBase Shell, Quick-Start Guide , Basics Export tool, Import and Export Tools –Import and Export Tools ext3 filesystem, Filesystem ext4 filesystem, Filesystem F Facebook data requirements of, The Dawn of Big Data Thrift (see Thrift) failure handling, Dimensions FamilyFilter class, FamilyFilter –FamilyFilter , Filters Summary familySet() method, Get class, Single Gets familySet() method, Increment class, Multiple Counters Fedora, Operating system file handles, File handles and process limits –File handles and process limits , File handles file info blocks, HFile Format FileContext class, Contexts, Records, and Metrics filesystem for HBase, Filesystems for HBase –Other Filesystems for operating system, Filesystem –Filesystem Filter interface, Introduction to Filters –The filter hierarchy , Custom Filters –Custom Filters filterAllRemaining() method, Filter
interface, Custom Filters FilterBase class, The filter hierarchy filterKeyValue() method, Filter interface, Custom Filters FilterList class, FilterList –FilterList , Filters Summary filterRow() method, Filter interface, Custom Filters , Custom Filters filterRowKey() method, Filter interface, Custom Filters filters, Introduction to Filters –Filters Summary Bloom filters, Column Families comparators for, Comparators –Comparators comparison filters, Comparison Filters –DependentColumnFilter comparison operators for, Comparison operators custom, Custom Filters –Custom Filters decorating filters, Decorating Filters –WhileMatchFilter dedicated filters, Dedicated Filters –RandomRowFilter list of, showing features, Filters Summary –Filters Summary multiple, applying to data, FilterList –FilterList financial companies, data requirements of, The Dawn of Big Data FirstKeyOnlyFilter class, FirstKeyOnlyFilter , Filters Summary flush command, HBase Shell, Tools flush() method, HBaseAdmin class, Cluster Operations flushCommits() method, HTable class, Client-side write buffer , Client API: Best Practices fonts used in this book, Conventions Used in This Book for loop, Shell Introduction forMethod() method, Batch class, The BaseEndpointCoprocessor class fully distributed mode, Fully distributed mode –Using the existing ZooKeeper ensemble G Ganglia, Introduction , Ganglia –Usage installing, Installation –HBase-related steps versions of, Ganglia web-based frontend, Usage –Usage web-based frontend for, Ganglia , Ganglia web frontend GangliaContext class, Contexts, Records, and Metrics , HBase-related steps –HBase-related steps garbage collection CPU requirements for, Servers metrics for, JVM Metrics performance tuning for, Garbage Collection Tuning –Garbage Collection Tuning , Garbage collection/memory tuning genomics, data requirements of, The Dawn of Big Data Get class, Single Gets –Single Gets (see also Result class) get command, HBase Shell, Quick-Start Guide , Commands , Data manipulation get operations, Get Method –Related retrieval methods , Read Path –Read Path (see also scan operations) get() method, HTable class, Single Gets –The Result class filters for (see filters) list-based, List of Gets –List of Gets get() method, Put class, Single Puts getAssignmentManager() method, MasterServices
class, The MasterCoprocessorEnvironment class getAverageLoad() method, ClusterStatus
class, Cluster Status Information getBatch() method, Scan class, Caching Versus Batching getBlocksize() method, HColumnDescriptor
class, Column Families getBloomFilterType() method, HColumnDescriptor
class, Column Families getBuffer() method, KeyValue class, The KeyValue class getCacheBlocks() method, Get class, Single Gets getCacheBlocks() method, Scan class, Introduction getCaching() method, Scan class, Caching Versus Batching getClusterId() method, ClusterStatus
class, Cluster Status Information getClusterStatus() method, HBaseAdmin class, Cluster Operations , Cluster Status Information getColumn() method, Result class, The Result class getColumnFamilies() method, HTableDescriptor
class, Table Properties getColumnLatest() method, Result class, The Result class getCompactionCompression() method, HColumnDescriptor
class, Column Families getCompactionCompressionType() method,
HColumnDescriptor class, Column Families getCompactionRequester() method,
RegionServerServices class, The RegionCoprocessorEnvironment class getCompression() method, HColumnDescriptor
class, Column Families getCompressionType() method, HColumnDescriptor
class, Column Families getConfiguration() method, HBaseAdmin class, Basic Operations getConfiguration() method, HTable class, The HTable Utility Methods getConnection() method, HBaseAdmin class, Basic Operations getConnection() method, HConnectionManager class, Connection Handling getDeadServerNames() method, ClusterStatus
class, Cluster Status Information getDeadServers() method, ClusterStatus
class, Cluster Status Information getEndKeys() method, HTable class, The HTable Utility Methods getEnvironment() method, ObserverContext
class, The ObserverContext class getExecutorService() method, MasterServices
class, The MasterCoprocessorEnvironment class getFamilies() method, Scan class, Introduction getFamily() method, HTableDescriptor class, Table Properties getFamilyMap() method, Delete class, Single Deletes getFamilyMap() method, Get class, Single Gets getFamilyMap() method, Increment class, Multiple Counters getFamilyMap() method, Put class, Single Puts getFamilyMap() method, Result class, The Result class getFamilyMap() method, Scan class, Introduction getFilter() method, Get class, Single Gets getFilter() method, Scan class, Introduction getFlushRequester() method, RegionServerServices
class, The RegionCoprocessorEnvironment class getHBaseVersion() method, ClusterStatus
class, Cluster Status Information getHBaseVersion() method, CoprocessorEnvironment
class, The Coprocessor Class getHostAndPort() method, ServerName class, Cluster Status Information getHostname() method, ServerName class, Cluster Status Information getInstance() method, CoprocessorEnvironment
class, The Coprocessor Class getKey() method, KeyValue class, The KeyValue class getLength() method, KeyValue class, The KeyValue class getLoad() method, ClusterStatus class, Cluster Status Information , Cluster Status Information getLoad() method, HServerLoad class, Cluster Status Information getLoadSequence() method, CoprocessorEnvironment
class, The Coprocessor Class getLockId() method, Delete class, Single Deletes getLockId() method, Get class, Single Gets getLockId() method, Increment class, Multiple Counters getLockId() method, Put class, Single Puts getMap() method, Result class, The Result class getMaster() method, HBaseAdmin class, Basic Operations getMasterFileSystem() method, MasterServices
class, The MasterCoprocessorEnvironment class getMasterServices() method,
MasterCoprocessorEnvironment class, The MasterCoprocessorEnvironment class getMaxFileSize() method, HTableDescriptor
class, Table Properties getMaxHeapMB() method, HServerLoad class, Cluster Status Information getMaxVersions() method, HColumnDescriptor
class, Column Families getMaxVersions() method, Scan class, Introduction getMemStoreFlushSize() method, HTableDescriptor
class, Table Properties getMemStoreSizeInMB() method, HServerLoad
class, Cluster Status Information getMemStoreSizeMB() method, RegionLoad
class, Cluster Status Information getName() method, HTableDescriptor class, Table Properties getName() method, RegionLoad class, Cluster Status Information getNameAsString() method, RegionLoad
class, Cluster Status Information getNoVersionMap() method, Result class, The Result class getNumberofRegions() method, HServerLoad
class, Cluster Status Information getNumberOfRequests() method, HServerLoad
class, Cluster Status Information getOffset() method, KeyValue class, The KeyValue class getPort() method, ServerName class, Cluster Status Information getPriority() method, CoprocessorEnvironment
class, The Coprocessor Class getReadRequestsCount() method, RegionLoad
class, Cluster Status Information getRegion() method, RegionCoprocessorEnvironment
class, The RegionCoprocessorEnvironment class getRegionCachePrefetch() method, HTable
class, The HTable Utility Methods getRegionLocation() method, HTable class, The HTable Utility Methods getRegionsCount() method, ClusterStatus
class, Cluster Status Information getRegionServerAccounting() method,
RegionServerServices class, The RegionCoprocessorEnvironment class getRegionServerServices() method,
RegionCoprocessorEnvironment class, The RegionCoprocessorEnvironment class getRegionsInfo() method, HTable class, The HTable Utility Methods getRegionsInTransition() method, ClusterStatus
class, Cluster Status Information getRegionsLoad() method, HServerLoad
class, Cluster Status Information getRequestsCount() method, ClusterStatus
class, Cluster Status Information getRequestsCount() method, RegionLoad
class, Cluster Status Information getRow() method, Delete class, Single Deletes getRow() method, Get class, Single Gets getRow() method, Increment class, Multiple Counters getRow() method, KeyValue class, The KeyValue class getRow() method, Put class, Single Puts getRow() method, Result class, The Result class getRowLock() method, Delete class, Single Deletes getRowLock() method, Get class, Single Gets getRowLock() method, Increment class, Multiple Counters getRowLock() method, Put class, Single Puts getRowOrBefore() method, HTable class, Related retrieval methods getRpcMetrics() method, RegionServerServices
class, The RegionCoprocessorEnvironment class getScanner() method, HTable class, Introduction getScannerCaching() method, HTable class, Caching Versus Batching getScope() method, HColumnDescriptor class, Column Families getServerManager() method, MasterServices
class, The MasterCoprocessorEnvironment class getServerName() method, ServerName class, Cluster Status Information getServers() method, ClusterStatus class, Cluster Status Information , Cluster Status Information getServersSize() method, ClusterStatus
class, Cluster Status Information getSplits() method, TableInputFormat class, Table Splits getStartcode() method, ServerName class, Cluster Status Information getStartEndKeys() method, HTable class, The HTable Utility Methods getStartKeys() method, HTable class, The HTable Utility Methods getStartRow() method, Scan class, Introduction getStorefileIndexSizeInMB() method, HServerLoad
class, Cluster Status Information getStorefileIndexSizeMB() method, RegionLoad
class, Cluster Status Information getStorefiles() method, HServerLoad class, Cluster Status Information getStorefiles() method, RegionLoad class, Cluster Status Information getStorefileSizeInMB() method, HServerLoad
class, Cluster Status Information getStorefileSizeMB() method, RegionLoad
class, Cluster Status Information getStores() method, RegionLoad class, Cluster Status Information getTable() method, CoprocessorEnvironment
class, The Coprocessor Class getTable() method, HTablePool class, HTablePool getTableDescriptor() method, HBaseAdmin class, Table Operations getTableDescriptor() method, HTable class, The HTable Utility Methods getTableName() method, HTable class, The HTable Utility Methods getters, Table Properties getTimeRange() method, Get class, Single Gets getTimeRange() method, Increment class, Multiple Counters getTimeRange() method, Scan class, Introduction getTimeStamp() method, Delete class, Single Deletes getTimeStamp() method, Put class, Single Puts getUsedHeapMB() method, HServerLoad class, Cluster Status Information getValue() method, HTableDescriptor class, Table Properties getValue() method, Result class, The Result class getVersion() method, ClusterStatus class, Cluster Status Information getVersion() method, CoprocessorEnvironment
class, The Coprocessor Class getVersion() method, HServerLoad class, Cluster Status Information getWAL() method, RegionServerServices
class, The RegionCoprocessorEnvironment class getWriteBuffer() method, HTable class, List of Puts getWriteRequestsCount() method, RegionLoad
class, Cluster Status Information getWriteToWAL() method, Increment class, Multiple Counters getWriteToWAL() method, Put class, Single Puts get_counter command, Introduction to Counters get_counter command, HBase Shell, Data manipulation GFS (Google File System), Backdrop Git, requirements for, Building the Examples GitHub, Building the Examples Global Biodiversity Information Facility, The Dawn of Big Data gmetad (Ganglia meta daemon), Ganglia , Ganglia meta daemon –Ganglia meta daemon gmond (Ganglia monitoring daemon), Ganglia , Ganglia monitoring daemon –Ganglia monitoring daemon Google “Bigtable: A Distributed
Storage System for Structured Data” (paper), Backdrop data requirements of, The Dawn of Big Data file system developed by, Backdrop “The Google File System”
(paper), Backdrop “MapReduce:
Simplified Data Processing on Large Clusters” (paper), Backdrop Protocol Buffers (see Protocol Buffers) “Bigtable: A Distributed Storage System for Structured Data”
(paper), Preface graphing tools, Introduction (see also Ganglia) GREATER operator, Comparison operators GREATER_OR_EQUAL operator, Comparison operators Grunt shell, Pig –Pig GZIP algorithm, Available Codecs , GZIP H Hadoop, The Dawn of Big Data –The Dawn of Big Data building, Hadoop requirements for, Hadoop –Hadoop Hadoop Distributed File System (see HDFS) hadoop-env.sh file, Static Provisioning Hadoop: The Definitive Guide (O’Reilly), Hardware hard drives, requirements for, Servers hardware requirements, Hardware –Networking has() method, Put class, Single Puts hasFamilies() method, Get class, Single Gets hasFamilies() method, Increment class, Multiple Counters hasFamilies() method, Scan class, Introduction hasFamily() method, HTableDescriptor class, Table Properties HAvroBase, Advanced Schemas HBase, Building Blocks –Summary (see also client API; cluster; configuration) building from source, Building from Source compared to Bigtable, HBase Versus Bigtable –HBase Versus Bigtable configuration, Configuration –Client Configuration deployment, Deployment –Puppet and Chef distributed mode, Run Modes , Distributed Mode –Using the existing ZooKeeper ensemble distributions of, Distributions hardware requirements for, Hardware –Networking history of, Backdrop –Backdrop , History –History implementation of, Implementation –Implementation installing, Quick-Start Guide –Quick-Start Guide , Installation Choices –Building from Source nomenclature of, compared to Bigtable, Nomenclature software requirements, Software –Windows standalone mode, Quick-Start Guide , Run Modes , Standalone Mode starting, Quick-Start Guide , Running and Confirming Your Installation stopping, Quick-Start Guide , Stopping the Cluster storage architecture, Storage –KeyValue Format structural units of, Tables, Rows, Columns, and Cells –Auto-Sharding upgrading from previous releases, Upgrade from Previous Releases –Upgrading to HBase 0.92.0 versions of, Road Map –HBase 0.94.0 determining, Cluster Status Information in this book, HBase Version metrics for, Info Metrics numbering of, History supported by Hive, Hive web-based UI for, Web-based UI Introduction , Web-based UI –Shared Pages HBase Shell, Quick-Start Guide , Shell Introduction , Shell –Scripting administrative commands, Tools –Tools cluster status, General command syntax, Commands command-line options, Basics commas in, Commands configuration, Basics data definition commands, Data definition –Data definition data manipulation commands, Data manipulation debug mode, Basics exiting, Basics formatting for, Basics help for, Basics , Commands parameters in, Commands quotes in, Commands replication commands, Replication restricting output from, Commands Ruby hashes in, Commands scripting in, Scripting –Scripting starting, Basics version of cluster, General hbase-default.xml file, hbase-site.xml and hbase-default.xml , Single Puts (see also configuration) HBase-DSL client, Other Clients hbase-env.sh file, Configuration , hbase-env.sh , hbase-env.sh (see also configuration) HBase-Runner project, Clojure hbase-site.xml file, ZooKeeper setup , hbase-site.xml and hbase-default.xml , hbase-site.xml , Single Puts , HBase Configuration Properties –HBase Configuration Properties (see also configuration) hbase-webapps directory, Apache Binary Release hbase.balancer.max.balancing property, Load Balancing hbase.balancer.period property, Load Balancing , HBase Configuration Properties hbase.client.keyvalue.maxsize property, HBase Configuration Properties hbase.client.pause property, HBase Configuration Properties hbase.client.retries.number property, Batch Operations , HBase Configuration Properties hbase.client.scanner.caching property, HBase Configuration Properties hbase.client.write.buffer property, Client-side write buffer , HBase Configuration Properties hbase.cluster.distributed property, Fully distributed mode , HBase Configuration Properties hbase.coprocessor.master.classes property, Loading from the configuration , Loading from the configuration , HBase Configuration Properties hbase.coprocessor.region.classes
property, Loading from the configuration , HBase Configuration Properties hbase.coprocessor.wal.classes property, Loading from the configuration hbase.defaults.for.version.skip property, HBase Configuration Properties hbase.extendedperiod property, Contexts, Records, and Metrics hbase.hash.type property, HBase Configuration Properties hbase.hlog.split.skip.errors property, Log splitting hbase.hregion.majorcompaction property, Compactions , Presplitting Regions , HBase Configuration Properties hbase.hregion.majorcompaction.jitter property, Compactions hbase.hregion.max.filesize property, Region splits , Managed Splitting , Configuration , HBase Configuration Properties hbase.hregion.memstore.block.multiplier
property, Configuration , HBase Configuration Properties hbase.hregion.memstore.flush.size property, Write Path , Garbage Collection Tuning , HBase Configuration Properties hbase.hregion.memstore.mslab.chunksize property, Memstore-Local Allocation Buffer hbase.hregion.memstore.mslab.enabled property, Memstore-Local Allocation Buffer , HBase Configuration Properties hbase.hregion.memstore.mslab.max.allocation
property, Memstore-Local Allocation Buffer hbase.hregion.preclose.flush.size property, Write Path , HBase Configuration Properties hbase.hstore.blockingStoreFiles property, Configuration , HBase Configuration Properties hbase.hstore.blockingWaitTime property, HBase Configuration Properties hbase.hstore.compaction.max property, Compactions , HBase Configuration Properties hbase.hstore.compaction.max.size property, Compactions hbase.hstore.compaction.min property, Compactions hbase.hstore.compaction.min.size property, Compactions hbase.hstore.compaction.ratio property, Compactions hbase.hstore.compactionThreshold property, Compactions , HBase Configuration Properties hbase.id file, Root-level files hbase.mapreduce.hfileoutputformat.blocksize
property, HBase Configuration Properties hbase.master.cleaner.interval property, Root-level files hbase.master.distributed.log.splitting
property, Log splitting hbase.master.dns.interface property, HBase Configuration Properties hbase.master.dns.nameserver property, HBase Configuration Properties hbase.master.info.bindAddress property, HBase Configuration Properties hbase.master.info.port property, Required Ports , HBase Configuration Properties hbase.master.kerberos.principal property, HBase Configuration Properties hbase.master.keytab.file property, HBase Configuration Properties hbase.master.logcleaner.plugins property, HBase Configuration Properties hbase.master.logcleaner.ttl property, Root-level files , HBase Configuration Properties hbase.master.port property, Required Ports , HBase Configuration Properties hbase.regions.slop property, HBase Configuration Properties hbase.regionserver.class property, HBase Configuration Properties hbase.regionserver.codecs property, Startup check hbase.regionserver.dns.interface property, Domain Name Service , HBase Configuration Properties hbase.regionserver.dns.nameserver property, Domain Name Service , HBase Configuration Properties hbase.regionserver.global.memstore.lowerLimit
property, Configuration , HBase Configuration Properties hbase.regionserver.global.memstore.upperLimit
property, Configuration , HBase Configuration Properties hbase.regionserver.handler.count property, Client-side write buffer , Configuration , HBase Configuration Properties hbase.regionserver.hlog.blocksize property, LogRoller Class hbase.regionserver.hlog.reader.impl property, HBase Configuration Properties hbase.regionserver.hlog.splitlog.writer.threads
property, Log splitting hbase.regionserver.hlog.writer.impl property, HBase Configuration Properties hbase.regionserver.info.bindAddress property, HBase Configuration Properties hbase.regionserver.info.port property, Required Ports , HBase Configuration Properties hbase.regionserver.info.port.auto property, HBase Configuration Properties hbase.regionserver.kerberos.principal property, HBase Configuration Properties hbase.regionserver.keytab.file property, HBase Configuration Properties hbase.regionserver.lease.period property, HBase Configuration Properties hbase.regionserver.logroll.multiplier property, LogRoller Class hbase.regionserver.logroll.period property, LogRoller Class , HBase Configuration Properties hbase.regionserver.maxlogs property, Keeping track of logs , Configuration hbase.regionserver.msginterval property, Cluster Status Information , HBase Configuration Properties hbase.regionserver.nbreservationblocks property, HBase Configuration Properties hbase.regionserver.optionallogflushinterval
property, LogSyncer Class , HBase Configuration Properties hbase.regionserver.port property, Required Ports , HBase Configuration Properties hbase.regionserver.regionSplitLimit property, HBase Configuration Properties hbase.replication property, Replication hbase.rest.port property, HBase Configuration Properties hbase.rest.readonly property, HBase Configuration Properties hbase.rootdir property, Quick-Start Guide , Pseudodistributed mode , HBase Configuration Properties hbase.rpc.engine property, HBase Configuration Properties hbase.server.thread.wakefrequency property, Compactions , HBase Configuration Properties hbase.server.thread.wakefrequency.multiplier
property, Compactions hbase.skip.errors property, Edits recovery hbase.tmp.dir property, HBase Configuration Properties hbase.version file, Root-level files hbase.zookeeper.dns.interface property, HBase Configuration Properties hbase.zookeeper.dns.nameserver property, HBase Configuration Properties hbase.zookeeper.leaderport property, HBase Configuration Properties hbase.zookeeper.peerport property, HBase Configuration Properties hbase.zookeeper.property property prefix, ZooKeeper setup hbase.zookeeper.property.clientPort property, ZooKeeper setup , ZooKeeper setup , Choosing region servers to replicate to , HBase Configuration Properties hbase.zookeeper.property.dataDir property, ZooKeeper setup , HBase Configuration Properties hbase.zookeeper.property.initLimit property, HBase Configuration Properties hbase.zookeeper.property.maxClientCnxns property, HBase Configuration Properties hbase.zookeeper.property.syncLimit property, HBase Configuration Properties hbase.zookeeper.quorum property, ZooKeeper setup , ZooKeeper setup , Client Configuration , Basics , Choosing region servers to replicate to , HBase Configuration Properties HBaseAdmin class, HBaseAdmin –Cluster Status Information HBaseConfiguration class, Single Puts HBaseFsck class, HBase Fsck HBaseHelper class used in examples, Building the Examples HBasene, Search Integration –Search Integration HBaseStorage class, Pig HBASE_CLASSPATH variable, hbase-site.xml and hbase-default.xml HBASE_HEAPSIZE variable, Configuration HBASE_MANAGER_ZK variable, ZooKeeper setup HBASE_MANAGES_ZK variable, Using the existing ZooKeeper ensemble HBASE_OPTS variable, Garbage Collection Tuning HBASE_REGIONSERVER_OPTS variable, Garbage Collection Tuning , Configuration hbck tool, HBase Fsck –HBase Fsck HBql client, Other Clients HColumnDescriptor class, Column Families –Column Families HConnection class, Connection Handling –Connection Handling HConnectionManager class, Connection Handling –Connection Handling HDFS (Hadoop Distributed File System), Implementation , Filesystems for HBase –Filesystems for HBase , HDFS –HDFS , Overview –Overview files in, Files –Compactions HFile format for, HFile Format –HFile Format KeyValue format for, KeyValue Format –KeyValue Format requirements for, Distributed Mode starting, Running and Confirming Your Installation version of, metrics for, Info Metrics write path, Write Path –Write Path hdfs-site.xml file, Datanode handlers head() method, Bytes class, The Bytes Class heap for block cache, Configuration generational architecture of, Garbage Collection Tuning memory requirements for, Servers –Servers for memstore, Configuration for Put, determining, Single Puts for scanner leases, The ResultScanner Class settings for, hbase-env.sh , Configuration , Garbage collection/memory tuning status information for, Cluster Status Information , Cluster Status Information , Region Server Metrics , Region Server Metrics , JVM Metrics heapSize() method, Put class, Single Puts help command, HBase Shell, Shell Introduction , Basics , Commands HFile class, HFile Format –HFile Format hfile.block.cache.size property, HBase Configuration Properties HFileOutputFormat class, Bulk load procedure HFiles (see store files) Hive, Hive –Hive command-line interface for, Hive –Hive configuring, Hive documentation for, Hive HBase versions supported, Hive unsupported features, Hive HiveQL, Hive HLog class, Write Path , HLog Class –HLog Class , Replication HLogKey class, HLogKey Class HMasterInterface class, Basic Operations HServerLoad class, Cluster Status Information HTable class, General Notes –General Notes HTableDescriptor class, Loading from the table descriptor , Tables HTableFactory class, HTablePool HTableInterfaceFactory interface, HTablePool HTablePool class, General Notes , HTablePool –HTablePool , Connection Handling Hush (HBase URL Shortener), Hush: The HBase URL Shortener –Hush: The HBase URL Shortener building, Running Hush ERD for, Database (De-)Normalization –Database (De-)Normalization HBase schema for, Database (De-)Normalization –Database (De-)Normalization RDBMS implementation of, The Problem with Relational Database Systems –The Problem with Relational Database Systems running, Running Hush schema for, Hush SQL Schema table and column descriptors, modifying, Schema Operations table pools used by, HTablePool I I/O metrics, Region Server Metrics IdentityTableReducer class, Data Source and Sink IHBase (Indexed HBase), Secondary Indexes –Secondary Indexes impedance match, Dimensions Import tool, Import and Export Tools , Import and Export Tools –Import and Export Tools importing data bulk import, Bulk Import –Advanced usage Import tool, Import and Export Tools , Import and Export Tools –Import and Export Tools importtsv tool, Using the importtsv tool ImportTsv.java class, Advanced usage InclusiveStopFilter class, InclusiveStopFilter –InclusiveStopFilter , Filters Summary incr command, Introduction to Counters incr command, HBase Shell, Data manipulation Increment class, Multiple Counters –Multiple Counters increment() method, HTable class, Multiple Counters –Multiple Counters incrementBytes() method, Bytes class, The Bytes Class incrementColumnValue() method, HTable class, Single Counters –Single Counters index blocks, HFile Format Indexed HBase (IHBase), Secondary Indexes –Secondary Indexes Indexed-Transactional HBase (ITHBase) project, Secondary Indexes , Transactions indexes, secondary, Dimensions , Secondary Indexes –Secondary Indexes INFO logging level, Changing Logging Levels InputFormat class, InputFormat –InputFormat Integer value (IV) metric type, Contexts, Records, and Metrics intelligent keys (see DDI) interactive clients, Interactive Clients –Other Clients IOPS (I/O operations per second), Servers IRB, compared to HBase Shell, Shell Introduction isAutoFlush() method, HTable class, Client-side write buffer isBlockCacheEnabled() method, HColumnDescriptor
class, Column Families isDeferredLogFlush() method, HTableDescriptor
class, Table Properties isEmpty() method, Delete class, Single Deletes isEmpty() method, Put class, Single Puts isEmpty() method, Result class, The Result class isInMemory() method, HColumnDescriptor
class, Column Families isLegalFamilyName() method, HColumnDescriptor
class, Column Families isMasterRunning() method, HBaseAdmin class, Basic Operations isReadOnly() method, HTableDescriptor class, Table Properties isStopping() method, RegionServerServices
class, The RegionCoprocessorEnvironment class isTableAvailable() method, HBaseAdmin class, Table Operations isTableDisabled() method, HBaseAdmin class, Table Operations isTableEnabled() method, HBaseAdmin class, Table Operations isTableEnabled() method, HTable class, The HTable Utility Methods is_disabled command, HBase Shell, Data definition is_enabled command, HBase Shell, Data definition ITHBase (Indexed-Transactional HBase) project, Secondary Indexes IV (Integer value) metric type, Contexts, Records, and Metrics J Java client for REST, REST Java client –REST Java client native (see client API) Java Development Kit (JDK), requirements for, Building from Source Java heap (see heap) Java Management Extensions (see JMX) Java Runtime Environment (see JRE) Java, requirements for, Building the Examples , Java –Java Java-based MapReduce API, Native Java JAVA_HOME variable, Java , Run Modes JBOD, Servers JConsole, JConsole –JConsole JDiff, for this book, HBase Version JDK (Java Development Kit), requirements for, Building from Source JMX (Java Management Extensions), Introduction , JMX –JMX Remote API enabling, JMX JConsole for, JConsole –JConsole remote API for, JMX Remote API –JMX Remote API JMXToolkit, JMX Remote API –JMX Remote API , Nagios –Nagios JPA/JPO client, Other Clients JRE (Java Runtime Environment) garbage collection handling by, Garbage Collection Tuning , Garbage Collection Tuning , Garbage Collection Tuning , Memstore-Local Allocation Buffer requirements for, Installation (J)Ruby, in HBase Shell commands, Shell Introduction JRuby client, Other Clients JSON format, with REST, JSON (application/json) –JSON (application/json) JVM metrics, JVM Metrics –JVM Metrics K key structures column keys, Key Design , Concepts field swap and promotion of row key, Time Series Data pagination with, Pagination –Pagination partial key scans with, Partial Key Scans –Partial Key Scans randomization of row key, Time Series Data row keys, Key Design , Concepts salting prefix for row key, Time Series Data time series data with, Time Series Data –Time Series Data time-ordered relations with, Time-Ordered Relations –Time-Ordered Relations KeyComparator class, The KeyValue class KeyOnlyFilter class, KeyOnlyFilter , Filters Summary KeyValue array, KeyValue Format –KeyValue Format , Concepts KeyValue class, The KeyValue class –The KeyValue class KFS (Kosmos filesystem) (see CloudStore filesystem) Kimball, Ralph (quotation regarding data assets), The Dawn of Big Data L Lempel-Ziv-Oberhumer (LZO) algorithm, Available Codecs , LZO LESS operator, Comparison operators LESS_OR_EQUAL operator, Comparison operators lib directory, Apache Binary Release libjars, in MapReduce, Dynamic Provisioning limits.conf file, File handles and process limits Linux, Operating system –Operating system list command, HBase Shell, Quick-Start Guide , Data definition list() method, Result class, The Result class listTables() method, HBaseAdmin class, Table Operations load balancing, Dimensions , Load Balancing –Load Balancing , Node Decommissioning load tests, Load Tests –YCSB LoadIncrementalHFiles class, Advanced usage local filesystem, Local –Local locality properties, Implementation lockRow() method, HTable class, Row Locks locks, Dimensions on rows, Single Puts , Single Puts , Single Gets , Single Gets , Single Deletes , Single Deletes , Row Locks –Row Locks , Multiple Counters timeout for, Row Locks Log-Structured Merge-Trees (see LSM-trees) Log-Structured Sort-and-Merge-Maps, Implementation log4j.properties file, log4j.properties , Changing Logging Levels (see also configuration) logfiles, Analyzing the Logs (see also WAL (write-ahead log)) accessing, Shared Pages analyzing, Analyzing the Logs –Analyzing the Logs level of, changing, Basics , Shared Pages , Changing Logging Levels location of, Apache Binary Release , Root-level files properties for, log4j.properties , Changing Logging Levels rolling of, Root-level files –Root-level files logging metrics, JVM Metrics LogRoller class, LogRoller Class –LogRoller Class logs directory, Apache Binary Release , Root-level files , Root-level files LogSyncer class, LogSyncer Class –LogSyncer Class Long value (LV) metric type, Contexts, Records, and Metrics LSM-trees, Implementation , Log-Structured Merge-Trees –Log-Structured Merge-Trees Lucene, Search Integration , Search Integration LV (Long value) metric type, Contexts, Records, and Metrics LZO (Lempel-Ziv-Oberhumer) algorithm, Available Codecs , LZO M majorCompact() method, HBaseAdmin class, Cluster Operations , Managed Splitting major_compact command, HBase Shell, Tools , Managed Splitting managed beans (MBeans), JMX Mapper class, Mapper –Mapper mapred package, Classes mapred-site.xml file, Garbage collection/memory tuning MapReduce, Backdrop , Storage API , MapReduce –Clojure , Framework –MapReduce Introduction classes for, Classes –Supporting Classes custom processing for, Custom Processing –Custom Processing data locality, MapReduce Locality –MapReduce Locality dynamic provisioning for, Dynamic Provisioning –Dynamic Provisioning HBase as both data source and sink, Data Source and Sink –Data Source and Sink HBase as data sink for, Data Sink –Data Sink HBase as data source for, Data Source –Data Source libjars, Dynamic Provisioning persisting data, OutputFormat –OutputFormat reading data, Mapper –Mapper shuffling and sorting data, Reducer splitting data, MapReduce Introduction , InputFormat –InputFormat , Table Splits –Table Splits static provisioning for, Static Provisioning –Static Provisioning versions of, Classes mapreduce package, Classes “MapReduce:
Simplified Data Processing on Large Clusters” (paper, by
Google), Backdrop massively parallel processing (MPP) databases, The Dawn of Big Data master server, The Problem with Relational Database Systems , Implementation backup, adding, Adding a backup master communication with, from API, Basic Operations local backup, adding, Adding a local backup master logfiles created by, Analyzing the Logs metrics exposed by, Master Metrics ports for, Required Ports properties for, HBase Configuration Properties –HBase Configuration Properties requirements for, Servers –Servers running tasks on, status of, Main page stopping, Cluster Operations MasterCoprocessorEnvironment class, The MasterCoprocessorEnvironment class –The MasterCoprocessorEnvironment class MasterObserver class, The MasterObserver Class –The BaseMasterObserver class Maven profiles, Dynamic Provisioning –Dynamic Provisioning requirements for, Building the Examples , Building from Source MBeans (managed beans), JMX Memcached, The Problem with Relational Database Systems , Nonrelational Database Systems, Not-Only SQL or NoSQL? memory, Servers (see also heap) requirements for, Servers usage metrics for, JVM Metrics memstore, Implementation , Write Path flush size for, Table Properties flushing, Implementation , State: open , The RegionCoprocessorEnvironment class , Cluster Operations , Log-Structured Merge-Trees , Write Path , Files limits of, Configuration metrics for, Region Server Metrics performance of, Garbage Collection Tuning memstore-local allocation buffer (MSLAB), Memstore-Local Allocation Buffer –Memstore-Local Allocation Buffer .META. table, Region Lookups , HBase Fsck MetaComparator class, The KeyValue class MetaKeyComparator class, The KeyValue class metrics (see monitoring systems) MetricsBase class, Contexts, Records, and Metrics MetricsContext interface, Contexts, Records, and Metrics –Contexts, Records, and Metrics MetricsRecord class, Contexts, Records, and Metrics military, data requirements of, The Dawn of Big Data modifyColumn() method, HBaseAdmin class, Schema Operations modifyTable() method, HBaseAdmin class, Table Operations monitoring systems, Introduction –Info Metrics (see also hbck tool; logfiles) Ganglia, Introduction , Ganglia –Usage importance of, Introduction –Introduction info metrics, Info Metrics –Info Metrics JMX, Introduction , JMX –JMX Remote API JVM metrics, JVM Metrics –JVM Metrics master server metrics, Master Metrics metric types, Contexts, Records, and Metrics –Contexts, Records, and Metrics metrics for, The Metrics Framework –Info Metrics Nagios, Nagios –Nagios for prototyping, Introduction region server metrics, Region Server Metrics –Region Server Metrics RPC metrics, RPC Metrics –RPC Metrics types of, Introduction –Introduction move command, HBase Shell, Tools move() method, HBaseAdmin class, Cluster Operations Mozilla Socorro, Time Series Data MPP (massively parallel processing) databases, The Dawn of Big Data MSLAB (memstore-local allocation buffer), Memstore-Local Allocation Buffer –Memstore-Local Allocation Buffer multicast messages, Ganglia monitoring daemon multicore processors, Servers multiversion concurrency control, Row Locks MUST_PASS_ALL operator, FilterList MUST_PASS_ONE operator, FilterList N n-way writes, LogSyncer Class Nagios, Introduction , Nagios –Nagios Narayanan, Arvind (developer, sample data set), Data Sink native Java API (see client API) Network Time Protocol (NTP), Synchronized time networking, hardware requirements for, Networking –Networking new (young) generation of heap, Garbage Collection Tuning next() method, ResultScanner class, The ResultScanner Class NoSQL database systems, Nonrelational Database Systems, Not-Only SQL or NoSQL? –Nonrelational Database Systems, Not-Only SQL or NoSQL? NOT_EQUAL operator, Comparison operators NO_OP operator, Comparison operators NTP (Network Time Protocol), Synchronized time NullComparator class, Comparators NullContext class, Contexts, Records, and Metrics NullContextWithUpdateThread class, Contexts, Records, and Metrics number generators, custom versioning for, Custom Versioning numColumns() method, Increment class, Multiple Counters numFamilies() method, Get class, Single Gets numFamilies() method, Increment class, Multiple Counters numFamilies() method, Put class, Single Puts numFamilies() method, Scan class, Introduction O observer coprocessors, Introduction to Coprocessors , The RegionObserver Class –The BaseMasterObserver class ObserverContext class, The ObserverContext class –The ObserverContext class old (tenured) generation of heap, Garbage Collection Tuning oldlogfile.log file, Region-level files oldlogfile.log.old file, Region-level files oldlogs directory, Root-level files OpenPDC project, The Dawn of Big Data OpenSSH, SSH OpenTSDB project, Time Series Data OS (operating system), requirements for, Operating system –Operating system , Windows OutputFormat class, OutputFormat –OutputFormat @Override, for methods, Data Sink P PageFilter class, PageFilter –PageFilter , Filters Summary pagination, Pagination –Pagination Parallel New Collector, Garbage Collection Tuning parameterless constructors, Tables partial key scans, Partial Key Scans –Partial Key Scans partition tolerance, Nonrelational Database Systems, Not-Only SQL or NoSQL? PE (Performance Evaluation) tool, Performance Evaluation –Performance Evaluation perf.hfile.block.cache.size property, Configuration performance best practices for, Client API: Best Practices –Client API: Best Practices block replication and, MapReduce Locality –MapReduce Locality load tests for, Load Tests –YCSB seek compared to transfer operations, Log-Structured Merge-Trees tuning compression, Compression –Enabling Compression configuration for, Configuration –Configuration garbage collection, Garbage Collection Tuning –Garbage Collection Tuning load balancing, Load Balancing –Load Balancing managed splitting, Managed Splitting –Managed Splitting memstore-local allocation buffer, Memstore-Local Allocation Buffer –Memstore-Local Allocation Buffer merging regions, Merging Regions –Merging Regions presplitting regions, Presplitting Regions –Presplitting Regions region hotspotting, Region Hotspotting Performance Evaluation (PE) tool, Performance Evaluation –Performance Evaluation Persistent time varying rate (PTVR) metric
rate, Contexts, Records, and Metrics physical models, Dimensions Pig, Pig –Pig Grunt shell for, Pig –Pig installing, Pig Pig Latin query language for, Pig pipelined writes, LogSyncer Class piping commands into HBase Shell, Scripting –Scripting planet-sized web applications, The Dawn of Big Data POM (Project Object Model), Building the Examples pom.xml file, Dynamic Provisioning ports for Avro, Operation required for each server, Required Ports for REST, Operation for Thrift, Operation for web-based UI, Master UI , Adding a local backup master postAddColumn() method, MasterObserver
class, The MasterObserver Class postAssign() method, MasterObserver class, The MasterObserver Class postBalance() method, MasterObserver class, The MasterObserver Class postBalanceSwitch() method, MasterObserver
class, The MasterObserver Class postCheckAndDelete, Handling client API events postCheckAndPut() method, RegionObserver
class, Handling client API events postCreateTable() method, MasterObserver
class, The MasterObserver Class postDelete() method, RegionObserver class, Handling client API events postDeleteColumn() method, MasterObserver
class, The MasterObserver Class postDeleteTable() method, MasterObserver
class, The MasterObserver Class postDisableTable() method, MasterObserver
class, The MasterObserver Class postEnableTable() method, MasterObserver
class, The MasterObserver Class postExists() method, RegionObserver class, Handling client API events postGet() method, RegionObserver class, Handling client API events postGetClosestRowBefore() method, RegionObserver
class, Handling client API events postIncrement() method, RegionObserver
class, Handling client API events postIncrementColumnValue() method, RegionObserver
class, Handling client API events postModifyColumn() method, MasterObserver
class, The MasterObserver Class postModifyTable() method, MasterObserver
class, The MasterObserver Class postMove() method, MasterObserver class, The MasterObserver Class postOpenDeployTasks() method,
RegionServerServices class, The RegionCoprocessorEnvironment class postPut() method, RegionObserver class, Handling client API events postScannerClose() method, RegionObserver
class, Handling client API events postScannerNext() method, RegionObserver
class, Handling client API events postScannerOpen() method, RegionObserver
class, Handling client API events postUnassign() method, MasterObserver class, The MasterObserver Class power supply unit (PSU), requirements for, Servers preAddColumn() method, MasterObserver class, The MasterObserver Class preAssign() method, MasterObserver class, The MasterObserver Class preBalance() method, MasterObserver class, The MasterObserver Class preBalanceSwitch() method, MasterObserver
class, The MasterObserver Class preCheckAndDelete() method, RegionObserver
class, Handling client API events preCheckAndPut() method, RegionObserver
class, Handling client API events preClose() method, RegionObserver class, State: pending close preCompact() method, RegionObserver class, State: open preCreateTable() method, MasterObserver
class, The MasterObserver Class preDelete() method, RegionObserver class, Handling client API events preDeleteColumn() method, MasterObserver
class, The MasterObserver Class preDeleteTable() method, MasterObserver
class, The MasterObserver Class predicate deletions, Tables, Rows, Columns, and Cells , Log-Structured Merge-Trees predicate pushdown, Introduction to Filters preDisableTable() method, MasterObserver
class, The MasterObserver Class preEnableTable() method, MasterObserver
class, The MasterObserver Class preExists() method, RegionObserver class, Handling client API events PrefixFilter class, PrefixFilter –PrefixFilter , Filters Summary preFlush() method, RegionObserver class, State: open preGet() method, RegionObserver class, Handling client API events preGetClosestRowBefore() method, RegionObserver
class, Handling client API events preIncrement() method, RegionObserver
class, Handling client API events preIncrementColumnValue() method, RegionObserver
class, Handling client API events preModifyColumn() method, MasterObserver
class, The MasterObserver Class preModifyTable() method, MasterObserver
class, The MasterObserver Class preMove() method, MasterObserver class, The MasterObserver Class preOpen() method, RegionObserver class, State: pending open prepare() method, ObserverContext class, The ObserverContext class prePut() method, RegionObserver class, Handling client API events preScannerClose() method, RegionObserver
class, Handling client API events preScannerNext() method, RegionObserver
class, Handling client API events preScannerOpen() method, RegionObserver
class, Handling client API events preShutdown() method, MasterObserver class, The MasterObserver Class preSplit() method, RegionObserver class, State: open preStopMaster() method, MasterObserver
class, The MasterObserver Class preUnassign() method, MasterObserver class, The MasterObserver Class preWALRestore() method, RegionObserver class, State: pending open prewarmRegionCache() method, HTable class, The HTable Utility Methods process limits, File handles and process limits –File handles and process limits processors (see CPU) profiles, Maven, Dynamic Provisioning –Dynamic Provisioning Project Object Model (see POM) properties, for configuration, HBase Configuration Properties –HBase Configuration Properties Protocol Buffers, Introduction to REST, Thrift, and Avro encoding for REST, Protocol Buffer (application/x-protobuf) schema used by, Advanced Schemas pseudodistributed mode, Pseudodistributed mode , Pseudodistributed mode –Adding a local region server PSU (power supply unit), requirements for, Servers PTVR (Persistent time varying rate), Contexts, Records, and Metrics Puppet, deployment using, Puppet and Chef Put class, Single Puts –Single Puts put command, HBase Shell, Quick-Start Guide , Data manipulation Put type, KeyValue class, The KeyValue class put() method, HTable class, Put Method –Atomic compare-and-set (see also checkAndPut() method, HTable class) list-based, List of Puts –List of Puts for multiple operations, Client-side write buffer –List of Puts for single operations, Single Puts –Single Puts putLong() method, Bytes class, The Bytes Class putTable() method, HTablePool class, HTablePool PyHBase client, Other Clients R RAID, Servers RAM (see memory) RandomRowFilter class, RandomRowFilter , Filters Summary range partitions, Auto-Sharding Rate (R) metric type, Contexts, Records, and Metrics raw() method, Result class, The Result class RDBMS (Relational Database Management System) converting to HBase, Database (De-)Normalization –Database (De-)Normalization limitations of, The Dawn of Big Data –The Dawn of Big Data , The Problem with Relational Database Systems –The Problem with Relational Database Systems read-only tables, Table Properties read/write performance, Dimensions readFields() method, Writable interface, Tables record IDs, custom versioning for, Custom Versioning RecordReader class, InputFormat recovered.edits directory, Region-level files , Log splitting , Edits recovery Red Hat Enterprise Linux (see RHEL) Red Hat Package Manager (see RPM) Reducer class, Reducer referential integrity, The Problem with Relational Database Systems RegexStringComparator class, Comparators region hotspotting, Region Hotspotting region servers, Auto-Sharding , Implementation adding, Adding a region server for fully distributed mode, Specifying region servers heap for, Garbage collection/memory tuning local, adding, Adding a local region server logfiles created by, Analyzing the Logs metrics exposed by, Region Server Metrics –Region Server Metrics ports for, Required Ports properties for, HBase Configuration Properties –HBase Configuration Properties rolling restart for, Rolling Restarts –Rolling Restarts shutting down, troubleshooting, Stability issues –“Could not obtain block” errors startup check for, Startup check status information for, Web-based UI Introduction , Cluster Status Information , Main page , Region Server UI –Main page stopping, Cluster Operations , Node Decommissioning –Node Decommissioning workloads of, handling, Garbage Collection Tuning RegionCoprocessorEnvironment class, The RegionCoprocessorEnvironment class .regioninfo file, Region-level files RegionLoad class, Cluster Status Information –Cluster Status Information RegionObserver class, The RegionObserver Class –The BaseRegionObserver class regions, Auto-Sharding –Auto-Sharding , Tables assigning to a server, Tools cache for, The HTable Utility Methods closing, Cluster Operations , Tools compacting, Cluster Operations , Tools , User Table page , Compactions –Compactions deploying or undeploying, Cluster Operations files for, Region-level files –Region-level files flushing, Cluster Operations , Tools life-cycle state changes, Handling region life-cycle events –State: pending close , The Region Life Cycle listing, User Table page , User Table page lookups for, Region Lookups map of, The HTable Utility Methods merging, Merging Regions –Merging Regions moving to a different server, Cluster Operations , Tools presplitting, Presplitting Regions –Presplitting Regions reassigning to a new server, HBase Fsck size of, increasing, Configuration splitting, Auto-Sharding , Cluster Operations , Tools , User Table page , Region splits –Region splits , Managed Splitting –Managed Splitting status information for, Cluster Status Information , Cluster Status Information –Cluster Status Information in transition, map
of, Cluster Status Information in transition, table
of, Main page unassigning, Tools RegionScanner class, Read Path regionservers file, Specifying region servers , regionserver , regionservers , Script-Based (see also configuration) RegionSplitter utility, Presplitting Regions Relational Database Management System (see RDBMS) remote method invocation (RMI), JMX Remote API remote procedure call (see RPC) RemoteAdmin class, REST Java client RemoteHTable class, REST Java client –REST Java client remove() method, HTableDescriptor class, Table Properties removeFamily() method, HTableDescriptor
class, Table Properties remove_peer command, HBase Shell, Replication replication, Replication –Region server failover , Replication –Replication for column families, Column Families in HBase Shell, Replication Representational State Transfer (see REST) requests, current number of, Cluster Status Information reset() method, Filter interface, Custom Filters REST (Representational State Transfer), Introduction to REST, Thrift, and Avro –Introduction to REST, Thrift, and Avro , REST –REST Java client , HBase Configuration Properties Base64 encoding used in, XML (text/xml) , JSON (application/json) documentation for, Operation formats supported by, Supported formats –Raw binary (application/octet-stream) Java client for, REST Java client –REST Java client JSON format for, JSON (application/json) –JSON (application/json) plain text format for, Plain (text/plain) –Plain (text/plain) port for, Operation Protocol Buffer format for, Protocol Buffer (application/x-protobuf) raw binary format for, Raw binary (application/octet-stream) starting gateway server for, Operation stopping, Operation verifying operation of, Operation XML format for, XML (text/xml) –XML (text/xml) Result class, The Result class –The Result class ResultScanner class, The ResultScanner Class –The ResultScanner Class , Client API: Best Practices RHEL (Red Hat Enterprise Linux), Operating system RMI (remote method invocation), JMX Remote API rolling restarts, Rolling Restarts –Rolling Restarts -ROOT- table, Region Lookups RootComparator class, The KeyValue class RootKeyComparator class, The KeyValue class round-trip time, Client-side write buffer row keys, Tables, Rows, Columns, and Cells –Tables, Rows, Columns, and Cells , Concepts field swap and promotion of, Time Series Data for pagination, Pagination for partial key scans, Partial Key Scans randomization of, Time Series Data salting prefix for, Time Series Data RowComparator class, The KeyValue class RowCountProtocol interface, The BaseEndpointCoprocessor class RowFilter class, RowFilter –RowFilter , Filters Summary RowLock class, Single Puts rows, Tables, Rows, Columns, and Cells –Tables, Rows, Columns, and Cells adding, Data manipulation multiple operations, Client-side write buffer –List of Puts single operations, Single Puts –Single Puts batch operations on, Batch Operations –Batch Operations counting, Data manipulation deleting, Data manipulation multiple operations, List of Deletes –List of Deletes single operations, Single Deletes –Single Deletes getting, Data manipulation multiple operations, List of Gets –List of Gets single operations, Single Gets –The Result class locking, Single Puts , Single Puts , Single Gets , Single Gets , Single Deletes , Single Deletes , Row Locks –Row Locks , Multiple Counters scanning, Scans –Caching Versus Batching , Data manipulation RPC (remote procedure call) metrics for, RPC Metrics –RPC Metrics put operations as, Client-side write buffer RPM (Red Hat Package Manager), Operating system Ruby hashes, in HBase Shell, Commands RVComparator class, The KeyValue class S S (String) metric type, Contexts, Records, and Metrics S3 (Simple Storage Service), S3 –S3 Safari Books Online, Safari® Books Online sales, data requirements of, The Dawn of Big Data salting, Time Series Data scalability, Scalability –Scalability Scan class, Introduction –Introduction , Introduction scan command, HBase Shell, Quick-Start Guide , Data manipulation scan operations, Scans –Caching Versus Batching , Read Path (see also get operations) batching, Caching Versus Batching –Caching Versus Batching caching, Caching Versus Batching –Caching Versus Batching leases for, The ResultScanner Class pagination, Pagination –Pagination partial key scans, Partial Key Scans –Partial Key Scans scan() method, HTable class filters for (see filters) schema, Schema Definition –Column Families column families, Column Families –Column Families tables, Tables –Table Properties script-based deployment, Script-Based –Script-Based scripting, in HBase Shell, Scripting –Scripting search integration, Search Integration –Search Integration secondary indexes, Dimensions , Secondary Indexes –Secondary Indexes seek operations, compared to transfer
operations, Log-Structured Merge-Trees sequential consistency, Nonrelational Database Systems, Not-Only SQL or NoSQL? ServerName class, Cluster Status Information servers, Servers (see also master server; region servers) adding, Adding Servers –Adding a region server requirements for, Servers –Servers status information for, Cluster Status Information status of, Cluster Status Information –Cluster Status Information setAutoFlush() method, HTable class, Client-side write buffer , Client API: Best Practices setBatch() method, Scan class, Caching Versus Batching setBlockCacheEnabled() method, HColumnDescriptor
class, Column Families setBlockSize() method, HColumnDescriptor
class, Column Families setBloomFilterType() method, HColumnDescriptor
class, Column Families setCacheBlocks() method, Get class, Single Gets setCacheBlocks() method, Scan class, Introduction , Client API: Best Practices setCaching() method, Scan class, Caching Versus Batching , Client API: Best Practices setCompactionCompressionType() method,
HColumnDescriptor class, Column Families setCompressionType() method, HColumnDescriptor
class, Column Families setDeferredLogFlush() method, HTableDescriptor
class, Table Properties setFamilyMap() method, Scan class, Introduction setFilter() method, Get class, Single Gets setFilter() method, Get or Scan class, The filter hierarchy setFilter() method, Scan class, Client API: Best Practices setInMemory() method, HColumnDescriptor
class, Column Families setMaxFileSize() method, HTableDescriptor
class, Table Properties setMaxVersions() method, Get class, Single Gets setMaxVersions() method, HColumnDescriptor
class, Column Families setMaxVersions() method, Scan class, Introduction setMemStoreFlushSize() method, HTableDescriptor
class, Table Properties setReadOnly() method, HTableDescriptor
class, Table Properties setRegionCachePrefetch() method, HTable
class, The HTable Utility Methods setScannerCaching() method, HTable class, Caching Versus Batching setScope() method, HColumnDescriptor class, Column Families setters, Table Properties setTimeRange() method, Get class, Single Gets setTimeRange() method, Increment class, Multiple Counters setTimeRange() method, Scan class, Introduction setTimeStamp() method, Delete class, Single Deletes setTimeStamp() method, Get class, Single Gets setTimeStamp() method, Scan class, Introduction setValue() method, HTableDescriptor class, Loading from the table descriptor , Table Properties setWriteToWAL() method, Increment class, Multiple Counters setWriteToWAL() method, Put class, Single Puts sharding, The Problem with Relational Database Systems , Scalability , Auto-Sharding –Auto-Sharding Shell, HBase (see HBase Shell) shouldBypass() method, ObserverContext
class, The ObserverContext class shouldComplete() method, ObserverContext
class, The ObserverContext class shutdown() method, HBaseAdmin class, Cluster Operations Simple Object Access Protocol (see SOAP) Simple Storage Service (see S3) SingleColumnValueExcludeFilter class, Filters Summary SingleColumnValueFilter class, SingleColumnValueFilter –SingleColumnValueFilter , SingleColumnValueExcludeFilter , Filters Summary size() method, Put class, Single Puts size() method, Result class, The Result class SkipFilter class, SkipFilter –SkipFilter , Filters Summary slave servers, The Problem with Relational Database Systems , Servers –Servers (see also region servers) smart grid, data requirements of, The Dawn of Big Data Snappy algorithm, Available Codecs , Snappy SOAP (Simple Object Access Protocol), Introduction to REST, Thrift, and Avro –Introduction to REST, Thrift, and Avro Socorro, Mozilla, Time Series Data software requirements, Software –Windows , Building from Source Solaris, Operating system Solr, Search Integration sort and merge operations, compared to seek
operations, Log-Structured Merge-Trees speculative execution mode, MapReduce, Table Splits split command, HBase Shell, Tools , Managed Splitting split() method, HBaseAdmin class, Cluster Operations , Managed Splitting split/compaction storms, Managed Splitting SplitAlgorithm interface, Presplitting Regions splitlog directory, Root-level files , Region-level files , Log splitting splits directory, Region-level files , Region splits src directory, Apache Binary Release SSH, requirements for, SSH standalone mode, Run Modes , Standalone Mode for HBase, Quick-Start Guide start key, for partial key scans, Partial Key Scans start() method, Coprocessor interface, The Coprocessor Class start_replication command, HBase Shell, Replication static provisioning, for MapReduce, Static Provisioning –Static Provisioning status command, HBase Shell, Quick-Start Guide , General stop key, for partial key scans, Partial Key Scans stop() method, Coprocessor interface, The Coprocessor Class stopMaster() method, HBaseAdmin class, Cluster Operations stopRegionServer() method, HBaseAdmin class, Cluster Operations stop_replication command, HBase Shell, Replication storage API (see client API) storage architecture, Storage –KeyValue Format accessing data, Log-Structured Merge-Trees , Overview column families, Concepts –Concepts deleting data, Log-Structured Merge-Trees files in, Files –Compactions HFile format, HFile Format –HFile Format KeyValue format, KeyValue Format –KeyValue Format LSM-trees for, Log-Structured Merge-Trees –Log-Structured Merge-Trees read path, Read Path –Read Path tables, Tall-Narrow Versus Flat-Wide Tables WAL (write-ahead log), Write-Ahead Log –Durability writing data, Log-Structured Merge-Trees writing path, Write Path –Write Path storage models, Dimensions store files (HFiles), Tables, Rows, Columns, and Cells , Implementation –Implementation (see also storage architecture) compaction of (see compaction) compression of (see compression) creation of, Overview in LSM-trees, Log-Structured Merge-Trees metrics for, Region Server Metrics properties for, HBase Configuration Properties –HBase Configuration Properties status information about, Cluster Status Information , Cluster Status Information stored procedures, The Problem with Relational Database Systems StoreScanner class, Read Path strict consistency, Nonrelational Database Systems, Not-Only SQL or NoSQL? String (S) metric type, Contexts, Records, and Metrics SubstringComparator class, Comparators swapping, configuring, Swappiness synchronized time, Synchronized time sysctl.conf file, File handles and process limits , Swappiness system event metrics, JVM Metrics system requirements, Hardware –Windows system time, synchronized, Synchronized time T tab-separated value (TSV) data, importing, Using the importtsv tool table descriptors, Tables –Table Properties loading coprocessors, Loading from the table descriptor –Loading from the table descriptor modifying, Schema Operations retrieving, Table Operations , Data definition table hotspotting, Region Hotspotting tableExists() method, HBaseAdmin class, Table Operations .tableinfo file, Table-level files TableInputFormat class, InputFormat , Table Splits , Data Source , Data Source and Sink TableMapper class, Mapper TableMapReduceUtil class, Supporting Classes TableOutputCommitter class, OutputFormat TableOutputFormat class, OutputFormat , Data Sink , Data Source and Sink TableRecordReader class, Table Splits TableRecordWriter class, OutputFormat tables, Tables, Rows, Columns, and Cells –Tables, Rows, Columns, and Cells altering structure of, Table Operations , Data definition closing, The HTable Utility Methods compacting, Cluster Operations , Tools , User Table page , Compactions –Compactions copying, CopyTable Tool –CopyTable Tool creating, Quick-Start Guide , Shell Introduction , Table Operations –Table Operations , Data definition deferred log flushing for, Table Properties deleting, Table Operations disabling, Table Operations , Data definition dropping, Quick-Start Guide , Data definition enabling, Table Operations , Data definition files for, Table-level files flat-wide layout, Tall-Narrow Versus Flat-Wide Tables flushing, Cluster Operations , Tools keyvalue pairs for, setting, Table Properties listing, Table Operations maximum file size for, Table Properties memstore flush size for, Table Properties name for, The HTable Utility Methods , Tables , Table Properties properties of, Table Properties –Table Properties read-only, Table Properties replication of, Replication –Replication splitting, Cluster Operations , Tools , User Table page , Table Splits –Table Splits status information for, Main page , User Table page –User Table page tall-narrow layout, Tall-Narrow Versus Flat-Wide Tables truncating, Data manipulation tail() method, Bytes class, The Bytes Class tenured (old) generation of heap, Garbage Collection Tuning thread metrics, JVM Metrics Thrift, Introduction to REST, Thrift, and Avro –Introduction to REST, Thrift, and Avro , Thrift –Example: PHP documentation for, Operation installing, Installation –Installation PHP schema compiler for, Example: PHP –Example: PHP port used by, Operation schema compilers for, Example: PHP , Example: PHP schema for, Installation starting server for, Operation stopping, Operation time series data, Time Series Data –Time Series Data Time varying integer (TVI) metric type, Contexts, Records, and Metrics Time varying long (TVL) metric type, Contexts, Records, and Metrics Time varying rate (TVR) metric type, Contexts, Records, and Metrics –Contexts, Records, and Metrics time-ordered, related, data, Time-Ordered Relations –Time-Ordered Relations time-to-live (TTL), Column Families , Log-Structured Merge-Trees , Root-level files , Cleaning logs timestamp, for cells (see versioning) TimestampFilter class, TimestampsFilter –TimestampsFilter TimeStampingFileContext class, Contexts, Records, and Metrics TimestampsFilter class, Filters Summary .tmp directory, Region-level files , Region splits toBoolean() method, Bytes class, Single Gets toBytes() method, Bytes class, Single Puts toFloat() method, Bytes class, Single Gets toInt() method, Bytes class, Single Gets toLong() method, Bytes class, Single Gets , The Bytes Class tombstone marker (see delete marker) ToR (top-of-rack) switch, Networking toString() method, Bytes class, Single Gets , List of Deletes toString() method, Result class, The Result class toStringBinary() method, Bytes class, The Bytes Class trailer blocks, HFile Format Transactional HBase project, Transactions transactions, The Problem with Relational Database Systems , Secondary Indexes , Transactions –Transactions transfer operations, compared to seek
operations, Log-Structured Merge-Trees troubleshooting, HBase Fsck (see also debugging) checklist for, Basic setup checklist –“Could not obtain block” errors hbck tool, HBase Fsck –HBase Fsck logfiles, analyzing, Analyzing the Logs –Analyzing the Logs region servers shutting down, Stability issues –“Could not obtain block” errors ZooKeeper problems, ZooKeeper problems –ZooKeeper problems truncate command, HBase Shell, Data manipulation TSV (tab-separated value) data, importing, Using the importtsv tool TTL (time-to-live), Column Families , Log-Structured Merge-Trees , Root-level files , Cleaning logs TVI (Time varying integer) metric type, Contexts, Records, and Metrics TVL (Time varying long) metric type, Contexts, Records, and Metrics TVR (Time varying rate) metric type, Contexts, Records, and Metrics –Contexts, Records, and Metrics U Ubuntu, Operating system , File handles and process limits UDP multicast messages, Ganglia monitoring daemon UDP unicast messages, Ganglia monitoring daemon ulimit setting, File handles unassign command, HBase Shell, Tools unassign() method, HBaseAdmin class, Cluster Operations unicast messages, Ganglia monitoring daemon Unix, Operating system –Operating system Unix epoch, Single Puts Unix time, Single Puts unlockRow() method, HTable class, Row Locks update() method, Batch class, The CoprocessorProtocol interface URL encoding, Plain (text/plain) URLs, shortening (see Hush (HBase URL Shortener)) V value() method, Result class, The Result class ValueFilter class, ValueFilter –ValueFilter , Filters Summary verifyrep tool, Replication version command, HBase Shell, General versioning, Tables, Rows, Columns, and Cells –Tables, Rows, Columns, and Cells , Single Puts –Single Puts , Versioning –Custom Versioning custom, Custom Versioning –Custom Versioning implicit, Implicit Versioning –Implicit Versioning incrementing counters based on, Multiple Counters retrieving timestamp for Get, Single Gets retrieving timestamp for Put, Single Puts setting timestamp for Delete, Single Deletes , Single Deletes setting timestamp for Get, Single Gets setting timestamp for Put, Row Locks setting timestamp for Scan, Introduction storage architecture for, Concepts versions of HBase, Road Map –HBase 0.94.0 determining, Cluster Status Information in this book, HBase Version metrics for, Info Metrics numbering of, History supported by Hive, Hive upgrading from previous releases, Upgrade from Previous Releases –Upgrading to HBase 0.92.0 virtual shards, The Problem with Relational Database Systems Vogels, Werner (author, “Eventually Consistent”), Nonrelational Database Systems, Not-Only SQL or NoSQL? W waits (from locking), Dimensions WAL (write-ahead log), Implementation , Write-Ahead Log –Durability (see also logfiles) appending data to, HLog Class –HLog Class deferred flushing for, Table Properties , LogSyncer Class –LogSyncer Class durability of data with, Durability –Durability keys in, HLogKey Class location of, Root-level files –Root-level files number of, decreasing, Configuration recovering edits, Edits recovery replaying, Replay –Edits recovery rolling, LogRoller Class –LogRoller Class splitting, Log splitting –Log splitting writing data to, Write Path WALEdit class, WALEdit Class , Replication , Normal processing WARN logging level, Changing Logging Levels weak consistency, Nonrelational Database Systems, Not-Only SQL or NoSQL? web-based companies, data requirements of, The Dawn of Big Data –The Dawn of Big Data web-based UI ports for, Adding a local backup master web-based UI for HBase, Web-based UI –Shared Pages accessing, Master UI cluster information, Main page –Main page logfiles, accessing from web-based UI, Shared Pages logging levels, Shared Pages ports used by, Master UI region server information, Region Server UI –Main page table information, User Table page –User Table page thread dumps, Shared Pages ZooKeeper information, ZooKeeper page website resources Avro server documentation, Operation Bigtable, Backdrop Cascading, Cascading Chef, Puppet and Chef Cloudera’s Distribution including Apache Hadoop, Cloudera’s Distribution Including Apache Hadoop CloudStore, Other Filesystems companies using HBase, list of, Preface Crossbow project, The Dawn of Big Data Delicious RSS feed, Data Sink error messages, Analyzing the Logs ext3 filesystem, Filesystem ext4 filesystem, Filesystem for this book, HBase Version , Building the Examples , How to Contact Us , CRUD Operations GFS (Google File System), Backdrop GitHub, Building the Examples Global Biodiversity Information Facility, The Dawn of Big Data Hadoop, Hadoop HBase, History , Quick-Start Guide , Apache Binary Release HBase-Runner project, Clojure HDFS, Distributed Mode Hive documentation, Hive Java, Java JConsole documentation, JConsole JMXToolkit, JMX Remote API JRE (Java Runtime Environment), Installation (J)Ruby, Shell Introduction Linux file descriptor limit, File handles and process limits MapReduce, Backdrop Mozilla Socorro, Time Series Data NTP, Synchronized time OpenPDC project, The Dawn of Big Data OpenSSH, SSH Puppet, Puppet and Chef REST documentation, Operation Safari Books Online, Safari® Books Online Thrift server documentation, Operation Whirr, Apache Whirr Windows Installation guide, Windows XFS filesystem, Filesystem ZFS filesystem, Filesystem Zookeeper, Using the existing ZooKeeper ensemble webtable, Tables, Rows, Columns, and Cells WhileMatchFilter class, WhileMatchFilter –WhileMatchFilter , Filters Summary Whirr, deployment using, Apache Whirr –Apache Whirr White, Tom (author, Hadoop: The Definitive
Guide), Hardware Windows, Windows Writable interface, Tables write buffer, Client-side write buffer –Client-side write buffer concurrent modifications in, HTablePool flushing, Client-side write buffer –Client-side write buffer , List of Puts –List of Puts , HTablePool , Data Sink , Client API: Best Practices size of, HBase Configuration Properties write() method, Writable interface, Tables write-ahead log (see WAL) writeToWAL() method, Put class, Client API: Best Practices X XFS filesystem, Filesystem XML format, with REST, XML (text/xml) –XML (text/xml) -XX:+CMSIncrementalMode
option, Garbage Collection Tuning -XX:CMSInitiatingOccupancyFraction
option, Garbage Collection Tuning -XX:MaxNewSize option, Garbage Collection Tuning -XX:NewSize option, Garbage Collection Tuning -XX:+PrintGCDetails
option, Garbage Collection Tuning -XX:+PrintGCTimeStamps
options, Garbage Collection Tuning -XX:+UseConcMarkSweepGC
option, Garbage Collection Tuning -XX:+UseParNewGC option, Garbage Collection Tuning Z ZFS filesystem, Filesystem Zippy algorithm, Available Codecs , Snappy zk_dump command, HBase Shell, Tools zoo.cfg file, ZooKeeper setup , ZooKeeper setup ZooKeeper, Implementation existing cluster, setting up for HBase, Using the existing ZooKeeper ensemble information about, retrieving, Tools , Main page , ZooKeeper page number of members to run, ZooKeeper setup properties for, HBase Configuration Properties –HBase Configuration Properties role in data access, Overview setup for fully distributed mode, ZooKeeper setup –Using the existing ZooKeeper ensemble sharing connections to, Connection Handling splits tracked by, Region splits starting, Running and Confirming Your Installation timeout for, Configuration for transactions, Transactions troubleshooting, ZooKeeper problems –ZooKeeper problems znodes for, ZooKeeper –ZooKeeper zookeeper.session.timeout property, ZooKeeper setup , JVM Metrics , Configuration , HBase Configuration Properties zookeeper.znode.parent property, ZooKeeper , Choosing region servers to replicate to , HBase Configuration Properties zookeeper.znode.rootserver property, HBase Configuration Properties
..................Content has been hidden....................
You can't read the all page of ebook, please click
here login for view all page.