[A][B][C][D][E][F][G][H][I][J][K][L][M][N][O][P][Q][R][S][T][U][V][W][Y]
access pattern
defining
isolating
access time complexity
ACID, 2nd
ad impressions and clickstream
add_peer
addCallback(), 2nd
addCallbacks()
addColumn()
addFamily()
Aggregate example
alter command
Amazon Web Services.
See AWS.
Amazon, Dynamo
Apache HBase
installing
Apache Whirr
array list, creating
asymptotic notation
async
asynchronous programming
atomic operations
atomicity
and tall tables
attribute
identifying
non-identifying
availability
single namespace
AWS
Elastic Block Store (EBS)
Elastic Compute Cloud(EC2)
Simple Storage Service (S3)
backup
of root directory
using MapReduce jobs
CopyTable
Export/Import
ImportTsv.
See replication.
balance_switch
balancer
base64 utility
BaseEndpointCoprocessor
BaseRegionObserver
Big Data
BinaryComparator
BinaryPrefixComparator
BitComparator
blob, 2nd
block
block cache, maximum heap use
block size, defining
BlockCache
bloom filter
bulk import
Bytes class
Bytes.compareTo()
caching
aggressive
disabling
caching block, disabling
Cacti, collecting metrics with
Cafarella, Mike
Callback chain
CAP theorem
cell, 2nd
names should be short
versioning
centroid
checkAndDelete()
checkAndPut()
Chef, 2nd
client configuration
client, alternatives to Java
asynchbase
JRuby
REST
Thrift
UNIX
cloud
deploying in, pros and cons
deploying with Whirr
Cloudera CDH
installing
Cloudera Manager, 2nd
Cluster SSH
CMS
column family
advanced configurations
aggressive caching
block cache
block size
bloom filters
cell versioning
compression
doesn’t map to relational database
names should be short
TTL
column qualifier, 2nd
names should be short
treating as data
ColumnPrefixFilter
commands
commodity hardware
compact command
compaction
major
and tuning performance
minor
threshold
triggering from the shell
compareAndSet()
CompareFilter, 2nd
CompareOp
completebulkload
compression
and tuning performance
concurrent-mark-and-sweep (CMS)
configuration parameters
configuration.
See HBase.
configuration
connection pool
consistency
content management system.
See CMS.
content, serving
convex hull
convexHull()
coprocessorExec(), 2nd
CoprocessorHost
CopyTable MapReduce job
counter
cross-row transaction
Cutting, Doug
Data Definition Language.
See DDL.
data locality vs. spatial locality
data model, 3rd
column oriented
key-value store
logical
physical
semistructured
sorted map of maps
database, relational vs. non-relational
DDL
de-normalization, 2nd
Deferred
DeferredGroupException
Delete, 2nd
deleteColumn(), 2nd
deleteColumns(), 2nd
deleted record, cleaning up
describe command
design fundamentals of HBase
dfs.client.read.shortcircuit
dfs.datanode.max.xcievers
dfs.support.append
DistanceComparator
distributed file system
consistency
durability, 2nd
Elastic Block Store (EBS)
Elastic Compute Cloud (ECC)
endpoint coprocessor
executing
implementing
client
interface
server, 2nd
scatter-gather algorithm
entity
nesting
entropy()
Errback, 2nd
/etc/security/limits.conf
Export MapReduce job
Facebook
HBase use, 2nd
messages
Facebook Insights
file
open, limit on
splitting across DataNodes
filter, 2nd, 3rd
combining
custom, 2nd
using exceptions in
default no-arg
constructor
installing, 2nd
prebundled.
See also RowFilter, Prefix-Filter, QualifierFilter, ValueFilter, TimestampFilter, FilterList.
Filter interface, 2nd
methods
filterAllRemaining
FilterBase, 2nd
filterKeyValue
filterKeyValue()
FilterList, 2nd
filterRow
filterRow()
filterRowKey
foreign key
Ganglia, 2nd
GangliaContext
GangliaContext31
garbage collection
concurrent-mark-and-sweep (CMS)
configuration options
logging
occupancy fraction
Parallel New Collector
tuning
Geographic Information Systems.
See GIS.
geohash, 2nd
as rowkey
calculating
character precision
precision
scan, creating from query polygon
truncating
vs. other linearization techniques
GeoHash.getAdjacent()
Geometry
contains(), 2nd
converting GeoHash into
Geometry.getCentroid()
GeometryFactory
Get, 2nd, 3rd, 4th
specificity of
getCause()
getCoords()
getEnvironment()
getRegion()
getTimestamp() method
GIS
custom filter
dimensions
nearest-neighbors query
rowkey, geohash as
spatial index.
See spatial index.
Google coprocessors
Google File System, introduction to
graceful-stop.sh
Guava
GZIP
Hadoop Distributed Filesystem.
See HDFS.
Hadoop MapReduce.
See MapReduce.
hadoop-datajoin
hadoop-hbase
hadoop-hbase-master
hadoop-hbase-regionserver
hadoop-metrics.properties
hardware
choosing
replication
hash function
collisions
HBase Master
hardware
listener port
sample UI
HBase shell, 2nd
$HBASE_HOME
hbase-daemon.sh, 2nd
hbase-daemons.sh
hbase-env.sh
hbase-site.xml
hbase.balancer.period
hbase.client.scanner.caching, 2nd, 3rd
hbase.client.write.buffer
hbase.cluster.distributed
hbase.coprocessor.master.classes
hbase.coprocessor.region.classes, 2nd
hbase.coprocessor.wal.classes
hbase.hregion.majorcompaction, 2nd
hbase.hregion.max.filesize, 2nd, 3rd, 4th
hbase.hregion.memstore.flush.size, 2nd
hbase.hregion.memstore.mslab.enabled, 2nd
hbase.hstore.blockingStoreFiles
hbase.hstore.blockingWaitTime
hbase.hstore.compaction.max
hbase.hstore.compaction-Threshold
Hbase.IFace
hbase.mapreduce.hfileoutput-format.blocksize
hbase.master.info.port
hbase.master.port
hbase.regionserver.global.memstore.lowerLimit, 2nd, 3rd
hbase.regionserver.global.memstore.upperLimit, 2nd, 3rd
hbase.regionserver.handler.count, 2nd
hbase.regionserver.optionallog-flushinterval
hbase.regionserver.port
hbase.regionserver.regionSplit-Limit
hbase.rootdir, 2nd
hbase.server.thread.wake-frequency
hbase.tmp.dir
hbase.zookeeper.quorum, 2nd
HBaseAdmin
HBaseAdmin.createTable()
HBaseClient
HBaseConfiguration
HBaseFsck.
See hbck.
hbck
HColumnDescriptor
HDFS, 2nd, 3rd, 4th, 14th
as underlying storage
block not same as HFile block
blocks, 2nd
column family data
configuration
parameters
consistency
durable sync
HFiles stores on
location
provides HBase with single namespace
read path
replication
write path
heap size
help command
Heuberger, Silvio
HFile
block size
index granularity
loading into table
restricting number of
viewing
hfile.block.cache.size, 2nd
HFileOutputFormat
high availability
Hilbert curve
HLog
flush frequency
viewing
hlog command
hot failover
hot-spotting
HStoreFile, 2nd
HTable constructor
HTableDescriptor
HTableInterface, 2nd
HTablePool, 2nd
I/O load, reducing
ICV, 2nd, 3rd
idempotent, definition of
IdentityTableReducer
IDL
ImmutableBytesWritable
Import MapReduce job
ImportTsv MapReduce job
IncomingDataPoints.rowKeyTe mplate()
Increment Column Value.
See ICV.
incremental data, capturing
index
lack of, doesn’t map to relational database
secondary, via coprocessors
indexing
information exchange
Ingest
init script
Interface Definition Language.
See IDL.
InternalScanner
do-while
iostat
isolation
Java Management Extensions.
See JMX.
Java Runtime Environment.
See JRE.
Java, use in HBase
java.awt.geom.Point2D
JMX
Job
JobTracker
join.
See also map-side join, ; reduce-side join.
join key
joinUninterruptibly()
JRE
JRuby
scripting shell from
JSON, pretty printing
JTS
JTS Topology Suite.
See JTS.
JTSTestBuilder
junction table
latency vs. throughput
latency()
Leaflet
Lily, HBase use
linear scalability
linearization
list command
list_peer
LoadIncrementalHFiles
LongWritable
LRU cache
lsof
LZO
main()
major compaction, timing.
See also compaction, major.
major_compact command
map
map-side join
master process
MasterObserver
registering
Maven archetype
maven-assembly-plugin
MD5
vs. SHA-1
MemStore
maximum heap percentage
maximum size
size, defining
MemStore-Local Allocation Buffer, 2nd
.META. table, 2nd
metadata, separating from data
MetricsContext
Microsoft cloud
migration
MinMaxPriorityQueue
minor compaction, maximum files.
See also compaction, minor.
Mozilla, HBase use
MultiPoint
NameNode
NAS
nearest-neighbors query
nested entities
netstat
network-attached storage.
See NAS.
nextRows()
node
adding
decommissioning
normalization
NoSQL, introduction to
NullComparator
Nutch
observer coprocessor
implementing
installing
modifying schema
MasterObserver
RegionObserver
WALObserver
observer preprocessor, recursion
OLAP, 2nd
OLTP, 2nd
Online Analytical Processing (OLAP).
See OLAP.
Online Transaction Processing (OLTP).
See OLTP.
online vs. offline systems, 2nd
Open Time Series Database.
See OpenTSDB.
open-file limit
OpenStreetMap
OpenTSDB, 2nd
architecture
asynchbase
collecting metrics with
data collection
designing
HBase
high availability
implementing
infrastructure monitoring
linear scalability
overview
querying data
read path
schema
serving queries
storing data
tcollector
three responsibilities
time series data
tsd
tsdb
tsdb-uid
write path
operating system, configuration parameters
Opsware
org.apache.hadoop.hbase.util.RegionSplitter
OutputFormat
parallel
execution
problems
Parallel New Collector
ParseFilter
partitioning
performance
impacts on
testing
PerformanceEvaluation tool
Yahoo! Cloud Serving Benchmark (YCSB)
tuning
mixed workload
random-read-heavy
sequential-read-heavy
write-heavy
tuning dependency systems
file system
hardware
HDFS
network
OS
PerformanceEvaluation tool
placeholder entry
postDeleteTable()
PostGIS
Powerset
PrefixFilter
process, limit on
prototype cluster.
See cluster, prototype.
Puppet, 2nd
push-down predicate.
See filter.
Put
put
Put.get()
QualifierFilter
query polygon
query within.
See within query.
QueryMatch
Rackspace
read path
read pattern, defining
record, writing
reduce
reduce-side join
problems
RegexStringComparator, 2nd
region, 2nd
finding
hot-spotting
maximum number
metadata, missing
missing or extra
overlapping
size, 2nd
splitting from the shell
region balancer
RegionCoprocessorEnvironment
.regioninfo file
RegionObserver
RegionServer
disabling swap
graceful stop
handler count, and tuning performance
hardware
listener port
relational database
concepts
nested entities
RelationsDAO
relationship
in HBase
one-to-many and many-to-many
reliability
and failure resistance
remove_peer
replication
and time synchronization
cyclic
inter-cluster
configuring
managing
testing
types
master-master
master-slave.
See also backup.
reset()
Result, 2nd
ResultScanner
retention policy
ReturnCode
rolling restart
root directory, backing up
-ROOT- table, 3rd
finding
RowFilter
rowkey, 2nd, 3rd
design strategies
cardinality
hashing
I/O considerations
optimizing for reads
optimizing for writes
salting
design, and tuning performance
hashed, benefits of
identifying attributes
importance during table design
MD5
names should be short
partial, generating
placement of information
time series data
rowkey scan pattern
RPC listener
Runa, HBase use
Salesforce, HBase use
salting.
See rowkey, design strategies.
Scan, 2nd, 3rd, 9th
caching
designing for
executing
filter
keys
setCaching().
See filter.
Scan.setCacheBlocks()
scannerGet()
scannerOpen()
scatter-gather algorithm
schema-less database
server side, pushing work to
service, starting on cluster
serving content.
See content, serving.
setCaching()
setCaching(int)
setTimeRange()
SHA-1, vs. MD5
shell command
shutdown()
Simple Storage Service (S3)
skip list
slave node
Snappy
Socorro, HBase use
software, deploying
Solr
spatial index
designing
geohash
as rowkey
calculating
rowkey design
spatial locality, vs. data locality
split command
Stack, Michael
Stamen Design
start script
start_replication
start-hbase.sh
status command
stop script
stop_replication
stop-hbase.sh
store file
StumbleUpon
HBase adoption
MySQL
OpenTSDB
su.pr, HBase use
SubstringComparator
swap behavior, tuning
table name
TableInputFormat
TableMapper
TableMapReduceUtil
TableOutputFormat
TableReducer
TaskTracker
telemetry
temporary directory
Text
TextOuputFormat
Thrift
API
client library
gateway deployment
interacting with tables
scripting shell from
service, launching
throughput vs. latency
time series
aggregation
data management
data, recording
hot spot
metadata auto-complete
reading
Time Series Database.
See TSDB.
Time To Live.
See TTL.
timestamp
TimestampFilter
tombstone record
top (Linux tool)
Trend Micro, HBase use
truncate command
TSDB
TSDB.addPoint()
TsdbQuery
TsdbQuery.createAndSetFilter()
TsdbQuery.findSpans()
TsdbQuery.getScanner()
TsdbQuery.run()
TTL
disabling
tuning.
See performance, tuning.
TwitBase
uberhbck
UI, port
UID
creating
name auto-completion
UniqueId.getOrCreateId()
UniqueId.getSuggestScanner()
UniqueId.suggest()
UNIX, scripting shell from
upgrading
URL shorteners
user model, serving
user-interaction data, capturing
value
ValueFilter, 2nd
VerifyReplication
version
version information
versioning, doesn’t map to relational database
vm.swappiness
WAL, 2nd, 4th
disabling.
See also HLog, ; MemStore.
WALObserver
web search, canonical problem
well-known text.
See WKT.
WhileMatchFilter
Whirr
within query
client side
filter
WKT
workload, tuning.
See performance, tuning.
WritableByteArrayComparable
write buffer
write path
write pattern, defining
write-ahead log.
See WAL.
Yahoo! Cloud Serving Bench-mark (YCSB)
Yahoo!, Hadoop development
3.139.82.77