Index

Symbols

1 fault tolerant, 38

2PC (two-phase commit), 40, 141–142

3PC (three-phase commit), 40

A

acceleration, 108

ACID (Atomicity, Consistency, Isolation, and Durability), 6, 139–140

address resolution protocol. See ARP

addresses (IP)

Anycast, 102–103

load balancing, 63–65

managing,Wackamole, 92

administration, 25, 36

cost effectiveness, 34

high availability, 26

monitoring, 27–30

release cycles, 30–34

speed, 34–36

AFS protocol, priority placement, 85

agents

MTAs, 28

SNMP, 28

aggregation

passive logs for metrics, 187–188

passive sniffing log, 173–174

periodic batch, 171–172

real-time unicast, 172–173

analysis, real-time, 181–183

Anycast, static content, 102–103

Apache

memory resources, 79

Web server, 88

APIs (application programming interfaces), 67

application-layer load balancers, 67–71

applications

horizontal scalability, 5

mod_perl, 202

wwwstat, 186

applying

logging, 197

Spread, 241–245

architecture

data resiliency, 137

design, 23

five nines availability, 26

high availability

costs, 39–40

Foundry ServerIron, 45–49

growth, 55

load balancing, 41–43

maintenance, 40–41

mission-critical systems, 55–60

peer-based, 49–54

site surveys, 45

traditional, 43–44

logging, 176–177

mission-critical environments, 25

cost effectiveness, 34

high availability, 26

management, 36

monitoring, 27–30

release cycles, 30–34

speed, 34–36

production environments, 12–13

scalability, 5, 9–10

decreasing, 7–8

flawed designs, 6–7

need for, 6

real-world design of, 8–9

Spread, 227–228

ARP (address resolution protocol), 63

response packets, 97

spoofing, 96

Atomicity, Consistency, Isolation, and Durability (ACID), 6, 139–140

availability

five nines availability, 26

high availability, 26, 37–38

costs, 39–40

Foundry ServerIron, 45–49

growth, 55

load balancing, 41–43

maintenance, 40–41

mission-critical systems, 55–60

peer-based, 49–54

site surveys, 45

traditional, 43–44

static content, 90

available resource load balancing, 62

avoiding failure

control, 14–15

disaster recovery, 22–23

rapid development, 15–16

unit testing, 16–17

version control, 18–21

production environments, 12–13

stability, 14–15

disaster recovery, 22–23

rapid development, 15–16

unit testing, 16–17

version control, 18–21

B

batches, periodic aggregation, 171–172

binlog, 147

black-box load balancers. See Web switches

browser caching, 83

building high availability architectures, 40–41

C

caches, 134–135

cache-on-demand, 86–87

deploying, 117

invalidation, 134

static content clusters, 83

types of, 107

data, 112–113

distributed, 114–116

integrated/look-aside, 109–112, 122–127

layered/transparent, 107–108

write-thru/write-back, 113–114

Web, 87–88

Ccapacity

horizontal scalability, 5

planning, 72

CARP (Common Address Redundancy Protocol), 49

casual monitors, 177

casual ordering, 228

changeset replication, 145–146

Oracle, 147

selecting, 147

CLF (Common Log Format), 180

clients

spoon-feeding, 108

testing, 223

writing, 220–223

clusters, 50

load balancing, 61–63

IP services, 63–65

web switches, 65–66

load balancing, 73

periodic batch aggregation, 171–172

static content, 82

caching, 83

upper bound, 83

CODA protocol, priority placement, 85

Code, consistency, 41

collecting metrics, 191–193

Common Address Redundancy

Protocol (CARP), 49

Common Log Format (CLF), 180

communication

group (Spread), 227–228

Spread, 229–230

applying, 241–245

configuring, 231–241

installing, 231

compiling Wackamole, 91

complete reliability, 138

complex scheduling, monitoring, 29

computational reuse, 109

Concurrent Versioning System (CVS), 19

conditions, race, 154

configuration

databases, 189–193

logging, 177

mod_log_spread, 179

spreadlogd, 178–179

logs, 152

MySQL

defining scope, 206–207

selecting, 207–208

technical setup, 200–206

testing, 223–226

troubleshooting, 208–223

PHP extensions, 203–206

Spread, 91, 231–241

Squid, 98

Wackamole, 92–94

connections

least, 62

Spreads, 178–179

consistency, code, 41

content distribution

caches, 134–135

data, 112–113

deploying, 117

distributed, 114–116

integrated/look-aside, 109–112, 122–127

layered/transparent caches, 107–108

types of, 107

write-thru/write-back, 113–114

dynamic

serving news sites, 117–125, 127–130

two-tier execution, 130–134

context switching, 77–78

control, 14–15

disaster recovery, 22–23

rapid development, 15–16

unit testing, 16–17

versions, 18–21

cookies, 128–130

costs

architecture, 34

high availability, 39–40

replication, 126

CREATE TABLE DDL, 151

criteria for monitoring systems, 29–30

cross-vendor replication, implementing, 151–166

cull_old_hits function, 216

custom reactions, monitoring, 29

CVS (Concurrent Versioning System), 19

D

daemons, Spread, 229–230

data caches, 112–113

data modification language. See DML

data resiliency, 137

databases

distributed

data resiliency, 137

geographically distributed operations, 139

operational failover, 138

optimizing query performance, 138

overview of, 137

reliability, 138

MySQL

defining scope, 206–207

selecting, 207–208

technical setup, 200–206

testing, 223–226

troubleshooting, 208–223

RDBMS, 113

RDDtool, 188–189

collecting metrics, 191–193

configuring, 189–190

generating graphs, 194–197

replication, 139

master-master, 144

master-slave, 144–147

multimaster, 140–143

scalability, 148, 151

implementing cross-vendor replication, 151–166

implementing same-vendor replication, 166

dblinks, 164

DDL:CREATE TABLE, 151

decreasing scalability, 7–8

defining scope, 206–207

dependencies, services, 30

deploying caches, 117

design, 6–10, 23. See also configuration

development

internal release cycles, 31

rapid, 15–16

differential synchronization, 85

disaster recovery, 22–23

distributed databases

data resiliency, 137

geographically distributed operations, 139

master-master replication, 144

master-slave replication, 144–147

multimaster replication, 140–141

2PC, 141–142

EVS, 142–143

operational failover, 138

optimizing query performance, 138

overview of, 137

reliability, 138

scalability, 148, 151

cross-vendor replication, 151–154, 156–166

same-vendor replication, 166

distribution

caches, 114–116, 134–135

data, 112–113

deploying, 117

integrated/look-aside, 109–112, 122–127

layered/transparent, 107–108

types of, 107

write-thru/write-back, 113–114

dynamic

serving news sites, 117–130

two-tier execution, 130–134

static content, 83–87

DML (data modification language), 145–147

logs, 152

race conditions, 154

replay replication, 152–162

snapshot replication, 163–166

DNS (Domain Name Service)

high availability, 55–60

Round-Trip Times

static content, 101

dot.com bust, affect on large systems, 13–14

dynamic content distribution

serving, 117–125, 127–130

two-tier execution, 130–134

E

effective resource utilization, load balancing, 62

email, high availability, 55–60

emergency releases, 33

environments

mission-critical, 25

cost effectiveness, 34

high availability, 26

management, 36

monitoring, 27–30

release cycles, 30–34

speed, 34–36

production, 12–13

events

logging

architecture, 176–177

configuring, 177

mod_log_spread, 179

optimizing, 175–176

overview of, 169–171

passive sniffing log aggregation, 173–174

periodic batch aggregation, 171–172

real-time unicast aggregation, 172–173

spreadlogd, 178–179

monitoring, 30

evolution of architecture, 8

EVS (extended virtual synchrony), 142–143

exports, revision control, 86

extended virtual synchrony (EVS), 142–143

extensibility, monitoring, 29

extensions, creating PHP, 203–206

external release cycles, 33–34

F

failover, 26, 138

fault tolerance, 38, 90. See also availability

FIFO ordering, 228

files, creating, 92–94

Finagle’s Law, 12

fine nines availability, 26

firewalls

high availability, 55–60

OSs (operating systems), 90

flexible notifications, monitoring, 29

flipping snapshots, 163–166

formatting CLF, 180

Foundry ServerIron, 45–49

frameworks, monitoring, 29–30

FreeBSD 4.9, 90

Freevrrpd, 49

front-end load balancers, 66

functions

cull_old_hits, 216

get_current_online_count, 216

get_hit_info, 211

online_init, 218

online_shutdown, 218

sl_find_compare_neighbors, 216

G

generating RRDtool graphs, 194–197

geographically distributed operations, 139

getcounts method, 222

getuserinfo method, 222

get_current_online_count function, 216

get_hit_info function, 211

graphs, generating, 194–197

gratuitous ARPing, 97

groups, Spread communication, 227–228

H

HA (high availability), 37–38

costs, 39–40

load balancing, 41–43

maintenance, 40–41

mission-critical systems, 55–60

peer-based, 49–55

traditional, 43–44

Foundry ServerIron, 45–49

site surveys, 45

Wackamole, 94–98

Web servers, 89

hardware

high availability, 43

horizontal scalability, 5

load balancing, 42

costs, 39–41

load balancing, 41–43

mission-critical systems, 55–60

peer-based, 49–55

traditional, 43–44

Foundry ServerIron, 45–49

site surveys, 45

high performance computing (HPC) systems, 72

Hitdate index, 215

horizontal scalability, 5. See also scalability

Hot Standby Routing Protocol (HSRP), 45

hot-standby, 138

HPC (high performance computing) systems, 72

HSRP (Hot Standby Routing Protocol), 45

http acceleration mode, Squid Web servers, 89

HTTPS (secure hypertext transport protocol), 64

I–J

image serving, 99–101

implementation

cross-vendor replication, 151–166

flipping snapshots, 163–166

high availability, 40–41

monitoring, 27–28

samevendor replication, 166

information collectors, selecting, 214–220

infrastructure

mission-critical environments, 25

cost effectiveness, 34

high availability, 26

management, 36

monitoring, 27–30

release cycles, 30–34

speed, 34–36

scalability, 5

architecture, 9–10

decreasing, 7–8

flawed designs, 6–7

need for, 6

real-world design of, 8–9

installation

Spread, 231

Squid, 98

Wackamole, 91–94

integrated caches, 109–112, 122–127

integrity, caches, 130

internal release cycles, 31–32

Internet, load balancing, 72–73

Invalidation, caches, 134

IP (Internet Protocol)

addresses

Anycast, 102–103

managing Wackamole, 92

load balanced protocols, 72–73

services, 63–65

IPVS (IP virtual servers), 67

ISPs (Internet service providers), 83

K–L

keys, primary, 158

large systems, dot.com bust affect on, 13–14

latency, static content improvements, 99–101

layered caches, 107–108

layers, application-layer load balancers, 67–71

least connections load balancing, 62

load balancing, 61–63

application-layer, 67–71

definition of, 71–72

high availability, 41–43

IP services, 63–65

IPVS, 67

services, 72–73

session stickiness, 73–74

web switches, 65–66

loading MySQL modules, 220

logging

applying, 197

architecture, 176–177

binlog, 147

configuring, 177

mod_log_spread, 179

spreadlogd, 178–179

DML, 152

monitors, 177

optimizing, 175–176

overview of, 169–171

passive log aggregation for metrics, 187–188

passive sniffing log aggregation, 173–174

periodic batch aggregation, 171–172

RDDtool, 188–189

collecting metrics, 191–193

configuring, 189–190

generating graphs, 194–197

real-time analysis, 181–183

real-time monitoring, 183–186

real-time unicast aggregation, 172–173

servers, 177

look-aside caches, 109–112, 122–127

M

Mail Transport Agents. See MTAs

maintenance

high availability, 40–41

monitoring, 30

management

logging, 171–172

mission-critical environments, 25, 36

cost effectiveness, 34

high availability, 26

monitoring, 27–30

release cycles, 30–34

speed, 34–36

Management Information Bases. See MIBs

manual magic, 142

master-master replication, 144

master-slave replication, 144–147

mecached servers, 125

memcached servers, 125

memory resources, Apache, 79

methods

getcounts, 222

getuserinfo, 222

query, 221

read_response, 222

metrics

passive log aggregation for, 187–188

RRDtool, 188–189

collecting, 191–193

configuring, 189–190

generating graphs, 194–197

MIBs (Management Information Bases), 28

mission-critical environments, 25

availability, 55–60

cost effectiveness, 34

high availability, 26

management, 36

monitoring, 27–30

release cycles, 30–34

speed, 34–36

modules, loading MySQL, 220

mod_log_spread, 179

mod_perl application, 202

monitoring, 27–30, 183–186

casual, 177

logging, 177

Moore’s Law, 13

MTAs (Mail Transport Agents), 28

multimaster replication, 140–141

2PC, 141–142

EVS, 142–143

multinode clusters, 171–172

MySQL, 147

MySQLl

scope, 206–207

selecting, 207–208

technical setup, 200–206

testing, 223–226

troubleshooting, 208–223

N

N-1 fault tolerant, 38

name-based virtual hosting, 64

NAT (network address translation), 65

network file system (NFS), 85

networks, partitions, 142

news sites

serving, 117–125, 127–130

two-tier execution, 130–134

NFS (network file system), 85

nodes, 84–85

AFS protocol, 85

CODA protocol, 85

differential synchronization, 85

NFS (network file system), 85

revision control exports, 86

thttpd Web server, 88

notifications, monitoring, 29

O

online_init function, 218

online_shutdown function, 218

operating systems (OSs), 90

operational failover, 138

optimization

logging, 175–176

passive log aggregation for metrics, 187–188

real-time analysis, 181–183

real-time monitoring, 183–186

queries, 138

Oracle, changeset replication, 147

ordering, 228

OSs (operating systems), 90

outages, 26. See also availability

P

P2P (peer-to-peer) systems, 39

partitions, networks, 142

passive log aggregation for metrics, 187–188

passive sniffing log aggregation, 173–174

peak rates, cost effectiveness, 81–82

peer-based high availability, 49–55

peer-to-peer (P2P) systems, 39

performance

dynamic content distribution, 121

logging

real-time analysis, 181–183

real-time monitoring, 183–186

passive log aggregation for metrics, 187–188

peak rates, 81–82

queries, 138

speed, 34–36

periodic batch aggregation, 171–172

perl, mod_perl application, 202

PHP extensions, creating, 203–206

PKI (public key infrastructure), 69

planning capacity, 72

platforms, selecting, 208–209

point-to-point communication, 227

predictive load balancing, 62

preferences, tracking, 127–130

primary keys, 158

priority placement, 84–85

AFS protocol, 85

CODA protocol, 85

differential synchronization, 85

NFS, 85

revision control exports, 86

thttpd Web server, 88

production

internal release cycles, 32

working, 12–13

protocols

AFS, 85

ARP, 63

CARP, 49

CODA, 85

HSRP, 45

HTTPS, 64

IP load balanced, 72–73

SNMP, 28

VRRP, 45

proxy caches, 88, 108. See also caches

public key infrastructure (PKI), 69

publishing logs

configuring, 177

mod_log_spread, 179

spreadlogd, 178–179

Q–R

query method, 221

queries

caches, 113

optimizing, 138

quorums, 143

race conditions, DML, 154

random load balancing, 62

rapid development, 15–16

RDBMS (relation database management system), 113

reactions, monitoring, 29

read_response method, 222

real-time

analysis, 181–183

monitoring, 183–186

unicast aggregation, 172–173

real-world design of scalability, 8–9

recovery, disaster, 22–23

recursive name service resolution, 101

reference primary keys, 158

relation database management system. See RDBMS

release cycles

mission-critical applications, 30–34

reliability, 138

replay replication, 152–162

replication, 40, 126

changeset, 145–147

cross-vendor, 151–154, 156–166

databases, 139

master-master, 144

master-slave, 144–147

multimaster, 140–143

replay, 152, 154, 156–162

same-vendor, 166

snapshot, 163–166

requests, 61–63

IP services, 63–65

web switches, 65–66

requirements, ACID, 140

resiliency, data, 137

resolution, DNS Round-Trip Times, 101

resources

load balancing, 62

memory, 79

site processes, 78–80

reverse-proxy support, 88

revision control exports, 86

roles, logging, 176–177

Round Robin Database tool. See RRDtool

round robin load balancing, 62

routers, high availability, 55–60

RRDtool, 188–189

collecting metrics, 191–193

configuring, 189–190

rsync tool, 85

S

same-vendor replication, implementing, 166

scalability, 5

architecture, 9–10

decreasing, 7–8

distributed databases, 148–151

implementing cross-vendor replication, 151–166

implementing same-vendor replication, 166

flawed designs, 6–7

memcached servers, 125

need for, 6

real-world design of, 8–9

speed, 34–36

web switches, 65–66

scheduling, monitoring, 29

scope, defining, 206–207

secure hypertext transport protocol (HTTPS), 64

secure socket layer (SSL), 51

security

caches, 130

firewalls, 90

selecting

changeset replication, 147

information collectors, 214–220

MySQL, 207–208

platforms, 208–209

service providers, 210–213

tools, 200

servers

Foundry ServerIron, 45–49

IPVS, 67

logging, 176–177

memcached, 125

monitoring, 27–28

Web

choosing, 88–89

processes, context switching, 77

setting up, 98

selecting, 210–213

services

dependencies, 30

IP, 63–65

monitoring, 27–30

serving news sites, 117–134

sessions

SSL caches, 116

stickiness, 73–74

tracking, 127–130

simple distributed information caches, 115

simple network management protocol. See SNMP

single point of failure, Foundry ServerIron, 47

site surveys, 45

site processes

analyzing, 76–77

context switching, 77–78

resources, 78–80

SiteUserID index, 215

skiplists, 216

sl_find_compare_neighbors function, 216

snapshot replication, 163–166

SNMP (simple network management protocol), 28–29

software, logging, 176–177

spacachepurge, 244

speed, architecture, 34–36

spoofing, ARP, 96

spoon-feeding clients, 108

spurgecached, 242

Spread, 229–230

applying, 241–245

configuring, 91, 231–241

group communication, 227–228

installing, 231

Wackamole, 91

spreadlogd, 178–179

spuser real-time observation sessions, 181

Squid Web server, 89, 98

SSL (secure socket layer), 51, 116

stability, 14–15

disaster recovery, 22–23

rapid development, 15–16

unit testing, 16–17

version control, 18–21

staging internal release cycles, 31–32

staleness, 62

static content

Anycast, 102–103

availability, 90

clustering, 82

caching, 83

upper bound, 83

distribution, 83–87

DNS Round-Trip Times, 101

improvements, 99–101

OSs (operating systems), 90

overview, 75

peak rates, 81–82

site processes

analyzing, 76–77

context switching, 77–78

resources, 78–80

Wackamole, 91

ARP spoofing, 96

benefits, 91

compiling, 91

high availability, 94–98

installing, 91, 93–94

Web servers

choosing, 88–89

setting up, 98

static URLs, creating, 82

statistics, wwwstat program, 186

stickiness, sessions, 73–74

storage, binlog, 147

subscribers, logging, 178–179

switches

high availability, 55–60

load balancing, 65–66

web, 104

synchronization, 41

differential, 85

rsync tool, 85

T

tables

CREATE TABLE DDL, 151

replay replication, 152–162

snapshot replication, 163–166

technical setup, MySQL, 200–206

testing

clients, 223

MySQL, 223–226

unit, 16–17

three-phase commit (3PC), 40

thttpd Web server, 88

timeout-based caches, 123

tools

RDDtool, 188–189

collecting metrics, 191–193

configuring, 189–190

generating graphs, 194–197

rsync, 85

selecting, 200

Spread, 229–230

applying, 241–245

configuring, 231–241

installing, 231

total ordering, 228

tracking user data, 127–130

traditional high availability, 43–44

Foundry ServerIron, 45–49

peer-based, 49–51

site surveys, 45

traffic, scalability, 8

transparent caches, 107–108

troubleshooting

high availability, 40–41

logging

overview of, 169–171

passive log aggregation for metrics, 187–188

passive sniffing log aggregation, 173–174

periodic batch aggregation, 171–172

real-time analysis, 181–183

real-time monitoring, 183–186

real-time unicast aggregation, 172–173

MySQL, 208–223

two-phase commit (2PC), 40, 141–142

two-tier execution, dynamic content distribution, 130–134

types

of caches, 107

data, 112–113

distributed, 114–116

integrated/look-aside, 109–112, 122–127

layered/transparent, 107–108

write-thru/write-back, 113–114

of DML, 145–146

U

unification of separate systems, 42. See also load balancing

unit testing, 16–17

unplanned outages, 26. See also high availability

unsolicited ARPing, 97

upper bound, static content clusters, 83

URL-Hitdate index, 215–216

URLs (Uniform Resource Locators)

static, 82

viewing, 217

user data, tracking, 127–130

utilities

RDDtool, 188–189

collecting metrics, 191–193

configuring, 189–190

generating graphs, 194–197

selecting, 200

Spread, 229–230

applying, 241–245

configuring, 231–241

installing, 231

utilization

load balancing, 62

scalability, 8

V

VCSs (version control systems), 18

versions, CVS, 19

viewing URLs, 217

Virtual Router Redundancy Protocol (VRRP), 45

visualization (RRDtool), 188–189

collecting metrics, 191–193

configuring, 189–190

generating graphs, 194–197

VRRP (Virtual Router Redundancy Protocol), 45

W–Z

Wackamole, 91

ARP spoofing, 96

benefits, 91

compiling, 91

high availability, 94–98

installing, 91–94

wackamole.conf file, 92–94

warm-standby, 138

web accelerators, 87

web caches, 86–88

web servers

choosing, 88–89

logging, 177

monitoring, 27–28

processes, 77

setting up, 98

web sites, Squid, 89

web switches, 65–66, 104

weighted random load balancing, 62

whitepaper approach to high availability, 43–44

Foundry ServerIron, 45–49

site surveys, 45

"who’s online" solution, 226

write-thru/write-back caches, 113–114

writing clients, 220–223

wwwstat program, 186

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.127.37