Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Previous Chapter

Index

Acknowledgments (ack), with holding, 96–98

Active messages, 162

Adaptor memory, 111–113

Addresses, Internet, 235–236

Address lookup, 236

Address Resolution Protocol (ARP), 37

Afterburner approach, 112–113

Aggregation

edge, 359–361

random, 359

threshold, 385–387

Aho-Corasick algorithm, 402–403

Algorithms versus algorithmics, 54–55

American National Standards Institute (ANSI), 123

Anomaly intrusion detection, 399

Apache Web server, 149

API

speeding up select() by changing, 158–159

speeding up select() without changing, 157–158

Appletalk, 37 , 223

Application code, 140

Application device channels (ADCs), 161–162

buffer validation of, 74–76

Architecture

endnode, 32–34

router, 34–38

virtual interface, 162–163

Asynchronous transfer mode (ATM)

flow control, 76–77

video conferencing via, 102–104

Backtracking, 14 , 281

Baker, Fred, 227

Bandwidth, 28

guarantees, 348–354

reducing collection, 389–390

scaling, 5

Banks, 28

Banyan, 444

Barrel shifters, 23

Batch allocator, 200–201

Batching, 58 , 164

Benes networks, 328–333

Berkeley packet filter (BPF), 145 , 186–188

BGP (Border Gateway Protocol), 36 , 373–374

Binary search

of long identifiers, 100–102

pipelining, 230–231

prefix lengths, 259–261

on ranges, 257–258

Binary trees, balanced, 29

Binomial bucketing, 93

Bit-by-bit round-robin, 350–354

Bitmaps, tree, 255–257

Bit slicing, 333–334

Bit vector linear search, 289–292

Bloom filters, 410–413

Bottlenecks, 3

endnode, 4–5

router, 5–7

Boyer-Moore algorithm, 403–405

BPF (Berkeley packet filter), 145 , 186–188

Bridges/bridging, 80–81

defined, 221

Ethernets, 222–224

scaling lookups to higher speeds, 228–231

wire speed forwarding, 224–228

BSD UNIX, 40–41 , 164

callouts and timers, 178–179

Bucket sorting, 92–93

Buddy system, 200

Buffer(s)

aggregates, 126

allocation, 5 , 199–201

dynamic buffer limiting, 203

fast, 115–119

management, 198–203

overflow, 8

sharing, 201–203

stealing, 201–202

validation of application device channels (ADCs), 74–76

Buses, 28 , 33 , 305–307

Butterfly network, 444

Byte swapping and order, 207 , 208

Caches (caching), 32–33 , 63–64 , 131–135 , 242

packet classification and, 276

Callouts, 178–179

Cell, 190–191

CERN Web proxy, 153

Checksums, 5 , 36 , 203

header, 208–209

Internet, 207–209

Cheeson, Greg, 210

Cheetah Web Server, 128

Chips

design, 441–442

scaling and speeds, 31

Chi-square, 15

Circuit switches, 38

Cisco, 240 , 320 , 354 , 382

GSR, 428–429

NetFlow, 388–389

Clark, Dave, 143–144

Class-based queuing (CBQ), 353–354

Classification, See Packet Classification

Classless Internet Domain Routing (CIDR), 235–236

Client

structuring processes per client, 147–148

structuring threads per client, 148–150

Clos, Charles, 324

Clos networks, 324–328 , 442–443

Clusters

copying in, 122–123

VAX, 122–123

CMU Stanford packet filter (CSPF), 145 , 185–186 , 194–195

Code

application, 140

arrangement, 132–133

networking, 143–146

Column address strobe (CAS), 27

Compaction, frame-based, 262–263

Compaq Computer Inc., virtual interface architecture, 162–163

Concurrency, 147 , 150–151

Connection lists, getting rid of TCP open, 93–96

Content-addressable memory (CAM), 50–54 , 230 , 242 , 278

Control overhead, 5 , 226

context-switching, 146–152

fast select, 153–159

interrupts, 163–165

in networking code, 143–146

reasons for, 141–143

system calls, 159–163

Copying, 4–5

adaptor memory, 111–113

Afterburner approach, 112–113

in a cluster, 122–123

loop, 129–130

methods of, 109–111

page remapping, 115–119

reducing, 111–121

remote DMA to avoid, 121–125

semantics, transparent emulation, 119–121

Copy-on-write (COW), 57,, 113–114

transient (TCOW), 119–121

Counting (counters), 381–382

pre-prefix, 394

probabilistic, 387–388

reducing counter height using flow, 387–388

reducing counter height using threshold aggregation, 385–387

reducing counter width using randomized, 384–385

Crossbar switches/scheduler, 6 , 307–311

Crosspoints, 307–308

Cross-producting

equivalenced, 293–296

on demand, 292–293

CSPF (CMU Stanford packet filter), 145 , 185–186 , 194–195

Cyclic redundancy checks (CRCs), 203–207

Data, copying. See Copying data

Databases, incremental reading of large, 98–100

Data cache, 32

Data link layer, 223

Data manipulations, 18

DEC (Digital Equipment Corp.), 122 , 223 , 227 , 228

Decision trees, 296–299

DECNET, 37 , 223

Decoders, 23

Deficit round-robin, 350–354

Degrees of freedom, 52–53 , 64

Delay guarantees, 354–358

Delta network, 328–330 , 443–444

Demand paging, 42

Demultiplexing (demultiplexers), 5 , 19 , 23 , 145

Berkeley packet filter (BPF), 145 , 186–188

challenges of early, 184–185

CMU Stanford packet filter (CSPF), 145 , 185–186 , 194–195

defined, 182

delayered, 182

dynamic packet filter (DPF), 192–195

early, 117 , 182–195

layered, 182

packet classification and, 277

PathFinder, 145 , 189–192

in x-kemel, 81–83

Dense wavelength-division multiplexing (DWDM), 323

Descriptors, 160–161 , 163

Design, implementation principles versus, 65–66

Device driver, 43

DiffServ, 272 , 277 , 348 , 359–361

Dijkstra’s algorithm, 77–80

Directed acyclic graph (DAG), 191

Direct memory access (DMA), 33 , 226

remote, 121–125

versus programmed I/O, 135

Display-get-data, 144

Distributed systems, routers as

asynchronous updates, 371–373

internal flow control, 363–368

internal striping, 368–371

Divide-and-conquer, 288–296

dlmalloc(), 200

Doorbells, 163

Download times, reducing, 66–67

Dynamic buffer limiting, 203

Dynamic packet filter (DPF), 192–195

Dynamic random access memory (DRAM), 26–29 , 32 , 33 , 226 , 441

reducing SRAM width using, backing store, 382–384

Earliest deadline first, 356

Encoders

architecture, 34

design of priority, 22–23

programmable priority, 24–25 , 322

quality of service and priority, 22

Endnodes, 4–5 , 418–419

architecture, 32–34

ESLIP, 321 , 322

Ethernets

description of, 222–224

forwarding packets, 80–81

Event-driven scheduler, 150

Event-driven server, 150–151

Evil packet example, 8

Exact-match lookups, 6 , 28 , 221–232

ExpiryProcessing, 171 , 172

Expression tree model, 185–186

Extended grid of tries (EGT), 288

False negative, 10

Fast retransmit, 343

Fast select. See select()

Fbufs (fast buffers), 115–119

FDDI, 228

Fiber Channel, 20–21 , 122 , 123

File systems

IO-Lite, 126–128

I/O splicing, 128–129

shared memory, 116 , 125–126

Fine-granularity timers, 179–180

Firewalls, 272

First in, first out (FIFO), 339

Fisk, Mike, 8

Fixed-stride tries, 246–247

Flash Web server, 151 , 428

Flip-flops, 25

Flow control, internal, 363–368

Flow counting, reducing counter height using, 387–388

Flow ID lookups, 28–30

Flow switching, 240–241

Forwarding, 17

Forwarding information base (FIB), 35

Fractional cascading, 285

Fragmentation, 37

of link state protocols, 87–89

Frame-based compaction, 262–263

Geometric view, of packet classification, 284–286

Gigaswitch, 228–230

Green, Larry, 210

Grid of tries, 281–284

extended, 288

Hardware

component-level design, 30–31

design tools, 23–25

logic gates, 21–22

memory, 25–30

models, 437–442

parallelism, 230–231

parameters, 31–32

transmission speed, 22–23

Hart, John, 227

Harvest Web server. See Squid Web server

Hashed wheels, 175–176

Hashing, 28 , 75 , 228–230

locality-preserving, 405

Header checksum, 208–209

Header fields, 273

Header prediction, 210–212

Header validation, 36

Head-of-line blocking, 6 , 311–316

Hewlett-Packard, Open View, 20

Hierarchical deficit round-robin, 353

Hierarchical wheels, 176–178

Hints, use of, 62–63

Hole filling, 396

Hunt groups, 310–311

Hypercube, 444

IBM, 223

I-caches, 132–133

Identifiers, binary search of long, 100–102

Implementation principles

caution when using, 68–70

modularity with efficiency principles, 56 , 61–63

routines, principles for speeding up, 56 , 63–65

systems principles, 56–61

versus design, 65–66

Infiniband, 123–124

Instruction cache, 32

Integrated layer processing (ILP), 130

Intel

virtual interface architecture, 162–163

VTune, 20

Internal flow control, 363–368

Internal striping, 368–371

Internet Control Message Protocol (ICMP), 37

Interrupt(s)

handlers, 40 , 145

reducing, 163–165

software, 40 , 144 , 164

Intrusion detection systems (IDSs)

Aho-Corasick algorithm, 402–403

anomaly, 399

Boyer-Moore algorithm, 403–405

logging, 409–413

probabilistic marking, 406–409

searching for multiple strings in packet payloads, 401–405

signature, 399

speeding up, 67–68

string matching, approximate, 405–406

subtasks, 400

worms, detecting, 413–415

IO-Lite, 126–128

I/O splicing, 128–129

IP Lookups, See Prefix-match Lookups

iSCSI, 20–21 , 124–125

iSLIP, 316–323

Jupiter Networks, 324 , 326 , 392–393

Kempf, Mark, 223–224 , 225 , 227

Kernels, 43 , 162

Kingsley, Chris, 199

Labels, passing, 277

Latency, 19

Lauck, Tony, 227

Layer 4 switching. See Packet classification

Layer processing, locality-driven, 133–134

Lazy evaluation, 14 , 57–58 , 208

Lazy receiver processing (LRP), 165

Lea, Doug, 200

Leaf pushing, 252

Least-cost matching rule, 270 , 273–275

Level-compressed tries, 250–251

Linear feedback shift register (LFSR), 206

Linear search, 276

bit vector, 289–292

Link state packet (LSP), 18 , 77 , 87–89

Link state protocols, avoiding fragmentation of, 87–89

Linux allocator, 200

Logging, 409–413

Logic gates, 21–22 , 437–438

Lookups

chip model, 263–264

exact-match, 6 , 28 , 221–232

flow ID, 28–30

prefix-match, 6 , 35 , 233–266

Lulea-compressed tries, 252–255

malloc(), 199

Markers, 366

Masking, 207 , 235

Matchmaking, 148

mbufs, 118 , 199

McQuillan, John, 272

Measuring network traffic

difficulty of, 381–382

Jupiter network example, 392–393

reducing collection bandwidth, 389–390

reducing counter height using flow counting, 387–388

reducing counter height using threshold aggregation, 385–387

reducing counter width using randomized counting, 384–385

reducing processing using NetFlow, 388–389

reducing SRAM width using DRAM backing store, 382–384

Sting, 395–396

traffic matrices, 393–395

trajectory sampling to correlate, 390–391

Memory, 25–30

adaptor, 111–113

allocation in compressed schemes, 261–263

backtracking, 281

content-addressable memory (CAM), 50–54 , 230 , 242 , 278

direct memory access (DMA), 33

dynamic random access memory (DRAM), 26–29 , 32 , 33 , 226 , 441

main, 32

mapped, 33

registered, 163

scaling, 335–336

shared, 116,, 125–126 , 305

static random access memory (SRAM), 26 , 32 , 228 , 382–384 , 441

virtual, 41–43 , 113–114

Memory management unit (MMU), 42

Microsoft Inc., virtual interface architecture, 162–163

Modified deficit round-robin, 354

Modularity with efficiency principles, 56

generality, avoiding, 62

hints, use of, 62–63

over referencing, avoiding, 62

replace inefficient routines, 61–62

Multibit tries, 245–250

Multicast, 321–322

Multichassis routers, 323 , 326–328

Multiplexers, 23

Multi-protocol-label switching (MPLS), 37 , 240 , 241 , 277

Multithreading, 38

Net-dispatch, 144

NetFlow, 388–389

Network address translation (NAT), 236

Network algorithmics

algorithms versus, 54–55

characteristics of, 13–15

defined, 14 , 423–427

future, 429–431

real products and, 427–429

techniques, 7–15

Networking code, avoiding scheduling overhead in, 143–146

Network processors, 36 , 37–38

Node compression, tries and, 83–85

1D torus, 334–335

Operating systems

system calls and simple, 43–44

uninterrupted computation, 39–41

virtual memory, 41–43

OSPF, 18 , 36 , 77

Output queuing, 312–314

Output scheduling, 36

Packet classification, 6 , 36 , 85 , 185

caching, 276

content-addressable memory, 278

cross-producting, 292–293

cross-producting, equivalenced, 293–296

decision trees, 296–299

demultiplexing, 277

divide-and-conquer, 288–296

extended grid of tries, 288

geometric view, 284–286

grid of tries, 281–284

linear search, 276

linear search, bit vector, 289–292

passing labels, 277

reasons for, 271–273

requirements and metrics, 275–276

role of, 270

routers, 270

tuples, 273–275

two dimensional, 278–284 , 287–288

Packet filters

Berkeley (BPF), 145,, 186–188

CMU Stanford (CSPF), 145,, 185–186

dynamic, 192–195

Packets, 17

filtering in routers, 85–87

flow, 340

header validation and checksums, 36

logs, 388

repeaters, 223–224

scheduling, 339–361

Page mode, 27 , 226

Page remapping, 115–119

Pages, 42

Parallelism, hardware, 230–231

Parallel iterative matching (PIM), 314–316

Pareto optimality, 201

PathFinder, 145 , 189–192 , 277

Path MTU, 213–214

Patricia trie, 14 , 245

pbufs, 199

Perfect hashing, 229

Performance, improving, 364–365 , 368–369 , 372–373

Performance measures, 19–20

select() and server performance problem, 153–154

for timers, 171

Perlman, Radia, 227

PerTickBookkeeping, 171 , 172

Piggybacking, 98

Ping, 395

Pipelining, 28–29 , 230–231

Polling, 164

Population scaling, 5

Prefix-match lookups, 6 , 35

binary search, prefix lengths, 259–261

binary search, on ranges, 257–258

flow switching, 240–241

memory allocation, 261–263

model, 236–238

model, lookup-chip, 263–264

multi-protocol label switching, 37 , 240 , 241

nonalgorithmic techniques for, 242

notation, 234–235

threaded indices and tag switching, 14 , 238–240 , 241

tree bitmaps, 255–257

tries, level-compressed, 250–251

tries, Lulea-compressed, 252–255

tries, multibit, 245–250

tries, unibit, 243–245

variable-length, reasons for, 235–236

Pre-prefix counters, 394

Priorities, 320–321 , 347

Probabilistic counting, 387–388

Probabilistic marking, 406–409

Programmable logic arrays (PLAs), 439

Programmable priority encoders, 24–25 , 322

Protocol control block (PCB), 210–212

Protocol Engines, Inc., 210

Protocol processing

buffer management, 198–203

checksums and cyclic redundancy checks, 203–209

generic, 209–213

reassembly, 213–216

Protocols, 17–19 , 36–37

reservation, 347–348

Pushout, 202–203

Quality of service (QOS), 22

reasons for, 340–342

Queuing, 36

class-based, 353–354

multiple outbound, 346–347

output, 312–314

scalable fair, 358–361

Random early detection (RED), 342–345

Randomization

avoiding, 316–323

memory scaling using, 335–336

Rational, Quantify, 20

Reading large databases, incremental, 98–100

Rearrangeably nonblocking, 327

Reassembly, 19 , 213–216

Receiver livelock, 40–41

avoiding, 164–165

Recursive flow classification (RFC), 293–296

Redirects, 37

Reentrant, 148

Registered memory, 163

Registers, 25 , 440

Reliability, 365–368 , 369–371 , 373

Remote direct memory access (RDMA), 121–125

Repeaters

filtering, 224

packet, 223–224

Resemblance, 405–406

Reservation protocols, 347–348

Resource Reservation Protocol (RSVP), 347–348

Resources, identifying, 92–93

RIP, 36

Round-robin

deficit (bit-by-bit), 350–354

slice-by-slice, 348–350

Routers, 5–7 , 419–420 See also Distributed systems, routers as

architecture, 34–38

fragmentation, redirects and ARPs, 37–38

history of, 305–307

lookup, 35–36

multichassis, 323 , 326–328

packet classification, 270

packet filtering in, 85–87

pin-count for buffers, 30–31

processing, 36–37

queuing, 36

switching, 36

telephone switches versus, 304–305

Routines, principles for speeding up, 56 , 63–65

Routing, 17

computation using Dijkstra’s algorithm, 77–80

Row address strobe (RAS), 27

Sampled charging, 389–390

Savage, Stefan, 395

Scalable fair queuing, 358–361

Scale

bandwidth, 5

endnode, 4

population, 5

router, 5

Scaling

chip, 31

to faster switches, 333–336

to larger switches, 323–333

memory, 31 , 335–336

via hashing, 228–230

Schedule clients, 19

Scheduling

crossbar, 24–25 , 307–311

event-driven, 150

output, 36

packets, 339–361

SCSI (small computer system interface), 20–21 , 123

Security issues. See Intrusion detection systems (IDSs)

Security forensics problem, 54–55

select()

analysis of, 155–157

server performance problem, 153–154

speeding up by changing API, 158–159

speeding up without changing API, 157–158

use and implementation of, 154–155

Server, event-driven, 150–151

Service differentiation, 6 , 270

Set-pruning tries, 278–281

Shared memory, 116 , 125–126 , 305

Shelley, Bob, 227

Short links, 334–335

Signature intrusion detection, 399

Simcoe, Bob, 228

Simple Network Management Protocol (SNMP), 37 , 381

SNA, 37 , 223

Snapshot, 367

SNORT, 399–402

SNS, 223

Socket queue, 41

Software interrupt, 40 , 144 , 164

Spanning tree algorithm, 227

SPECweb, 128

Spinney, Barry, 228–229 , 230

Squid Web server, 150 , 153

StartTimer, 171 , 172

State machine implementation, 30–31

Static random access memory (SRAM), 26 , 32 , 228 , 441

reducing, using DRAM backing store, 382–384

Sting, 395–396

StopTimer, 171 , 172

Storage area networks (SANs), 20–21 , 123

String matching, approximate, 405–106

Strings in packet payloads, searching for, 401–105

Structure, 4

Substitution error, 405

Switching (switches), 36

Benes networks, 328–333

bit slicing, 333–334

Clos networks, 324–328

costs, 324

crossbar scheduler, 307–311

flow, 240–241

head-of-line blocking, 311–316

iSLIP, 316–323

memory scaling, 335–336

multi-protocol-label, 37 , 240 , 241

optical, 38

output queuing, 312–314

parallel iterative matching (PIM), 314–316

router, 305–307

router versus telephone, 304–305

scaling, 323–336

shared-memory, 305

short links, 334–335

theory, 442–443

threaded indices and tag, 14 , 238–240 , 241

Switch pointers, 282–283

System calls, 43–14

avoiding, 159–163

Systems principles

leverage off system components, 59–60

performance improved by hardware, 60–61

relaxing requirements, 58–59

time and space computation, 57–58

waste, avoiding, 56–57

Tag switching, 14 , 238–240 , 241

Take-a-ticket scheme, 307–311

Task-based structuring, 151–152

tcpdump, 20

TCP/IP (Transmission Control Protocol/Intemet Protocol), 17–19 , 21

congestion control, 342

header prediction, 210–212

open connection lists, getting rid of, 93–96

transport and routing, 433–437

Teardrop attack, 409

Telephone switches

Clos networks, 325–326

router versus, 304–305

Thomas, Bob, 228

Threads, 40

indices and tag switching, 238–240 , 241

per client, 148–150

Threshold aggregation, 385–387

Throughput, 19

memory, 28

Timers, 5 , 19

BSD UNIX implementation, 178–179

delays, 439

fine granularity, 179–180

hashed wheels, 175–176

hierarchical wheels, 176–178

reasons for, 169–171

routines and performance of, 171

simple, 172–173

wheels, 173–174

Timing Wheels, See Timers, wheels

Token bucket shaping and policing, 345–346

Tomography, 394

Traceback

logging, 409–413

probabilistic marking, 406–409

Traceroute, 395

Trading memory for processing, 14

Traffic matrices, 393–395

Traffic patterns, monitoring, 90–92 See also Measuring network traffic

Trajectory sampling, 390–391

Transistors, 437–438

Translation look-aside buffer (TLB), 42 , 115

Transmission speed, 22–23

Transport-arm-to-send, 144

Transport-get-port, 144

Tree bitmaps, 255–257

Tries, 402

defined, 190

extended grid of, 288

fixed-stride, 246–247

grid of, 281–284

level-compressed, 250–251

Lulea-compressed, 252–255

multibit, 245–250

node compression and, 83–85

Patricia, 14 , 245

set-pruning, 278–281

variable-stride, 247–250

unibit, 243–245

UDP (User Datagram Protocol), 17 , 212–213

ufalloc(), 153–154

Unibit tries, 243–245

UNIX mbufs, 118 , 199 See also BSD UNIX

Upcalls, 143–145

Updates, asynchronous, 371–373

User-level implementation, 144–146

User processes, 40

Variable-stride tries, 247–250

VAX cluster, 122–123

Video conferencing, asynchronous transfer mode and, 102–104

Virtual circuits (VCs), 76–77 , 102–104

Virtual clock, 355–356

Virtual interface architecture (VIA), 162–163

Virtual memory, 41–43 , 113–114

Virtual output queues (VOQs), 314–315 , 322

WAN (wide area network), 153

Waters, Greg, 231

Web servers

context-switching control overhead, 146–152

event-driven scheduler, 150

event-driven server, 150–151

process per client, 147–148

task-based structuring, 151–152

thread per client, 148–150

Wheels. See Timers

Wire speed forwarding, 9 , 224–228

Worms, detecting, 413–415

Xerox, 223

x-kemel, demultiplexing in, 81–83

XTP protocol, 210

Zeus Web server, 150

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Bibliography

Create new playlist

Sign In

Sign Up

Index

Table of Contents for
Bibliography