A
- A* algorithm, 35, 381, 388, 396–399
- abscissa (x coordinate), 39
- abstract machines, 36–37
- Adamaszek, Michal, 32
- Adamaszek, Anna, 32
- admissible heuristic, 396
- Advanced Driver-Assist Systems (ADAS), 15
- Advanced Encryption Standard (AES), 284–287
- agent, defined, 18
- AGraph, 149, 150
- AI (artificial intelligence) 222, 383–384
- algorithmic problems, ones yet to solve, 411–415
- algorithms. See also specific algorithms
- as all about getting acceptable answer, 19–20
- analysis of, 36
- categorization of, 119
- comparison of as using measurement, 24
- complex ones as generally less favorable than simple ones, 36
- costs of, 24, 32–35
- defined, 11
- described, 10–15, 20–21
- design of, 23–40
- discovering correct ones to use, 173–174
- as distinct from equations, 411
- elements of, 15
- evaluation of, 35–40
- finding them everywhere, 14–15
- genetic algorithms, 368, 384
- getting them into business, 231–232
- greedy algorithms, 24, 31, 32, 174, 291–306
- heuristic algorithms, 34–35, 382
- historical creation of, 10, 27–28, 403
- importance of precision in using, 157
- iterative form of, 13
- languages of, 21
- modern-day creation of, 10
- nature-inspired algorithms, 354
- path finding algorithms, 388–399
- randomized algorithms, 331–348
- recursive algorithms, 13, 40, 310
- same algorithm as having difference uses, 21
- simplex algorithm, 368, 372–373
- that are changing the world, 403–410
- uses of, 12–14
- Al-Khawarizmi, 403
- allowable errors, 240
- Amazon, link analysis on, 407
- American Standard Code for Information Interchange (ASCII) encoding standard, 269
- “The Anatomy of a Large Scale Hypertextual Web Search Engine” (Brin and Page), 210
- Apache Lucene, 253
- Apache Software Foundation, 253
- Application Programming Interface (API), 150
- applications, determining whether one will end, 412–413
- Aristotle, 27
- The Art of Computer Programming (Knuth), 13, 35–36
- Artificial Intelligence: A Modern Approach (Russell and Norvig), 309
- artificial intelligence (AI), 222, 383–384
- artificial neurons, 17
- astronomy, as field where big data has a basis in, 228–229
- asynchronous outputs, as aspect that can dampen parallelism, 252
- Atlantic City randomized solutions, 333
- authorities, 211
- automatic responses/automation dealing with, 409
B
- Babbage, Charles, 35
- backlinks, 212
- base64 encoding, 285, 287
- Bayes’ Theorem, 20–21
- A Beautiful Mind (film), 28
- Bélády, László, 300
- Bellman, Richard Ernest, 308, 309
- Bellman-Ford algorithm, 180, 187–190, 191
- benchmarks, 37, 119
- Berkeley Open Infrastructure for Network Computing, 18
- best-first search (BFS) algorithm, 381, 388, 392–395
- Bezzel, Max, 355
- BFS (breadth-first search), 34, 164–168
- bidirectional search, as example of brute-force search algorithm, 34
- big data, 225–248
- Big O notation, 39–40
- binary heap, 127, 128, 129, 131–132, 185
- binary max heap, 128
- binary min heap, 128
- binary search, 76, 78, 100
- binary search tree (BST), 127, 128, 129–130
- bit streams, defined, 270
- Bloom, Burton H., 240
- Bloom filters, 135, 240–245, 264
- Boolean circuits, solving satisfiability of, 359–365
- Boolean combinational circuit, 360
- boredom, 219–220
- Borůvka, Otakar, 173, 170
- Borůvka’s minimum spanning tree (MST), 173
- bounded knapsack problem, 317
- bounds, 371
- branch nodes (of tree), 110, 277
- breadth-first search (BFS), 34, 164–168
- Brin, Sergey, 210–212, 213
- Bron, Coenraad, 201
- BST (binary search tree) 127, 128, 129–130
- brute-force solution, 29, 34, 121
- Burrows-Wheeler Transform, 275
C
- cache, defined, 299, 308
- caching, defined, 311
- Caesar, Julius, 283, 406
- Cartesian space, 371, 373
- census, 235, 236
- centrality, 151, 154
- chaining, 135
- ChaosKey, 405
- character substitution, 283–285
- Cheat Sheet, 4
- circuit satisfiability problem, 360
- clairvoyant replacement algorithm, 300
- class, building basic class, 82–90
- clauses, 360
- clique percolation method, 201–202
- cliques, 199–201
- cloaking, 211
- cluster, 253, 254
- cluster computing, 18
- clustering, 153
- code, 56, 92–93. See also encoding
- code cells, 52–53
- Colab (Google)
- defining, 42–46
- executing code in, 56
- getting familiar with features of, 44–46
- getting help for, 57
- performing common tasks in, 51–55
- understanding what it does, 42–43
- using hardware acceleration, 55–56
- working with, 41–57
- working with notebooks in, 47–51
- Colab Pro (settings tab in Colab), 45
- collisions, 133, 134–135, 248
- “Combinatorics of the Change-Making Problem” (Adamszek and Adamaszek), 32
- comparison, as factor in timing benchmarks, 119
- Comparison of Probabilistic Test for Primality (Finn), 334
- compression, 270–273
- compression techniques, 267, 270
- computer cache, 299–300
- computers
- Cray super-computers, 251
- every task performed on as involving algorithms, 15
- failures of, 255
- getting the most out of modern CPUs and GPUs, 16
- hypercomputer, 413
- understanding point of view of, 22
- use of to solve problems, 15–19
- variation in, 15–16
- Connection Suggestion algorithm, 197
- constraint programming, 368
- constraints, 368–369, 371, 379
- consumption (of data), 234
- costs
- of algorithms, 24
- computing costs and following heuristics, 32–35
- counterexamples, finding of in problem solving, 26–27
- Count-Min Sketch (algorithm), 135, 248
- CPLEX, 375
- CPUs, 16, 227
- “Cramming More Components Onto Integrated Circuits” (Moore), 226
- crawling, 209
- Cray, Seymour, 251
- Cray super-computers, 251
- create, read, update, and delete (CRUD), 117
- Cryptographically-Secure Pseudorandom Number Generator (CSPRNG), 405
- cryptography, 14, 268, 282–287, 406–407
- customer satisfaction, addressing of, 302
- Cutting, Doug, 253
- cycles (spider traps), 219
D
- DAGs (Directed Acyclic Graphs), 168, 169
- damping factor, 219
- Dantzig, George Bernard, 368
- data, 67–70
- arranging and searching of, 117–137
- arranging cached computer data, 299–300
- arranging of as making the difference, 22
- big data, 225–248
- changing domain of, 407
- compressing and concealing of, 267–287
- considering need for remediation of, 102–105
- consumption of, 234
- dealing with data duplication, 102–103
- dealing with missing values, 103–104
- determining need for structure of, 100–105
- distinguishing permutations in, 68–69
- facing repetitions in, 70
- finding it everywhere, 228–231
- finding of using dictionaries, 108–109
- how to find the right data, 209–210
- keeping it secret, 406
- leveraging available data, 18–19
- making it smaller, 268–282
- managing immense amounts of, 250–259
- manipulation of, 59
- matching of from various sources, 101–102
- performing compression of, 406
- raw data, 99
- representing relations of in graph, 112–115
- reserving the right data, 235–236
- searching Internet for, 208
- shuffling combinations of, 69–70
- size and complexity of data sources as greatly affecting solution resolution, 20
- sketching an answer from stream data, 240–248
- sorting of using merge sort and quick sort, 118–126
- spotting patterns in, 408–409
- stacking and piling of in order, 105–109
- storage of, 105, 234
- streaming flows of, 233–239
- structure of to obtain solution, 21–22
- structuring of, 99–115
- summarization of, 234
- transforming power into, 226–233
- understanding data content, 100–101
- understanding encoding of, 268–269
- value of is in how it is used, 232
- ways of dealing with large amounts of, 233–234
- working with trees, 109–112
- data domain, changing of, 407
- data patterns, spotting of, 408–409
- Database Management System (DBMS), 406
- dead ends (rank sinks), 219
- deadlines, meeting of, 302–303
- Dean, Jeffery, 253
- decorators, 315
- decryption, 284, 286
- degenerate matrix, 67
- degrees of separation, counting of, 202–204
- depth-first search (DFS), 34, 165–168
- deque, use of, 165
- determinant, calculation of, 90, 91–95
- Dewey Decimal System, 29
- DFS (Distributed File System), 18, 250
- dictionaries, 108–109, 137
- Dijkstra, Edsger W., 180
- Dijkstra’s algorithm, 180, 184–187, 191, 293, 386, 397
- dimension, 63
- Directed Acyclic Graphs (DAGs), 168, 169
- directed graph, 143, 182, 204
- directional graph, 146
- discovered vertex, 162
- distributed computing, 18, 250
- Distributed File System (DFS), 253, 254
- distributions, understanding of, 335–339
- divide-and-conquer approach, 28–30, 75–78, 82, 119, 122, 234
- DjVu, 270
- DNA, 237, 274, 304–305
- Dorigo, Marco, 384
- dot product, 63, 65, 88–90
- dynamic programming
- approximately string search, 326–329
- bottom-up solutions in, 310
- casting recursion dynamically, 311–314
- as compared to linear programming (LP), 369
- defined, 308
- discovering the best dynamic recipes, 316–329
- explaining of, 308–316
- knapsack problem, 312, 317–321
- traveling salesman problem (TSP). See traveling salesman problem (TSP)
E
- edit distance, 326, 414–415
- Editor (settings tab in Colab), 44
- effective, as requirement for process to represent an algorithm, 11
- Elastic MapReduce (EMR), 256
- elements, finding number of distinct elements, 246–247
- element-wise product approach, 87–88
- encapsulation, as OOP technique, 81
- encoding, 268–269, 273–275, 276–277, 278
- encryption, use of encoding for, 268
- Enron corpus, 147
- ensemble, 25, 275
- equation, defined, 11
- Euclid, 10, 12
- Euclidean distance, 387, 388, 389
- exchanges, as factor in timing benchmarks, 119
- exhaustive approach, 29
- exponential time, 297, 298
F
- Facebook, 196, 233, 250, 407
- factorial, 72
- Fano, Robert M., 276, 303
- Fast Fourier Transform (FFT), 407
- Fermi, Enrico, 339
- Fibonacci heap, 179
- Fibonacci numbers, 40, 311–312, 314
- files
- comparison of in Colab, 46
- distributing of, 253–255
- finite, as requirement for process to represent an algorithm, 11
- Finn, J., 334
- Firefox, 43
- Fischer, Michael J., 326
- Flajolet, Philippe, 246
- Flajolet-Martin (LogLog algorithm), 246
- flattening, 90, 95–96
- Floyd, Robert W., 190
- Floyd-Warhsall algorithm, 180, 190–194
- folding, as technique to avoid collisions, 134
- formulas, 11, 12–13
- Fourier Transform, 407, 414
- fractional knapsack problem, 317
- fraud detection, use of PageRank in, 221
- friendship graphs, 196
- Fruchterman-Reingold force-directed algorithm, 198
- function decorators, 307–308
- Functional Programming For Dummies (Mueller), 256
- functional programming languages, 256
- functions
- Big O notation, 39–40
- creating and using one-way functions, 413
- form of, 370
- lambda functions, 256
- in linear programming (LP), 370
- objective function, 354–355, 376, 379
- working with, 38–40
G
- G = (V, E), 142–143
- game theory, 415
- Gartner, 231
- Gauss, Carl Friedrich, 10
- Gauss’s complex multiplication algorithm, 413
- GCD (Greatest Common Divisor), 11, 12
- Gecko, Gordon (fictional character in Wall Street), 294
- general-purpose processors, 16
- “Generating Random Numbers Is a Lot Harder Than You Think,” 405
- genetic algorithms, 368, 384
- Genome Project, compression of data in, 274
- genomics, 229–230
- Ghemawat, Sanjay, 253
- GitHub, 42, 48–49, 50–51
- Global Position System (GPS), 161
- Gödel, Kurt, 411
- good enough solutions, 24
- Google, change in algorithm of, 13
- Google Colaboratory (Colab), 41–57
- Google DeepMind, 297
- Google Drive, 48
- Google File System (GFS), 253
- Google PageRank algorithm, 207, 209, 210–222, 232, 250
- GPUs (Graphics Processing Unit), 16, 17, 55–56
- gradient descent, 357
- graph analysis, as use for algorithms, 14
- graph functionality, 151
- graphs
- adding of to matrix, 157–158
- building of, 114–115
- computing centrality, 154–157
- considering essence of, 142–145
- counting edges and vertexes in, 152–153
- creating of, 163–164
- defined, 141
- defining how to draw, 148–151
- directed graph, 143–144, 182, 204
- directional graph, 146
- discovering graph secrets, 195–205
- distinguishing key attributes of, 149–150
- drawing of, 150–151
- envisioning social networks as, 196–202
- as extension of tree, 113
- finding them everywhere, 145–146
- forms of, 143
- friendship graphs, 196
- GPS navigation devices as operating, 386
- with loops, 168
- measuring graph functionality, 151–157
- mixed graph, 144
- navigating of, 202–205
- putting of in numeric format, 157–159
- representing relations in, 112–115
- representing robot’s territory as, 386
- showing social side of, 146–147
- sorting graph elements, 168–170
- traversing of, 202–205
- traversing of efficiently, 162–168
- understanding basics of, 141–159
- understanding subgraphs, 147–148
- undirected graph, 143
- unweighted graph, 171
- using list to hold a graph, 159
- vertex-labeled graph, 145
- walking a graph randomly, 204–205
- web as example of graph structure, 207
- weighted graph, 144–145, 171, 172, 173, 182
- working with unweighted versus weighted graphs, 171
- Zachary’s Karate Club sample graph, 197–198, 200
- Greatest Common Divisor (GCD), 11, 12
- greed, learning that it can be good, 30–32
- greedy, deciding when it is better to be, 292–299
- greedy algorithms
- introduction, 291–292
- addressing customer satisfaction, 302
- competing for resources, 301–303
- deciding when it is better to be greedy, 292–299
- defined, 24
- finding out how greedy can be useful, 299–306
- how they work, 294
- keeping them under control, 294–296
- meeting deadlines, 302–303
- as providing a particular solution, but not an optimal solution, 32
- as taking shortest edges among those available between all vertexes, 174
- understanding why greedy is good, 293–294
- greedy approach/method, use of, 24, 30–32, 174
- greedy best-first search, 35
- greedy exchange, 296
- greedy reasoning, 31
- greedy strategy, use of heaps in achieving, 276
H
- Hadoop, 253, 256
- halting problem, 412
- hash functions, 132, 135–137, 235, 240, 241, 264
- hash tables, 108, 132, 133, 137, 240, 241
- hashes, 241, 410
- hashing, 132–137, 235
- heaps, defined, 276
- hedge mazes, 386–387
- heuristic algorithms, 34–35, 382
- heuristic approach, 32–35
- heuristic routing, 384
- heuristics
- admissible heuristic, 396
- considering of, 381–399
- creating mazes, 388–391
- defined, 351, 381
- differentiating of, 382–384
- explaining path finding algorithms, 388–399
- goals of, 383
- going from genetic to AI, 383–384
- going heuristically around by A*, 396–399
- looking for quick best-first route, 392–395
- origin of term, 382
- routing robots using, 384–388
- scouting in unknown territories, 385–387
- using distance measures as, 387–388
- hill-climbing optimization, 354–357
- HITS (Hypertext-Induced Topic Search), 211
- Hoare, Tony, 124
- housekeeping, 252
- hubs, 211
- Huffman, David A., 276
- Huffman compression, 274, 276–277, 278
- Huffman encoding/coding, 31, 267, 303–306
- Hyper Search, 211
- hypercomputer, 413
- hyperlinks, 212
- HyperLogLog (algorithm), 135, 246
- Hypertext-Induced Topic Search (HITS), 211
I
- IBM, 231
- icons, explained, 3–4
- identity matrix, 67
- In the Eye of the Hurricane (Bellman), 309
- inbound links, 212
- index, defined, 117
- inferential statistics, 235
- “An Information Flow Model for Conflict and Fission in Small Groups” (Zachary), 198
- inheritance, as OOP technique, 81
- injecting randomness, 332
- insertion sort, switching to, 120–121
- integration, process of, 226
- Intel, on chip miniaturization, 227
- Intel i9 processor, 16
- Internet
- growth in usage of, 230
- searching of for data, 208
- Internet of Things (IoT), 230–231
- inverters, 360
- invisible text, 211
- issues, as distinguished from solutions, 19–21
K
- Karatusba multiplication, 413
- Kerboschin, Joep, 201
- keyboard shortcuts, 45–46
- keyword stuffing, 211
- Khachiyan, Leonid, 298
- Kleene’s algorithm, 190
- Kleinberg, Jon, 211
- knapsack problem, 310, 317–321
- Knuth, Donald E., 13, 35–36
- Kruskal, Joseph, 173
- Kruskal’s minimum spanning tree (MST), 31, 173–174, 177–180, 293
- kth value algorithm, 342
L
- lambda function, 256
- Laplace, Pierre-Simon, 92
- Large Hadron Collider, 229, 233
- Las Vegas randomized solutions, 333
- laser range-finder, 384
- leaf nodes (of tree), 110, 277
- Lehmer, Derrick Henry, 12
- Lempel, Abraham, 278
- Lempel-Ziv-Welch (LZW) algorithm, 267, 278–282
- Levenshtein, Vladimir, 326
- Levenshtein distance (aka edit distance), 326
- Lévy-Flight Firefly algorithm, 354
- Liber Abaci (Fibonacci), 311
- lidar, 384–385
- linear functions, using of as a tool, 368–374
- linear probing, as method to address collisions, 135
- linear programming (LP), 309, 367–380
- linear time, 292, 297
- link (of tree), 109
- link analysis, 407–408
- LinkedIn, 197
- list comprehension, 68
- lists, 142, 159
- load factor, 133
- local improvement technique, 349
- local search
- implementing Python code, 361–364
- knowing the neighborhood, 351–353
- performing of, 349–366
- presenting local search tricks, 353–359
- realizing that the starting point is important, 365–366
- solving 2-SAT using randomization, 360–361
- solving satisfiability of Boolean circuits, 359–365
- understanding of, 350–353
- Locality-sensitive Hashing (LSH) (algorithm), 136
- logic, putting randomness into yours, 341–348
- logic gates, 359
- loops, graphs with, 168
- lossy compression, 235, 270, 271–272
- Lovelace, Ada, 35
- Loyd, Sam, 415
- LZW (Lempel-Ziv-Welch) algorithm, 267, 278–282
M
- machine learning, 384
- Machine Learning For Dummies, 2nd Edition (Mueller and Massaron), 25
- make-change problem and solution, 292–293
- making toast algorithm, 12, 14
- Manhattan distance, 387, 388, 389, 394, 399
- mapping, inquiring by, 262–265
- MapReduce, 253, 255–265
- “MapReduce: Simplified Data Processing On Large Clusters” (Dean and Ghemawat), 253
- maps, 256–257, 385–386, 388
- Marchiori, Massimo, 211
- Markham, J. David, 28
- Martin, Nigel, 246
- Massaron, Luca, 25, 59
- Material Requirements Planning (MRP) software, 299
- math coprocessors, 16, 17
- matplotlib, 149, 198
- Matrix Reshish, 92
- matrixes
- accessing specific elements of, 85–86
- adding graph to, 157–158
- creating of, 83–84
- creating one as right way to start, 63–64
- defined, 60
- defining advanced operations of, 65
- degenerate matrix, 67
- developing matrix computation class, 79–96
- element-wise product approach, 87–88
- identity matrix, 67
- inversion of, 67
- manipulation of, 90–96
- matrix inversion, 67
- multiplying of, 64–65, 87–90, 412
- performing calculations using, 60–67
- performing scalar and matrix addition, 86–87
- printing of, 84
- singular matrix, 67
- sparse matrix, 213
- substochastic matrix, 218
- as technique for putting graph into numeric format, 142
- transition matrix, 216, 217
- matroids, 295
- “Matroids for Greedy Algorithms” (video), 295
- MAX-3SAT, 383
- mazes, 386–391, 393
- mean time between failures (MTBF), 255
- measurement, role of, 24
- memoization, 308, 311, 313, 314
- merge sort, 122–124
- metaheuristics, 384
- meteorology, 229
- “A Method for the Construction of Minimum-Redundancy Codes” (Huffman), 276
- mid-square, as technique to avoid collisions, 134
- “Millennium Prize Problems” (Clay Mathematics Institute), 298
- minimum spanning tree (MST), 170–180.See also Kruskal’s minimum spanning tree (MST); Prim’s minimum spanning tree (MST)
- Miscellaneous (settings tab in Colab), 45
- Mitchell, Stuart Antony, 375
- mixed graph, 144
- Monte Carlo method, 339–341, 344–347
- Monte Carlo randomized solutions, 333
- Moore, Gordon, 226
- Moore’s Law, 226–228
- MP3, 270
- MPEG, 270, 275
- MrJob (Map Reduce Job), 256
- MRP (Material Requirements Planning) software, 299
- MST (minimum spanning tree). See minimum spanning tree (MST)
- Mueller, John Paul, 25, 59, 256
- multicore processing units, 251
- multithreading, 253
N
- Napoleon For Dummies (Markham), 28
- Nash, John, 28, 411
- Nash Equilibrium, 28
- nature-inspired algorithms, 354
- negative edge, adding of, 182–184
- negative weights, 180–181
- networks
- clustering of in groups, 196–199
- discovering communities in, 199–202
- explaining importance of, 142–148
- as kind of graph that associates names with vertexes, edges, or both, 142
- neural networks, 17
- for running agent, 18
- social networks, 196–202
- NetworkX, 149, 150–151, 154, 157, 163, 164, 201
- Neumann, John von, 411
- neural networks, as benefitting from use of specialized chips, 17
- Newton, Isaac, 10, 27
- node (of tree), 109
- noise, 26
- normal distribution, 336
- Norvig, Peter, 309
- Notebook, performing commons tasks in, 51–55
- notebooks, working with, 47–51
- NP-complete problems, 297–299
- NP-completeness theory, 297
- NP-hard problem, 350, 353
- n-queens, 355–357, 384
- NumPy, 61, 64, 79, 80–81
- Nutch, 253
O
- objective (of problem), 368
- objective function, 354–355, 376, 379
- Object-Oriented Programming (OOP), 81
- objects, learning to count objects in a stream, 247–248
- occupancy grid maps, 385–386, 388
- oceanography, 230
- “On Selecting a Satisfying Truth Assignment” (Papadimitriou), 361
- one-way functions, creating and using of, 413
- operations, distributing of, 253–255, 258
- optimal page replacement policy, 300
- optimal substructure, 190
- optimizer, 369, 379
- ordinate (y coordinate), 39
- organizational chart, as graph, 146
- outbound links, 212
- overhead, as aspect that can dampen parallelism, 252
P
- padding, 285
- Page, Larry, 210–212, 213
- PageRank (algorithm), 207, 209, 210–222, 232, 250
- PageRank checker, 213
- “The PageRank Citation Ranking: Bringing Order to the Web” (Brin and Page), 213
- Pandas, 102, 105, 107, 221, 192, 326
- Pandas DataFrame, 192
- Papadimitrious, Christos H., 360–361
- paradigms, 291
- parallel computing, 251
- parallel paradigm, understanding of, 251–253
- parallelism, 252
- parallelizing operations
- managing immense amounts of data, 250–259
- working out algorithms for MapReduce, 259–265
- parity game, playing of, 415
- partial values, as technique to avoid collisions, 134
- path finding algorithms, 388–399
- path planning, 385
- pathfinding, 385, 389
- pathing, 385
- path-planning algorithm, 392
- pattern analysis, 408–409
- Perlin, Ken, 26
- Perlin noise, 26
- permutation, distinguishing of in data, 68–69
- physics, as field where big data has a basis in, 229
- PID (proportional integral derivative) algorithm, 25, 409
- Pisano, Leonardo (aka Fibonacci), 311
- pixel, described, 270
- Plato, 27
- Pledge algorithm, 389
- plotting (of graphs), 141
- polymorphism, as OOP technique, 81
- polynomial time, 173, 297, 298
- prime numbers, 27
- Prim’s minimum spanning tree (MST), 31, 173, 175–177, 179, 293
- priority queue, 174–175, 185
- probability, 334–335
- problem depth, 33
- problem instance, 33
- problem solution, defined, 23
- problem solving
- avoiding brute-force solutions in, 29
- breaking down of problem, 30
- computing costs and following heuristics in, 32–35
- distinguishing between possible solutions, 78
- divide-and-conquer approach to, 23
- dividing and conquering in, 28–30
- finding solutions and counterexamples, 26–27
- going random and being blessed by luck, 34
- greedy approach/method to, 24, 30–32
- keeping it simple, silly (KISS), 29–30
- modeling real-world problems, 25–26
- quickness of as algorithmic problem, 411
- reaching a good solution, 31–32
- representing problem as a space, 33
- start of, 24–28
- understanding the problem first, 24
- using a heuristic and a cost function, 34–35
- problem space, 33
- problem space graph, 33
- processed vertex, 162
- product recommendation, 221
- Project Gutenberg, 260
- proportional integral derivative (PID) algorithm, 25, 409
- Proxima Centauri, 229
- pseudocode, 21
- Pseudo-Random Number Generators (PRNGs), 405
- pseudorandom numbers, 14, 333
- PuLP, 375, 379
- pure heuristic search, 35
- PyCrypto, 284
- Pyrrhic victory, 30
- Python
- as excelling at plotting, 141
- implementing Python code, 361–364
- implementing script of, 213–216
- as not coming with built-in tree object, 110
- as not ideal language for parallel operations, 259–260
- performing essential data manipulations using, 59–78
- as providing access to number of organizational structures for data, 99
- as providing number of storage methodologies, 105
- use of classes in, 81–82
- use of Python libraries, 79, 80
- Python 3 Notebook, creating, 47
- Python for Data Science For Dummies (Mueller and Massaron), 59
Q
- queues, 107–108, 165, 174–175, 185, 237
- quick select algorithm, 337, 341–344, 345, 346
- quick sort, 124–126, 347–348
R
- Rabin, Michael, 332
- RAM simulations, 38
- Random Access Machine (RAM), 36
- random numbers, shaking things up with, 405
- random sampling, 353
- random walk, 353, 362
- randomization, 332–341, 353, 360–361
- randomized algorithms
- calculating median using quick select, 341–344
- considering why randomization is needed, 333–334
- defining how randomization works, 332–341
- doing simulations using Monte Carlo, 344–347
- ordering faster with quick sort, 347–348
- putting randomness into your logic, 341–348
- simulating use of Monte Carlo method, 339–341
- understanding distributions, 335–339
- understanding how probability works, 334–335
- RandomWalkSAT, 361, 362, 365
- rank sinks (dead ends), 219
- RankBrain, 222
- Raspberry Pi, 18
- raw data, 99
- really large numbers, multiplying of, 413–414
- real-world problems, modeling of, 25–26
- recursion, 71–75, 78, 311–314
- recursive algorithms, 13, 40, 310
- reduce, explaining of, 256–257
- redundancy, considering of, 162
- rehashing, as method to address collisions, 135
- replacement strategy, 300
- reservoir sampling, 236–238
- resources
- competing for, 301–303
- dividing one equally, 414
- RLE (run-length encoding), 275
- robots
- employing algorithms in, 17
- routing of using heuristics, 384–388
- Shakey the robot, 381, 396
- Roomba, 332
- Roy, Bernard, 190
- RSA cryptosystem, 297
- RSA’s MDS algorithm, 136
- run-length encoding (RLE), 275
- Russell, Stuart, 309
S
- sample, defined, 235
- sampling
- as algorithmic tool, 234
- defined, 225, 235, 249
- limitations of, 240
- problems with, 236
- random sampling, 353
- reservoir sampling, 236–238
- simple random sample, 236
- satellites, as field where big data has a basis in, 230
- scalar, 60, 61–62, 86–87
- scheduling, as use for algorithms, 14
- SciPy package, 158
- scraping, 243
- SDC (self-driving cars), 385
- search engine optimization (SEO), 209, 221
- search engines, 208–210, 220–221
- Search for Extraterrestrial Intelligence (SETI), 18, 228
- search routines, 404–405
- search trees, 127–132
- Search-Engine Journal (SEJ), 221
- searches/searching
- considering need to search effectively, 127
- performing local search, 349–366
- performing specialized ones using binary heap, 131–132
- as use for algorithms, 13
- Secure Hash Algorithms (SHA), 136
- Seki, Takakazu, 211–212
- selection sort, use of, 120
- self-driving cards (SDC), 385
- Selfridge-Conway solution, 414
- semantic queries, 222
- Sentinel 1A, 230
- SEO (search engine optimization), 209, 221
- sequences, remembering of with LZW, 278–282
- SETI (Search for Extraterrestrial Intelligence), 18, 228
- Settings dialog box (in Colab), 45
- SHA (Secure Hash Algorithms), 136
- Shakey the robot, 381, 396
- Shannon, Claude, 276, 303
- shortcut, defined, 185
- shortest route, finding of between two points, 180–194
- Silver, Nate, 236
- simple random sample, 236
- simplex algorithm, 368, 372–373
- simplification, learning to simplify when planning, 371–372
- simulated annealing, 354, 357–358
- simulation, in implementing PageRank, 212
- singular matrix, 67
- Site (settings tab in Colab), 44
- sketching, 225, 235, 240–248
- Smith, Adam, 28, 294
- Social Network Analysis (SNA), 196
- social networks, envisioning of as graphs, 196–202
- sociograms, 196
- Solovay, Robert M., 332
- solutions, as distinguished from issues, 19–21
- sonar arrays, 385
- sort routines, use of, 404
- sorting
- employing better sort techniques, 122–126
- relying on hashing, 132–137
- relying on topological sorting, 169–170
- switching to insert sort, 120–121
- understanding why sorting data is important, 118–121
- as use for algorithms, 13
- using a selection sort, 120
- Space/Time Trade-offs in Hash Coding with Allowable Errors (Bloom), 240
- spammers, 210
- spanning tree, 170–171.See also minimum spanning tree (MST)
- Spark, 253
- sparse matrix, 213
- sparse representation, 142, 158–159
- spatial issues, understanding of, 415
- special cells, creating of, 54–55
- special processors/specialized processors, 17, 55–56
- special-purpose chips, 17
- spider traps (cycles), 219
- spiders, defined, 209
- stacks, as LIFO data structures, 105–107
- starting point, realizing that it is important, 365–366
- stochastic averaging, 247
- storage (of data), 105, 233–234
- Strassen, Volker, 332
- strategies, adapting of to the problem, 20
- stream elements, filtering of by heart, 240–243
- streaming, defined, 249
- streaming data, 225, 233–235
- streams, learning to count objects in, 247–248
- “The String-to-string Correction Problem” (Wagner and Fischer), 326
- structure, as essential element in making algorithms work, 100
- subgraphs, 147–148
- suboptimal solution, 383
- substochastic, 218
- summarization (of data), 234
- survivor bias, 25–26
- swarm intelligence, 383–384
- switches, 254
- Symbolab calculator, 92
- symbolic table, 269
T
- Tabu Search, 354, 358–359
- tail call, 74
- tasks, performing them more quickly, 75–78
- teleporting, 219–220
- Teller, Edward, 339
- Tensor Processing Unit (TPU), 55–56
- text cells, creating of, 54
- threads, 253
- 3SUM problems, solving of more efficiently, 411–412
- TIROS 1, 230
- Toom-Cook, 414
- topological maps, 385–386, 388
- topological sorting, 169–170
- transforming, as use for algorithms, 13
- transition matrix, 216, 217
- transposition, of matrix, 66, 90, 91
- traveling salesman problem (TSP), 293, 310, 321–325, 352, 383
- trees
- balanced trees, 112
- building of, 110–112
- heaps, 112
- Huffman tree, 277
- traversing of, 111
- unbalanced trees, 112
- understanding basics of, 109–110
- using search trees, 127–132
- working with, 109–112
- trials, 334
- True Random Number Generator (TRNG), 405
- TrueRNG, 405
- TSP (traveling salesman problem), 293, 310, 321–325, 352, 383
- Turing, Alan, 412
- Turing machines, 298, 412
- Twitter, 146–147, 233
U
- unbounded knapsack problem, 318
- undirected graph, 143
- undiscovered vertex, 162
- Unicode encoding, 269
- Unicode Transformation Format 8 (UTF-8), 269
- uniform distribution, 337
- unique identifiers, creating of, 409–410
- unweighted graph, 171
- updates (to book), 4
V
- variety, as one of four key characteristics of big data, 231, 232
- vectors, performing calculations using, 60–67
- velocity, as one of four key characteristics of big data, 231, 232
- veracity, as one of four key characteristics of big data, 231, 232
- vertex-labeled graph, 145
- volume, as one of four key characteristics of big data, 231
W
- Wagner, Robert A., 326
- Wald, Abraham, 25
- Wall Follower algorithm, 389
- Wall Street (film), 294
- Warshall, Stephen, 190
- Weather Analytics, 229
- web, finding the world in a search engine, 208–210
- web spammers, 209, 210–211
- weighted graph, 144–145, 171, 172, 173, 182
- Welch, Terry, 278
- well-defined, as requirement for process to represent an algorithm, 11
- whitespace, use of in output, 154
- Wilson, Charles Erwin, 309
- windowing, 237
- wizard, 146
- WMA, as lossy compression algorithm, 270
Z
- Zachary, Wayne W., 198
- Zachary’s Karate Club sample graph, 197–198, 200
- ZIP algorithm, 272, 273
- Ziv, Jacob, 278
..................Content has been hidden....................
You can't read the all page of ebook, please click
here login for view all page.