Index

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Index

A

anonymous functions, 80

Anscombe, F., 135

apply() method, 132–133

arguments, 30

arrays

broadcasting, 98–99

expanding dimensions, 99–100

changing values in, 91

copies, changing values in, 95

creating, 86–88

one-dimensional, 87

two-dimensional, 88

using reshape method, 88–89

element-by-element operations, 91–92

filtering values, 92–94

indexing, 89–90

matrix operations, 96–97

methods, 95–96

one-dimensional, 87

sequences and, 91

setting type automatically, 97

setting type explicitly, 97–98

slicing, 89–90

two-dimensional, 88

indexing and slicing, 90

views, 94

changing values in, 94

assert statements, 16–17

assignment statements, 17

attributes, 22

axes, 136, 143–144

B

binomial distribution, 105–107

Bokeh, 149–150

Boolean operators, 14, 58–59, 125

DataFrames and, 126–127

bracket syntax, 121–122

break statement, 64

break statements, 19

broadcasting, 98–99

expanding dimensions, 99–100

built-in types, 14

C

cells, 4–5

character classes, 209

character sets, 208–209

classes, 22, 188–189

datetime.date, 207

inheritance, 196–199

variables, 190–191

classifier classes, 166

code blocks, 56, 63–64

collocations() method, 165

columns

creating, 128

updating, 129

comparison operators, 57–58, 93–94

compiling regular expressions, 211–212

compound statements, 55

if, 59–62

structure, 56

concordance() method, 165

constructors

dict(), 38

list(), 29

tuple(), 29

context managers, 205

continue statements, 19

continuous distributions, 108

exponential distribution, 110

normal distribution, 108–110

uniform distribution, 110–111

control statements, 56, 68

copies, changing values in, 95

corpus readers, 160

loading text, 160–161

tokenizers, 161

corpuses, downloading, 166–167

creating

arrays, 86–88

one-dimensional, 87

two-dimensional, 88

using reshape method, 88–89

columns, 128

DataFrames, 114

from a dictionary, 114–115

from a file, 116

from a list of lists, 115–116

datetime object, 206

dictionaries, 38

lists, 29–30

tuples, 29–30

D

DataFrames, 113

accessing data, 120–121

apply() method, 132–133

Boolean operators, 126–127

bracket syntax, 121–122

columns

creating, 128

updating, 129

creating, 114

from a dictionary, 114–115

from a file, 116

from a list of lists, 115–116

data manipulation, 129–131

describe method, 118

exclude argument, 120

include keyword, 119–120

percentiles argument, 118–119

head method, 117

interacting with, 117

interactive display, 133

manipulating, 127–128, 129

masking and filtering, 125–126

methods, 128

optimized access

by index, 124

by label, 123–124

replace method, 131–132

sorting, 204

tail method, 118

datetime object, 207

creating, 206

setting the time zone, 207

translating strings to, 207

decorators, 76–77, 79

syntax, 79–80

del() function, 40

delete statements, 18

describe method, 118

exclude argument, 120

include keyword, 119–120

percentiles argument, 118–119

dict() constructor, 38

dict_key view, 41–42

dictionaries, 37–39

checking for keys, 43

creating, 38

creating DataFrames from, 114–115

get method, 43–44

hash() method, 45

key/value pairs

adding, 39

updating, 39

removing items, 39–40

valid key types, 44–45

dictionary comprehensions, 181

dictionary views, 40–42

dict_key, 41–42

key_item, 42

difference() method, 51

discrete distributions, 105

binomial distribution, 105–107

Poisson distribution, 107–108

disjoint sets, 48

dispersion_plot() method, 165–166

docstrings, 68–69

dot notation, 22

downloading, corpuses, 166–167

E

elif statement, 62

else statement, 61

equality operators, 56–57, 125

estimators, 156

exceptions, 18–19

exponential distribution, 110

expressions, 16

generator, 182–183

extend method, 31

F

figures, 136

fileids() method, 160

files

creating DataFrames from, 116

Google Colab, 9–10

opening, 205

reading and writing, 204–205

filter() function, 179

replacing with a list comprehension, 180

filtering, DataFrames, 125–126

find iterator, 211

findall() method, 165

flattening nested lists, 167

for loops, 63–64

FreqDist class

built-in plot method, 164

methods, 164

frequency distributions, 161–162

filtering stopwords, 163–164

removing punctuation, 162–163

frozensets, 53

f-strings, 34

functional programming, 173, 174–175

changing mutable data, 176–177

dictionary comprehensions, 181

filter() function, 179

generator(s), 182

expressions, 182–183

functions, 183–184

lambda functions, 179

list comprehensions, 179

conditionals and, 181

multiple variables, 181

replacing map() and filter() with, 180

syntax, 179–180

map() function, 177–178

operator module, 179

reduce() function, 178, 179

scope, 173–174

inheriting, 174

outer, 175–176

state, 174

functions, 15, 67

anonymous, 80

control statement, 68

datetime.now(), 206

decorators, 76–77, 79

syntax, 79–80

del(), 40

docstring, 68–69

generator, 183–184

helper, 33–34

lambda, 179

len, 27

max, 28

min, 28

nested, 77

nested wrapping, 78–79

open(), 204–205

as a parameter, 78

parameters, 69–70

default value, 71–72

keyword assignments, 70–71

keyword wildcard, 74–75

mutable defaults, 72–73

positional wildcard, 74

positional-only, 73

re.compile(), 211

re.findall(), 210

re.finditer(), 211

re.match(), 207–208

re.search(), 208

return statements, 75

reversed, 41

scope, 75–76

sorted(), 202–204

wrapping, 77–78

future statements, 20

G

generator(s), 182

expressions, 182–183

functions, 183–184

get method, 43–44

global statements, 20

Google Colab, 5–6

code cells, 9

Code Snippets, 11

existing collections and, 11

files, 9–10

headings, 7–8

LaTeX, 8–9

notebooks, managing, 10

shell commands, 11–12

system aliases, 11–12

text cells, 6–8

groups, 209–210

named, 210

H

hash() method, 45

head method, 117

helper functions, 33–34

high-level programming languages, 15

I

if statements, 59–62

immutable objects, 44–45

import statements, 19–20

index method, 28

indexing, 26

arrays, 89–90

DataFrames and, 124

inequality operators, 56–57

inheritance, 196–199

inheriting scope, 174

installing, NumPy, 86

instances, 188

interacting with DataFrame data, 117

interrogation, 27–28

intersections, 51

ints, 14

numerator attribute, 22

issuperset() method, 50

items() method, 40

J-K

JSON files, opening and reading, 205

Jupyter notebooks, 4–5

Keras, 153

key_item view, 42

keys() method, 40

key/value pairs, 37

adding, 39

updating, 39

L

labels, DataFrames access and, 123–124

lambda functions, 80, 179

LaTeX, 8–9

len function, 27

libraries. See also NumPy; SciPy

machine learning, 153–154

SciPy, 103

third-party, 85

visualization, matplotlib, 135–136

list comprehensions, 179

conditionals and, 181

multiple variables, 181

replacing map() and filter() with, 180

syntax, 179–180

list() constructor, 29

lists, 29

adding and removing items, 30–31

creating, 29–30

creating DataFrames from, 115–116

flattening, 167

nested, 31

sorting, 32, 201–204

unpacking, 31–32

loops

break statement, 64

for, 63–64

while, 62–63

low-level programming languages, 15

M

machine learning, 153. See also Scikit-learn

overfitting, 155

splitting test and training data, 155–156

supervised versus unsupervised learning, 154

transformations, 154–155

magic functions, 12

manipulating DataFrames, 127–128, 129

map() function, 177–178

replacing with a list comprehension, 180

Markdown, 6

math operations, 21–22

math operator methods, 195–196

matplotlib, 135–136

colors, 139

creating multiple axes, 143–144

labeled data, 140–141

line styles, 138

marker types, 137–138

object-oriented style, 143

plotting multiple sets of data, 141–143

styling plots, 137, 139–140

matrix operations, 96–97

max function, 28

methods, 188–190

to_bytes(), 187–188

apply(), 132–133

arrays and, 95–96

collocations(), 165

concordance(), 165

count, 28

DataFrames, 128

describe, 118

exclude argument, 120

include keyword, 119–120

percentiles argument, 118–119

difference(), 51

disjoint(), 48

dispersion_plot(), 165–166

extend, 31

fileids(), 160

findall(), 165

get, 43–44

hash(), 45

head, 117

index, 28

inheritance, 196–199

intersection(), 51

issuperset(), 50

items(), 40

keys(), 40

math operator, 195–196

min(), 125

pop, 30

private, 190

public, 190

replace, 131–132

representation, 192

reverse, 32

rich comparison, 192–195

similar(), 165

sort, 32

sort(), 201–202

special, 191

subset(), 49

symmetric difference(), 51

tail, 118

union(), 50

values(), 40

min function, 28

min() method, 125

MinMaxScaler transformer, 154–155

multiple statements, 16

mutable objects, 44, 176–177

N

named groups, 210

substitution and, 211

natural language processing, 159

Natural Language Processing with Python, 169

nested functions, 77

nested lists, 31

nested wrapping functions, 78–79

NLTK (Natural Language Toolkit), 159

classifier classes, 166

defining features, 168

downloading corpuses, 166–167

flattening nested lists, 167

labeling data, 167

training and testing, 168–169

corpus readers, 160

loading text, 160–161

tokenizers, 161

fileids() method, 160

FreqDist class

built-in plot method, 164

methods, 164

frequency distributions, 161–162

filtering stopwords, 163–164

removing punctuation, 162–163

sample texts, 159–160

Text class, 165

collocations() method, 165

concordance() method, 165

dispersion_plot() method, 165–166

findall() method, 165

similar() method, 165

NoneType, 15

nonlocal statements, 20

normal distribution, 108–110

notebooks, 4–5

Google Colab, 5–6

Jupyter, 4–5

managing, 10

numerics, 14

NumPy. See also arrays; SciPy

creating arrays, 86–87

installing and importing, 86

polynomials, 100–101

O

object-oriented programming, 187

classes, 188–189

variables, 190–191

inheritance, 196–199

instances, 188

methods, 188–190

math operator, 195–196

representation, 192

rich comparison, 192–195

special, 191

objects, 187–188

private methods, 190

objects, 22, 187–188

datetime, creating, 206

evaluation, 59

immutable, 44–45

mutable, 44, 176–177

range, 34–35

one-dimensional arrays, 87

open() function, 204–205

in operator, 26, 40

or operator, 59

operators, 21–22

Boolean, 58–59

Boolean operators, 126–127

comparison, 57–58, 93–94

equality/inequality, 56–57, 125

in, 40

math, 28–29

or, 59

walrus, 60

overfitting, 155

P

packages, zoneinfo, 207

Pandas DataFrames. See DataFrames

parameters

default value, 71–72

functions as, 78

keyword assignments, 70–71

keyword wildcard, 74–75

mutable defaults, 72–73

positional wildcard, 74

positional-only, 73

parser, 14

pass statements, 18

Plotly, 148–149

Poisson distribution, 107–108

polynomials, 100–101

pop method, 30

print statements, 20–21

private methods, 190

procedural programming, 174

programming languages, high-level versus low-level, 15

proper subsets, 49

public methods, 190

Punkt tokenizer, 161

Python, types, 14–15

PyTorch, 154

Q-R

quotation marks, strings and, 33

raise statements, 18–19

ranges, 34–35

raw strings, 33

reading files, 204–205

re.compile() function, 211

reduce() function, 178, 179

re.findall() function, 210

re.finditer() function, 211

regular expressions, 207–208

compiling, 211–212

groups, 209–210

named groups, 210

substitution, 211

using named groups, 211

re.match() function, 207–208

removing, items from dictionaries, 39–40

replace method, 131–132

representation methods, 192

re.search() function, 208

return statements, 18, 75

reverse method, 32

rich comparison methods, 192–195

running statements, 4

S

Scikit-learn, 154

estimators, 156

MinMaxScaler transformer, 154–155

splitting test and training data, 155–156

training a model, 156

training and testing, 156

tutorials, 157

SciPy, 103

continuous distributions, 108

exponential distribution, 110

normal distribution, 108–110

uniform distribution, 110–111

discrete distributions, 105

binomial distribution, 105–107

Poisson distribution, 107–108

scipy.misc submodule, 104–105

scipy.special submodule, 105

scipy.stats submodule, 105

scope, 20, 75–76, 173–174

inheriting, 174

Seaborn, 144–145

plot types, 148

themes, 145–147

sequences, 14, 25

arrays and, 91

frozensets and, 53

indexing, 26

interrogation, 27–28

intersections, 51

lists, 29

adding and removing items, 30–31

nested, 31

sorting, 32

unpacking, 31–32

math operations, 28–29

slicing, 27

testing membership, 26

tuples, 29

unpacking, 31–32

sets, 46–48

difference between, 51

disjoint, 48

proper subsets, 49

subsets and, 49

supersets and, 50

symmetric difference, 51

union, 50

updating, 51–52

shared operations, 25

similar() method, 165

slicing, 27

arrays, 89–90

DataFrames, 122

sort() method, 201–202

sort method, 32

sorted() function, 202–204

sorting, lists, 32, 201–204

special characters, 33

statements, 15–16

assert, 16–17

assignment, 17

break, 19, 64

code blocks, 56, 63–64

continue, 19, 64–65

delete, 18

elif, 62

else, 61

expression, 16

future, 20

global, 20

if, 59–62

import, 19–20

multiple, 16

nonlocal, 20

pass, 18

print, 20–21

raise, 18–19

return, 18, 75

running, 4

yield, 18

stopwords, 163–164

strings, 14, 32–33

f-, 34

helper functions, 33–34

quotation marks and, 33

raw, 33

special characters, 33

translating to datetime object, 207

submodules

scipy.misc, 104–105

scipy.special, 105

scipy.stats, 105

subset() method, 49

substitution, 211

supersets, 50

symmetric difference() method, 51

syntax

bracket, 121–122

decorators, 79–80

list comprehensions, 179–180

T

tail method, 118

TensorFlow, 153

text cells, 6–8

Text class, 165

collocations() method, 165

concordance() method, 165

dispersion_plot() method, 165–166

findall() method, 165

similar() method, 165

third-party libraries, 85

time series data, 206

time zone, setting for datetime object, 207

to_bytes() method, 187–188

tokenizers, 161

transformations, 154–155

tuple() constructor, 29

tuples, 29

creating, 29–30

unpacking, 31–32

two-dimensional arrays, 88

indexing and slicing, 90

types, 14–15. See also sequences

U

uniform distribution, 110–111

union() method, 50

updating

columns, -129

sets, 51–52

V

values() method, 40

variables, 190–191

views, 94

changing values in, 94

visualization libraries, 151

Bokeh, 149–150

matplotlib, 135–136

colors, 139

creating multiple axes, 143–144

labeled data, 140–141

line styles, 138

marker types, 137–138

object-oriented style, 143

plotting multiple sets of data, 141–143

styling plots, 137, 139–140

Plotly, 148–149

Seaborn, 144–145

plot types, 148

themes, 145–147

W

walrus operator, 60

while loops, 62–63

wrapping functions, 77–78

writing file data, 204–205

X-Y-Z

yield statements, 18

zoneinfo package, 207

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Index

Create new playlist

Sign In

Sign Up

Index

A

B

C

D

E

F

G

H

I

J-K

L

M

N

O

P

Q-R

S

T

U

V

W

X-Y-Z

Table of Contents for
Index