Index

A

anonymous functions, 80

Anscombe, F., 135

apply() method, 132133

arguments, 30

arrays

broadcasting, 9899

expanding dimensions, 99100

changing values in, 91

copies, changing values in, 95

creating, 8688

one-dimensional, 87

two-dimensional, 88

using reshape method, 8889

element-by-element operations, 9192

filtering values, 9294

indexing, 8990

matrix operations, 9697

methods, 9596

one-dimensional, 87

sequences and, 91

setting type automatically, 97

setting type explicitly, 9798

slicing, 8990

two-dimensional, 88

indexing and slicing, 90

views, 94

changing values in, 94

assert statements, 1617

assignment statements, 17

attributes, 22

axes, 136, 143144

B

binomial distribution, 105107

Bokeh, 149150

Boolean operators, 14, 5859, 125

DataFrames and, 126127

bracket syntax, 121122

break statement, 64

break statements, 19

broadcasting, 9899

expanding dimensions, 99100

built-in types, 14

C

cells, 45

character classes, 209

character sets, 208209

classes, 22, 188189

datetime.date, 207

inheritance, 196199

variables, 190191

classifier classes, 166

code blocks, 56, 6364

collocations() method, 165

columns

creating, 128

updating, 129

comparison operators, 5758, 9394

compiling regular expressions, 211212

compound statements, 55

if, 5962

structure, 56

concordance() method, 165

constructors

dict(), 38

list(), 29

tuple(), 29

context managers, 205

continue statements, 19

continuous distributions, 108

exponential distribution, 110

normal distribution, 108110

uniform distribution, 110111

control statements, 56, 68

copies, changing values in, 95

corpus readers, 160

loading text, 160161

tokenizers, 161

corpuses, downloading, 166167

creating

arrays, 8688

one-dimensional, 87

two-dimensional, 88

using reshape method, 8889

columns, 128

DataFrames, 114

from a dictionary, 114115

from a file, 116

from a list of lists, 115116

datetime object, 206

dictionaries, 38

lists, 2930

tuples, 2930

D

DataFrames, 113

accessing data, 120121

apply() method, 132133

Boolean operators, 126127

bracket syntax, 121122

columns

creating, 128

updating, 129

creating, 114

from a dictionary, 114115

from a file, 116

from a list of lists, 115116

data manipulation, 129131

describe method, 118

exclude argument, 120

include keyword, 119120

percentiles argument, 118119

head method, 117

interacting with, 117

interactive display, 133

manipulating, 127128, 129

masking and filtering, 125126

methods, 128

optimized access

by index, 124

by label, 123124

replace method, 131132

sorting, 204

tail method, 118

datetime object, 207

creating, 206

setting the time zone, 207

translating strings to, 207

decorators, 7677, 79

syntax, 7980

del() function, 40

delete statements, 18

describe method, 118

exclude argument, 120

include keyword, 119120

percentiles argument, 118119

dict() constructor, 38

dict_key view, 4142

dictionaries, 3739

checking for keys, 43

creating, 38

creating DataFrames from, 114115

get method, 4344

hash() method, 45

key/value pairs

adding, 39

updating, 39

removing items, 3940

valid key types, 4445

dictionary comprehensions, 181

dictionary views, 4042

dict_key, 4142

key_item, 42

difference() method, 51

discrete distributions, 105

binomial distribution, 105107

Poisson distribution, 107108

disjoint sets, 48

dispersion_plot() method, 165166

docstrings, 6869

dot notation, 22

downloading, corpuses, 166167

E

elif statement, 62

else statement, 61

equality operators, 5657, 125

estimators, 156

exceptions, 1819

exponential distribution, 110

expressions, 16

generator, 182183

extend method, 31

F

figures, 136

fileids() method, 160

files

creating DataFrames from, 116

Google Colab, 910

opening, 205

reading and writing, 204205

filter() function, 179

replacing with a list comprehension, 180

filtering, DataFrames, 125126

find iterator, 211

findall() method, 165

flattening nested lists, 167

for loops, 6364

FreqDist class

built-in plot method, 164

methods, 164

frequency distributions, 161162

filtering stopwords, 163164

removing punctuation, 162163

frozensets, 53

f-strings, 34

functional programming, 173, 174175

changing mutable data, 176177

dictionary comprehensions, 181

filter() function, 179

generator(s), 182

expressions, 182183

functions, 183184

lambda functions, 179

list comprehensions, 179

conditionals and, 181

multiple variables, 181

replacing map() and filter() with, 180

syntax, 179180

map() function, 177178

operator module, 179

reduce() function, 178, 179

scope, 173174

inheriting, 174

outer, 175176

state, 174

functions, 15, 67

anonymous, 80

control statement, 68

datetime.now(), 206

decorators, 7677, 79

syntax, 7980

del(), 40

docstring, 6869

generator, 183184

helper, 3334

lambda, 179

len, 27

max, 28

min, 28

nested, 77

nested wrapping, 7879

open(), 204205

as a parameter, 78

parameters, 6970

default value, 7172

keyword assignments, 7071

keyword wildcard, 7475

mutable defaults, 7273

positional wildcard, 74

positional-only, 73

re.compile(), 211

re.findall(), 210

re.finditer(), 211

re.match(), 207208

re.search(), 208

return statements, 75

reversed, 41

scope, 7576

sorted(), 202204

wrapping, 7778

future statements, 20

G

generator(s), 182

expressions, 182183

functions, 183184

get method, 4344

global statements, 20

Google Colab, 56

code cells, 9

Code Snippets, 11

existing collections and, 11

files, 910

headings, 78

LaTeX, 89

notebooks, managing, 10

shell commands, 1112

system aliases, 1112

text cells, 68

groups, 209210

named, 210

H

hash() method, 45

head method, 117

helper functions, 3334

high-level programming languages, 15

I

if statements, 5962

immutable objects, 4445

import statements, 1920

index method, 28

indexing, 26

arrays, 8990

DataFrames and, 124

inequality operators, 5657

inheritance, 196199

inheriting scope, 174

installing, NumPy, 86

instances, 188

interacting with DataFrame data, 117

interrogation, 2728

intersections, 51

ints, 14

numerator attribute, 22

issuperset() method, 50

items() method, 40

J-K

JSON files, opening and reading, 205

Jupyter notebooks, 45

Keras, 153

key_item view, 42

keys() method, 40

key/value pairs, 37

adding, 39

updating, 39

L

labels, DataFrames access and, 123124

lambda functions, 80, 179

LaTeX, 89

len function, 27

libraries. See also NumPy; SciPy

machine learning, 153154

SciPy, 103

third-party, 85

visualization, matplotlib, 135136

list comprehensions, 179

conditionals and, 181

multiple variables, 181

replacing map() and filter() with, 180

syntax, 179180

list() constructor, 29

lists, 29

adding and removing items, 3031

creating, 2930

creating DataFrames from, 115116

flattening, 167

nested, 31

sorting, 32, 201204

unpacking, 3132

loops

break statement, 64

for, 6364

while, 6263

low-level programming languages, 15

M

machine learning, 153. See also Scikit-learn

overfitting, 155

splitting test and training data, 155156

supervised versus unsupervised learning, 154

transformations, 154155

magic functions, 12

manipulating DataFrames, 127128, 129

map() function, 177178

replacing with a list comprehension, 180

Markdown, 6

math operations, 2122

math operator methods, 195196

matplotlib, 135136

colors, 139

creating multiple axes, 143144

labeled data, 140141

line styles, 138

marker types, 137138

object-oriented style, 143

plotting multiple sets of data, 141143

styling plots, 137, 139140

matrix operations, 9697

max function, 28

methods, 188190

to_bytes(), 187188

apply(), 132133

arrays and, 9596

collocations(), 165

concordance(), 165

count, 28

DataFrames, 128

describe, 118

exclude argument, 120

include keyword, 119120

percentiles argument, 118119

difference(), 51

disjoint(), 48

dispersion_plot(), 165166

extend, 31

fileids(), 160

findall(), 165

get, 4344

hash(), 45

head, 117

index, 28

inheritance, 196199

intersection(), 51

issuperset(), 50

items(), 40

keys(), 40

math operator, 195196

min(), 125

pop, 30

private, 190

public, 190

replace, 131132

representation, 192

reverse, 32

rich comparison, 192195

similar(), 165

sort, 32

sort(), 201202

special, 191

subset(), 49

symmetric difference(), 51

tail, 118

union(), 50

values(), 40

min function, 28

min() method, 125

MinMaxScaler transformer, 154155

multiple statements, 16

mutable objects, 44, 176177

N

named groups, 210

substitution and, 211

natural language processing, 159

Natural Language Processing with Python, 169

nested functions, 77

nested lists, 31

nested wrapping functions, 7879

NLTK (Natural Language Toolkit), 159

classifier classes, 166

defining features, 168

downloading corpuses, 166167

flattening nested lists, 167

labeling data, 167

training and testing, 168169

corpus readers, 160

loading text, 160161

tokenizers, 161

fileids() method, 160

FreqDist class

built-in plot method, 164

methods, 164

frequency distributions, 161162

filtering stopwords, 163164

removing punctuation, 162163

sample texts, 159160

Text class, 165

collocations() method, 165

concordance() method, 165

dispersion_plot() method, 165166

findall() method, 165

similar() method, 165

NoneType, 15

nonlocal statements, 20

normal distribution, 108110

notebooks, 45

Google Colab, 56

Jupyter, 45

managing, 10

numerics, 14

NumPy. See also arrays; SciPy

creating arrays, 8687

installing and importing, 86

polynomials, 100101

O

object-oriented programming, 187

classes, 188189

variables, 190191

inheritance, 196199

instances, 188

methods, 188190

math operator, 195196

representation, 192

rich comparison, 192195

special, 191

objects, 187188

private methods, 190

objects, 22, 187188

datetime, creating, 206

evaluation, 59

immutable, 4445

mutable, 44, 176177

range, 3435

one-dimensional arrays, 87

open() function, 204205

in operator, 26, 40

or operator, 59

operators, 2122

Boolean, 5859

Boolean operators, 126127

comparison, 5758, 9394

equality/inequality, 5657, 125

in, 40

math, 2829

or, 59

walrus, 60

overfitting, 155

P

packages, zoneinfo, 207

Pandas DataFrames. See DataFrames

parameters

default value, 7172

functions as, 78

keyword assignments, 7071

keyword wildcard, 7475

mutable defaults, 7273

positional wildcard, 74

positional-only, 73

parser, 14

pass statements, 18

Plotly, 148149

Poisson distribution, 107108

polynomials, 100101

pop method, 30

print statements, 2021

private methods, 190

procedural programming, 174

programming languages, high-level versus low-level, 15

proper subsets, 49

public methods, 190

Punkt tokenizer, 161

Python, types, 1415

PyTorch, 154

Q-R

quotation marks, strings and, 33

raise statements, 1819

ranges, 3435

raw strings, 33

reading files, 204205

re.compile() function, 211

reduce() function, 178, 179

re.findall() function, 210

re.finditer() function, 211

regular expressions, 207208

compiling, 211212

groups, 209210

named groups, 210

substitution, 211

using named groups, 211

re.match() function, 207208

removing, items from dictionaries, 3940

replace method, 131132

representation methods, 192

re.search() function, 208

return statements, 18, 75

reverse method, 32

rich comparison methods, 192195

running statements, 4

S

Scikit-learn, 154

estimators, 156

MinMaxScaler transformer, 154155

splitting test and training data, 155156

training a model, 156

training and testing, 156

tutorials, 157

SciPy, 103

continuous distributions, 108

exponential distribution, 110

normal distribution, 108110

uniform distribution, 110111

discrete distributions, 105

binomial distribution, 105107

Poisson distribution, 107108

scipy.misc submodule, 104105

scipy.special submodule, 105

scipy.stats submodule, 105

scope, 20, 7576, 173174

inheriting, 174

Seaborn, 144145

plot types, 148

themes, 145147

sequences, 14, 25

arrays and, 91

frozensets and, 53

indexing, 26

interrogation, 2728

intersections, 51

lists, 29

adding and removing items, 3031

nested, 31

sorting, 32

unpacking, 3132

math operations, 2829

slicing, 27

testing membership, 26

tuples, 29

unpacking, 3132

sets, 4648

difference between, 51

disjoint, 48

proper subsets, 49

subsets and, 49

supersets and, 50

symmetric difference, 51

union, 50

updating, 5152

shared operations, 25

similar() method, 165

slicing, 27

arrays, 8990

DataFrames, 122

sort() method, 201202

sort method, 32

sorted() function, 202204

sorting, lists, 32, 201204

special characters, 33

statements, 1516

assert, 1617

assignment, 17

break, 19, 64

code blocks, 56, 6364

continue, 19, 6465

delete, 18

elif, 62

else, 61

expression, 16

future, 20

global, 20

if, 5962

import, 1920

multiple, 16

nonlocal, 20

pass, 18

print, 2021

raise, 1819

return, 18, 75

running, 4

yield, 18

stopwords, 163164

strings, 14, 3233

f-, 34

helper functions, 3334

quotation marks and, 33

raw, 33

special characters, 33

translating to datetime object, 207

submodules

scipy.misc, 104105

scipy.special, 105

scipy.stats, 105

subset() method, 49

substitution, 211

supersets, 50

symmetric difference() method, 51

syntax

bracket, 121122

decorators, 7980

list comprehensions, 179180

T

tail method, 118

TensorFlow, 153

text cells, 68

Text class, 165

collocations() method, 165

concordance() method, 165

dispersion_plot() method, 165166

findall() method, 165

similar() method, 165

third-party libraries, 85

time series data, 206

time zone, setting for datetime object, 207

to_bytes() method, 187188

tokenizers, 161

transformations, 154155

tuple() constructor, 29

tuples, 29

creating, 2930

unpacking, 3132

two-dimensional arrays, 88

indexing and slicing, 90

types, 1415. See also sequences

U

uniform distribution, 110111

union() method, 50

updating

columns, -129

sets, 5152

V

values() method, 40

variables, 190191

views, 94

changing values in, 94

visualization libraries, 151

Bokeh, 149150

matplotlib, 135136

colors, 139

creating multiple axes, 143144

labeled data, 140141

line styles, 138

marker types, 137138

object-oriented style, 143

plotting multiple sets of data, 141143

styling plots, 137, 139140

Plotly, 148149

Seaborn, 144145

plot types, 148

themes, 145147

W

walrus operator, 60

while loops, 6263

wrapping functions, 7778

writing file data, 204205

X-Y-Z

yield statements, 18

zoneinfo package, 207

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.159.224