A
act parameter 233
action class 98–99
action parameters
about 19, 23
nested 24–26
removing 92–93
retrieving 92–93
specifying 22–27
action sets (CAS) 5–8, 35–37
actions
addtable 65–66, 68–70, 75–76
amntrain 233
checking return status of 32–35
columninfo 54, 94
correlation 96
crosstab 172–176
datapreprocess 176–181
distinct 163–166
droptable 56
dtreescore 225
dtreetrain 203–205, 219, 223
echo 23–24
exceptions 38–39
executing on CAS tables 11–12
fetch 55, 85, 89, 94, 120
foresttrain 228
freq 166–169
gbtreescore 231
genmod 201–202
getsessopt 57–58
glm 188–190
histogram 153
impute 181–184
loadactionset 36
logistic 212–214
resolving parameter problems 39–41
save 120
setsessopt 46–48, 58, 82–83
summary 55–56, 95–96, 149–152
tableinfo 54, 94
topk 169–171
actionset parameter 19, 23
activation functions 231
active caslib 57–58, 82–83
activeonadd parameter 58
add (+) operator 131
addtable action 65–66, 68–70, 75–76
alpha option 215
amntrain action 233
Anaconda Python distribution 1
& (and) operator 130
arch parameter 233
attribute interface, managing parameters using 105–106
attributes
dtypes 110
iloc 125, 127–128
loc 125, 127–128
message 39
reason 32
response 39
results 39
severity 32
shape 111
size 111
status 32
status_code 32
attributes, setting parameters as 90–92
attribute-style syntax 31
Authinfo file 16
automatic type casting 26
B
Binary data type 57
binary interfaces, REST interfaces vs. 236–238
Bokeh 12
bootstrap parameter 228
bracket notation 90–91
bracketed syntax 31
Bucket technique 179
BY statement 217
by-group processing
about 136–138
concatenating by-groups 139
handling multiple sets of by-groups 143–145
selecting result keys by table names 139–142
selecting specific by-groups 142
C
CAS
action results 28–35
action sets 5–8, 35–37
connecting to 15–17
data management in 49–83
data types 56–57
executing actions on tables 11–12
help for 37
installing 3
loading data into 9–10
running actions 8–9, 17–22
session options 46–48
CAS actions. see actions
CAS interface. see binary interfaces
CAS tables
about 50, 109
data wrangling 134–145
displaying data in 55
dropping 56
fetching data with a sort order 120–122
getting started with 51–58
indexing techniques 125–134
introspection 109–112
iterating 122–124
loading data into 53–54
selecting data 125–134
streaming data from databases into 75–77
uploading data files to 58–61
uploading data from Pandas DataFrames to 62–63
uploading data from URLs to 61–62
using objects like DataFrames 109–120
visibility of 57
CASAction objects 85–93
CASDataMsgHandler class 63
caslibinfo action 57
caslibs
about 50
active 57–58, 82–83
creating 82
dropping 83
getting started with 51–58
managing 82–83
visibility of 57
casout parameter 27
CASResults class 29
CASTable data, creating plots from 116–119
CASTable objects
getting started 93–107
manually creating 98
CASTable parameters
materializing 106–107
setting 99–102
CASTables
action interface 98–99
exporting to other formats 119–120
categorical variables
modeling 207–233
summarizing 162–176
cbar option 215
cflev parameter 222
Char data type 56
chart types 118–119
classes
action 98–99
CASDataMsgHandler 63
CASResults 29
Clipboard 63
collections.OrderedDict 29
CSV 63
DBAPI 63
Excel 63
HTML 63
JSON 63
OrderedDict 94
PandasDataFrame 63
SQLQuery 63, 72
SQLTable 63, 72
Text 63
"client-side view" 100
Clipboard class 63
cloglog link function 208
closing connections 14
collections.OrderedDict class 29
colon 24
columninfo action 54, 94
columns
computer 134–136
iterating through 122–124
selecting by label and position 125–126
columns attribute 110
comb parameter 233
communication, secure 248
computed columns, creating 134–136
computedvarsprogram parameter 100
concat function 32
concat_bygroups method 139, 152
concatenating, by groups 139
conn variable 16
connection attribute 39
connections
to CAS 15–17
closing 14
to existing sessions 247–248
conn.fetch(...) 87
continuous variables
modeling 185–205
summarizing 148–162
Continuum Analytics (website) 1
cooksd option 190
correlation action 96
correlations 161–162
count method 116
covratio option 190
crosstab action 172–176
CSS statistic 149
CSV class 63
Cutpts technique 179
CV statistic 116, 149
D
data
displaying in CAS tables 55
dynamic selection of 129–134
fetching with sort orders 120–122
loading 9–10, 53–54
loading into CAS tables 53–54
managing in CAS 49–83
selecting 125–134
selecting by label and position 127–129
streaming from databases into CAS tables 75–77
uploading from Pandas DataFrames to CAS tables 62–63
uploading from URLs to CAS tables 61–62
data files, uploading to CAS tables 58–61
data message handlers
Excel 69–71
HTML 64–69
PandasDataFrame 71–72
using 63–77
using with databases 72–77
writing your own 77–82
data transformers, adding 79–82
data types 56–57
data visualization 12–14
data wrangling
about 134
by-group processing 136–145
creating computed columns 134–136
databases
streaming data into CAS tables from 75–77
using data message handlers with 72–77
DataFrame method 137
DataFrames 30–32, 109–120
datapreprocess action 176–181
Date data type 57
Datetime data type 57
DATETIME format 68
date/time properties 133
DBAPI class 63
decision trees 217–226
DecQuad data type 56
DecSext data type 56
del_param(s) method 104, 152
describe method 97, 112–113
describe_option function 41–45
descriptive statistics 148–153
dffits option 190
dict constructor 24, 27
dictionary conversion 27
difchisq option 215
difdev option 215
dimension reduction 176–184
distinct action 163–166
distribution parameter 231
divide (/) operator 131
dot syntax 31
dot-separated notation 89
Double data type 56
droptable action 56
dtreescore action 225
dtreetrain action 203–205, 219, 223
dtypes attribute 110
dynamic data selection 129–134
E
echo action 23–24
equal-distance binning 153
equality (==) operator 131
error handling 37–38
errorFunc parameter 233
Excel class 63
Excel data message handler 69–71
exceptions 38–39
executing actions on CAS tables 11–12
export methods 119–120
exporting CASTables to other formats 119–120
F
fetch action 55, 85, 89, 94, 120
fetching data with sort orders 120–122
forests 226–233
foresttrain action 228
freq action 166–169
FREQ procedure 173–176
functions
activation 231
cloglog link 208
concat 32
describe_option 41–45
getnext 245
get_option 41–45
gridplot 158
link 198
log link 198
logit link 208
negative cloglog link 208
option_content 41–45
probit link 208, 212–213
reset_option 41–45
set_option 41–45
str2cas_datetime utility 80
str2cas_time utility 80
utility 25
vl utility 25
G
gbtreescore action 231
generalized linear models 197–202
genmod action 201–202
get_dtype_counts method 111
get_group method 142
getnext function 245
get_option function 41–45
get_param(s) method 88, 92–93
getrow method 77–78, 81–82
getsessopt action 57–58
GitHub 2
glm action 188–190
gradient boosting 226–233
greater than (>) operator 131
greater than or equal to (>=) operator 131
gridplot function 158
groupby method 142
groupby parameter 105, 138
H
h option 190, 214
has_param(s) method 93
head method 111–112, 121–122
help
for CAS 37
displaying 19
histogram action 153
histograms 153–159
HTML class 63
HTML data message handler 64–69
I
iloc attribute 125, 127–128
importing SWAT 5–8
impute action 181–184
includeBias parameter 233
indexing techniques 125–134
inequality (!=) operator 131
info method 97, 111
__init__constructor 77, 81–82
installing
CAS 3
Python 1
SAS SWAT 2–3
Int32 data type 56
Int64 data type 56
integers 26
invoke method 243–247
IPython 15–17, 20, 85–87
iterating, through columns and rows 122–124
iterrows method 124
ix accessor 128–129
J
JSON class 63
Jupyter notebook 2
K
**kwargs argument 22
L
label
selecting columns by 125–126
selecting data by 127–129
lasso parameter 231
lcl option 190, 214
lclm option 190, 214
leafsize parameter 222
less than (<) operator 131
less than or equal to (<=) operator 131
likedist option 190
linear regressions 186–197, 197–205
link function 198
loadactionset action 36
loading data 9–10
loadtable action 59, 67
loc attribute 125, 127–128
log link function 198
logistic action 212–214
logistic regression 207–217
logit link function 208
Lua programming language 24
M
m parameter 228, 231
Matplotib 12–13, 117
Max statistic 116, 149, 183
maxbranch parameter 222
maxlevel parameter 222
MEAN technique 116, 183
MEDIAN technique 183
message attribute 39
method interface, managing parameters using 102–104
methodcontinuous option 183
methodnominal option 183
methods
concat_bygroups 139, 152
count 116
DataFrame 137
del_param(s) 104, 152
describe 97, 112–113
export 119–120
get_dtype_counts 111
get_group 142
get_param(s) 88, 92–93
getrow 77–78, 81–82
groupby 142
has_param(s) 93
head 111–112, 121–122
info 97, 111
invoke 243–247
iterrows 124
plot 12–13, 117
probt 116
set_params 88
sort_values 121
stack 150
string 132–133
tail 111–112, 121–122
time-related 134
upload 53, 58–59, 61–62
MIDRANGE technique 183
Min statistic 149, 183
MODE technique 183
modeling
categorical variables 207–233
continuous variables 185–205
modulo (%) operator 131
multiply (*) operator 131
N
N statistic 149
negative cloglog link function 208
nested parameters 24–26, 89–90
Netrc file 16
neural networks 226–233
NMiss statistic 116, 149
ntree parameter 228, 231
nTries parameter 233
O
oob parameter 229
option_content function 41–45
options
alpha 215
cbar 215
cooksd 190
covratio 190
dffits 190
difchisq 215
difdev 215
h 190, 214
lcl 190, 214
lclm 190, 214
likedist 190
methodcontinuous 183
methodnominal 183
pred 190, 214
press 190
reschi 215
resdev 215
resid 190
reslik 215
resraw 214
reswork 215
rstudent 190
stats 115–116
stdi 190
stdp 190
stdr 190
stdreschi 215
stdxbeta 214
student 190
ucl 190, 214
uclm 190, 214
xbeta 214
| (or) operator 130
OrderedDict class 94
ordinary linear regressions, extensions of 197–205
out variable 28–29
P
Pandas DataFrame 9, 12, 62–63
PandasDataFrame class 63
PandasDataFrame data message handler 71–72
parameters
act 233
action 19, 22–27, 92–93
actionset 19, 23
activeonadd 58
arch 233
bootstrap 228
casout 27
CASTable 99–102, 106–107
cflev 222
comb 233
computedvarsprogram 100
distribution 231
errorFunc 233
groupby 105, 138
includeBias 233
lasso 231
leafsize 222
m 228, 231
managing using attribute interface 105–106
managing using method interface 102–104
maxbranch 222
maxlevel 222
nested 24–26, 89–90
ntree 228, 231
nTries 233
oob 229
pop 91
prune=True|False 222
responsefunc= 239–243
resultfunc= 239–243
ridge 231
scalar 27
seed 228, 231
setting as attributes 90–92
sortby 91, 120
subsamplerate 231
targetAct 233
targetComb 233
varimp=True|False 222
verbose 19, 23
vote='majority' 228
vote='prob' 228
percentiles 159–161
periods (.) 90
pip tool 2
platform 2
plot method 12–13, 117
plots, creating from CASTable data 116–119
pop parameter 91
position
selecting columns by 125–126
selecting data by 127–129
power (**) operator 25–26, 66, 131
pred option 190, 214
press option 190
probit link function 208, 212–213
probt method 116
ProbT statistic 149
prune=True|False parameter 222
Python 1
Q
Quantile technique 179
R
RANDOM technique 183
reason attribute 32
regression trees 202–205
regressions
linear 186–197
linear, extensions of ordinary 197–205
logistic 207–217
reschi option 215
resdev option 215
reset_option function 41–45
resid option 190
reslik option 215
resolving CAS action parameter problems 39–41
response attribute 39
responsefunc= parameter 239–243
resraw option 214
REST interface 63, 236–238
result keys, selecting by table name 139–142
result processing workflows 238–247
resultfunc= parameter 239–243
results attribute 39
reswork option 215
return status, checking of CAS actions 32–35
ridge parameter 231
rows, iterating through 122–124
rstudent option 190
running CAS actions 8–9, 17–22
S
SAS SWAT, installing 2–3
save action 120
scalar parameter 27
secure communication 248
seed parameter 228, 231
SELECT statement 74
set_option function 41–45
set_params method 88
setsessopt action 46–48, 58, 82–83
severity attribute 32
shape attribute 111
size attribute 111
sort orders, fetching data with 120–122
sortby parameter 91, 120
sort_values method 121
Spyder IDE 2
SQLite database 75
SQLQuery class 63, 72
SQLTable class 63, 72
stack method 150
statistics, descriptive 148–153
stats option 115–116
status attribute 32
status_code attribute 32
Std statistic 116, 149
StdErr statistic 116, 149
stdi option 190
stdp option 190
stdr option 190
stdreschi option 215
stdxbeta option 214
str2cas_datetime utility function 80
str2cas_time utility function 80
streaming data from databases into CAS tables 75–77
string methods 132–133
strings 26
student option 190
subsamplerate parameter 231
subtract (-) operator 131
Sum statistic 116, 149
summarizing
categorical variables 162–176
continuous variables 148–162
summary action 55–56, 95–96, 149–152
SWAT
importing 5–8
options 41–45
swat.options object 44–45
syntax 24
T
table name, selecting result keys by 139–142
tableinfo action 54, 94
tables. see CAS tables
tail method 111–112, 121–122
targetAct parameter 233
targetComb parameter 233
test connection 3
Text class 63
Time data type 57
time-related methods 134
topk action 169–171
transforming variables 176–184
TValue statistic 116, 149
U
ucl option 190, 214
uclm option 190, 214
upload method 53, 58–59, 61–62
URLs, uploading data to CAS tables from 61–62
USS statistic 116, 149
utility function 25
V
VALUE technique 183
Var statistic 116, 149
Varbinary data type 57
Varchar data type 57
variable binning 177–181
variable imputation 181–184
variables
about 79
categorical 162–176, 207–233
continuous 148–162, 185–205
transforming 176–184
varimp=True|False parameter 222
verbose parameter 19, 23
versions, Python 1
visibility, of caslib and CAS tables 57
vl utility function 25
vote='majority' parameter 228
vote='prob' parameter 228
W
WHERE clause 74
X
xbeta option 214
18.191.239.48