3D in data visualizations, 220-221
56th Grammy Awards, hypothesis validation example (data analysis), 104, 112-113
2008 presidential debates, 122-124
2011 Academy Awards, 14
2012 presidential debates, 14
2012 presidential election, 86
2014 Grammy Awards and Twitter, xxix
2014 IBM Insight conference, 172
Academy Awards (2011), 14
Activity Scorecard KPI in PSD, 177-180
Adams, Ansel, 103
Adams, Douglas, 31
ad hoc analysis, 87
external social media (domain of analysis), 90
internal social media (domain of analysis), 95
Adventure of the Six Napoleons, The, xx
AdWeek, JetBlue and customer positive/negative experiences, 38
affinity analysis (SMA), 165-167
affinity matrixes, 244
Africa, growth of social media, xxviii
age of author and data analysis, 34, 41-42
All Things Analytics website, 170-172
Al Qaeda, 5
Altimeter Group, 169
Always On Engagement Center (IBM), 125
analysis, depth of (taxonomy of data analysis), 84-85
analysis, domain of (taxonomy of data analysis), 84, 169
external social media, 88
ad hoc analysis, 90
ad hoc analysis, 95
analysis, duration of (taxonomy of data analysis), 90-91, 96
Analytics (Enterprise Graphs), 174
Analytics Services (Enterprise Graphs), 174
analyzing comsumer reactions, 204-209
analyzing data, xxx
ad hoc analysis, 87
external social media (domain of analysis), 90
internal social media (domain of analysis), 95
audience comments, filtering
eminence/popularity, 35, 42-44
objective feedback, 31
profession/expertise, 34
public versus employee comments, 31-32
roles (job), 35
specific audiences, 35
conclusions, 247
data analysis (first pass), 235-241
data analysis (second pass), 243-244
data identification, 228, 231-235
interpreting information, 244-247
chaff, separating wheat from, 18
data collection
calculating web page visits, 20-21
data interpretation, xxv
data modeling, xxv
data visualization, xxv
Evolving Topics algorithm, 163-164
external social media (domain of analysis), 90-93
internal social media (domain of analysis), 95-97
relationship matrixes, 92
suggested action phase, 161-163
support via analytics software, 163-167
descriptive analytics, 54
defining, 53
predictive analytics versus, 48-49
Simple Social Metrics, 53
hypotheses, validating, 103
Cannes Lions 2013 example, 104, 110-112
Grammy Awards example, 104, 112-113
youth unemployment example, 104-110
IBMAmplify case study, 227-228
conclusions, 247
data analysis (first pass), 235-241
data analysis (second pass), 243-244
data identification, 228, 231-235
interpreting information, 244-247
iterative methods and, 117-119
marketing and, xxvi
near real-time analysis, 86
near real-time analytics, 121-123
predictive analytics
defining, 49
descriptive analytics versus, 48-49
real-time analytics, 121
2008 presidential debates, 122-124
as early warning system, 139
IBM Always On Engagement Center, 125
near real-time analytics versus, 123
real-time views, xxv
relationship matrixes, xxv
stream computing, 126
components of streams, 128-130
filters, 127
IBM InfoSphere Streams, 128
real-time data analytics, 128
REST and, 132
Streams Studio IDE, 129
target audience, determining
eminence/popularity, 35, 42-44
objective feedback, 31
profession/expertise, 34
public versus employee comments, 31-32
roles (job), 35
specific audiences, 35
taxonomy of analysis, 83
domain of analysis, 84, 88-90, 94-96, 169
duration of analysis, 90-91, 96
machine capacity, 84-86, 90-91, 94-98
themes, discovering, 103, 113-117
topics, discovering, 103, 113-117
trends, discovering, 103, 113-117
Twitter, xxv
value pyramid, 18
analyzing sentiment, 202
defining, xxx
microblogs, 203
analyzing social media content, process of
consumer reaction study, 204-209
data
finding the right data, 193-194
data model, developing, 192
questions, posing, 190
tools
configuring, 192
customizing/modifying, 201-203
selecting, 204
animal testing, 11
API (Application Programming Interfaces) and Enterprise Graphs, 186
Apple iPad, Twitter data collection/filtering example, 22-29
architects and data model development, 192
Armstrong, Lance, 35, 144, 151-155, 193, 222
Asher, Jay, 48
Asia-Pacific, growth of social media, xxviii
attributes of data, 7
language, 9
ownership of data, 14
region, 9
time, 14
type of content, 10
blogs, 12
discussion forums, 12
instructions, 11
microblogs, 12
news, 11
press releases, 11
wikis, 12
venue, 13
comments, filtering
eminence/popularity, 35, 42-44
objective feedback, 31
profession/expertise, 34
public versus employee comments, 31-32
roles (job), 35
specific audiences, 35
finding, xxvi
target audiences, determining
eminence/popularity, 35, 42-44
objective feedback, 31
profession/expertise, 34
public versus employee comments, 31-32
roles (job), 35
specific audiences, 35
audio as a type of content (data attribute), 10
Australia, growth of social media, xxviii
baby boom (Post-World War II), 34
Baidu Tieba, sifting through big data, 71
BBC, 5
bias
big data, 72
finding, 69
looking for, 69
as natural resource, xxviii
paradox of choice, the, 70
sifting through
nonscoped/scoped datasets, 71
signal-to-noise ratio, 71
entertainment, 68
sharing, 69
social aspect, 68
BladeCenter (IBM), big data analysis example, 72
blogs. See also microblogs
data identification and type of content, 12
ESN, 171
identifying data in, 80
microblogging, xxvi
Yousafzai, Malala, 5
Bluemix, 204
Boardreader data aggregator, 58, 105, 192
Borse, Santosh, 170
Bowers, Jeffery, 122
Brazil, growth of social media, xxviii
Brown, Gordon, 123
Bryant, Randal, 66
Burns, Robert, 117
BusinessWeek, xxi
Calgary Floods project (SMA), 164-167
Cannes Lions 2013, hypothesis validation example (data analysis), 104, 110-112
CapGemini, 247
case study (data analysis), 227-228
conclusions, 247
data analysis
data identification, 228, 231-235
interpreting information, 244-247
“casting a net” (data collection), 19-23
chaff, separating wheat from, 17
charts
scaling issues, 215
China
growth of social media, xxviii
IBM and Chinese factories, 193-194
RenRen, 78
social media outlets, 74
choice, the paradox of, 70
classifying
data and stream computing, 135-136
leads (deep analysis), 160-161
clear communication in social analytics process, 195-198
Clegg, Nick, 123
Coase, Ronald, 193
Coca-Cola, 9
collecting data
Apple and Twitter example, 22-29
regular expressions, 23
Twitter and Apple example, 22-29
web page visits, computing, 20-21
wildcards, 23
color in data visualizations, 221
Comcast
customer satisfaction, xxi, xxii
NHL playoffs, xxii
comments, filtering
eminence/popularity, 35, 42-44
objective feedback, 31
profession/expertise, 34
public versus employee comments, 31-32
roles (job), 35
specific audiences, 35
communication, transparency of (ESN), 171
communities (online), 12
computer architects and data model development, 192
conferences and real-time data anlytics, 138-139
connotations (positive/negative), words with, 202
consumer reaction analysis, 204-209
Consumer Reports, social media as sharing, 69
Content Analytics (IBM), 204
content, type of (data attribute), 10
blogs, 12
discussion forums, 12
instructions, 11
microblogs, 12
news, 11
press releases, 11
wikis, 12
context, structuring data via, 63
conversations,
social media as, xx, xxi, xxii, xxiii
starting, xxvi
Cowper, William, 61
Crow, Sheryl, 155
customer satisfaction
customizing/modifying tools in the social analytics process, 201-203
data
attributes of, 7
language, 9
ownership of data, 14
region, 9
structure, 8
time, 14
venue, 13
deduplicating, 200
defining, 2
information, defining, 3
interpreting, xxv, xxvi, xxvii
modeling
defining, xxv
model development, 192
motion, data at, 14
noisy data
defining, 3
private data, 15
proprietary data, 15
public data, 15
refining, 192
rest, data at, 14
states of, 14
uniqueness of, 200
unprocessed data, 2
velocity of (taxonomy of data analysis), 84
data in motion, 99
wisdom, defining, 3
data analysis, xxx
ad hoc analysis, 87
external social media (domain of analysis), 90
internal social media (domain of analysis), 95
audience comments, filtering
eminence/popularity, 35, 42-44
objective feedback, 31
profession/expertise, 34
public versus employee comments, 31-32
roles (job), 35
specific audiences, 35
conclusions, 247
data analysis (first pass), 235-241
data analysis (second pass), 243-244
data identification, 228, 231-235
interpreting information, 244-247
chaff, separating wheat from, 18
data collection
calculating web page visits, 20-21
data interpretation, xxv
data modeling, xxv
data visualization, xxv
Evolving Topics algorithm, 163-164
external social media (domain of analysis), 90-93
internal social media (domain of analysis), 95-97
relationship matrixes, 92
suggested action phase, 161-163
support via analytics software, 163-167
descriptive analytics, 54
defining, 53
predictive analytics versus, 48-49
Simple Social Metrics, 53
hypotheses, validating, 103
Cannes Lions 2013 example, 104, 110-112
Grammy Awards example, 104, 112-113
youth unemployment example, 104-110
IBMAmplify case study, 227-228
conclusions, 247
data analysis (first pass), 235-241
data analysis (second pass), 243-244
data identification, 228, 231-235
interpreting information, 244-247
iterative methods and, 117-119
marketing and, xxvi
near real-time analytics, 86, 121-123
predictive analytics
defining, 49
descriptive analytics versus, 48-49
real-time analytics, 121
2008 presidential debates, 122-124
as early warning system, 139
IBM Always On Engagement Center, 125
near real-time analytics versus, 123
real-time views, xxv
relationship matrixes, xxv
stream computing, 126
components of streams, 128-130
filters, 127
IBM InfoSphere Streams, 128
real-time data analytics, 128
REST and, 132
Streams Studio IDE, 129
target audience, determining
eminence/popularity, 35, 42-44
objective feedback, 31
profession/expertise, 34
public versus employee comments, 31-32
roles (job), 35
specific audiences, 35
taxonomy of analysis, 83
domain of analysis, 84, 88-90, 94-96, 169
duration of analysis, 90-91, 96
machine capacity, 84-86, 90-91, 94-98
themes, discovering, 103, 113-117
topics, discovering, 103, 113-117
trends, discovering, 103, 113-117
Twitter, xxv
value pyramid, 18
data collection
Apple and Twitter example, 22-29
regular expressions, 23
Twitter and Apple example, 22-29
web page visits, computing, 20-21
wildcards, 23
data identification
attributes of data, 7
language, 9
ownership of data, 14
region, 9
structure, 8
time, 14
venue, 13
filtered data, defining, 3
hypothesis validation and, 105-108
information, defining, 3
noisy data, defining, 3
social media outlets, 74
blogs, 80
Facebook, 77
information sharing sites, 78-79
professional networking sites, 75-76
RenRen, 78
wikis, 80
unprocessed data, defining, 2
value pyramid, 3
wisdom, defining, 3
Data Services (Enterprise Graphs), 174
datasets (nonscoped/scoped), 71
Data Sources (Enterprise Graphs), 174
data streams (SPL), 129
data visualization, xxv, 211-212
color, 221
effectiveness of, 213
information overload, 219
scaling issues, 215
scatter plots, 218
Dave, Hardik, 170
Davidzenka, Mila, 105
Davis, Colin, 122
debates (presidential)
2014, 12
deconstructing knowledge creation (ESN), 172
deduplication of data, 200
Evolving Topics algorithm, 163-164
external social media (domain of analysis), 90-93
internal social media (domain of analysis), 95-97
leads
relationship matrixes, 92
suggested action phase, 161-163
support via analytics software, 163-167
demographics
Facebook, 77
LinkedIn, 76
RenRen, 78
Twitter, 80
depth of analysis (taxonomy of data analysis), 84-85
descriptive analytics, 54
defining, 53
predictive analytics versus, 48-49
Simple Social Metrics, 53
detectives, social media analysts as, xxiv
directed graphs and stream computing, 130-133
discovery/innovation in ESN, 172
discussion forums
data identification and type of content, 12
ESN, 172
domain of analysis (taxonomy of data analysis), 84, 169
external social media, 88
ad hoc analysis, 90
ad hoc analysis, 95
Doyle, Arthur Conan, xx
duplicated data in social analytics process, 198-200
duration of analysis (taxonomy of data analysis), 90-91, 96
early warning system, real-time data analytics as, 139
Econsultancy, 83
Edwards Air Force Base, 189
egrep (Extended Global Regular Expressions Print), 25-27
eliminating data based on validty, 21-25
eminence/popularity and data analysis, 35, 42-44
Eminence Scorecard KPI in PSD, 177, 181-182
employees
ESN employee-to-employee interactions, 172-173
job roles and data analysis, 35
performance and Enterprise Graphs, 186
privacy, 170
public vs employee comments, 31-32
Enterprise Graphs
Analytics, 174
Analytics Services, 174
API, 186
Data Services, 174
Data Sources, 174
employee performance, 186
Graph Store, 174
PSD, 175
Activity Scorecard KPI, 177-180
assessing business benefits, 183-185
benefits of, 176
Eminence Scorecard KPI, 177, 181-182
Network Scorecard KPI, 177, 183
Reaction Scorecard KPI, 177, 180-181
sales outcomes, 186
Enterprise (Star Trek), 4
entertainment, social media as, 68
ESN (Enterprise Social Networks), 88, 169
blogs in, 171
discovery/innovation, 172
discussion forums, 172
employee-to-employee interactions, 172-173
Enterprise Graphs, components of, 174-175
IBM and, 170
knowledge, 172
PSD, 175
Activity Scorecard KPI, 177-180
assessing business benefits, 183-185
benefits of, 176
Eminence Scorecard KPI, 177, 181-182
future of Enterprise Graphs, 185-186
Network Scorecard KPI, 177, 183
Reaction Scorecard KPI, 177, 180-181
transparency of communication, 171
ESPN, 155
Europe, social media outlets, 75
evolving topics, 163-164, 206-209
expertise/profession and data analysis, 34
expressions (regular), 23
external social media (domain of analysis)
data at rest
SSM, 90
data in motion, 88
ad hoc analysis, 90
deep analysis, 90
SSM, 89
big data, sifting through, 71
consumer reaction analysis study, 205
data identification and type of content, 10
demographics, 77
fan pages, 77
groups, 78
identifying data in, 77
online communities, 12
public data, 15
sentiment analysis, 203
social aspect of social media, 68
social media as sharing, 69
timelines, 77
fan pages (Facebook), 77
feedback (objective), 31
feedback loops, 118
filtering
comments
eminence/popularity, 35, 42-44
objective feedback, 31
profession/expertise, 34
public versus employee comments, 31-32
roles (job), 35
specific audiences, 35
data, 192
choosing filter words, 198
defining, 3
filters (stream computing), 127
finding
an audience, xxvi
big data, 69
the right data (social analytics process), 193-194
forums (discussion)
data identification and type of content, 12
ESN, 172
Foundation for Biomedical Research, 12
Friedlein, Ashley, 83
Fuechsel, George, 1
“garbage in, garbage out”, 1
gender and data analysis, 34, 41-42
Generation X, 34
geography, audience comments and data analysis, 33, 39-41
Gessner, Mila, 158
Goethe, Johann Wolfgang von, 157
Grammy Awards
hypothesis validation example (data analysis), 104, 112-113
Twitter and 2014 Grammy Awards, xxix
directed graphs and stream computing, 130-133
Enterprise Graphs
Analytics, 174
Analytics Services, 174
API, 186
Data Services, 174
Data Sources, 174
employee performance, 186
Graph Store, 174
sales outcomes, 186
groups
Facebook groups, 78
groups (top word), xxv
Harvard Business Review, 212
Hawthorne, Nathaniel, 47
Holmes, Sherlock, xx
Holmes, Sr., Oliver Wendell, 195
House of Cards, xxx
Huffington Post, 13
Hurricane Sandy consumer reaction analysis study, 204-209
Hyde Park, London, 81
hypotheses, validating (data analysis), 103
Cannes Lions 2013 example, 104, 110-112
Grammy Awards example, 104, 112-113
youth unemployment example, 104
data identification/analysis, 105-108
Always On Engagement Center, 125
Chinese factories and, 193-194
comment filtering example, 35-37
Content Analytics, 204
eminence/popularity and data analysis, 44
ESN and, 170
IBM Academy of Technology, 53
IBM BladeCenter, big data analysis example, 72
IBM Commerce, 227
IBM Connections, 94
IBM DeveloperWorks, 58
IBM InfoSphere Streams, 128
IBM Singapore, 51
ICA, 207
Insight 2014 conference, 90, 172
Project Breadcrumb, 170
Activity Scorecard KPI, 177-180
assessing business benefits, 183-185
benefits of, 176
Eminence Scorecard KPI, 177, 181-182
future of Enterprise Graphs, 185-186
Network Scorecard KPI, 177, 183
Reaction Scorecard KPI, 177, 180-181
SMA, 204
data analytics case study, 235-240
deep analysis and, 158
Social Listening, 158
SPL
data streams, 129
jobs, 129
operators, 129
PE, 129
ports, 129
Twitter and IBM-specific handles, 233
Watson Content Analytics, 207
IBMAmplify data analytics case study, 227-228
conclusions, 247
data analysis
data identification, 228, 231-235
interpreting information, 244-247
ICA (IBM Content Analytics), 207
IDC, ESN, 88
IDE (Integrated Development Environment) and stream computing, 129
identifying data
attributes of data, 7
language, 9
ownership of data, 14
region, 9
structure, 8
time, 14
venue, 13
filtered data, 3
hypothesis validation and, 105-108
information, defining, 3
noisy data, defining, 3
social media outlets, 74
blogs, 80
Facebook, 77
information sharing sites, 78-79
professional networking sites, 75-76
RenRen, 78
unprocessed data, 2
value pyramid, 3
wikis, 80
wisdom, defining, 3
identifying leads (deep analysis), 158-159
immediacy in social media, 47
India
growth of social media, xxviii
social media outlets, 75
information
defining, 3
data visualizations and information overload, 219
information sharing sites, identifying data in, 78-79
innovation/discovery in ESN, 172
Insight 2014 conference (IBM), 172
Instagram as “in the moment” media type, 47
instructions, data identification and type of content, 11
internal social media (domain of analysis), 88
data at rest
SSM, 96
data in motion, 94
ad hoc analysis, 95
deep analysis, 95
SSM, 94
Internet Statistics and Market Research Company eMarketer, xxviii
interpreting data, 244-247, xxv, xxvi, xxvii
“in the moment” media types, 47
investigation, social media as, xxiv
iPad, Twitter data collection/filtering example, 22-29
IT architects and data model development, 193
iterative methods and data analysis, 117-119
Japan, growth of social media, xxviii
JavaScript, JSON and stream computing, 133-136
J.D. Power, North America Airline Satisfaction Study, 39
JetBlue, positive/negative experiences and data analysis, 38
jobs
data analysis and job roles, 35
SPL, 129
.jpg files, wildcards, 23
JSON (JavaScript Object Notation) and stream computing, 133-136
Katz, Randy, 66
keywords
data identification and hypothesis validation (data analysis), 105-108
Kintz, Jarod, 14
knowledge
ESN
deconstructing the creation of, 172
redistribution of, 172
Kohirkar, Avinash, 43
KPI (Key Performance Indicators) in PSD
Activity Scorecard KPI, 177-180
Eminence Scorecard KPI, 177, 181-182
Network Scorecard KPI, 177, 183
Reaction Scorecard KPI, 177, 180-181
Kremer-Davidson, Shiri, 170
language
data attribute, 9
NLP, defining, xxx
Lazowska, Edward, 66
leads (deep analysis)
data identification and type of content, 10
demographics, 76
identifying data in, 76
online communities, 12
sentiment analysis, 55, 76, 203
user profiles, 76
location, audience comments and data analysis, 33, 39-41
London, England, 81
loops (feedback), 118
Lotus Notes Mail, 172
Lynd, Robert Staughton, 17
machine capacity (taxonomy of data analysis), 84-86, 90-91, 94-98
Maraboli, Steve, 121
marketing and data analysis, xxvi
matrixes (affinity), 244
Memon, Amina, 122
Mexico, growth of social media, xxviii
microblogs, xxvi, 12. See also blogs
consumer reaction analysis study, 205
data identification, 12, 79-80
sentiment analysis, 203
Microsoft, defining big data, 66
Middle-East, growth of social media, xxviii
modeling data, defining, xxv
modifying/customizing tools in the social analytics process, 201-203
motion, data at (states of data), 14
Murphy, Capt. Edward A., 189
Murphy’s Law, 189
NASA, defining big data, 65
natural resource, big data as, xxviii
near real-time data analysis, 86, 11-123
Neeleman, David, 38
negative/positive bias and data analysis, 31-32
negative/positive connotations, words that can have, 202
negative/positive experiences, 37-39
Netflix, xxx
nets, casting (data collection), 19-23
network architects and data model development, 192
Network Scorecard KPI in PSD, 177, 183
networking sites (professional), identifying data in, 75-76
news, data identification and type of content, 11
New York Times, 5
NHL playoffs, Comcast customer satisfaction, xxii
NIST (National Institute of Standards and Technology), defining big data, 66
NLP (Natural Language Processing), defining, xxx
noisy data
defining, 3
filtering
nonscoped datasets, sifting through big data, 71
Obama, President Barack, 14, 86
objective feedback, 31
observations in structured data, 64
Occupy Wall Street movement, 33
Olympics (Summer) data visualization scaling example, 215
online communities, 12
operators (SPL), 129
Oracle Corporation, defining big data, 66
overloaded information in data visualizations, 219
ownership of data (data attribute), 14
Pakistan, 5
Pandya, Aroop, 170
Paradox of Choice: Why More Is Less, The, 70
PE (Processing Elements), SPL, 129
Pepsi, 9
performance (employee) and Enterprise Graphs, 186
PETA (People for the Ethical Treatment of Animals), 12
Pew Research Center, social media traffic, 41
photos/pictures as a type of content (data attribute), 10
phrases (top word), xxv
Picasso, Pablo, 211
pictures/photos as a type of content (data attribute), 10
Pinterest, data identification and type of content, 10
Plurad, Jason, 170
popularity/eminence and data analysis, 35, 42-44
ports (SPL), 129
positive/negative bias and data analysis, 31-32
positive/negative connotations, words that can have, 202
positive/negative experiences and data analysis, 37-39
Post-World War II baby boom, 34
predictive analytics
defining, 49
descriptive analytics versus, 48-49
presidential debates
2012, 14
presidential election (2012), 86
Press, Gil, 65
press releases, data identification and type of content, 11
privacy and employees, 170
private data, 15
professional networking sites, identifying data in, 75-76
profession/expertise and data analysis, 34
Project Breadcrumb (IBM), 170
proprietary data, 15
PSD (Personal Social Dashboard), 170, 175
benefits of, 176
business benefits, assessing, 183-185
Enterprise Graphs, the future of, 185-186
KPI
Eminence Scorecard, 177, 181-182
Reaction Scorecard, 177, 180-181
public data, 15
public versus employee comments, 31-32
qualifying leads (deep analysis), 160-161
quantitative forecasting, 51-53
questions, posing (social analytics process), 190
Reaction Scorecard KPI in PSD, 177, 180-181
real-time data analytics, 121
2008 presidential debates, 122-124
as early warning system, 139
IBM Always On Engagement Center, 125
near real-time analytics versus, 123
real-time views, xxv
stream computing, 128
redistribution of knowledge (ESN), 172
refining data (social analytics process), 192
region (data attribute), 9
regular expressions, 23
Reilly, Rick, 155
Reisner, Rebecca, xxi
relationship matrixes (deep analysis), 92, xxv
relevancy of data and the data identification process, 5-7
representing data. See data modeling
rest, data at (states of data), 14
REST (Representational State Transfer), SSM and stream computing, 132
Robbins, Naomi, 215
Robinson, David, 170
roles (job) and data analysis, 35
Rometty, Ginni, xxviii
Royal Holloway University of London, 122
Russia, growth of social media, xxviii
sales outcomes and Enterprise Graphs, 186
Salmon of Doubt, The, 31
Sandy (Hurricane) consumer reaction analysis study, 204-209
SapphireNow, big data analysis example, 74
satisfaction
customer satisfaction
scaling issues with data visualization, 215
scatter plots, 218
Schwartz, Barry, 70
Science Magazine, 11
scoped datasets, sifting through big data, 71
Scott, Chief Engineer Montgomery (Star Trek), 4
selecting tools in the social analytics process, 204
sentiment analysis, 202
defining, xxx
descriptive analytics and, 55-57
LinkedIn and, 76
microblogs, 203
predictive analytics and, 51-53
seven attributes of data, 7
language, 9
ownership of data, 14
region, 9
structure, 8
time, 14
type of content, 10
blogs, 12
discussion forums, 12
instructions, 11
microblogs, 12
news, 11
press releases, 11
wikis, 12
venue, 13
sharing, social media as a way of, 69
Shirk, Adam Hull, 189
sifting through big data
nonscoped/scoped datasets, 71
signal-to-noise ratio, 71
signal-to-noise ratio, sifting through big data, 71
Simple Social Metrics, 53
SMA (Social Media Analytics), 192, 204
Calgary Floods project, 164-167
data analytics case study, 235-240
deep analysis and, 158
evolving topics, 163-164, 206-209
SnapChat, data identification and type of content, 10
social analytics
choosing filter words, 198
configuring tools, 192
consumer reaction study, 204-209
customizing/modifying tools, 201-203
developing a data model, 192
finding the right data, 193-194
posing questions, 190
selecting tools, 204
troubleshooting
consumer reaction analysis, 204-209
customizing/modifying tools, 201-203
filtering data, 198
selecting tools, 204
Social Listening (IBM), 158
social media
as a way of sharing, 69
big data as, 67
entertainment, 68
sharing, 69
social aspect, 68
China, 74
as conversation, xx, xxi, xxii, xxiii
as entertainment, 68
Europe, 75
external social media (domain of analysis), 88
ad hoc analysis, 90
growth of, xxviii
identifying data in, 74
blogs, 80
information sharing sites, 78-79
professional networking sites, 75-76
wikis, 80
India, 75
internal social media (domain of analysis), 88
ad hoc analysis, 95
as investigation, xxiv
social aspect of, 68
social sites, identifying data in, 77-78
Solis, Brian, 169
South Africa, 9
South Korea, growth of social media, xxviii
Speakers’ Corner (Hyde Park, London), 81
specific audiences and data analysis, 35
SPL (Streams Processing Language)
data streams, 129
jobs, 129
operators, 129
PE, 129
ports, 129
Sprout Social, positive/negative experiences and data analysis, 38
SSM (Simple Social Metrics), 85, 124
data at rest
external social media, 90
internal social media, 96
data in motion
external social media, 89
internal social media, 94
word clouds, 136
Stapp, Dr. John Paul, 189
Star Trek, 4
Stikeleather, Jim, 212
stream computing, 126
filters, 127
IBM InfoSphere Streams, 128
real-time data analytics, 128
REST and, 132
SPL
data streams, 129
jobs, 129
operators, 129
PE, 129
ports, 129
word clouds, 136
Streams Studio IDE, 129
structured data, 8
attributes in, 64
context’s role in, 63
observations in, 64
unstructured data versus, 63-64
suggested action phase (deep analysis), 161-163
Summer Olympics data visualization scaling example, 215
Super Bowl and Twitter, 8
system architects and data model development, 192
Taliban, 5
target audience, determining
eminence/popularity, 35, 42-44
objective feedback, 31
profession/expertise, 34
public versus employee comments, 31-32
roles (job), 35
specific audiences, 35
taxonomy of data analysis, 83
internal social media, 88, 94-96
duration of analysis, 90-91, 96
machine capacity, 84-86, 90-91, 94-98
velocity of data, 84
data in motion, 99
TED Talks, 170
tennis, 35
Te’o, Manti, 154
text as a type of content (data attribute), 10
themes, discovering (data analysis), 103, 113-117
Thirteen Reasons Why, 48
“Three Elements of Successful Data Visualizations, The”, 212
time (data attribute), 14
Time Magazine, 13
timelines (Facebook), 77
timing
“in the moment” media types, 47
Tolkien, J. R. R., 141
topics
discovering (data analysis), 103, 113-117
top word groups/phrases, xxv
transparency of communication (ESN), 171
trends
discovering (data analysis), 103, 113-117
topics in social media, 47
troubleshooting
data visualizations
color, 221
information overload, 219
social analytics process
consumer reaction analysis, 204-209
customizing/modifying tools, 201-203
filtering data, 198
selecting tools, 204
Tumblr, sifting through big data, 71
animal testing debate, 12
Apple example and data collection, 22-29
as “in the moment” media type, 47
Citibank and, 197
consumer reaction analysis study, 205
customer satisfaction, xxi, xxii
data analysis, xxv
demographics, 80
eminence/popularity and data analysis, 44
Grammy Awards (2014), xxix
IBM-specific handles, 233
identifying data in, 80
positive/negative experiences and data analysis, 38
public data, 15
sentiment analysis, 203
social media as sharing, 69
SSM and, 85
Super Bowl, 8
type of content (data attribute), 10
blogs, 12
discussion forums, 12
instructions, 11
microblogs, 12
news, 11
press releases, 11
wikis, 12
Tzu, Sun, xxvi
unemployment (youth), hypothesis validation example (data analysis), 104
data identification/analysis, 105-108
unfiltered (noisy) data
defining, 3
processing
uniqueness of data, 200
United Kingdom, 9
United Nations, 7
United States
growth of social media, xxviii
presidential debates
2012, 14
presidential election (2012), 86
unprocessed data, defining, 2
unstructured data, 8
defining, 64
user profiles (LinkedIn), 76
US Open (Tennis), 35
validating
a hypothesis (data analysis), 103
Cannes Lions 2013 example, 104, 110-112
Grammy Awards example, 104, 112-113
youth unemployment example, 104-110
valuing data
big data, defining, 66
variety and defining big data, 66
velocity
big data, defining, 66
of data (taxonomy of data analysis), 84
data in motion, 99
external social media (domain of analysis), 88
ad hoc analysis, 90
deep analysis, 90
SSM, 89
internal social media (domain of analysis)
ad hoc analysis, 95
SSM, 94
venue (data attribute), 13
veracity and defining big data, 66, 69
video as a type of content (data attribute), 10
viewing data in real time (data analysis), xxv
Vine as “in the moment” media type, 47
visualizing data, xxv, 211-212
color, 221
effectiveness of, 213
information overload, 219
scaling issues, 215
scatter plots, 218
volume and defining big data, 66
Watson Content Analytics (IBM), 207
Watson, Dr., xx
web page visits, computing, 20-21
Western Governors University, 12
wheat, separating from chaff, 17
Whiting, Anita, 68
Why Greatness Cannot Be Planned, 211
data identification and type of content, 12
identifying data in, 80
wildcards, 23
Williams, David, 68
Winfrey, Oprah, 35, 144, 193, 222
wisdom, defining, 3
Wonder Bread, 232
Wong, Kyle, 11
Wong, Shara LY, 51
word clouds, xxv, xxx, 224-225
defining, 153
stream computing and, 136
word groups/phrases, xxv
World War II baby boom, 34
“worm” graph (2008 presidential debates), 122
youth unemployment, hypothesis validation example (data analysis), 104
data identification/analysis, 105-108
YouTube
data identification and type of content, 10
JetBlue and customer positive/negative experiences, 39
52.15.122.235