Index

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Access control

content-based, 478

manager, 399

model, 282–283

and policy ontology modeling, 326

system, 408

Access control policies, 15, 16

attribute-based access control, 19

authorization-based access control policies, 16–18

role-based access control, 18–19

usage control, 19

Access Token List (AT-list), 283

Access Tokens (AT), 282

assignment, 283–284

Access Token Tuples (ATT), 282

Accuracy-weighted classifier ensembles (AWEs), 343

Actual data, 357

ADABOOST. PL algorithms, 193

ADCi., see Aggregated dissimilarity count

“Added error”, 117

Administration policies, 20; see also Access control policies

Advanced CPT system, 382–383, 384

Advanced Encryption Standard (AES), 331, 333, 336

Adversaries, 403

adversarial data miner, 400

Aerosol optical depth (AOD), 435

AES, see Advanced Encryption Standard

AFOSR, see Air Force Office of Scientific Research

Aggregated dissimilarity count (ADCi), 153

Aggregate object, 521

Airavat, 440

Aircraft equipment problem, 161

Air Force Office of Scientific Research (AFOSR), 308

Air quality data, 435

AIS, see Assured information sharing

Alchemy 1.0, 443–444

Alchemy 2.0, 443, 444

ALKH12b approach, 193

All technique, 122, 123

Amazon, 332

Amazon S3, 331

integrating blackbook with Amazon S3, 331–335

server, 336

Amazon Web Services (AWS), 83

DynamoDB Accelerator, 83

Android applications, 418

Android ViewServer, 423–424

Android WindowManager, 424

Annotations, 393–396

ANNs, see Artificial neural networks

Anomalies, 189

anomalous data generation process, 228

Anomaly detection, 47, 190, 223–224, 227, 414; see also Graph-based anomaly detection (GBAD)

over IoT network traffic, 411

in social network and author attribution, 252–253

AOD, see Aerosol optical depth

Apache

Accumulo, 440

Apache-distributed file system, 309–310

Apche Pig, 80

Cassandra, 57, 82

CouchDB, 56, 82

Flink, 80

Hadoop, 56, 79

HBase, 56, 82

HDFS, 51

Hive, 56, 79, 81

Kafka, 80

Mahout, 83

Sentry, 440–441

Spark, 80, 409, 439, 447, 458

Storm, 80, 437, 439

APIs, see Application program interfaces

Apple’s official scrutinization process, 418

Application program interfaces (APIs), 81, 324, 415, 447

Applications, 302–303

Android, 418

Cyber security, 46

data mining, 74–75

framework relationship, 502

independent integrity constraints, 515

security, 427, 460

of SNOD, 300

App Store, 421

AprioriTid, 35

Arbitrary model, 221

Architectural issues, 509–510

Area under ROC curve (AUC), 163

of ReaSC, 164–165

ARM, see Association rule mining

ARMA process, 409

Artificial drift, 228

Artificial neural networks (ANNs), 27, 28–31

ASRS, see Aviation Safety Reporting Systems

Association rule mining (ARM), 27, 35–37

Association rules, 35

Assured information sharing (AIS), 307

algorithm, 35

prototypes, 324

AT, see Access Tokens

AT-list, see Access Token List

Attack; see also Cyber security (CyS)

BDMA for preventing cyber attacks, 480

collusion, 252

computer, 47

covert channel, 420

on critical infrastructures, 45–46

data, 403

host-based, 47

network-based, 47

types, 142

AT&T Network (ATT), 403

Attribute-based access control, 15, 19

AUC, see area under ROC curve

Audio signal, 409

Auditing, 21, 46, 517

Authentication, 20–21

Authorization-based access control policies, 16

conflict resolution, 17

consistency and completeness of rules, 18

negative authorization, 17

positive authorization, 17

propagation of authorization rules, 17

special rules, 17–18

strong and weak authorization, 17

Authorization rules propagation, 17

Automatic image annotation, 39–40

Automatic vehicle location techniques, 405

Autonomy, 520

Availability, 47, 56, 79, 83, 193, 379, 380

Aviation Safety Reporting Systems (ASRS), 100, 161–162, 173

AWEs, see Accuracy-weighted classifier ensembles

AWS, see Amazon Web Services

Background

check score computation, 295

generator module, 366–367

Backup and recovery, 517

Baseline

approach, 107–110, 142–143

frameworks, 279

methods, 122, 162, 350–351

Basic graph patterns (BGPs), 64, 312

Basic Security Mode auditing program (BSM auditing program), 207

Batch learning

algorithms, 94

techniques, 106

BDMA, see Big data management and analytics

BDSP, see Big data security and privacy

Behavior(al)

analysis, 414

detection mechanisms, 414

detector, 414

signatures, 414

Behavioral feature extraction and analysis, 415

classification model development, 417

evolving data stream classification, 416–417

graph-based behavior analysis, 415–416

sequence-based behavior analysis, 416

BestL technique, 122, 123

Bestplan problem, 275

BGPs, see Basic graph patterns

Big data, 1, 173, 331, 394–396, 470

access control and privacy policy challenges in, 474

big dataset for insider threat detection, 244–245

business intelligence meets, 473

CPT within context of big data and social networks, 388–390

DBD dataset, 246–248

extensions for big data-based social media applications, 326–327

formal methods for preserving privacy while loading, 473

integrity for, 396

and IoT, 403–411

issues, 184

management, 192, 367–368

OD dataset, 245–246

problem, 494

results for big data set relating to insider threat detection, 245

securing big data in cloud, 473

stream mining as big data mining problem, 253

techniques for scalability, 192–193

technologies, 15

Big data analytics

applications, 302–303

baseline methods, 350–351

big data and cloud for malware detection, 340

binary n-grams, 351

datasets, 349–350

design and implementation of system, 344

EMPC, 352

empirical error reduction and time complexity, 345

ensemble construction and updating, 344

error reduction analysis, 344–345

experiments, 349

Hadoop/MapReduce framework, 345–347

for insider threat detection, 454

malicious code detection, 347–349

malware detection, 340–342

and management, 428, 461

modules of InXite, 291–302

premise, 290–291

privacy aware, 473

related work, 303–304, 342–344

security, and privacy, 5

for security applications, 472

and security layer, 9

for social media applications, 290

techniques, 181

Big data management

and cloud for assured information sharing, 308

commercial developments, 326

design of CAISS, 309–311

design of CAISS++, 312–321

design philosophy, 308–309

experiments, 336

extensions for big data-based social media applications, 326–327

formal policy analysis, 321

implementation approach, 321

integrating blackbook with Amazon S3, 331–335

overall related research, 324–326

related research, 322–324

related work, 321

system design, 309

Big data management and analytics (BDMA), 1, 9, 13, 79, 261, 339, 373, 377, 453, 469, 470–471, 483, 485

Apache Cassandra, 82

Apache CouchDB, 82

Apache HBase, 82

Apache Hive, 81

Apache Mahout, 83

cloud platforms, 83–84

curriculum development, 455–457

for cyber security, 480–481

directions for, 490–491

educational program and experimental infrastructure, 454

education and infrastructure program, 455

experimental BDMA systems, 4

experimental program, 457–459

Google BigQuery, 81

Google BigTable, 82

infrastructure tools to host BDMA systems, 79–80

layered framework, 486

MongoDB, 82

NoSQL database, 81

Oracle NoSQL database, 82–83

steps in, 4–5

supporting technologies for, 2–3

systems and tools, 80

technologies, 173

Weka, 83

Big data security and privacy (BDSP), 1, 9, 13, 261, 339, 373, 377, 453, 469, 485

big data analytics for security applications, 472

capstone course on, 461

community building, 472

directions for, 490–491

educational program and experimental infrastructure, 454

experimental BDSP systems, 4

issues in, 469

layered framework, 486

philosophy for, 475

research issues, 470

security and privacy, 471–472

steps in, 4–5

supporting technologies for, 2–3

Big data systems, 379

aspects of integrity, 392–393

big data, 394–396

cloud services, 394–396

data provenance, 391, 393–394

data quality, 393–394

inferencing, 393–394

integrity for big data, 396

integrity management, 391, 394–396

need for integrity, 391–392

BigOWLIM, 267, 279

BigQuery, 81

BigSecret system, 440

Big streaming analytics framework, 409

BigTable, 193

Binary classification, 31

Binary code analysis, 427, 455, 461

Binary signatures, 414

BioMANTA project, 193

Biometrics techniques, 20

BitMat, 267

Bit string, 476

Blackbook integration with Amazon S3, 331–335

BLS, see U.S. Bureau of Labor and Statistics

BOAT, see Bootstrapped optimistic decision tree

Bootstrapped optimistic decision tree (BOAT), 106

Botmaster, 122, 350

Botnet, 122

dataset, 100, 350

Bots, 350

Breaking ties by summary statistics, 277–278

BSM auditing program, see Basic Security Mode auditing program

Buffer (buf), 134

CAISS, see Cloud-based information sharing system

CAISS++, see Ideal cloud-based assured information sharing system

Calgary dataset, 227

CapEx, see Capital expenditure

Capital expenditure (CapEx), 332

Capstone BDMA course, 456

Capstone course

on BDSP, 461

on secure mobile computing, 428

Cassandra, Apache, 57, 82, 436, 438

CBIR, see Content-based image retrieval

CCA, see Computer Corporation of America

Centralized architecture, 509

Centralized CAISS++, 313–314

Centroid (µ), 133

Chief Information Officer (CIO), 307

Chi square statistic, 72

Chronic obstructive pulmonary disease (COPD), 435

Chunking, 439

Chunks, 197, 203, 227

“chunk-based” approaches, 417

CIE, see Confidentiality inference engine

CIO, see Chief Information Officer

Classes, 59

Classification, 27, 134

analysis and discussion, 137

deviation between approximate and exacting q-NSC computation, 138–140

high-level algorithm, 133–134

justification of novel class detection algorithm, 137–138

model, 129, 418

with novel class detection, 133, 134–137

problem, 27

techniques, 28, 37

time and space complexity, 140–141

Classifier-based data mining technique, 171

Classify(M,xj,buf) algorithm, 133–134

Class/subclass hierarchy, 521

Client–server approach, 518

Client–server architectures, 493, 496

Cloud-based information sharing system (CAISS), 309–311, 373, 489

Cloud, 409

cloud-based system, 261, 289

cloud-centric policy manager, 308

cloud-design of Inxite to hanndle big data, 301–302

cloud-enabled NoSQL systems, 56

data systems, 379

deployment models, 53

development and security, 428

provider, 55

storage and data management, 54–55

Cloudant, 84

Cloud computing, 51, 173, 237, 263, 307, 331, 332

cloud storage and data management, 54–55

components, 52

framework, 173

frameworks based on semantic web technologies, 63–65

for malware detection, 341

model, 51–52

preliminaries, 52–53

secure, 454–455, 461

technologies, 52

tools, 56–57

virtualization, 53–54

Cloud platforms, 83

Amazon Web Services’ DynamoDB, 83

Google’s cloud-based big data solutions, 84

IBM’s cloud-based big data solutions, 84

Microsoft Azure’s Cosmos DB, 83–84

Cloud query processing system for big data management

approach, 264

architecture, 267–269

cloud computing, 263

contributions, 265

evaluation, 280–281

experimental setup, 264, 279–280

MapReduce framework, 269–278

related work, 265–267

results, 279

security extensions, 281–285

Cloud services, 394–396

for integrity management, 394

models, 54

Clustering, 27, 132

algorithm, 39

cluster-impurity, 109

techniques, 28

CM, see Compression method

CMRJ, see Conflicting MapReduceJoins

CNSIL, see Computer Networks and Security Instructional Lab

Collusion attack, 252

Command sequences (cseq), 244

Communication

data, 410

devices, 405

energy-efficient, 410

small communication frames, 407

wireless communication networks, 404

Community building, 472

Complete elimination, 275

Completely labeled training data, 94

Complexity

analysis, 224

of Bestplan, 276

of inference engine, 365

Compound impurity-measure, 109–110

Compressed/quantized dictionary construction, 251–252

Compression-based techniques, 203

Compression method (CM), 221

Compression/quantization using MR, 243

Computer attacks, 47

Computer Corporation of America (CCA), 495

Computer Networks and Security Instructional Lab (CNSIL), 427

Concept-adapting very fast decision tree learner (CVFDT), 106

Concept-drifting data streams, 127, 171

baseline approach, 142–143

classification with novel class detection, 133–141

datasets, 141–142

datasets and experimental setup, 122

ECSMiner, 127–133

ensemble development, 115

error reduction using MPC training, 116–121

evaluation approach, 143

experiments, 121, 141, 142

MPC, 115–116

performance study, 122–125, 143

results, 143–147

Concept-drifting synthetic dataset (SynD), 161

Concept-evolving synthetic dataset (SynDE), 161, 166

Concept drift, 93–95, 141, 160, 253, 340, 373, 410

issues, 416

in sequence stream, 238

in stream data, 198, 218

SynDE, 161

synthetic data with, 99

in training set, 228–230

Concept evolution, 93, 95–97, 410

synthetic data with, 99

Concept Instantiation, 60

Concept satisfiability, 60

Concept subsumption, 60

Concurrency control, 392, 513

Confidentiality, 379

approach to confidentiality management, 384–385

Confidentiality inference engine (CIE), 382–383

Confidentiality, privacy, and trust (CPT), 379, 380, 483, 489

advanced, 382–383, 384

approach to confidentiality management, 384–385

big data systems, 379

within context of big data and social networks, 388–390

framework, 381

integrated system, 387–388, 389

privacy for social media systems, 385–387

process, 382, 383

role of server, 381–382

trust for social networks, 387

trust, privacy, and confidentiality, 379–381, 383–384

Conflicting MapReduceJoins (CMRJ), 271

Conflicts, 284–285

resolution, 17

Consistency and completeness of rules, 18

Constraints, 24

constraint-based approaches, 109

Content-based access control, 478

Content-based image retrieval (CBIR), 38

Content-based score computation, 294

Control processing units, 405

Control systems, 405

Conventional data mining, 476

Conventional relational database management system, 436

COPD, see Chronic obstructive pulmonary disease

CoreNLP, 458

Cost estimation for query processing, 270–274

CouchDB, 56, 438, 490

Covert channel attack in mobile apps, 420

CPS, see Cyber-physical systems

CPT, see Confidentiality, privacy, and trust

Credit card fraud, 45

detection, 46, 95–96

Critical infrastructures, 43

attacks on, 45

cyber-physical, 455, 461

security for, 428

CRM, see Customer relationship management

Cryptographic approaches, 476, 477

Cryptographic commitment, 476

cseq, see Command sequences

Curriculum development, 426, 455–457, 460

capstone course on BDSP, 461

capstone course on secure mobile computing, 428

extensions to existing courses, 426–428, 460–461

integration of study modules with existing
courses, 460

“Curse of dimensionality”, 39

Customer relationship management (CRM), 332

Cutset networks, 447

CVFDT, see Concept-adapting very fast decision tree learner

Cyber attacks, BDMA for preventing, 480

Cybercrime datasets, 190

Cyber-defense framework, 403

Cyber-physical systems (CPS), 405, 428

security, 455

Cyber-Provenance Infrastructure for Sensor-based Data-Intensive Research (CY-DIR), 454

Cyber security (CyS), 1–2, 43, 459

applications, 46

BDMA for, 480–481

cyber security threats, 43–46

data mining for, 43

data mining services for cyber security, 47

Cyber signals, 409

Cyber terrorism, 43–45

CY-DIR, see Cyber-Provenance Infrastructure for Sensor-based Data-Intensive Research

CyS, see Cyber security

DaaS, see Data as a Service

DAC, see Discretionary access control

DAG, see Directed acyclic graph

Data; see also Information; Security

accuracy, 393

acquisition, 419, 479

and applications security, 427, 460

authenticity, 393

classification methods, 447

collection, 363

completeness, 393

confidentiality, 478

controller, 359, 397

currency, 393, 395

gathering, 419

generation and storage, 267–268

lifecycle framework, 479

ownership, 479

points, 217

provenance, 393–394

publication, 479

quality, 393–394

quality policy, 395

recovery, 393

reduction techniques, 471

reverse engineering of Smartphone applications, 419

sanitization approaches, 476

science, 1, 453

services, 52

sharing, 408, 479

sources, 436

storage, 73–74

virtualization, 54

warehousing, 523

Data analytics, 436; see also Big data analytics

system, 408

techniques, 471

Data as a Service (DaaS), 53

Database

administration, 511–512

design process, 511

functions, 23–24

integrity, 515–516

virtualization, 54

Database administrator (DBA), 20, 511

Database management systems (DBMS), 507, 522–523

architectural issues, 509–510

autonomy, 520

centralized architecture, 509

database administration, 511–512

database design, 510–511

distributed databases, 517–518

entity-relationship data models, 507, 508–509

extensible, 511

functional architecture, 510

functions, 512–517

heterogeneous and federated data management, 518–520

object data model, 520–522

relational data models, 507, 508

three-schema architecture, 510

Database system, 21

developments in, 494–497

technology, 470

Data management systems, 407, 471, 493

building information systems from framework, 500–502

comprehensive view, 496

developments in database systems, 494–497

framework, 498–500

relationship between texts, 502–504

status, vision, and issues, 497

3D view, 499

Data mining, 3, 27, 43, 409

algorithms, 471

answering queries using Hadoop mapreduce, 74

applications, 44, 46, 74–75

ARM, 35–37

artificial neural networks, 28–31

challenges, related work, and approach, 68–69

for cyber security, 43

cyber security threats, 43–46

data mining-based malware detectors, 342

Data mining (Continued)

data mining applications, 74–75

data mining services for cyber security, 47

data storage, 73–74

feature extraction and compact representation, 70–72

image mining, 38–40

and insider threat detection, 68

for insider threat detection, 69

Markov model, 32–35

multiclass problem, 37–38

outcomes, 27

RDF repository architecture, 72–73

solution architecture, 69–70

support vector machines, 31–32

tasks, 27–28

techniques, 27–28

techniques, 4, 67

tools, 47–48

Data-oblivious learning mechanisms, 465

Data-obliviousness, 464

Data privacy, 16, 24–25

multiobjective optimization framework for, 476–477

Data security, 16

policy enforcement and related issues, 21–24

security impact on database functions, 25

security policies, 16–21

Dataset(s), 160–162, 279, 349–350

nonsequence data, 207–209

sequence data, 227–228

Data-sharing policies, 408

Data stream classification, 93, 149, 172

approach to data stream classification, 105–106

baseline approach, 107–109

challenges, 93–94

comparison with baseline methods, 163–165

concept drift, 94–95

concept evolution, 95–97

dataset, 160–162

directions in, 171–175

ensemble classification, 156–160

ensemble classification, 107–108

experiments, 99–100, 160, 162–163

extensions, 172–175

infinite length, 94–95

with limited labeled data, 109–110

limited labeled data, 98–99

malware detection, 340–341

MPC ensemble approach, 171–172

network intrusion detection using, 106

novel class detection, 108

and novel class detection in data streams, 172

novelty detection, 108

outlier detection, 108–109

problems and proposed solutions, 94

ReaSC, 149–151

running times, scalability, and memory requirement, 165–166

with scarcely labeled data, 172

sensitivity to parameters, 166–168

single-model classification, 106–107

task, 127

training with limited labeled data, 152–156

Data streams, 3, 93, 127, 173, 410, 446, 457, 463

classification and novel class detection in, 172

classifiers, 417

constructing LZW Dictionary by selecting patterns, 221–222

DBA, see Database administrator

DBD, see Duplicate big data

DBMS, see Database management systems

DCS, see Distributed control systems

DDBMS, see Distributed database management system

DDTS, see Distributed Database Testbed System

Decentralized CAISS++, 314–315, 316

Decision trees, 27, 130–131

Deductive database systems, see Next-generation database systems

Deep learning, 494

Demand management, 404

Demographics-based score computation, 294

Department of Defense (DoD), 307

Description length (DL), 204, 415, 416

Description logics (DL), 59–60, 310, 358

Descriptive tasks, 27

Detecting anomalies, 27

DetectNovelClass, 135, 136

DGSOT, 47

Dictionary construction and compression using single MR, 243–244

Digital Equipment Corporation, 495

Digital forensics, 427

BDMA for, 480

DIM, see Distributed Integrity Manager

Directed acyclic graph (DAG), 362

Discretionary access control (DAC), 323–324

Discretionary security, 23–24

policies, 15

Dissimilarity count, 153

Distance-based techniques, 109

Distributed control systems (DCS), 405

Distributed database management system (DDBMS), 517–518

Distributed database systems, 264

Distributed Database Testbed System (DDTS), 495

Distributed feature extraction and selection, 348–349

Distributed Integrity Manager (DIM), 517

Distributed Metadata Manager (DMM), 517

Distributed processing of SPARQL, 319–320

Distributed processor (DP), 517, 518

Distributed Query Processor (DQP), 517, 518

Distributed reasoners (DRs), 312–313

Distributed reasoning, 325–326

Distributed Security Manager (DSP), 517

Distributed system, 264

Distributed Transaction Manager (DTM), 517

Diverse computing systems, 403

DL., see Description length; Description logics

DLL, see Dynamic-Link Library

DMM, see Distributed Metadata Manager

DoD, see Department of Defense

Domains, 362–363

DP, see Distributed processor

DQP, see Distributed Query Processor

DroidDream, 413

DRs, see Distributed reasoners

DSP, see Distributed Security Manager

DTM, see Distributed Transaction Manager

Duplicate big data (DBD), 245

dataset, 246–248

Dynamic-Link Library (DLL), 342

Dynamic analysis, 421

Dynamic chunk size, 173

Dynamic feature vector, 173

Dynamo, 193

E-count(v), 275

E-M technique, see Expectation-maximization technique

Early elimination heuristic, 277

EC, see Explicit content

ECSMiner, see Enhanced Classifier for Data Streams with novel class Miner

Efficiency, 391

Electronic patient record (EPR), 355–356

ElephantSQL, 84

Embedded systems, 405

Emergency room (ER), 435

EMPC, see Extended, multipartition, multichunk

Empirical error reduction and time complexity, 345

Encapsulation, 521

Enclave Page Cache (EPC), 462

Encoded sensing (ES), 410

Energy-efficient communication, 410

Enhanced Classifier for Data Streams with novel class Miner (ECSMiner), 95, 96, 100–101, 108, 109, 127, 129, 141, 142, 172

base learners, 131–132

creating decision boundary during training, 132–133

high level algorithm, 128–129

nearest neighborhood rule, 129–130

novel class and properties, 130–131

Enhanced policy engine, 310

Enhanced SPARQL query processor, 310

Ensemble

approach, 94, 209

construction and updating, 344

learning, 197–199, 218

refinement, 150–151, 156–160

size, 173

for supervised learning, 200–201

techniques, 417

training process, 160

for unsupervised learning, 199–200

update, 151, 160, 222–223

Ensemble-based insider threat detection, 197

ensemble for supervised learning, 200–201

ensemble for unsupervised learning, 199–200

ensemble learning, 197–199

Ensemble-based learning, 183

algorithms, 203

approach, 190

Ensemble-based stream mining, 76

Ensemble-based techniques, 177, 207

Ensemble-based USSL, 220

Ensemble classification, 107–108, 156

classification overview, 156

ensemble refinement, 156–160

ensemble update, 160

time complexity, 160

Entity

entity-relationship data models, 507, 508–509

extraction, 292–293

Entropy, 132

EPC, see Enclave Page Cache

EPR, see Electronic patient record

ER, see Emergency room

Erlang, 82

Error rates (ERR), 143, 145, 146

Error reduction

analysis, 344–345

using MPC training, 116

time complexity of MPC, 121

ES, see Encoded sensing

ETL, see Extract-transfer-load

Evaluation approach, 143

Evolved class, 156–159

Expectation-maximization technique (E-M technique), 109, 131, 150

optimizing objective function with, 154–155

Experimental activities, 419

covert channel attack in mobile apps, 420

large scale, automated detection of SSL /TLS, 421

location spoofing detecting in mobile apps, 420

Experimental program, 457, 461

association between big data management and case studies, 457

coding for political event data, 458

geospatial data processing on GDELT, 458

laboratory setup, 461–462

programming projects to supporting lab, 462–465

timely health indicator, 459

Experimental system, 425–426

layer, 9

Expert systems support, 300–301

Explicit content (EC), 70

Explicit type information of object, split using, 269

Extended, multipartition, multichunk (EMPC), 344, 352, 489

Extended relational database systems, 521

eXtensible Access Control Markup Language (XACML), 307, 440

eXtensible Markup Language (XML), 15, 57, 58, 485

layer, 58

schemas, 61

security, 62

Extensions for big data-based social media applications, 326–327

Extensions to existing courses, 426, 460–461

big data analytics and management, 428

Critical Infrastructure Security, 428

data and applications security, 427

developing and securing cloud, 428

digital forensics, 427

integration of study modules with existing courses, 426

language-based security, 428

network security, 427

systems security and binary code analysis, 427

External attacks, 43–44

External threat detection, 189, 190

Extract-transfer-load (ETL), 56

Fading factor, 199

False detection, 197

False negatives (FN), 190, 197, 212, 230, 251

False positive rates (FPR), 183, 230

False positives (FP), 186, 190, 197, 212, 230, 251

Farthest-first traversal heuristic, 155

Fast classification model, 174

Fault

detection, 95

fault-tolerant computing, 24

tolerance, 393, 516

FDP, see Federated data processor

Feature extraction, 341, 347

Feature selection, 341, 347

Feature weighting, 175

Federated data management, 518–520

Federated data processor (FDP), 519

Field actuation mechanisms, 404

File organization, 73, 268

predicate object split, 74

predicate split, 73–74

Filtered outlier (F outliers), 97, 134–135

Firewalls, 407

First-order logic formulas and inference, 443

First-order Markov model, 34

Five Vs, see Volume, velocity, variety, veracity, and value

FN, see False negatives

Forecasting, 409

Forest cover dataset, 100

from UCI repository, 142

Formal policy analysis, 321, 324

Forming associations, 27

Foursquare, 289

F outliers, see Filtered outlier

FP, see False positives

FPR, see False positive rates

F pseudopoints, 135–136

Framework design, 437

mixed continuous and discrete domains, 444–446

offline scalable statistical analytics, 442–444

privacy and security aware data management for scientific data, 440–442

real-time stream analytics, 446–448

storing and retrieving multiple types of scientific data, 437–440

Framework integration, 320

Frequency, 221

Frequent itemset graph, 36, 37

“Friends-smokers” social network domain, 443, 444

Functional architecture, 510

Functional database systems, 522–523

Functionality, 415

Future system, 439–442, 444, 446

online structure learning methods for stream classification, 447–448

semisupervised classification/prediction, 446–447

Gaussian distribution, 141, 163, 204

GBAD, see Graph-based anomaly detection

GDELT, see Global Database of Event, Language, and Tone

Generating and populating knowledge base, 366

Generic problems, 456

Genetic algorithms, 109

Geospatial data processing on GDELT, 458

GFS, see Google File System

Gibbs sampling, 444

Gini index, 132

Global big data security and privacy controller, 400–401

Global data-mining models, 408

Global Database of Event, Language, and Tone (GDELT), 458

geospatial data processing on, 458

Google, 266

BigQuery, 79, 81

BigTable, 82

Calendar, 405

cloud-based big data solutions, 84

Compute Engine, 409

Google+, 289

Monkey tool, 423

Google File System (GFS), 82, 193, 438

GPS-equipped vehicles techniques, 405

Graph

analysis, 70

graph-based behavior analysis, 415–416

mining techniques, 69

rewriting, 361

transformation, 361

Graph-based anomaly detection (GBAD), 183–184, 190, 197, 203–204, 251; see also Anomaly detection

GBAD-MDL, 204

GBAD-MPS, 205

GBAD-P, 204–205

models, 488

Graphical models and rewriting, 361

Graphical user interface (GUI), 421

GREE88 dataset, 227

Ground truth, 198, 199, 220

Guest machine, 54

Guests, 54

GUI, see Graphical user interface

Hadoop, 193, 265, 463, 488

cluster, 244

distributed system setup, 351

storage architecture, 312, 318, 325

Hadoop distributed file system (HDFS), 51, 70, 79, 173, 174, 184, 237, 265, 312, 322

Hadoop/MapReduce, 438

framework, 181, 345–347

platform, 237–238, 490

technologies, 373

HAN, see Home area network

HAQU13a approach, 193

HAQU13b approach, 193

Hard subspace clustering, 71

Hardware, 279, 339

hardware-assisted security, 406

hardware-level security, 406

services, 52

virtualization, 54

Hardware security modules (HSMs), 406

HBase, 56, 436, 438, 490

HDFS, see Hadoop distributed file system

HDP, see Heterogeneous data processor

Healthcare, 1

architecture of methodologies, 437

for big data analytics and security, 433

framework design, 437–448

methodologies, 436–437

motivation, 433–436

Health Insurance Portability and Accountability Act (HIPAA), 356

Heart rate monitor, 407

Heterogeneity, 410

issue, 69

Heterogeneous components, 403

Heterogeneous data(base)

interoperability, 501

management, 518–520

systems, 496

types, 517

Heterogeneous data processor (HDP), 518–519

Heterogeneous IoT environment, 409

Heuristic model, 273–274

Hewlett Packard Company, 495

Open Cirrus Testbed, 51

Hexastore, 64, 267

Hijacked kernel function pointers, 455

HIPAA, see Health Insurance Portability and Accountability Act

Hive, 56, 79, 81, 438

Hive-based assured cloud query processing, 322

HiveQL, 81

HMLNs, see Hybrid MLNs

Home area network (HAN), 405

Homomorphic encryption schemes, 463

Host-based attacks, 47

Host BDMA systems, infrastructure tools to, 79–80

Host machine, 54

HSMs, see Hardware security modules

HTML, see Hypertext Markup Language

Hybrid CAISS++, 315–318

Hybrid cloud, 53

Hybrid high-order Markov chain models, 189

Hybrid layout, 319

Hybrid MLNs (HMLNs), 444

Hyperplane technique, 161

Hypertext Markup Language (HTML), 263

Hypervisor, see Virtual machine monitor

IaaS, see Infrastructure as a Service

IARPA, see Intelligence Advanced Research Project Activity

IBM

cloud-based big data solutions, 84

System R, 494

IBM, see International Business Machine Corporation

ICD, see International Classification of Diseases

ICDE, see International Conference on Data Engineering

ICE, see Immigration and Customs Enforcement

Ideal cloud-based assured information sharing system (CAISS++), 309, 312, 489

centralized, 313–314

decentralized, 314–315, 316

framework integration, 320

hybrid, 315–318

hybrid layout, 319

limitations, 312

naming conventions, 318

policy specification and enforcement, 320–321

Ideal model, 271–273

Identity

management, 51

theft, 45

IDS, see Intrusion detection systems

IG, see Information gain

Image mining, 38

automatic image annotation, 39–40

feature selection, 39

goal, 39

image classification, 40

IME, see Input method editor

IME/Update app, 425

Immigration and Customs Enforcement (ICE), 424

Implicit type information of object, split using, 269

Impurity measurement, 153

IMS, see Information management system

In-line reference monitor (IRM), 76

INAN12, 477

Incident management, 404

Incremental learning, 106, 183, 190, 191, 218, 219

Incremental probabilistic action modeling (IPAM), 191

Index, 208

Inference, 355

tools, 360, 400

web, 365

Inference control, 367–368

approach, 361–362

domains and provenance, 362–363

inference controller with two users, 363–364

through query modification, 361

SPARQL query modification, 364–365

Inference controller, 355, 360, 365, 400

approach, 365

architecture for, 356–360

background generator module, 366–367

generating and populating knowledge base, 366

implementation of medical domain, 365–366

with two users, 363–364

Inference engine, 359, 399

complexity, 365

Inferencing, 60–61, 393–394

Infinite length, 93–95, 340, 410

Infinite sequences, 217

Information; see also Data

integration, 292, 293

sharing manager, 399

systems from data management systems framework, 500–502

Information engine, 291

entity extraction, 292–293

information integration, 293

Information gain (IG), 71

Information management system (IMS), 81

Information Resource Dictionary System (IRDS), 495

Information technology (IT), 339, 405

Informix Corporation, 495

Infrastructure as a Service (IaaS), 53, 332

Infrastructure development, 421, 455

curriculum development, 426–428

virtual laboratory development, 421–426

INGRES, 15, 16, 494, 495

project at University of California at Berkeley, 23

Input events generation, 424

Input files selection, 270

Input method editor (IME), 424

Insider threat detection, 51, 67–68, 189–191, 209, 251; see also Malware detection; Security policies

additional experiments, 252

anomaly detection in social network and author attribution, 252–253

big data analytics for, 454

big data issues, 184

challenges, related work, and approach, 68–69

collusion attack, 252

comprehensive framework, 75–76

contributions, 185–186

data mining, 68, 69, 74–75

data storage, 73–74

feature extraction and compact representation, 70–72

GBAD, 183–184

incorporate user feedback, 252

RDF repository architecture, 72–73

for sequence data, 217–224

sequence stream data, 184

solution architecture, 69–70

stream data analytics applications for, 3–4

stream mining as big data mining problem, 253

as stream mining problem, 183, 184

SVMs, 251

Insider threats, 43–44, 67, 197, 203

analysis, 46

Instrumental behavior analysis, 415

Integrated system, 387–388, 389

Integration framework, 310–311

Integrity, 380, 391–392

aspects, 392–393

for big data, 396

constraints, 24, 393, 395

of data, 380

management, 394–396

Intellidimension RDF Gateway, 385

Intelligence Advanced Research Project Activity (IARPA), 331

Intelligent fuzzier for automatic android GUI application testing, 423

Intelligent transportation systems, 404

Intel SGX, 463, 465

Intel SGX-enabled machine, 461

SDK and SGX driver, 462

Interface manager, 358

International Business Machine Corporation (IBM), 494

International Classification of Diseases (ICD), 439

International Conference on Data Engineering (ICDE), 472

Internet of Things (IoT), 2, 377, 403–404, 433, 485

data protection, 407–408

layered framework for securing, 406–407

scalable analytics for IOT security applications, 408–411

use cases, 404–406

Interoperability, 57, 391

of heterogeneous database systems, 518

Interuser parallelization, 244

Intrusion, 46, 47

detection, 189, 407

Intrusion detection systems (IDS), 27, 414

InXite, 290, 291

application of SNOD, 300

cloud-based system, 289

cloud-design of Inxite to hanndle big data, 301–302

expert systems support, 300–301

implementation, 302

information engine, 291–293

InXite-Law, 302

InXite-Marketing, 302

InXite-Security, 302

plug-and-play approach, 291

threat detection and prediction, 298–300

InXite POI

analysis, 293–298

profile generation and analysis, 293–294

threat analysis, 294–296

IoT, see Internet of Things

IPAM, see Incremental probabilistic action modeling

IRDS, see Information Resource Dictionary System

IRM, see In-line reference monitor

IT, see Information technology

Iterative conditional mode algorithm (ICM algorithm), 155

Jena (Java application programming package), 266, 385

Job JB, 271

JobTracker, 79

Joining variable, 275

Kafka, 448

KDD cup 1999 intrusion detection dataset (KDD99), 100, 141–142, 160–161

KEND98 dataset, 207

Keynote presentations, 473

access control and privacy policy challenges in big data, 474

additional presentations, 474

authenticity of digital images in social media, 473

big data analytics, 473

business intelligence meets big data, 473

final thoughts, 474

formal methods for preserving privacy while loading big data, 473

privacy in world of mobile devices, 474

securing big data in cloud, 473

timely health indicators using remote sensing and innovation for validity of environment, 474

toward privacy aware big data analytics, 473

K-means clustering, 28

K-means clustering with cluster-impurity minimization (MCI-K means), 152–154

K models, 209

k-nearest neighbor algorithm (KNN algorithm), 40, 149, 342

classification model, 131

k-NN-based approach, 108

KNN algorithm, see k-nearest neighbor algorithm

Knowledge base, 282

Knowledge representation (KR), 59

Labeled data, 149, 211

K-means clustering with cluster-impurity minimization, 152–154

optimizing objective function with E-M, 154–155

problem description, 152

storing classification model, 155–156

training with limited, 152

unsupervised K-means clustering, 152

Labeled points, 155

Laboratory setup, 461–462

Language-based security, 428

Large scale, automated detection of SSL /TLS, 421

Last technique, 122, 123

Layered framework for secure IOT, 406–407

Layered security framework, 403

LBAC, see Location based access control

Learning classes

supervised learning, 203

unsupervised learning, 203–205

Learning models, 183

Lehigh University Benchmark (LUBM), 314

Lempel−Ziv–Welch algorithm (LZW algorithm), 220, 224, 237

constructing LZW Dictionary by selecting patterns, 221–222

dictionary construction using MR, 241–242

scalable LZW and QD construction using MR job, 238–244

Leveraging randomized response-based differential-privacy technique, 408

LIBSVM, 209

Lifted learning and approximations of pseudolikelihood, 445

Lightweight IP-based network stacks, 407

Lincoln Laboratory Intrusion Detection dataset, 207, 210–211

“Lineage”, 394

Link analysis, 28

LinkedIn, 289

L-model, 158

Location based access control (LBAC), 359, 398

Location spoofing detecting in mobile apps, 420

Logic database systems, see Next-generation database systems

LOGITBOOST.PL algorithms, 193

Loop detectors, 404

Lossy compression process, 221

6LoWPAN, 407

LUBM, see Lehigh University Benchmark

LZW algorithm, see Lempel−Ziv–Welch algorithm

Machine learning, 409

algorithms, 83

techniques, 410, 417

Mahout, 193

Major mechanical problem, 98

Malicious applications, 418

Malicious code detection, 347

distributed feature extraction and selection, 348–349

nondistributed feature extraction and selection, 347–348

Malicious insiders, 3

Malicious intrusions, 45

Malware, 339, 347

behavior modeling, 415

dataset, 350

Malware detection, 46, 95, 340–342, 414–419; see also Insider threat detection

application to Smartphones, 418–419

behavioral feature extraction and analysis, 415–417

challenges, 414–415

cloud computing for, 341

contributions, 341–342

as data stream classification problem, 340–341

experimental activities, 419–421

infrastructure development, 421–426

reverse engineering methods, 417

risk-based framework, 417–418

in Smartphones, big data analytics for, 413, 414

Mandatory security policies, 15

Manual labeling of data, 149

Map input phase (MI phase), 272

Map keys (MKey), 346

Map output phase (MO phase), 272

Mappings, 509

MapReduce framework (MR framework), 51, 56, 70, 79, 184, 193, 237, 265–266, 269, 348, 428, 438, 456

breaking ties by summary statistics, 277–278

compression/quantization, 243

cost estimation for query processing, 270–274

input files selection, 270

join execution, 278–279

LZW dictionary construction, 241–242

paradigm, 458

processes, 265

query plan generation, 274–277

scalable LZW and QD construction, 238–244

technology, 193

MapReduceJoin (MRJ), 271

Map values (MVal), 346

Markov logic, 442

Markov logic networks (MLNs), 443

Markov model, 27, 32–35

Markov network, 443

Masquerade detection, 189, 190, 191

Massive data problem, 493–494

Maximum likelihood tree, 447

MaxWalksat, 444

MCI-K means, see K-means clustering with cluster-impurity minimization

MDL approach, see Minimum description length approach

Mean distance (μd), 133

Medical domain implementation, 365–366

Mermaid, 495

Metadata, 391

controller, 398

management, 514–515

Meteorological data, 446

Mica2 nodes running TinyDB applications, 410

Microcluster, 99, 132, 149

Microlevel location mining, 296

Microsoft Azure’s Cosmos DB, 83–84

Minimum cost plan generation problem, 275

Minimum description length approach (MDL approach), 69, 190, 204

Minimum support (minsup), 35

Minor mechanical problem, 98

Minor weather problem, 98

minsup., see Minimum support

MI phase, see Map input phase

Misapprehension, 197

Misuse detection, 47, 414

Mixed continuous and discrete domains, 444

approximate compilation for online inference knowledge, 445–446

lifted learning and approximations of pseudolikelihood, 445

MKey, see Map keys

MLNs, see Markov logic networks

Mobile devices, privacy in world of, 474

Mobile interfaces, 428

Mobile OS, 420

Mobile sensors, 405

Model update, 416

Modern transportation algorithms, 404

MongoDB, 56, 79, 82, 438, 458

MO phase, see Map output phase

Motivation, 433

air quality data, 435

need for case study, 435–436

problem, 433–435

system architecture, 434

MPC, see Multipartition and multichunk ; Multiple partition and multiple chunk

MQTT, 410

MR framework, see MapReduce framework

MRJ, see MapReduceJoin

1MRJ approach, see Single map reduce job approach

2MRJ, see Two MapReduce jobs

Multichunk ensemble approach, 343

Multiclass novelty detection technique, 108

Multiclass problem, 37–38

Multidisciplinary approaches, 477–480

Multidisciplinary University Research Initiative (MURI), 324

Multilabel classification problem, 173

Multilabel instances, 173

Multimedia database systems, 522

Multimedia data management for collaboration, 500

Multiobjective optimization framework for data privacy, 476–477

Multipartition and multichunk (MPC), 94

Multiple partition and multiple chunk (MPC), 91, 115, 122, 123, 125, 171, 177

ensemble approach, 100, 107, 116, 171–172 487

ensemble built on, 115

ensemble updating algorithm, 115–116

error reduction using MPC training, 116–121

Multiple shards in cluster, 83

Multiple video signals, 409

Multisource derivation, 442

Multistep Markovian model, 189

MURI, see Multidisciplinary University Research Initiative

“Muslim-brotherhood”, 290

Mutual information, 447

MVal, see Map values

MyHealtheVet Decision Support Tool, 434–435

Naïve Bayes (NB), 342

classification, 299–300

classifier, 47, 230

NB-INC, 230–232

Naming conventions, 318

National Institute of Standards and Technology (NIST), 52

National Science Foundation (NSF), 4, 469

SATC funded project CNS-1228198, 440

SATC funded project CNS-1237235, 440

National Security Agency (NSA), 290, 307

Natural language processing (NLP), 295, 455–456

NB, see Naïve Bayes

NCMRJ, see Nonconflicting MapReduceJoins

Nearest neighbor classification (NN classification), 150

Nearest neighborhood rule, 129–130

Negative authorization, 17

Network

intrusion detection, 95

network-based attacks, 47

security, 406, 407, 427

types, 403

Networking and Information Technology Research and Development (NITRD), 469

Next-generation database systems, 495–496, 522

Neyman Pearson theory, 409

n-gram, 191, 347

NIST, see National Institute of Standards and Technology

NITRD, see Networking and Information Technology Research and Development

NLP, see Natural language processing

NN classification, see Nearest neighbor classification

Noise, 189

Non-SQL (NoSQL), 81

databases, 81, 368

system, 428, 437, 456

Nonconflicting MapReduceJoins (NCMRJ), 271

Nondistributed feature extraction and selection, 347–348

Nonrelational high performance database, 81

Nonsequence data, 207; see also Sequence data

dataset, 207–209

experimental setup, 209

results, 210

stream data, 251

supervised learning, 209–210, 210–212

unsupervised learning, 210, 212–214

Normative patterns, 220

Normative substructures, 197, 204

NoSQL, see Non-SQL

Novel class and properties, 130–131

Novel class detection, 3, 27, 51, 96, 134–137

analysis and discussion, 137

classification with, 133, 134

in data streams, 172

deviation between approximate and exacting q-NSC computation, 138–140

high-level algorithm, 133–134

justification of algorithm, 137–138

time and space complexity, 140–141

Novel success control models, 407

Novelty detection, 108

Novice programmer, 183

NSA, see National Security Agency

NSF, see National Science Foundation

N-Triples, 72

Number of hops concept, 35

Object data model, 520–522

class/subclass hierarchy, 521

object-relational data model, 521–522

objects and classes, 520

Objects and classes, 520

OCSVM, see One-class support vector machine

OD, see Original data

Offline scalable statistical analytics, 442

current systems and limitations, 443–444

future system, 444

problem and challenges, 442–443

OLAP models, see On-line analytical processing models

OLI N DDA model, 142, 147

On-line analytical processing models (OLAP models), 523

On Demand Stream approach (OnDS approach), 162–163

One-class classifiers, 108

One-class support vector machine (OCSVM), 183, 191, 197, 200, 203, 207, 209

algorithm, 190

OCSVM models, 488

One-pass learning paradigm, 94, 416

One time password (OTP), 331

One-VS-all approach, 38

One-VS-one approach, 37–38

Onion routing techniques, 407

Online inference knowledge, approximate compilation for, 445–446

Online reputation-based score computation, 295

Online structure learning methods for stream classification, 447–448

Ontologies, 487

security and, 63

Open provenance model (OPM), 361

Operating systems (OS), 53, 403, 419

level virtualization, 54

Operational expenditure (OpEx), 332

OpEx, see Operational expenditure

OPM, see Open provenance model

Optimizing objective function with E-M, 154–155

Oracle Corporation, 495

Oracle NoSQL database, 82–83

Original data (OD), 245

dataset, 245–246

OS, see Operating systems

OTP, see One time password

Outlier detection, 108–109

OWL, see Web Ontology Language

PaaS, see Platform as a Service

PAD algorithm, see Probabilistic anomaly detection algorithm

PANG04 techniques, 129

Parallel boosting algorithms, 193

Parallel database systems, 522

Parameter

reduction, 174

sensitivity, 146

Partial elimination, 275

Partially labeled data, 94

Particulate matter (PM), 433

Partitioner, 237

PARV12a approach, 192

PCA, see Principle component analysis

PCS systems, see Process control systems systems

PDP, see Policy Decision Point

Pedigree, 394

Peer effect, 303

Peer-to-peer (P2P), 100, 122, 350

Pellet, 400–401

PEP, see Policy Enforcement Point

Perceptron, 28–29

Person of interest (POI), 293

analysis, 293

InXite POI profile generation and analysis, 293–294

InXite POI threat analysis, 294–296

InXite psychosocial analysis, 296

sentiment mining, 297–298

PEs, see Portable Executables

PET, see Privacy-enhancing symposium

PETRARCH, 458

Physical system stream data, 409

PIE, see Privacy inference engine

Pig Latin, 80

Pig query language, 438

Platform as a Service (PaaS), 53, 332

Platform for Privacy Preferences (P3P), 380

PLCs, see Programmable logic controllers

Plug-and-play approach, 291

PM, see Particulate matter

PM2.5 observations, 435, 446

POI, see Person of interest

Point sensors, 404–405

Policy Decision Point (PDP), 334

Policy enforcement and related issues, 21

discretionary security and database functions, 23–24

policy specification, 23

query modification, 23

SQL extensions for security, 22–23

Policy Enforcement Point (PEP), 334

Policy engine, 312, 426

Policy manager, 357–358, 360, 398–399

Policy specification and enforcement, 320–321

Political event data, coding for, 458

Portable Executables (PEs), 350

POS, see Predicate Object Split

Positive authorization, 17

P2P, see Peer-to-peer

P3P, see Platform for Privacy Preferences

Predicate Object Split (POS), 267

Predicate object split, 74

Predicate split (PS), 73–74, 267, 269

Prediction, 409

Predictive tasks, 27

Preliminaries in cloud computing, 52

cloud deployment models, 53

service models, 53

Preprocessing, 409

Principle component analysis (PCA), 40

Privacy

policy, 380

privacy-enhancing techniques, 475–476

“privacy-sensitive” tuples, 441

for social media systems, 385–387

Privacy-enhancing symposium (PET), 475

Privacy-preserving

biometric authentication, 476

collaborative data mining, 476

data correlation techniques, 478–479

data management, 407

data matching, 476

record matching problem, 477

Privacy and security aware data management, 440

current systems and limitations, 440–441

future system, 441–442

problem and challenges, 440

Privacy inference engine (PIE), 382–383

Private cloud, 53

PRM, see Processor reserved memory

Probabilistic anomaly detection algorithm (PAD algorithm), 189

Probabilistic theorem proving (PTP), 445

Probability of state, 443

Process control systems systems (PCS systems), 405

Processor reserved memory (PRM), 462

Program analysis, 421

Programmable logic controllers (PLCs), 405

Programming projects to supporting lab, 462

proposed architecture, 464

secure data storage and retrieval in cloud, 462

secure encrypted stream data processing, 463–465

systematic performance study of TEE, 462–463

Propositional algorithms, 444

Proprietary protocols, 425

Provenance, 355, 357, 362–363

data, 356–357

integration, 64–65

Provenance controller, 359–360, 398

PS, see Predicate split

Pseudocode

for entity extraction, 293

for information integration, 293

Pseudolikelihood, lifted learning and approximations of, 445

“Pseudopoint”, 132

Psychological score computation, 294

Psychosocial analysis, InXite, 296

PTP, see Probabilistic theorem proving

QD, see Quantized dictionary

QEs, see Query engines

q-nearest neighborhood rule (q-NH rule), 130, 138

q-neighborhood silhouette coefficient (q-NSC), 135–140

q-NH rule, see q-nearest neighborhood rule

q-NSC, see q-neighborhood silhouette coefficient

QS, see Quantified self

Quantified self (QS), 1

movement, 453

Quantized dictionary (QD), 184, 218, 221–224, 237

scalable LZW and QD construction using MR job, 238–244

Query engines (QEs), 307

Query execution and optimization, 323

Query manager, 399

Query modification, 23

algorithm, 24

Query operation, 512

Query optimization, 23–24, 512

Query plan generation, 274–277

Query processing, 437, 512–513

module, 359

system, 264

Query processor, 513

Query transformation, 512

RabbitMQ, 448

Radial-based function (RBF), 209

Radius (R), 133

RAMP, see Reduce and map provenance

Raspberry Pi, 409

Raw outlier, 97

RBAC, see Role-based access control

RBF, see Radial-based function

RDD, see Resilient distributed dataset

RDF-S, see RDF schema

RDF, see Resource description framework

RDFKB, see RDF Knowledge Base

RDF Knowledge Base (RDFKB), 267

RDFQL, see RDF Query Language

RDF Query Language (RDFQL), 385

RDF schema (RDF-S), 59

Real dataset-ASRS, 161

Real dataset-KDD, 161–162

Realistic Data Stream Classifier (ReaSC), 149–151

Real-time

analytics, 436–437

classification, 174

database systems, 522

processing, 393, 516

threat, 43

traveler information systems, 404

Real-time stream analytics, 446

current systems and limitations, 446

future system, 446–448

problem and challenges, 446

Real-world problems, 494

ReaSC, see Realistic Data Stream Classifier

ReaSC, 98, 101, 109, 110, 163, 168, 172

Receiver operating characteristic curves (ROC curves), 163

Recovery, 513

Recursive mining, 190, 191

Redaction manager, 399

Reduce and map provenance (RAMP), 64

Reduce input phase (RI phase), 272

Reduce output phase (RO phase), 272

Refine-Ensemble, 156–157

Relational databases, 264, 508

systems, 496

Relational data models, 507, 508

Relational learning, 456

Relaxed Bestplan problem, 276–277

Research and infrastructure activities in BDMA and BDSP, 454

big data analytics for insider threat detection, 454

binary code analysis, 455

CPS security, 455

infrastructure development, 455

secure cloud computing, 454–455

secure data provenance, 454

TEE, 455

Research challenges, 477–480

Resilient distributed dataset (RDD), 80

Resource description framework (RDF), 3, 15, 57, 58, 263, 290, 308, 364, 373, 438, 487, 488

data manager, 308

Gateway, 385

graphs, 69

integration, 63–64

policy engine, 323–324

processing engines, 326

RDF-3X, 267

RDF-based policy engine, 325, 367

repository architecture, 72–73

security, 62–63

Reverse engineering methods, 417

REWARDS technique, 417, 419

RI phase, see Reduce input phase

Risk-based framework, 417–418

Risk analyzer, 399

Risk models, 479

Robotium (ROBO), 423

ROC curves, see Receiver operating characteristic curves

Role-based access control (RBAC), 15, 18–19, 331, 359, 398, 442

Role hierarchy, 19

RO phase, see Reduce output phase

Routing protocols, 407

Rule-combining algorithms, 335

SaaS, see Software as a Service

SAMOA, 253, 447

Sanitization

task output derivation, 441

tasks, 441

techniques, 477

Satellite AOD data, 446

SCADA systems, see Supervisory control and data acquisition systems

Scalability, 69, 184, 186, 391, 410

big dataset for insider threat detection, 244–245

big data techniques for, 192–193

experimental setup and results, 244

Hadoop cluster, 244

Hadoop MapReduce platform, 237–238

issues, 447

results for big data set relating to insider threat detection, 245–248

scalable analytics for IOT security applications, 408–411

scalable LZW and QD construction using MR job, 238–244

test, 147

Scalable, high-performance, robust and distributed (SHARD), 266, 325

Scalable LZW and QD construction using MR job, 238–244

1MRJ approach, 241–244

2MRJ approach, 238–241

Schema, 509

SciDB, 438–439

multidimensional array data model, 436

Scientific data

privacy and security aware data management, 440–442

storing and retrieving multiple types, 437–440

SDB, see SPARQL database

SDC, see System Development Corporation

SDN, see Software-defined networking

Search space size, 276

Second-order Markov model, 34

Secret sharing-based techniques, 408

Secure big data management and analytics, unified framework for, 392

design of framework, 397–400

global big data security and privacy controller, 400–401

integrity management and data provenance for big data systems, 391–396

Secure cloud computing, 454–455, 461

Secure cyber-physical systems, 461

Secure data

integration framework, 339

provenance, 454

storage and retrieval in cloud, 322, 324–325, 462

Secure encrypted stream data processing, 463–465

SecureMR, 440

Secure multiparty computation (SMC), 476

Secure SPARQL query processing on cloud, 322–323

Security, 516

and IoT, 403–411

labels, 441

and ontologies, 63

query and rules processing, 63

RDF, 62–63

semantic web AND, 61

XML, 62

Security and privacy for big data, 459

approach, 459–460

curriculum development, 460–461

experimental program, 461–465

Security applications

data mining for cyber security, 43–47

data mining tools, 47–48

Security extensions, 281

access control model, 282–283

access token assignment, 283–284

conflicts, 284–285

Security policies, 15, 16; see also Insider threat detection

access control policies, 16–19

administration policies, 20

auditing, 21

authentication, 20–21

discretionary security policies, 16

identification, 20–21

views for security, 21

SElinux, 440

Semantic gap, 38

Semantic web-based inference controller for provenance big data

architecture for inference controller, 356–360

big data management and inference control, 367–368

implementing inference controller, 365–367

inference control through query modification, 361–365

Semantic web, 51, 57

cloud computing frameworks based on technologies, 63–65

DL, 59–60

graphical models and rewriting, 361

inferencing, 60–61

OWL, 59

preliminaries in, 52

RDF, 58

and security, 61–63

semantic web-based models, 360–361

semantic web-based security policy engines, 326

SPARQL, 58–59

SWRL, 61

technologies, 52, 263, 360, 396

technology stack for, 57

XML, 58

Semantic Web Rules Language (SWRL), 58, 61, 309, 358–359, 387

Semisupervised classification/prediction, 446–447

Semisupervised clustering

stream classification algorithm, 172

techniques, 109, 131, 149

Sensing infrastructure, 404

Sensor network, 408–409

Sensor signal, 409

Sentiment mining, 297–298

Sequence-based behavior analysis, 416

Sequence data, 217; see also Nonsequence data

anomaly detection, 223–224

choice of ensemble size, 233–235

classification, 217–220

complexity analysis, 224

concept drift in training set, 228–230

dataset, 227–228

experiments and results for, 227

insider threat detection for, 217

NB-INC vs. USSL-GG for various drift values, 231–232

results, 230

stream data, 184, 251

TN, 230–231

USSL, 220–223

Serializability, 513

Server role, 381–382

Service models, 53

SETM algorithm, 35

SGX hardware, 463

SHARD, see Scalable, high-performance, robust and distributed

Signature(s), 47

behavior, 189, 191

database, 342

detection, 339

signature-based malware detectors, 342

Silver Lining, 440

Simple Protocol and RDF Query Language (SPARQL), 58–59, 69, 263, 269, 488

query modification, 364–365

query processor, 312, 325

Single-chunk approach, 171

Single-partition, single-chunk approach (SPC approach), 115, 340, 344

ensemble approach, 116

Single map reduce job approach (1MRJ approach), 238, 241–244

Single model approach, 94

classification, 106–107

incremental approaches, 417

Single pass algorithm, 220

Single source derivation, 441

Singular value decomposition (SVD), 40

Small communication frames, 407

Smart grid, 405–407

Smart home, 405

Smart meters, 408

Smartphones application, 418

classification model, 418

data gathering, 419

data reverse engineering, 419

malware detection, 419

SMC, see Secure multiparty computation

SMM, see System management mode

SNOD, see Stream-based novel class detection

Social factor-based technique, 297

Social graph-based score computation, 295

Social media

authenticity of digital images in, 473

privacy for, 385–387

sites, 291

systems, 27, 379

Social network, 388–389

community, 263

trust for, 387

Soft subspace clustering, 71

Software, 280

Software as a Service (SaaS), 53, 307, 332

Software-defined networking (SDN), 407

SOWT, see Special operations weather specialists

Space complexity, 140–141

Space sensors, 404–405

Spark, 422, 458

emerge, 490

running, 409

SPARQL, see Simple Protocol and RDF Query Language

SPARQL database (SDB), 321

SpatialHadoop, 458

Spatiotemporal Database Systems, 522

SPC approach, see Single-partition, single-chunk approach

Special operations weather specialists (SOWT), 459

Split using explicit type information of object, 269

Spout, 447–448

SQL, see Structured Query Language

SSL/TLS, large scale, automated detection, 421

SSO, see System security officer

Stand-alone systems, 497

Stanford framework, 458

State-of-the-art stream classification techniques, 127, 149, 171

Static analysis, 421

Static GBAD approaches, 190

Static learning, 190

Statistical models, 410

Status, 497

Sticky policies, 478

Storage management, 514

Storage services, 52

Storage virtualization, 54

Storing and retrieving multiple types of scientific data, 437

current systems and limitations, 438–439

future system, 439–440

problem and challenges, 437–438

Storm (data system), 442

Stream, 197

analytics, 171

classification techniques, 150

sequence data, see Infinite sequences

Stream-based novel class detection (SNOD), 289

application, 300

SNOD++, 300

Stream data, 192, 253, 410

classification, see Data stream classification

mining, 181

Stream data analytics, 3, 257

applications for insider threat detection, 3–4

for insider threat applications layer, 6–7

for insider threat detection, 4

layer, 6

Stream mining, 190–192, 457

big data issues, 184

as big data mining problem, 253

contributions, 185–186

GBAD, 183–184

insider threat detection as stream mining problem, 183, 184

sequence stream data, 184

techniques, 207

Strong authorization, 17

Structured Query Language (SQL), 15, 55, 69, 485, 495, 512

extensions for security, 22–23

Subspace clustering, 71–72

Supervised approach, 197

Supervised ensemble classification updating, 200

Supervised learning, 68, 190, 203, 209–212; see also Unsupervised learning

algorithm, 183, 184

approaches, 183, 189, 251

ensemble for, 200–201

Supervised methods, 191

Supervised microclustering technique, 110

Supervised model, 191

Supervised testing algorithm, 200

Supervised/unsupervised learning, 456

Supervisory control and data acquisition systems (SCADA systems), 405

Supporting technologies, 2–3; see also Big data management and analytics (BDMA); Big data security and privacy (BDSP)

layer, 6–7, 499, 500

Support vector machines (SVMs), 27, 31–32, 47, 68, 183, 185, 207–209, 251, 342

Support vectors, 32

SVD, see Singular value decomposition

SVMs, see Support vector machines

SWRL, see Semantic Web Rules Language

Sybase Inc., 495

Symposium on Access Control Models and Technologies, 18

SynC, see Synthetic Data with only Concept Drift

SynCN, see Synthetic Data with Concept Drift and Novel Class

SynD, see Concept-drifting synthetic dataset

SynDE, see Concept-evolving synthetic dataset

Synthetic datasets, 99, 160, 349–350

Synthetic data with concept drift and concept evolution, 99

Synthetic Data with Concept Drift and Novel Class (SynCN), 141

Synthetic Data with only Concept Drift (SynC), 141

Synthetic data with only concept drift, 99

Systematic performance study of TEE, 462–463

System Development Corporation (SDC), 495

System management mode (SMM), 462

System(s)

call, 207, 208

security, 427, 461

services, 52

System R, 15, 16

System security officer (SSO), 20, 511

TABARI software, 458

Tag, 442

TaintDroid, 425

TEE, see Trusted execution environments

Temporary buffer, 129

Text(s)

classification approaches, 189

relationship between, 502–504

Third-party IME, 424

Threat

assessment, 295

data, 403

Three-schema architecture, 510

TIE, see Trust inference engine

Time based access control (TRBAC), 359

Time complexity, 121, 140–141, 160

Timely health indicators, 459, 474

Time role-based access control (TRBAC), 398

TM, see Translation model

TMP36 sensors, 409

TNs, see True negatives

Token, 207

subgraph, 208

Tor (TOR), 407, 408

Toy problems, 494

TPJ, see Triple Pattern Join

TPR, see True positive rate

TPs, see Triple patterns; True positives

Trace Files, 227

Traditional data stream classification techniques, 127, 416

Traditional machine-learning tools, 409

Traditional static supervised method, 183

Traffic flow control, 404

Transactional approach, mitigating data leakage in mobile apps using, 424–425

Transaction management, 513–514

Translation model (TM), 40

Traveler information, 404

TRBAC, see Time based access control; Time role-based access control

Triple Pattern Join (TPJ), 271

Triple patterns (TPs), 264, 271

Triples, 72

True negatives (TNs), 197, 230

True positive rate (TPR), 186, 230

True positives (TPs), 197, 230

“Truncated” UNIX shell commands, 189, 191

Trust, 379, 380

probabilities, 387

for social networks, 387

Trusted execution environments (TEE), 454, 455, 459

systematic performance study, 462–463

Trust inference engine (TIE), 382–383

Trust, privacy, and confidentiality, 379

current successes and potential failures, 380–381

inference engines, 383–384

motivation for framework, 381

TrustZone security, 406

Twitter, 289

Two-class SVM, 209, 211

Two MapReduce jobs (2MRJ), 238

approach, 238–241

Two-phase commit, 513

Type sink, 417

UAV could, 409

UCON, see Usage control

UI, see User interface

Unbounded data stream, 221

Unified framework

design of framework, 397–400

global big data security and privacy controller, 400–401

integrity management and data provenance for big data systems, 391–396

learning framework, 409

for secure big data management and analytics, 392

Uniform resource identifiers (URIs), 58, 74, 269, 318, 331

UNIX shell commands, 189

Unsupervised ensemble classification and updating, 198

Unsupervised K-means, 131–132

clustering, 152

Unsupervised learning, 191, 203, 210, 212–214, 415; see also Supervised learning

algorithm, 183, 184

ensemble for, 199–200

GBAD-MDL, 204

GBAD-MPS, 205

GBAD-P, 204–205

GBAD, 203–204

Unsupervised method, 183

Unsupervised stream-based sequence learning (USSL), 184, 185, 218, 219, 220, 230

constructing LZW Dictionary, 221–222

data chunk, 220–221

USSL-GG algorithms, 230–235

URIs, see Uniform resource identifiers

Usage control (UCON), 19

U.S. Bureau of Labor and Statistics (BLS), 1

Use cases, 404–406

User demographics-based, 297

User feedback, 252

User interface (UI), 423–424

manager, 357, 398

User-level applications, 189

U.S. Homeland Security, 67

USSL, see Unsupervised stream-based sequence learning

VA, see Veterans Administration

Vector representation of content (VRC), 70–71

Vertically partitioned layout, 318–319

Very Fast Decision Trees (VFDTs), 106, 340

Veterans Administration (VA), 433, 434

decision support tools, 436

Personal Health Record system, 434

VFDTs, see Very Fast Decision Trees

Victim selection, 220

Video signal, 409

View management, 517

ViewServer, 424

Vigiles, 441

Virtualization, 53–54

Virtual laboratory development, 421

architectural diagram for virtual lab and integration, 422

experimental system, 425–426

input events generation, 424

intelligent fuzzier for automatic android GUI application testing, 423

interface, 423–424

laboratory setup, 421–422

mitigating data leakage in mobile apps, 424–425

policy engine, 426

problem statement, 423

programming projects to supporting virtual lab, 423

technical challenges, 425

Virtual machine manager (VMM), 462

Virtual machines (VM), 244

image, 55

monitor, 54

Vision, 497

VM, see Virtual machines

VMM, see Virtual machine manager

VMware, 54

Volume, velocity, variety, veracity, and value (Five Vs), 1

Voting, 409

VRC, see Vector representation of content

WA., see Weighted average

Wang, 122, 123, 124, 125

W3C, see World Wide Web Consortium

WCE, see Weighted classifier ensemble

WCOP, see Web rules, credentials, ontologies, and policies

Weak authorization, 17

Web-based interface, 421

Web Ontology Language (OWL), 58, 59, 263, 309, 355, 364, 487

OWL 2 specification, 400

Web rules, credentials, ontologies, and policies (WCOP), 388

Weighted average (WA), 199

Weighted classifier ensemble (WCE), 142

Weight learning, 443

Weka (machine learning open source package), 83, 122

Whitepages, 366

WHO, see World Health Organization

Wireless communication networks, 404

Wireless sensor networks (WSN), 410

Workgroups, 474

Workshop discussions, 474

BDMA for cyber security, 480–481

examples of privacy-enhancing techniques, 475–476

multiobjective optimization framework for data privacy, 476–477

philosophy for BDSP, 475

research challenges and multidisciplinary approaches, 477–480

workgroups, 474

Workshop presentations

keynote presentations, 473–474

summary, 472–474

World Health Organization (WHO), 433

World Wide Web, 20, 24, 53, 57, 365, 462

World Wide Web Consortium (W3C), 57, 380

Wrapper-based simultaneous feature weighing, 39

WSN, see Wireless sensor networks

XACML, see eXtensible Access Control Markup Language

XEN, 54

XML, see eXtensible Markup Language

XQuery, 23

Yahoo!, 266

Yellowpages, 366

Zero-knowledge proof of knowledge protocols (ZKPK protocols), 476

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Index

Create new playlist

Sign In

Sign Up

Table of Contents for
Index