A

Access control

content-based, 478

manager, 399

model, 282283

and policy ontology modeling, 326

system, 408

Access control policies, 15, 16

attribute-based access control, 19

authorization-based access control policies, 1618

role-based access control, 1819

usage control, 19

Access Token List (AT-list), 283

Access Tokens (AT), 282

assignment, 283284

Access Token Tuples (ATT), 282

Accuracy-weighted classifier ensembles (AWEs), 343

Actual data, 357

ADABOOST. PL algorithms, 193

ADCi., see Aggregated dissimilarity count

“Added error”, 117

Administration policies, 20; see also Access control policies

Advanced CPT system, 382383, 384

Advanced Encryption Standard (AES), 331, 333, 336

Adversaries, 403

adversarial data miner, 400

Aerosol optical depth (AOD), 435

AES, see Advanced Encryption Standard

AFOSR, see Air Force Office of Scientific Research

Aggregated dissimilarity count (ADCi), 153

Aggregate object, 521

Airavat, 440

Aircraft equipment problem, 161

Air Force Office of Scientific Research (AFOSR), 308

Air quality data, 435

AIS, see Assured information sharing

Alchemy 1.0, 443444

Alchemy 2.0, 443, 444

ALKH12b approach, 193

All technique, 122, 123

Amazon, 332

Amazon S3, 331

integrating blackbook with Amazon S3, 331335

server, 336

Amazon Web Services (AWS), 83

DynamoDB Accelerator, 83

Android applications, 418

Android ViewServer, 423424

Android WindowManager, 424

Annotations, 393396

ANNs, see Artificial neural networks

Anomalies, 189

anomalous data generation process, 228

Anomaly detection, 47, 190, 223224, 227, 414; see also Graph-based anomaly detection (GBAD)

over IoT network traffic, 411

in social network and author attribution, 252253

AOD, see Aerosol optical depth

Apache

Accumulo, 440

Apache-distributed file system, 309310

Apche Pig, 80

Cassandra, 57, 82

CouchDB, 56, 82

Flink, 80

Hadoop, 56, 79

HBase, 56, 82

HDFS, 51

Hive, 56, 79, 81

Kafka, 80

Mahout, 83

Sentry, 440441

Spark, 80, 409, 439, 447, 458

Storm, 80, 437, 439

APIs, see Application program interfaces

Apple’s official scrutinization process, 418

Application program interfaces (APIs), 81, 324, 415447

Applications, 302303

Android, 418

Cyber security, 46

data mining, 7475

framework relationship, 502

independent integrity constraints, 515

security, 427, 460

of SNOD, 300

App Store, 421

AprioriTid, 35

Arbitrary model, 221

Architectural issues, 509510

Area under ROC curve (AUC), 163

of ReaSC, 164165

ARM, see Association rule mining

ARMA process, 409

Artificial drift, 228

Artificial neural networks (ANNs), 27, 2831

ASRS, see Aviation Safety Reporting Systems

Association rule mining (ARM), 27, 3537

Association rules, 35

Assured information sharing (AIS), 307

algorithm, 35

prototypes, 324

AT, see Access Tokens

AT-list, see Access Token List

Attack; see also Cyber security (CyS)

BDMA for preventing cyber attacks, 480

collusion, 252

computer, 47

covert channel, 420

on critical infrastructures, 4546

data, 403

host-based, 47

network-based, 47

types, 142

AT&T Network (ATT), 403

Attribute-based access control, 15, 19

AUC, see area under ROC curve

Audio signal, 409

Auditing, 21, 46, 517

Authentication, 2021

Authorization-based access control policies, 16

conflict resolution, 17

consistency and completeness of rules, 18

negative authorization, 17

positive authorization, 17

propagation of authorization rules, 17

special rules, 1718

strong and weak authorization, 17

Authorization rules propagation, 17

Automatic image annotation, 3940

Automatic vehicle location techniques, 405

Autonomy, 520

Availability, 47, 56, 79, 83, 193, 379, 380

Aviation Safety Reporting Systems (ASRS), 100, 161162173

AWEs, see Accuracy-weighted classifier ensembles

AWS, see Amazon Web Services

B

Background

check score computation, 295

generator module, 366367

Backup and recovery, 517

Baseline

approach, 107110, 142143

frameworks, 279

methods, 122, 162, 350351

Basic graph patterns (BGPs), 64, 312

Basic Security Mode auditing program (BSM auditing program), 207

Batch learning

algorithms, 94

techniques, 106

BDMA, see Big data management and analytics

BDSP, see Big data security and privacy

Behavior(al)

analysis, 414

detection mechanisms, 414

detector, 414

signatures, 414

Behavioral feature extraction and analysis, 415

classification model development, 417

evolving data stream classification, 416417

graph-based behavior analysis, 415416

sequence-based behavior analysis, 416

BestL technique, 122, 123

Bestplan problem, 275

BGPs, see Basic graph patterns

Big data, 1, 173, 331, 394396, 470

access control and privacy policy challenges in, 474

big dataset for insider threat detection, 244245

business intelligence meets, 473

CPT within context of big data and social networks, 388390

DBD dataset, 246248

extensions for big data-based social media applications, 326327

formal methods for preserving privacy while loading, 473

integrity for, 396

and IoT, 403411

issues, 184

management, 192, 367368

OD dataset, 245246

problem, 494

results for big data set relating to insider threat detection, 245

securing big data in cloud, 473

stream mining as big data mining problem, 253

techniques for scalability, 192193

technologies, 15

Big data analytics

applications, 302303

baseline methods, 350351

big data and cloud for malware detection, 340

binary n-grams, 351

datasets, 349350

design and implementation of system, 344

EMPC, 352

empirical error reduction and time complexity, 345

ensemble construction and updating, 344

error reduction analysis, 344345

experiments, 349

Hadoop/MapReduce framework, 345347

for insider threat detection, 454

malicious code detection, 347349

malware detection, 340342

and management, 428, 461

modules of InXite, 291302

premise, 290291

privacy aware, 473

related work, 303304, 342344

security, and privacy, 5

for security applications, 472

and security layer, 9

for social media applications, 290

techniques, 181

Big data management

and cloud for assured information sharing, 308

commercial developments, 326

design of CAISS, 309311

design of CAISS++, 312321

design philosophy, 308309

experiments, 336

extensions for big data-based social media applications, 326327

formal policy analysis, 321

implementation approach, 321

integrating blackbook with Amazon S3, 331335

overall related research, 324326

related research, 322324

related work, 321

system design, 309

Big data management and analytics (BDMA), 1, 9, 13, 79, 261, 339, 373, 377, 453, 469, 470471, 483, 485

Apache Cassandra, 82

Apache CouchDB, 82

Apache HBase, 82

Apache Hive, 81

Apache Mahout, 83

cloud platforms, 8384

curriculum development, 455457

for cyber security, 480481

directions for, 490491

educational program and experimental infrastructure, 454

education and infrastructure program, 455

experimental BDMA systems, 4

experimental program, 457459

Google BigQuery, 81

Google BigTable, 82

infrastructure tools to host BDMA systems, 7980

layered framework, 486

MongoDB, 82

NoSQL database, 81

Oracle NoSQL database, 8283

steps in, 45

supporting technologies for, 23

systems and tools, 80

technologies, 173

Weka, 83

Big data security and privacy (BDSP), 1, 9, 13, 261, 339, 373, 377, 453, 469, 485

big data analytics for security applications, 472

capstone course on, 461

community building, 472

directions for, 490491

educational program and experimental infrastructure, 454

experimental BDSP systems, 4

issues in, 469

layered framework, 486

philosophy for, 475

research issues, 470

security and privacy, 471472

steps in, 45

supporting technologies for, 23

Big data systems, 379

aspects of integrity, 392393

big data, 394396

cloud services, 394396

data provenance, 391, 393394

data quality, 393394

inferencing, 393394

integrity for big data, 396

integrity management, 391, 394396

need for integrity, 391392

BigOWLIM, 267, 279

BigQuery, 81

BigSecret system, 440

Big streaming analytics framework, 409

BigTable, 193

Binary classification, 31

Binary code analysis, 427, 455, 461

Binary signatures, 414

BioMANTA project, 193

Biometrics techniques, 20

BitMat, 267

Bit string, 476

Blackbook integration with Amazon S3, 331335

BLS, see U.S. Bureau of Labor and Statistics

BOAT, see Bootstrapped optimistic decision tree

Bootstrapped optimistic decision tree (BOAT), 106

Botmaster, 122, 350

Botnet, 122

dataset, 100, 350

Bots, 350

Breaking ties by summary statistics, 277278

BSM auditing program, see Basic Security Mode auditing program

Buffer (buf), 134

C

CAISS, see Cloud-based information sharing system

CAISS++, see Ideal cloud-based assured information sharing system

Calgary dataset, 227

CapEx, see Capital expenditure

Capital expenditure (CapEx), 332

Capstone BDMA course, 456

Capstone course

on BDSP, 461

on secure mobile computing, 428

Cassandra, Apache, 57, 82, 436, 438

CBIR, see Content-based image retrieval

CCA, see Computer Corporation of America

Centralized architecture, 509

Centralized CAISS++, 313314

Centroid (µ), 133

Chief Information Officer (CIO), 307

Chi square statistic, 72

Chronic obstructive pulmonary disease (COPD), 435

Chunking, 439

Chunks, 197, 203, 227

“chunk-based” approaches, 417

CIE, see Confidentiality inference engine

CIO, see Chief Information Officer

Classes, 59

Classification, 27, 134

analysis and discussion, 137

deviation between approximate and exacting q-NSC computation, 138140

high-level algorithm, 133134

justification of novel class detection algorithm, 137138

model, 129, 418

with novel class detection, 133, 134137

problem, 27

techniques, 28, 37

time and space complexity, 140141

Classifier-based data mining technique, 171

Classify(M,xj,buf) algorithm, 133134

Class/subclass hierarchy, 521

Client–server approach, 518

Client–server architectures, 493, 496

Cloud-based information sharing system (CAISS), 309311, 373, 489

Cloud, 409

cloud-based system, 261, 289

cloud-centric policy manager, 308

cloud-design of Inxite to hanndle big data, 301302

cloud-enabled NoSQL systems, 56

data systems, 379

deployment models, 53

development and security, 428

provider, 55

storage and data management, 5455

Cloudant, 84

Cloud computing, 51, 173, 237, 263, 307, 331, 332

cloud storage and data management, 5455

components, 52

framework, 173

frameworks based on semantic web technologies, 6365

for malware detection, 341

model, 5152

preliminaries, 5253

secure, 454455, 461

technologies, 52

tools, 5657

virtualization, 5354

Cloud platforms, 83

Amazon Web Services’ DynamoDB, 83

Google’s cloud-based big data solutions, 84

IBM’s cloud-based big data solutions, 84

Microsoft Azure’s Cosmos DB, 8384

Cloud query processing system for big data management

approach, 264

architecture, 267269

cloud computing, 263

contributions, 265

evaluation, 280281

experimental setup, 264, 279280

MapReduce framework, 269278

related work, 265267

results, 279

security extensions, 281285

Cloud services, 394396

for integrity management, 394

models, 54

Clustering, 27, 132

algorithm, 39

cluster-impurity, 109

techniques, 28

CM, see Compression method

CMRJ, see Conflicting MapReduceJoins

CNSIL, see Computer Networks and Security Instructional Lab

Collusion attack, 252

Command sequences (cseq), 244

Communication

data, 410

devices, 405

energy-efficient, 410

small communication frames, 407

wireless communication networks, 404

Community building, 472

Complete elimination, 275

Completely labeled training data, 94

Complexity

analysis, 224

of Bestplan, 276

of inference engine, 365

Compound impurity-measure, 109110

Compressed/quantized dictionary construction, 251252

Compression-based techniques, 203

Compression method (CM), 221

Compression/quantization using MR, 243

Computer attacks, 47

Computer Corporation of America (CCA), 495

Computer Networks and Security Instructional Lab (CNSIL), 427

Concept-adapting very fast decision tree learner (CVFDT), 106

Concept-drifting data streams, 127, 171

baseline approach, 142143

classification with novel class detection, 133141

datasets, 141142

datasets and experimental setup, 122

ECSMiner, 127133

ensemble development, 115

error reduction using MPC training, 116121

evaluation approach, 143

experiments, 121, 141, 142

MPC, 115116

performance study, 122125, 143

results, 143147

Concept-drifting synthetic dataset (SynD), 161

Concept-evolving synthetic dataset (SynDE), 161, 166

Concept drift, 9395, 141, 160, 253, 340, 373, 410

issues, 416

in sequence stream, 238

in stream data, 198, 218

SynDE, 161

synthetic data with, 99

in training set, 228230

Concept evolution, 93, 9597, 410

synthetic data with, 99

Concept Instantiation, 60

Concept satisfiability, 60

Concept subsumption, 60

Concurrency control, 392, 513

Confidentiality, 379

approach to confidentiality management, 384385

Confidentiality inference engine (CIE), 382383

Confidentiality, privacy, and trust (CPT), 379, 380, 483489

advanced, 382383, 384

approach to confidentiality management, 384385

big data systems, 379

within context of big data and social networks, 388390

framework, 381

integrated system, 387388, 389

privacy for social media systems, 385387

process, 382, 383

role of server, 381382

trust for social networks, 387

trust, privacy, and confidentiality, 379381, 383384

Conflicting MapReduceJoins (CMRJ), 271

Conflicts, 284285

resolution, 17

Consistency and completeness of rules, 18

Constraints, 24

constraint-based approaches, 109

Content-based access control, 478

Content-based image retrieval (CBIR), 38

Content-based score computation, 294

Control processing units, 405

Control systems, 405

Conventional data mining, 476

Conventional relational database management system, 436

COPD, see Chronic obstructive pulmonary disease

CoreNLP, 458

Cost estimation for query processing, 270274

CouchDB, 56, 438, 490

Covert channel attack in mobile apps, 420

CPS, see Cyber-physical systems

CPT, see Confidentiality, privacy, and trust

Credit card fraud, 45

detection, 46, 9596

Critical infrastructures, 43

attacks on, 45

cyber-physical, 455, 461

security for, 428

CRM, see Customer relationship management

Cryptographic approaches, 476, 477

Cryptographic commitment, 476

cseq, see Command sequences

Curriculum development, 426, 455457, 460

capstone course on BDSP, 461

capstone course on secure mobile computing, 428

extensions to existing courses, 426428, 460461

integration of study modules with existing
courses, 460

“Curse of dimensionality”, 39

Customer relationship management (CRM), 332

Cutset networks, 447

CVFDT, see Concept-adapting very fast decision tree learner

Cyber attacks, BDMA for preventing, 480

Cybercrime datasets, 190

Cyber-defense framework, 403

Cyber-physical systems (CPS), 405, 428

security, 455

Cyber-Provenance Infrastructure for Sensor-based Data-Intensive Research (CY-DIR), 454

Cyber security (CyS), 12, 43, 459

applications, 46

BDMA for, 480481

cyber security threats, 4346

data mining for, 43

data mining services for cyber security, 47

Cyber signals, 409

Cyber terrorism, 4345

CY-DIR, see Cyber-Provenance Infrastructure for Sensor-based Data-Intensive Research

CyS, see Cyber security

D

DaaS, see Data as a Service

DAC, see Discretionary access control

DAG, see Directed acyclic graph

Data; see also Information; Security

accuracy, 393

acquisition, 419, 479

and applications security, 427, 460

authenticity, 393

classification methods, 447

collection, 363

completeness, 393

confidentiality, 478

controller, 359, 397

currency, 393, 395

gathering, 419

generation and storage, 267268

lifecycle framework, 479

ownership, 479

points, 217

provenance, 393394

publication, 479

quality, 393394

quality policy, 395

recovery, 393

reduction techniques, 471

reverse engineering of Smartphone applications, 419

sanitization approaches, 476

science, 1, 453

services, 52

sharing, 408, 479

sources, 436

storage, 7374

virtualization, 54

warehousing, 523

Data analytics, 436; see also Big data analytics

system, 408

techniques, 471

Data as a Service (DaaS), 53

Database

administration, 511512

design process, 511

functions, 2324

integrity, 515516

virtualization, 54

Database administrator (DBA), 20, 511

Database management systems (DBMS), 507, 522523

architectural issues, 509510

autonomy, 520

centralized architecture, 509

database administration, 511512

database design, 510511

distributed databases, 517518

entity-relationship data models, 507, 508509

extensible, 511

functional architecture, 510

functions, 512517

heterogeneous and federated data management, 518520

object data model, 520522

relational data models, 507, 508

three-schema architecture, 510

Database system, 21

developments in, 494497

technology, 470

Data management systems, 407, 471, 493

building information systems from framework, 500502

comprehensive view, 496

developments in database systems, 494497

framework, 498500

relationship between texts, 502504

status, vision, and issues, 497

3D view, 499

Data mining, 3, 27, 43, 409

algorithms, 471

answering queries using Hadoop mapreduce, 74

applications, 44, 46, 7475

ARM, 3537

artificial neural networks, 2831

challenges, related work, and approach, 6869

for cyber security, 43

cyber security threats, 4346

data mining-based malware detectors, 342

Data mining (Continued)

data mining applications, 7475

data mining services for cyber security, 47

data storage, 7374

feature extraction and compact representation, 7072

image mining, 3840

and insider threat detection, 68

for insider threat detection, 69

Markov model, 3235

multiclass problem, 3738

outcomes, 27

RDF repository architecture, 7273

solution architecture, 6970

support vector machines, 3132

tasks, 2728

techniques, 2728

techniques, 4, 67

tools, 4748

Data-oblivious learning mechanisms, 465

Data-obliviousness, 464

Data privacy, 16, 2425

multiobjective optimization framework for, 476477

Data security, 16

policy enforcement and related issues, 2124

security impact on database functions, 25

security policies, 1621

Dataset(s), 160162, 279, 349350

nonsequence data, 207209

sequence data, 227228

Data-sharing policies, 408

Data stream classification, 93, 149, 172

approach to data stream classification, 105106

baseline approach, 107109

challenges, 9394

comparison with baseline methods, 163165

concept drift, 9495

concept evolution, 9597

dataset, 160162

directions in, 171175

ensemble classification, 156160

ensemble classification, 107108

experiments, 99100, 160, 162163

extensions, 172175

infinite length, 9495

with limited labeled data, 109110

limited labeled data, 9899

malware detection, 340341

MPC ensemble approach, 171172

network intrusion detection using, 106

novel class detection, 108

and novel class detection in data streams, 172

novelty detection, 108

outlier detection, 108109

problems and proposed solutions, 94

ReaSC, 149151

running times, scalability, and memory requirement, 165166

with scarcely labeled data, 172

sensitivity to parameters, 166168

single-model classification, 106107

task, 127

training with limited labeled data, 152156

Data streams, 3, 93, 127, 173, 410, 446, 457, 463

classification and novel class detection in, 172

classifiers, 417

constructing LZW Dictionary by selecting patterns, 221222

DBA, see Database administrator

DBD, see Duplicate big data

DBMS, see Database management systems

DCS, see Distributed control systems

DDBMS, see Distributed database management system

DDTS, see Distributed Database Testbed System

Decentralized CAISS++, 314315, 316

Decision trees, 27, 130131

Deductive database systems, see Next-generation database systems

Deep learning, 494

Demand management, 404

Demographics-based score computation, 294

Department of Defense (DoD), 307

Description length (DL), 204, 415, 416

Description logics (DL), 5960, 310, 358

Descriptive tasks, 27

Detecting anomalies, 27

DetectNovelClass, 135, 136

DGSOT, 47

Dictionary construction and compression using single MR, 243244

Digital Equipment Corporation, 495

Digital forensics, 427

BDMA for, 480

DIM, see Distributed Integrity Manager

Directed acyclic graph (DAG), 362

Discretionary access control (DAC), 323324

Discretionary security, 2324

policies, 15

Dissimilarity count, 153

Distance-based techniques, 109

Distributed control systems (DCS), 405

Distributed database management system (DDBMS), 517518

Distributed database systems, 264

Distributed Database Testbed System (DDTS), 495

Distributed feature extraction and selection, 348349

Distributed Integrity Manager (DIM), 517

Distributed Metadata Manager (DMM), 517

Distributed processing of SPARQL, 319320

Distributed processor (DP), 517, 518

Distributed Query Processor (DQP), 517, 518

Distributed reasoners (DRs), 312313

Distributed reasoning, 325326

Distributed Security Manager (DSP), 517

Distributed system, 264

Distributed Transaction Manager (DTM), 517

Diverse computing systems, 403

DL., see Description length; Description logics

DLL, see Dynamic-Link Library

DMM, see Distributed Metadata Manager

DoD, see Department of Defense

Domains, 362363

DP, see Distributed processor

DQP, see Distributed Query Processor

DroidDream, 413

DRs, see Distributed reasoners

DSP, see Distributed Security Manager

DTM, see Distributed Transaction Manager

Duplicate big data (DBD), 245

dataset, 246248

Dynamic-Link Library (DLL), 342

Dynamic analysis, 421

Dynamic chunk size, 173

Dynamic feature vector, 173

Dynamo, 193

E

E-count(v), 275

E-M technique, see Expectation-maximization technique

Early elimination heuristic, 277

EC, see Explicit content

ECSMiner, see Enhanced Classifier for Data Streams with novel class Miner

Efficiency, 391

Electronic patient record (EPR), 355356

ElephantSQL, 84

Embedded systems, 405

Emergency room (ER), 435

EMPC, see Extended, multipartition, multichunk

Empirical error reduction and time complexity, 345

Encapsulation, 521

Enclave Page Cache (EPC), 462

Encoded sensing (ES), 410

Energy-efficient communication, 410

Enhanced Classifier for Data Streams with novel class Miner (ECSMiner), 95, 96, 100101, 108, 109, 127, 129, 141, 142, 172

base learners, 131132

creating decision boundary during training, 132133

high level algorithm, 128129

nearest neighborhood rule, 129130

novel class and properties, 130131

Enhanced policy engine, 310

Enhanced SPARQL query processor, 310

Ensemble

approach, 94, 209

construction and updating, 344

learning, 197199, 218

refinement, 150151, 156160

size, 173

for supervised learning, 200201

techniques, 417

training process, 160

for unsupervised learning, 199200

update, 151, 160, 222223

Ensemble-based insider threat detection, 197

ensemble for supervised learning, 200201

ensemble for unsupervised learning, 199200

ensemble learning, 197199

Ensemble-based learning, 183

algorithms, 203

approach, 190

Ensemble-based stream mining, 76

Ensemble-based techniques, 177, 207

Ensemble-based USSL, 220

Ensemble classification, 107108, 156

classification overview, 156

ensemble refinement, 156160

ensemble update, 160

time complexity, 160

Entity

entity-relationship data models, 507, 508509

extraction, 292293

Entropy, 132

EPC, see Enclave Page Cache

EPR, see Electronic patient record

ER, see Emergency room

Erlang, 82

Error rates (ERR), 143, 145, 146

Error reduction

analysis, 344345

using MPC training, 116

time complexity of MPC, 121

ES, see Encoded sensing

ETL, see Extract-transfer-load

Evaluation approach, 143

Evolved class, 156159

Expectation-maximization technique (E-M technique), 109, 131, 150

optimizing objective function with, 154155

Experimental activities, 419

covert channel attack in mobile apps, 420

large scale, automated detection of SSL /TLS, 421

location spoofing detecting in mobile apps, 420

Experimental program, 457, 461

association between big data management and case studies, 457

coding for political event data, 458

geospatial data processing on GDELT, 458

laboratory setup, 461462

programming projects to supporting lab, 462465

timely health indicator, 459

Experimental system, 425426

layer, 9

Expert systems support, 300301

Explicit content (EC), 70

Explicit type information of object, split using, 269

Extended, multipartition, multichunk (EMPC), 344, 352489

Extended relational database systems, 521

eXtensible Access Control Markup Language (XACML), 307, 440

eXtensible Markup Language (XML), 15, 57, 58, 485

layer, 58

schemas, 61

security, 62

Extensions for big data-based social media applications, 326327

Extensions to existing courses, 426, 460461

big data analytics and management, 428

Critical Infrastructure Security, 428

data and applications security, 427

developing and securing cloud, 428

digital forensics, 427

integration of study modules with existing courses, 426

language-based security, 428

network security, 427

systems security and binary code analysis, 427

External attacks, 4344

External threat detection, 189, 190

Extract-transfer-load (ETL), 56

F

Fading factor, 199

False detection, 197

False negatives (FN), 190, 197, 212, 230, 251

False positive rates (FPR), 183, 230

False positives (FP), 186, 190, 197, 212, 230, 251

Farthest-first traversal heuristic, 155

Fast classification model, 174

Fault

detection, 95

fault-tolerant computing, 24

tolerance, 393, 516

FDP, see Federated data processor

Feature extraction, 341, 347

Feature selection, 341, 347

Feature weighting, 175

Federated data management, 518520

Federated data processor (FDP), 519

Field actuation mechanisms, 404

File organization, 73, 268

predicate object split, 74

predicate split, 7374

Filtered outlier (F outliers), 97, 134135

Firewalls, 407

First-order logic formulas and inference, 443

First-order Markov model, 34

Five Vs, see Volume, velocity, variety, veracity, and value

FN, see False negatives

Forecasting, 409

Forest cover dataset, 100

from UCI repository, 142

Formal policy analysis, 321, 324

Forming associations, 27

Foursquare, 289

F outliers, see Filtered outlier

FP, see False positives

FPR, see False positive rates

F pseudopoints, 135136

Framework design, 437

mixed continuous and discrete domains, 444446

offline scalable statistical analytics, 442444

privacy and security aware data management for scientific data, 440442

real-time stream analytics, 446448

storing and retrieving multiple types of scientific data, 437440

Framework integration, 320

Frequency, 221

Frequent itemset graph, 36, 37

“Friends-smokers” social network domain, 443, 444

Functional architecture, 510

Functional database systems, 522523

Functionality, 415

Future system, 439442, 444, 446

online structure learning methods for stream classification, 447448

semisupervised classification/prediction, 446447

G

Gaussian distribution, 141, 163, 204

GBAD, see Graph-based anomaly detection

GDELT, see Global Database of Event, Language, and Tone

Generating and populating knowledge base, 366

Generic problems, 456

Genetic algorithms, 109

Geospatial data processing on GDELT, 458

GFS, see Google File System

Gibbs sampling, 444

Gini index, 132

Global big data security and privacy controller, 400401

Global data-mining models, 408

Global Database of Event, Language, and Tone (GDELT), 458

geospatial data processing on, 458

Google, 266

BigQuery, 79, 81

BigTable, 82

Calendar, 405

cloud-based big data solutions, 84

Compute Engine, 409

Google+, 289

Monkey tool, 423

Google File System (GFS), 82, 193, 438

GPS-equipped vehicles techniques, 405

Graph

analysis, 70

graph-based behavior analysis, 415416

mining techniques, 69

rewriting, 361

transformation, 361

Graph-based anomaly detection (GBAD), 183184, 190, 197, 203204, 251; see also Anomaly detection

GBAD-MDL, 204

GBAD-MPS, 205

GBAD-P, 204205

models, 488

Graphical models and rewriting, 361

Graphical user interface (GUI), 421

GREE88 dataset, 227

Ground truth, 198, 199, 220

Guest machine, 54

Guests, 54

GUI, see Graphical user interface

H

Hadoop, 193, 265, 463, 488

cluster, 244

distributed system setup, 351

storage architecture, 312, 318, 325

Hadoop distributed file system (HDFS), 51, 70, 79, 173, 174, 184, 237, 265, 312, 322

Hadoop/MapReduce, 438

framework, 181, 345347

platform, 237238, 490

technologies, 373

HAN, see Home area network

HAQU13a approach, 193

HAQU13b approach, 193

Hard subspace clustering, 71

Hardware, 279, 339

hardware-assisted security, 406

hardware-level security, 406

services, 52

virtualization, 54

Hardware security modules (HSMs), 406

HBase, 56, 436, 438, 490

HDFS, see Hadoop distributed file system

HDP, see Heterogeneous data processor

Healthcare, 1

architecture of methodologies, 437

for big data analytics and security, 433

framework design, 437448

methodologies, 436437

motivation, 433436

Health Insurance Portability and Accountability Act (HIPAA), 356

Heart rate monitor, 407

Heterogeneity, 410

issue, 69

Heterogeneous components, 403

Heterogeneous data(base)

interoperability, 501

management, 518520

systems, 496

types, 517

Heterogeneous data processor (HDP), 518519

Heterogeneous IoT environment, 409

Heuristic model, 273274

Hewlett Packard Company, 495

Open Cirrus Testbed, 51

Hexastore, 64, 267

Hijacked kernel function pointers, 455

HIPAA, see Health Insurance Portability and Accountability Act

Hive, 56, 79, 81, 438

Hive-based assured cloud query processing, 322

HiveQL, 81

HMLNs, see Hybrid MLNs

Home area network (HAN), 405

Homomorphic encryption schemes, 463

Host-based attacks, 47

Host BDMA systems, infrastructure tools to, 7980

Host machine, 54

HSMs, see Hardware security modules

HTML, see Hypertext Markup Language

Hybrid CAISS++, 315318

Hybrid cloud, 53

Hybrid high-order Markov chain models, 189

Hybrid layout, 319

Hybrid MLNs (HMLNs), 444

Hyperplane technique, 161

Hypertext Markup Language (HTML), 263

Hypervisor, see Virtual machine monitor

I

IaaS, see Infrastructure as a Service

IARPA, see Intelligence Advanced Research Project Activity

IBM

cloud-based big data solutions, 84

System R, 494

IBM, see International Business Machine Corporation

ICD, see International Classification of Diseases

ICDE, see International Conference on Data Engineering

ICE, see Immigration and Customs Enforcement

Ideal cloud-based assured information sharing system (CAISS++), 309, 312, 489

centralized, 313314

decentralized, 314315, 316

framework integration, 320

hybrid, 315318

hybrid layout, 319

limitations, 312

naming conventions, 318

policy specification and enforcement, 320321

Ideal model, 271273

Identity

management, 51

theft, 45

IDS, see Intrusion detection systems

IG, see Information gain

Image mining, 38

automatic image annotation, 3940

feature selection, 39

goal, 39

image classification, 40

IME, see Input method editor

IME/Update app, 425

Immigration and Customs Enforcement (ICE), 424

Implicit type information of object, split using, 269

Impurity measurement, 153

IMS, see Information management system

In-line reference monitor (IRM), 76

INAN12, 477

Incident management, 404

Incremental learning, 106, 183, 190, 191, 218, 219

Incremental probabilistic action modeling (IPAM), 191

Index, 208

Inference, 355

tools, 360, 400

web, 365

Inference control, 367368

approach, 361362

domains and provenance, 362363

inference controller with two users, 363364

through query modification, 361

SPARQL query modification, 364365

Inference controller, 355, 360, 365, 400

approach, 365

architecture for, 356360

background generator module, 366367

generating and populating knowledge base, 366

implementation of medical domain, 365366

with two users, 363364

Inference engine, 359, 399

complexity, 365

Inferencing, 6061, 393394

Infinite length, 9395, 340, 410

Infinite sequences, 217

Information; see also Data

integration, 292, 293

sharing manager, 399

systems from data management systems framework, 500502

Information engine, 291

entity extraction, 292293

information integration, 293

Information gain (IG), 71

Information management system (IMS), 81

Information Resource Dictionary System (IRDS), 495

Information technology (IT), 339, 405

Informix Corporation, 495

Infrastructure as a Service (IaaS), 53, 332

Infrastructure development, 421, 455

curriculum development, 426428

virtual laboratory development, 421426

INGRES, 15, 16, 494, 495

project at University of California at Berkeley, 23

Input events generation, 424

Input files selection, 270

Input method editor (IME), 424

Insider threat detection, 51, 6768, 189191, 209, 251; see also Malware detection; Security policies

additional experiments, 252

anomaly detection in social network and author attribution, 252253

big data analytics for, 454

big data issues, 184

challenges, related work, and approach, 6869

collusion attack, 252

comprehensive framework, 7576

contributions, 185186

data mining, 68, 69, 7475

data storage, 7374

feature extraction and compact representation, 7072

GBAD, 183184

incorporate user feedback, 252

RDF repository architecture, 7273

for sequence data, 217224

sequence stream data, 184

solution architecture, 6970

stream data analytics applications for, 34

stream mining as big data mining problem, 253

as stream mining problem, 183, 184

SVMs, 251

Insider threats, 4344, 67, 197, 203

analysis, 46

Instrumental behavior analysis, 415

Integrated system, 387388, 389

Integration framework, 310311

Integrity, 380, 391392

aspects, 392393

for big data, 396

constraints, 24, 393, 395

of data, 380

management, 394396

Intellidimension RDF Gateway, 385

Intelligence Advanced Research Project Activity (IARPA), 331

Intelligent fuzzier for automatic android GUI application testing, 423

Intelligent transportation systems, 404

Intel SGX, 463, 465

Intel SGX-enabled machine, 461

SDK and SGX driver, 462

Interface manager, 358

International Business Machine Corporation (IBM), 494

International Classification of Diseases (ICD), 439

International Conference on Data Engineering (ICDE), 472

Internet of Things (IoT), 2, 377, 403404, 433, 485

data protection, 407408

layered framework for securing, 406407

scalable analytics for IOT security applications, 408411

use cases, 404406

Interoperability, 57, 391

of heterogeneous database systems, 518

Interuser parallelization, 244

Intrusion, 46, 47

detection, 189, 407

Intrusion detection systems (IDS), 27, 414

InXite, 290, 291

application of SNOD, 300

cloud-based system, 289

cloud-design of Inxite to hanndle big data, 301302

expert systems support, 300301

implementation, 302

information engine, 291293

InXite-Law, 302

InXite-Marketing, 302

InXite-Security, 302

plug-and-play approach, 291

threat detection and prediction, 298300

InXite POI

analysis, 293298

profile generation and analysis, 293294

threat analysis, 294296

IoT, see Internet of Things

IPAM, see Incremental probabilistic action modeling

IRDS, see Information Resource Dictionary System

IRM, see In-line reference monitor

IT, see Information technology

Iterative conditional mode algorithm (ICM algorithm), 155

J

Jena (Java application programming package), 266, 385

Job JB, 271

JobTracker, 79

Joining variable, 275

K

Kafka, 448

KDD cup 1999 intrusion detection dataset (KDD99), 100, 141142, 160161

KEND98 dataset, 207

Keynote presentations, 473

access control and privacy policy challenges in big data, 474

additional presentations, 474

authenticity of digital images in social media, 473

big data analytics, 473

business intelligence meets big data, 473

final thoughts, 474

formal methods for preserving privacy while loading big data, 473

privacy in world of mobile devices, 474

securing big data in cloud, 473

timely health indicators using remote sensing and innovation for validity of environment, 474

toward privacy aware big data analytics, 473

K-means clustering, 28

K-means clustering with cluster-impurity minimization (MCI-K means), 152154

K models, 209

k-nearest neighbor algorithm (KNN algorithm), 40, 149342

classification model, 131

k-NN-based approach, 108

KNN algorithm, see k-nearest neighbor algorithm

Knowledge base, 282

Knowledge representation (KR), 59

L

Labeled data, 149, 211

K-means clustering with cluster-impurity minimization, 152154

optimizing objective function with E-M, 154155

problem description, 152

storing classification model, 155156

training with limited, 152

unsupervised K-means clustering, 152

Labeled points, 155

Laboratory setup, 461462

Language-based security, 428

Large scale, automated detection of SSL /TLS, 421

Last technique, 122, 123

Layered framework for secure IOT, 406407

Layered security framework, 403

LBAC, see Location based access control

Learning classes

supervised learning, 203

unsupervised learning, 203205

Learning models, 183

Lehigh University Benchmark (LUBM), 314

LempelZiv–Welch algorithm (LZW algorithm), 220, 224, 237

constructing LZW Dictionary by selecting patterns, 221222

dictionary construction using MR, 241242

scalable LZW and QD construction using MR job, 238244

Leveraging randomized response-based differential-privacy technique, 408

LIBSVM, 209

Lifted learning and approximations of pseudolikelihood, 445

Lightweight IP-based network stacks, 407

Lincoln Laboratory Intrusion Detection dataset, 207, 210211

“Lineage”, 394

Link analysis, 28

LinkedIn, 289

L-model, 158

Location based access control (LBAC), 359, 398

Location spoofing detecting in mobile apps, 420

Logic database systems, see Next-generation database systems

LOGITBOOST.PL algorithms, 193

Loop detectors, 404

Lossy compression process, 221

6LoWPAN, 407

LUBM, see Lehigh University Benchmark

LZW algorithm, see LempelZiv–Welch algorithm

M

Machine learning, 409

algorithms, 83

techniques, 410, 417

Mahout, 193

Major mechanical problem, 98

Malicious applications, 418

Malicious code detection, 347

distributed feature extraction and selection, 348349

nondistributed feature extraction and selection, 347348

Malicious insiders, 3

Malicious intrusions, 45

Malware, 339, 347

behavior modeling, 415

dataset, 350

Malware detection, 46, 95, 340342, 414419; see also Insider threat detection

application to Smartphones, 418419

behavioral feature extraction and analysis, 415417

challenges, 414415

cloud computing for, 341

contributions, 341342

as data stream classification problem, 340341

experimental activities, 419421

infrastructure development, 421426

reverse engineering methods, 417

risk-based framework, 417418

in Smartphones, big data analytics for, 413, 414

Mandatory security policies, 15

Manual labeling of data, 149

Map input phase (MI phase), 272

Map keys (MKey), 346

Map output phase (MO phase), 272

Mappings, 509

MapReduce framework (MR framework), 51, 56, 70, 79, 184, 193, 237, 265266, 269, 348, 428, 438, 456

breaking ties by summary statistics, 277278

compression/quantization, 243

cost estimation for query processing, 270274

input files selection, 270

join execution, 278279

LZW dictionary construction, 241242

paradigm, 458

processes, 265

query plan generation, 274277

scalable LZW and QD construction, 238244

technology, 193

MapReduceJoin (MRJ), 271

Map values (MVal), 346

Markov logic, 442

Markov logic networks (MLNs), 443

Markov model, 27, 3235

Markov network, 443

Masquerade detection, 189, 190, 191

Massive data problem, 493494

Maximum likelihood tree, 447

MaxWalksat, 444

MCI-K means, see K-means clustering with cluster-impurity minimization

MDL approach, see Minimum description length approach

Mean distance (μd), 133

Medical domain implementation, 365366

Mermaid, 495

Metadata, 391

controller, 398

management, 514515

Meteorological data, 446

Mica2 nodes running TinyDB applications, 410

Microcluster, 99, 132, 149

Microlevel location mining, 296

Microsoft Azure’s Cosmos DB, 8384

Minimum cost plan generation problem, 275

Minimum description length approach (MDL approach), 69, 190, 204

Minimum support (minsup), 35

Minor mechanical problem, 98

Minor weather problem, 98

minsup., see Minimum support

MI phase, see Map input phase

Misapprehension, 197

Misuse detection, 47, 414

Mixed continuous and discrete domains, 444

approximate compilation for online inference knowledge, 445446

lifted learning and approximations of pseudolikelihood, 445

MKey, see Map keys

MLNs, see Markov logic networks

Mobile devices, privacy in world of, 474

Mobile interfaces, 428

Mobile OS, 420

Mobile sensors, 405

Model update, 416

Modern transportation algorithms, 404

MongoDB, 56, 79, 82, 438, 458

MO phase, see Map output phase

Motivation, 433

air quality data, 435

need for case study, 435436

problem, 433435

system architecture, 434

MPC, see Multipartition and multichunk; Multiple partition and multiple chunk

MQTT, 410

MR framework, see MapReduce framework

MRJ, see MapReduceJoin

1MRJ approach, see Single map reduce job approach

2MRJ, see Two MapReduce jobs

Multichunk ensemble approach, 343

Multiclass novelty detection technique, 108

Multiclass problem, 3738

Multidisciplinary approaches, 477480

Multidisciplinary University Research Initiative (MURI), 324

Multilabel classification problem, 173

Multilabel instances, 173

Multimedia database systems, 522

Multimedia data management for collaboration, 500

Multiobjective optimization framework for data privacy, 476477

Multipartition and multichunk (MPC), 94

Multiple partition and multiple chunk (MPC), 91, 115, 122, 123, 125, 171, 177

ensemble approach, 100, 107, 116, 171172 487

ensemble built on, 115

ensemble updating algorithm, 115116

error reduction using MPC training, 116121

Multiple shards in cluster, 83

Multiple video signals, 409

Multisource derivation, 442

Multistep Markovian model, 189

MURI, see Multidisciplinary University Research Initiative

“Muslim-brotherhood”, 290

Mutual information, 447

MVal, see Map values

MyHealtheVet Decision Support Tool, 434435

N

Naïve Bayes (NB), 342

classification, 299300

classifier, 47, 230

NB-INC, 230232

Naming conventions, 318

National Institute of Standards and Technology (NIST), 52

National Science Foundation (NSF), 4, 469

SATC funded project CNS-1228198, 440

SATC funded project CNS-1237235, 440

National Security Agency (NSA), 290, 307

Natural language processing (NLP), 295, 455456

NB, see Naïve Bayes

NCMRJ, see Nonconflicting MapReduceJoins

Nearest neighbor classification (NN classification), 150

Nearest neighborhood rule, 129130

Negative authorization, 17

Network

intrusion detection, 95

network-based attacks, 47

security, 406, 407, 427

types, 403

Networking and Information Technology Research and Development (NITRD), 469

Next-generation database systems, 495496, 522

Neyman Pearson theory, 409

n-gram, 191, 347

NIST, see National Institute of Standards and Technology

NITRD, see Networking and Information Technology Research and Development

NLP, see Natural language processing

NN classification, see Nearest neighbor classification

Noise, 189

Non-SQL (NoSQL), 81

databases, 81, 368

system, 428, 437, 456

Nonconflicting MapReduceJoins (NCMRJ), 271

Nondistributed feature extraction and selection, 347348

Nonrelational high performance database, 81

Nonsequence data, 207; see also Sequence data

dataset, 207209

experimental setup, 209

results, 210

stream data, 251

supervised learning, 209210, 210212

unsupervised learning, 210, 212214

Normative patterns, 220

Normative substructures, 197, 204

NoSQL, see Non-SQL

Novel class and properties, 130131

Novel class detection, 3, 27, 51, 96, 134137

analysis and discussion, 137

classification with, 133, 134

in data streams, 172

deviation between approximate and exacting q-NSC computation, 138140

high-level algorithm, 133134

justification of algorithm, 137138

time and space complexity, 140141

Novel success control models, 407

Novelty detection, 108

Novice programmer, 183

NSA, see National Security Agency

NSF, see National Science Foundation

N-Triples, 72

Number of hops concept, 35

O

Object data model, 520522

class/subclass hierarchy, 521

object-relational data model, 521522

objects and classes, 520

Objects and classes, 520

OCSVM, see One-class support vector machine

OD, see Original data

Offline scalable statistical analytics, 442

current systems and limitations, 443444

future system, 444

problem and challenges, 442443

OLAP models, see On-line analytical processing models

OLI N DDA model, 142, 147

On-line analytical processing models (OLAP models), 523

On Demand Stream approach (OnDS approach), 162163

One-class classifiers, 108

One-class support vector machine (OCSVM), 183, 191, 197, 200, 203, 207, 209

algorithm, 190

OCSVM models, 488

One-pass learning paradigm, 94, 416

One time password (OTP), 331

One-VS-all approach, 38

One-VS-one approach, 3738

Onion routing techniques, 407

Online inference knowledge, approximate compilation for, 445446

Online reputation-based score computation, 295

Online structure learning methods for stream classification, 447448

Ontologies, 487

security and, 63

Open provenance model (OPM), 361

Operating systems (OS), 53, 403, 419

level virtualization, 54

Operational expenditure (OpEx), 332

OpEx, see Operational expenditure

OPM, see Open provenance model

Optimizing objective function with E-M, 154155

Oracle Corporation, 495

Oracle NoSQL database, 8283

Original data (OD), 245

dataset, 245246

OS, see Operating systems

OTP, see One time password

Outlier detection, 108109

OWL, see Web Ontology Language

P

PaaS, see Platform as a Service

PAD algorithm, see Probabilistic anomaly detection algorithm

PANG04 techniques, 129

Parallel boosting algorithms, 193

Parallel database systems, 522

Parameter

reduction, 174

sensitivity, 146

Partial elimination, 275

Partially labeled data, 94

Particulate matter (PM), 433

Partitioner, 237

PARV12a approach, 192

PCA, see Principle component analysis

PCS systems, see Process control systems systems

PDP, see Policy Decision Point

Pedigree, 394

Peer effect, 303

Peer-to-peer (P2P), 100, 122, 350

Pellet, 400401

PEP, see Policy Enforcement Point

Perceptron, 2829

Person of interest (POI), 293

analysis, 293

InXite POI profile generation and analysis, 293294

InXite POI threat analysis, 294296

InXite psychosocial analysis, 296

sentiment mining, 297298

PEs, see Portable Executables

PET, see Privacy-enhancing symposium

PETRARCH, 458

Physical system stream data, 409

PIE, see Privacy inference engine

Pig Latin, 80

Pig query language, 438

Platform as a Service (PaaS), 53, 332

Platform for Privacy Preferences (P3P), 380

PLCs, see Programmable logic controllers

Plug-and-play approach, 291

PM, see Particulate matter

PM2.5 observations, 435, 446

POI, see Person of interest

Point sensors, 404405

Policy Decision Point (PDP), 334

Policy enforcement and related issues, 21

discretionary security and database functions, 2324

policy specification, 23

query modification, 23

SQL extensions for security, 2223

Policy Enforcement Point (PEP), 334

Policy engine, 312, 426

Policy manager, 357358, 360, 398399

Policy specification and enforcement, 320321

Political event data, coding for, 458

Portable Executables (PEs), 350

POS, see Predicate Object Split

Positive authorization, 17

P2P, see Peer-to-peer

P3P, see Platform for Privacy Preferences

Predicate Object Split (POS), 267

Predicate object split, 74

Predicate split (PS), 7374, 267, 269

Prediction, 409

Predictive tasks, 27

Preliminaries in cloud computing, 52

cloud deployment models, 53

service models, 53

Preprocessing, 409

Principle component analysis (PCA), 40

Privacy

policy, 380

privacy-enhancing techniques, 475476

“privacy-sensitive” tuples, 441

for social media systems, 385387

Privacy-enhancing symposium (PET), 475

Privacy-preserving

biometric authentication, 476

collaborative data mining, 476

data correlation techniques, 478479

data management, 407

data matching, 476

record matching problem, 477

Privacy and security aware data management, 440

current systems and limitations, 440441

future system, 441442

problem and challenges, 440

Privacy inference engine (PIE), 382383

Private cloud, 53

PRM, see Processor reserved memory

Probabilistic anomaly detection algorithm (PAD algorithm), 189

Probabilistic theorem proving (PTP), 445

Probability of state, 443

Process control systems systems (PCS systems), 405

Processor reserved memory (PRM), 462

Program analysis, 421

Programmable logic controllers (PLCs), 405

Programming projects to supporting lab, 462

proposed architecture, 464

secure data storage and retrieval in cloud, 462

secure encrypted stream data processing, 463465

systematic performance study of TEE, 462463

Propositional algorithms, 444

Proprietary protocols, 425

Provenance, 355, 357, 362363

data, 356357

integration, 6465

Provenance controller, 359360, 398

PS, see Predicate split

Pseudocode

for entity extraction, 293

for information integration, 293

Pseudolikelihood, lifted learning and approximations of, 445

“Pseudopoint”, 132

Psychological score computation, 294

Psychosocial analysis, InXite, 296

PTP, see Probabilistic theorem proving

Q

QD, see Quantized dictionary

QEs, see Query engines

q-nearest neighborhood rule (q-NH rule), 130, 138

q-neighborhood silhouette coefficient (q-NSC), 135140

q-NH rule, see q-nearest neighborhood rule

q-NSC, see q-neighborhood silhouette coefficient

QS, see Quantified self

Quantified self (QS), 1

movement, 453

Quantized dictionary (QD), 184, 218, 221224, 237

scalable LZW and QD construction using MR job, 238244

Query engines (QEs), 307

Query execution and optimization, 323

Query manager, 399

Query modification, 23

algorithm, 24

Query operation, 512

Query optimization, 2324, 512

Query plan generation, 274277

Query processing, 437, 512513

module, 359

system, 264

Query processor, 513

Query transformation, 512

R

RabbitMQ, 448

Radial-based function (RBF), 209

Radius (R), 133

RAMP, see Reduce and map provenance

Raspberry Pi, 409

Raw outlier, 97

RBAC, see Role-based access control

RBF, see Radial-based function

RDD, see Resilient distributed dataset

RDF-S, see RDF schema

RDF, see Resource description framework

RDFKB, see RDF Knowledge Base

RDF Knowledge Base (RDFKB), 267

RDFQL, see RDF Query Language

RDF Query Language (RDFQL), 385

RDF schema (RDF-S), 59

Real dataset-ASRS, 161

Real dataset-KDD, 161162

Realistic Data Stream Classifier (ReaSC), 149151

Real-time

analytics, 436437

classification, 174

database systems, 522

processing, 393, 516

threat, 43

traveler information systems, 404

Real-time stream analytics, 446

current systems and limitations, 446

future system, 446448

problem and challenges, 446

Real-world problems, 494

ReaSC, see Realistic Data Stream Classifier

ReaSC, 98, 101, 109, 110, 163, 168, 172

Receiver operating characteristic curves (ROC curves), 163

Recovery, 513

Recursive mining, 190, 191

Redaction manager, 399

Reduce and map provenance (RAMP), 64

Reduce input phase (RI phase), 272

Reduce output phase (RO phase), 272

Refine-Ensemble, 156157

Relational databases, 264, 508

systems, 496

Relational data models, 507, 508

Relational learning, 456

Relaxed Bestplan problem, 276277

Research and infrastructure activities in BDMA and BDSP, 454

big data analytics for insider threat detection, 454

binary code analysis, 455

CPS security, 455

infrastructure development, 455

secure cloud computing, 454455

secure data provenance, 454

TEE, 455

Research challenges, 477480

Resilient distributed dataset (RDD), 80

Resource description framework (RDF), 3, 15, 57, 58, 263, 290, 308, 364, 373, 438, 487, 488

data manager, 308

Gateway, 385

graphs, 69

integration, 6364

policy engine, 323324

processing engines, 326

RDF-3X, 267

RDF-based policy engine, 325, 367

repository architecture, 7273

security, 6263

Reverse engineering methods, 417

REWARDS technique, 417, 419

RI phase, see Reduce input phase

Risk-based framework, 417418

Risk analyzer, 399

Risk models, 479

Robotium (ROBO), 423

ROC curves, see Receiver operating characteristic curves

Role-based access control (RBAC), 15, 1819, 331, 359, 398, 442

Role hierarchy, 19

RO phase, see Reduce output phase

Routing protocols, 407

Rule-combining algorithms, 335

S

SaaS, see Software as a Service

SAMOA, 253, 447

Sanitization

task output derivation, 441

tasks, 441

techniques, 477

Satellite AOD data, 446

SCADA systems, see Supervisory control and data acquisition systems

Scalability, 69, 184, 186, 391, 410

big dataset for insider threat detection, 244245

big data techniques for, 192193

experimental setup and results, 244

Hadoop cluster, 244

Hadoop MapReduce platform, 237238

issues, 447

results for big data set relating to insider threat detection, 245248

scalable analytics for IOT security applications, 408411

scalable LZW and QD construction using MR job, 238244

test, 147

Scalable, high-performance, robust and distributed (SHARD), 266, 325

Scalable LZW and QD construction using MR job, 238244

1MRJ approach, 241244

2MRJ approach, 238241

Schema, 509

SciDB, 438439

multidimensional array data model, 436

Scientific data

privacy and security aware data management, 440442

storing and retrieving multiple types, 437440

SDB, see SPARQL database

SDC, see System Development Corporation

SDN, see Software-defined networking

Search space size, 276

Second-order Markov model, 34

Secret sharing-based techniques, 408

Secure big data management and analytics, unified framework for, 392

design of framework, 397400

global big data security and privacy controller, 400401

integrity management and data provenance for big data systems, 391396

Secure cloud computing, 454455, 461

Secure cyber-physical systems, 461

Secure data

integration framework, 339

provenance, 454

storage and retrieval in cloud, 322, 324325, 462

Secure encrypted stream data processing, 463465

SecureMR, 440

Secure multiparty computation (SMC), 476

Secure SPARQL query processing on cloud, 322323

Security, 516

and IoT, 403411

labels, 441

and ontologies, 63

query and rules processing, 63

RDF, 6263

semantic web AND, 61

XML, 62

Security and privacy for big data, 459

approach, 459460

curriculum development, 460461

experimental program, 461465

Security applications

data mining for cyber security, 4347

data mining tools, 4748

Security extensions, 281

access control model, 282283

access token assignment, 283284

conflicts, 284285

Security policies, 15, 16; see also Insider threat detection

access control policies, 1619

administration policies, 20

auditing, 21

authentication, 2021

discretionary security policies, 16

identification, 2021

views for security, 21

SElinux, 440

Semantic gap, 38

Semantic web-based inference controller for provenance big data

architecture for inference controller, 356360

big data management and inference control, 367368

implementing inference controller, 365367

inference control through query modification, 361365

Semantic web, 51, 57

cloud computing frameworks based on technologies, 6365

DL, 5960

graphical models and rewriting, 361

inferencing, 6061

OWL, 59

preliminaries in, 52

RDF, 58

and security, 6163

semantic web-based models, 360361

semantic web-based security policy engines, 326

SPARQL, 5859

SWRL, 61

technologies, 52, 263, 360, 396

technology stack for, 57

XML, 58

Semantic Web Rules Language (SWRL), 58, 61, 309, 358359, 387

Semisupervised classification/prediction, 446447

Semisupervised clustering

stream classification algorithm, 172

techniques, 109, 131, 149

Sensing infrastructure, 404

Sensor network, 408409

Sensor signal, 409

Sentiment mining, 297298

Sequence-based behavior analysis, 416

Sequence data, 217; see also Nonsequence data

anomaly detection, 223224

choice of ensemble size, 233235

classification, 217220

complexity analysis, 224

concept drift in training set, 228230

dataset, 227228

experiments and results for, 227

insider threat detection for, 217

NB-INC vs. USSL-GG for various drift values, 231232

results, 230

stream data, 184, 251

TN, 230231

USSL, 220223

Serializability, 513

Server role, 381382

Service models, 53

SETM algorithm, 35

SGX hardware, 463

SHARD, see Scalable, high-performance, robust and distributed

Signature(s), 47

behavior, 189, 191

database, 342

detection, 339

signature-based malware detectors, 342

Silver Lining, 440

Simple Protocol and RDF Query Language (SPARQL), 5859, 69, 263, 269, 488

query modification, 364365

query processor, 312, 325

Single-chunk approach, 171

Single-partition, single-chunk approach (SPC approach), 115, 340, 344

ensemble approach, 116

Single map reduce job approach (1MRJ approach), 238, 241244

Single model approach, 94

classification, 106107

incremental approaches, 417

Single pass algorithm, 220

Single source derivation, 441

Singular value decomposition (SVD), 40

Small communication frames, 407

Smart grid, 405407

Smart home, 405

Smart meters, 408

Smartphones application, 418

classification model, 418

data gathering, 419

data reverse engineering, 419

malware detection, 419

SMC, see Secure multiparty computation

SMM, see System management mode

SNOD, see Stream-based novel class detection

Social factor-based technique, 297

Social graph-based score computation, 295

Social media

authenticity of digital images in, 473

privacy for, 385387

sites, 291

systems, 27, 379

Social network, 388389

community, 263

trust for, 387

Soft subspace clustering, 71

Software, 280

Software as a Service (SaaS), 53, 307, 332

Software-defined networking (SDN), 407

SOWT, see Special operations weather specialists

Space complexity, 140141

Space sensors, 404405

Spark, 422, 458

emerge, 490

running, 409

SPARQL, see Simple Protocol and RDF Query Language

SPARQL database (SDB), 321

SpatialHadoop, 458

Spatiotemporal Database Systems, 522

SPC approach, see Single-partition, single-chunk approach

Special operations weather specialists (SOWT), 459

Split using explicit type information of object, 269

Spout, 447448

SQL, see Structured Query Language

SSL/TLS, large scale, automated detection, 421

SSO, see System security officer

Stand-alone systems, 497

Stanford framework, 458

State-of-the-art stream classification techniques, 127, 149171

Static analysis, 421

Static GBAD approaches, 190

Static learning, 190

Statistical models, 410

Status, 497

Sticky policies, 478

Storage management, 514

Storage services, 52

Storage virtualization, 54

Storing and retrieving multiple types of scientific data, 437

current systems and limitations, 438439

future system, 439440

problem and challenges, 437438

Storm (data system), 442

Stream, 197

analytics, 171

classification techniques, 150

sequence data, see Infinite sequences

Stream-based novel class detection (SNOD), 289

application, 300

SNOD++, 300

Stream data, 192, 253, 410

classification, see Data stream classification

mining, 181

Stream data analytics, 3, 257

applications for insider threat detection, 34

for insider threat applications layer, 67

for insider threat detection, 4

layer, 6

Stream mining, 190192, 457

big data issues, 184

as big data mining problem, 253

contributions, 185186

GBAD, 183184

insider threat detection as stream mining problem, 183, 184

sequence stream data, 184

techniques, 207

Strong authorization, 17

Structured Query Language (SQL), 15, 55, 69, 485, 495, 512

extensions for security, 2223

Subspace clustering, 7172

Supervised approach, 197

Supervised ensemble classification updating, 200

Supervised learning, 68, 190, 203, 209212; see also Unsupervised learning

algorithm, 183, 184

approaches, 183, 189, 251

ensemble for, 200201

Supervised methods, 191

Supervised microclustering technique, 110

Supervised model, 191

Supervised testing algorithm, 200

Supervised/unsupervised learning, 456

Supervisory control and data acquisition systems (SCADA systems), 405

Supporting technologies, 23; see also Big data management and analytics (BDMA); Big data security and privacy (BDSP)

layer, 67, 499, 500

Support vector machines (SVMs), 27, 3132, 47, 68, 183, 185, 207209, 251, 342

Support vectors, 32

SVD, see Singular value decomposition

SVMs, see Support vector machines

SWRL, see Semantic Web Rules Language

Sybase Inc., 495

Symposium on Access Control Models and Technologies, 18

SynC, see Synthetic Data with only Concept Drift

SynCN, see Synthetic Data with Concept Drift and Novel Class

SynD, see Concept-drifting synthetic dataset

SynDE, see Concept-evolving synthetic dataset

Synthetic datasets, 99, 160, 349350

Synthetic data with concept drift and concept evolution, 99

Synthetic Data with Concept Drift and Novel Class (SynCN), 141

Synthetic Data with only Concept Drift (SynC), 141

Synthetic data with only concept drift, 99

Systematic performance study of TEE, 462463

System Development Corporation (SDC), 495

System management mode (SMM), 462

System(s)

call, 207, 208

security, 427, 461

services, 52

System R, 15, 16

System security officer (SSO), 20, 511

T

TABARI software, 458

Tag, 442

TaintDroid, 425

TEE, see Trusted execution environments

Temporary buffer, 129

Text(s)

classification approaches, 189

relationship between, 502504

Third-party IME, 424

Threat

assessment, 295

data, 403

Three-schema architecture, 510

TIE, see Trust inference engine

Time based access control (TRBAC), 359

Time complexity, 121, 140141, 160

Timely health indicators, 459, 474

Time role-based access control (TRBAC), 398

TM, see Translation model

TMP36 sensors, 409

TNs, see True negatives

Token, 207

subgraph, 208

Tor (TOR), 407, 408

Toy problems, 494

TPJ, see Triple Pattern Join

TPR, see True positive rate

TPs, see Triple patterns; True positives

Trace Files, 227

Traditional data stream classification techniques, 127, 416

Traditional machine-learning tools, 409

Traditional static supervised method, 183

Traffic flow control, 404

Transactional approach, mitigating data leakage in mobile apps using, 424425

Transaction management, 513514

Translation model (TM), 40

Traveler information, 404

TRBAC, see Time based access control; Time role-based access control

Triple Pattern Join (TPJ), 271

Triple patterns (TPs), 264, 271

Triples, 72

True negatives (TNs), 197, 230

True positive rate (TPR), 186, 230

True positives (TPs), 197, 230

“Truncated” UNIX shell commands, 189, 191

Trust, 379, 380

probabilities, 387

for social networks, 387

Trusted execution environments (TEE), 454, 455, 459

systematic performance study, 462463

Trust inference engine (TIE), 382383

Trust, privacy, and confidentiality, 379

current successes and potential failures, 380381

inference engines, 383384

motivation for framework, 381

TrustZone security, 406

Twitter, 289

Two-class SVM, 209, 211

Two MapReduce jobs (2MRJ), 238

approach, 238241

Two-phase commit, 513

Type sink, 417

U

UAV could, 409

UCON, see Usage control

UI, see User interface

Unbounded data stream, 221

Unified framework

design of framework, 397400

global big data security and privacy controller, 400401

integrity management and data provenance for big data systems, 391396

learning framework, 409

for secure big data management and analytics, 392

Uniform resource identifiers (URIs), 58, 74, 269, 318, 331

UNIX shell commands, 189

Unsupervised ensemble classification and updating, 198

Unsupervised K-means, 131132

clustering, 152

Unsupervised learning, 191, 203, 210, 212214, 415; see also Supervised learning

algorithm, 183, 184

ensemble for, 199200

GBAD-MDL, 204

GBAD-MPS, 205

GBAD-P, 204205

GBAD, 203204

Unsupervised method, 183

Unsupervised stream-based sequence learning (USSL), 184, 185, 218, 219, 220, 230

constructing LZW Dictionary, 221222

data chunk, 220221

USSL-GG algorithms, 230235

URIs, see Uniform resource identifiers

Usage control (UCON), 19

U.S. Bureau of Labor and Statistics (BLS), 1

Use cases, 404406

User demographics-based, 297

User feedback, 252

User interface (UI), 423424

manager, 357, 398

User-level applications, 189

U.S. Homeland Security, 67

USSL, see Unsupervised stream-based sequence learning

V

VA, see Veterans Administration

Vector representation of content (VRC), 7071

Vertically partitioned layout, 318319

Very Fast Decision Trees (VFDTs), 106, 340

Veterans Administration (VA), 433, 434

decision support tools, 436

Personal Health Record system, 434

VFDTs, see Very Fast Decision Trees

Victim selection, 220

Video signal, 409

View management, 517

ViewServer, 424

Vigiles, 441

Virtualization, 5354

Virtual laboratory development, 421

architectural diagram for virtual lab and integration, 422

experimental system, 425426

input events generation, 424

intelligent fuzzier for automatic android GUI application testing, 423

interface, 423424

laboratory setup, 421422

mitigating data leakage in mobile apps, 424425

policy engine, 426

problem statement, 423

programming projects to supporting virtual lab, 423

technical challenges, 425

Virtual machine manager (VMM), 462

Virtual machines (VM), 244

image, 55

monitor, 54

Vision, 497

VM, see Virtual machines

VMM, see Virtual machine manager

VMware, 54

Volume, velocity, variety, veracity, and value (Five Vs), 1

Voting, 409

VRC, see Vector representation of content

W

WA., see Weighted average

Wang, 122, 123, 124, 125

W3C, see World Wide Web Consortium

WCE, see Weighted classifier ensemble

WCOP, see Web rules, credentials, ontologies, and policies

Weak authorization, 17

Web-based interface, 421

Web Ontology Language (OWL), 58, 59, 263, 309, 355, 364, 487

OWL 2 specification, 400

Web rules, credentials, ontologies, and policies (WCOP), 388

Weighted average (WA), 199

Weighted classifier ensemble (WCE), 142

Weight learning, 443

Weka (machine learning open source package), 83, 122

Whitepages, 366

WHO, see World Health Organization

Wireless communication networks, 404

Wireless sensor networks (WSN), 410

Workgroups, 474

Workshop discussions, 474

BDMA for cyber security, 480481

examples of privacy-enhancing techniques, 475476

multiobjective optimization framework for data privacy, 476477

philosophy for BDSP, 475

research challenges and multidisciplinary approaches, 477480

workgroups, 474

Workshop presentations

keynote presentations, 473474

summary, 472474

World Health Organization (WHO), 433

World Wide Web, 20, 24, 53, 57, 365, 462

World Wide Web Consortium (W3C), 57, 380

Wrapper-based simultaneous feature weighing, 39

WSN, see Wireless sensor networks

X

XACML, see eXtensible Access Control Markup Language

XEN, 54

XML, see eXtensible Markup Language

XQuery, 23

Y

Yahoo!, 266

Yellowpages, 366

Z

Zero-knowledge proof of knowledge protocols (ZKPK protocols), 476

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.161.77