List of figures and tables

Figures

1.1. Technology stack of the current version of LSP running internally at Lundbeck 16

1.2. LSP curvefit, showing plate list, plate detail as well as object detail and curve 20

1.3. LSP MedChem Designer, showing on the fly calculated properties and Google visualisations 25

1.4. LSP4Externals front page with access to the different functionalities published to the external collaborators 26

1.5. LSP SAR grid with single row details form 28

1.6. IMI OpenPhacts GUI based on the LSP4All frame 31

2.1. Integration of online OpenTox descriptor calculation services in the Bioclipse QSAR environment 40

2.2. The Bioclipse Graphical User Interface for uploading data to OpenTox 41

2.3. OpenTox web page showing uploaded data 42

2.4. CPDB Signature Alert for Carcinogenicity for TCMDC-135308 48

2.5. Identification of the structural alert in the ToxTree Benigni/Bossa model for carcinogenicity and mutagenicity, available via OpenTox 49

2.6. Crystal structure of human TGF-β1 with the inhibitor quinazoline 3d bound (PDB-entry 3HMM) 50

2.7. Replacing the dimethylamino group of TCMDC-135308 with a methoxy group resolves the CPDB signature alert as well as the ToxTree Benigni/Bossa Structure Alerts for carcinogenicity and mutagenicity as provided by OpenTox 51

2.8. Annotated kinase inhibitors of the TCAMS, imported into Bioclipse as SDF together with data on the association with human adverse events 52

2.9. Applying toxicity models to sets of compounds from within the Bioclipse Molecule TableEditor 53

2.10. Adding Decision Support columns to the molecule table 54

2.11. Opening a single compound from a table in the Decision Support perspective 55

2.12. The highlighted compound – TCMDC-135174 (row 27) – is an interesting candidate as it is highly active against both strains of . falciparum while being inactive against human HepG2 cells 56

2.13. Molecule Table view shows TCMDC-134695 in row 19 57

2.14. The compound TCMDC-133807 is predicted to be strongly associated with human adverse events, and yields signature alerts with Bioclipse’s CPDB and Ames models 57

3.1. A ‘prospected’ article from RSC 65

3.2. The header of the chemical record for domoic acid in ChemSpider 69

3.3. Example of figure in article defining compounds 73

3.4. A review page of digested information 75

3.5. Examples of ChemDraw molecules which are not converted correctly to MOL files by OpenBabel 77

3.6. The Learn Chemistry Wiki 79

4.1. Isotope pattern for cystine 97

4.2. Ion chromatogram produced in R (xcms) 100

4.3. A mass spectrum produced from R (xcms) 101

4.4. 3D Image of a LC-MS scan using the plot surf command from the RGL R-package LC-MS spectrum using mzMine 102

4.5. A total ion chromatogram (TIC) plot from mzMine 103

4.6. Configuring peak detection 103

4.7. Deconvoluted peak list 105

4.8. 3D view of an LC-MS scan 106

4.9. Example of a Batch mode workflow 106

4.10. Configuring mzMine for metabolomics processing 108

4.11. mzMine results 110

4.12. Data analysis in mzMine 111

4.13. A metabolomics componentisation workflow in KNIME 113

4.14. Workflow to normalise to internal standard or total signal 114

4.15. PCA analysis using KNIME and R 117

4.16. The plots from the R PCA nodes 122

5.1. The many faces of ImageJ 134

5.2. ImageJ can be customised by defining the contents of the various menus 135

5.3. Smartroot displays a graphical user interface that only Javascript can deliver within ImageJ 139

5.4. A KNIME workflow that integrates ImageJ functions in nodes as well as custom macros 140

5.5. Example of a QR code that can be read by a plug-in based on ZXing 141

5.6. An example of a GUI that can be generated within the ImageJ macro language to capture user input 145

5.7. Imaging of seeds using a flat bed scanner 146

5.8. Plant phenotyping to non-subjectively quantify the areas of different colour classifications 147

6.1. Simple KNIME workflow building a decision tree for predicting molecular activity 152

6.2. Hiliting a frequent fragment also hilites the molecules in which the fragment occurs 154

6.3. Feature elimination is available as a loop inside a meta-node 155

6.4. Outline of a workflow for comparing two SD files 159

6.5. Reading and combining two SD files 159

6.6. Preparation of the molecules 160

6.7. Filtering duplicates 161

6.8. Writing out the results 161

6.9. Amide enumeration workflow 162

6.10. Contents of the meta-node 163

6.11. KNIME Enterprise Server web portal 165

6.12. Outline of a workflow for image processing 166

6.13. Black-and-white images in a KNIME data table 166

6.14. Image after binary thresholding has been applied 167

6.15. Meta-node that computes various features on the cell images 167

6.16. A workflow for large-scale analysis of sequencing data 168

6.17. Identification of regions of interest 170

7.1. Overview of the ISA-Tab format 178

7.2. An overview of the depth and breadth of the PredTox experimental design 182

7.3. The ontology widget illustrates here how CHEBI and other ontologies can be browsed and searched for term selection 184

8.1. Diverse types of genomic data 191

8.2. Flow-chart describing the various functionalities of the GenomicTools suite 196

8.3. Example entry from the user’s manual for the ‘shuffle’ operation of the genomic_regions tool 199

8.4. Example entry (partial) from the C++ API documentation produced using Doxygen and available online with the source code distribution 203

8.5. Example of TSS read profile for genes of high and low expression 209

8.6. Example of TSS read heatmap for select genes 210

8.7. Example of window-based read densities in wiggle format 212

8.8. Example of window-based peaks in bed format 213

8.9. Time evaluation of the overlap operation between a set of sequenced reads of variable size (1 through 64 million reads in logarithmic scale) and a reference set comprising annotated exons and repeat elements (~ 6.4 million entries) 216

8.10. Memory evaluation of the overlap operation presented in Figure 8.10 217

9.1. Applications of ‘omics data throughout the drug discovery and development process 223

9.2. User communities for ‘omics data 225

9.3. Atlas query results screen 227

9.4. Atlas architecture 228

9.5. Atlas administration interface 230

9.6. Federated query model for Atlas installations 236

10.1. Overview of the IT system showing the Beowulf compute cluster comprising a master server that passes out jobs to the four nodes 242

10.2. The current IT system following a modular approach with dedicated servers 244

10.3. NAS box implementation showing the primary NAS at site 1 mirrored to the secondary NAS at different site 247

10.4. A screenshot of our ChlP-on-chip microarray data viewing application 256

11.1. Changes in bases of sequence stored in GenBank and the cost of sequencing over the last decade 264

11.2. Our filesharing setup 268

11.3. Connectivity between web browsers, web service genome browsers and web services hosting genomic data 274

12.1. A sample DesignSet 286

12.2. The progress chart for the DDD1 project 288

12.3. Using the smiles tag within our internal wiki 290

12.4. Adoption of Design Tracker by users and projects 294

13.1. The FLOSS assessment framework 302

13.2. A screenshot of Pfollow showing the ‘tweets’ within the public timeline 306

13.3. A screenshot of tags.pfizer.com 310

13.4. A screenshot showing Pfizerpedia’s home page 316

13.5. A screenshot showing an example profile page for the Therapeutic Area Scientific Information Services (TA SIS) group 316

13.6. A screenshot of the tags.pfizer.com social bookmarking service page from the R&D Application catalogue 317

14.1. Schematic overview of the system from accessing document sources 329

14.2. Node/edge networks for disease-mechanism linkage 335

14.3. KOL Miner in action 337

14.4. Further KOL views 338

14.5. Drug repurposing matrix 339

14.6. Early snapshot of our drug-repositioning system 340

14.7. Article-level information 341

14.8. An example visual biological process map describing how our drugs work at the level of the cell and tissue 343

14.9. A screenshot of the Atlas Of Science system 345

14.10. Typical representation of three layout approaches that are built into Prefuse 346

15.1. A static PDF table 356

15.2. In (a), Utopia Documents shows meta-data relating to the article currently being read; in (b), details of a specific term are displayed, harvested in real time from online data sources 358

15.3. A text-mining algorithm has identified chemical entities in the article being read, details of which are displayed in the sidebar and top ‘flow browser’ 359

15.4. Comments added to an article can be shared with other users, without the need to share a specific copy of the PDF 361

15.5. Utopia Library provides a mechanism for managing collections of articles 362

16.1. An example of semantic annotations 370

16.2. Semantic Form in action 371

16.3. Page template corresponding to the form in Figure 16.2 372

16.4. The KnowIt landing page 374

16.5. The layout of KnowIt pages is focused on content 375

16.6. Advanced functions are moved to the bottom of pages to avoid clutter 375

16.7. Semantic MediaWiki and Linked Data Triple Store working in parallel 384

16.8. Wiki-based contextual menus 386

17.1. Data loading architecture 397

17.2. Properties of PDE5 stored semantically in the wiki 399

17.3. A protein page in Targetpedia 400

17.4. The competitor intelligence section 402

17.5. A protein family view 103

17.6. Social networking around targets and projects in Targetpedia 405

17.7. Dividing sepsis into physiological subcomponents 411

17.8. The Semantic Form for creating a new assertion 414

17.9. (a) An assertion page as seen after editing. (b) A semantic tag and automatic identification of related assertions 415

17.10. The sepsis project page 416

18.1. Chem2Bio2RDF organization, showing data sets and the links between them 428

18.2. Tools and algorithms that employ Chem2Bio2RDF 430

19.1. The TripleMap architecture 445

19.2. Entities and their associations comprise the GEM data network 447

19.3. TripleMap web application with knowledge maps 449

20.1. Architecture for analytical processing 454

20.2. HL7 V2.x message sample 455

20.3. HL7 V3.x CDA sample 457

20.4. Mirth Connect showing the channels from the data sources to the databases 460

20.5. Screenshot of Mirth loading template 461

20.6. The resulting Mirth message tree 462

20.7. CDA data model 467

20.8. MapReduce 469

20.9. Riak RESTful API query 473

20.10. Complete analytic architecture 478

21.1. Assess the open source software package 486

21.2. Assess the open source community 489

21.3. Risk management process 492

21.4. Typical validation activities 494

21.5. Software development, change control and testing 498

21.6. Development environments and release cycles 502

22.1. Deploying open source software and data inside the data centres of corporations 516

22.2. Vision for a new cloud-based shared architecture 517

Tables

2.1. Bioclipse–OpenTox functionality from the Graphical User Interface is also available from the scripting environment 37

2.2. Description of the local endpoints provided by the default Bioclipse Decision Support extension 44

2.3. Various data types are used by the various predictive models described in Table 2.2 to provide detailed information about what aspects of the molecules contributed to the decision on the toxicity 45

2.4. Structures created from SMILES representations with the Bioclipse New from SMILES wizard for various structures discussed in the use cases 46

8.1. Summary of operations of the genomic_regions too 198

8.2. Summary of usage and operations of the genomic_overlaps tool 200

8.3. Summary of usage and operations of the genomic_scans tool 201

8.4. Supported statistics for the permutation tests 201

13.1. Comparison of the differences between Web 2.0 and Enterprise 2.0 environmental drivers 300

13.2. Classifying some of the most common uses of MediaWiki within the research organisation 314

17.1. Protein information sources for Targetpedia 396

17.2. The composition of an assertion 412

18.1. Data sets included in Chem2Bio2RDF ordered by number of RDF triples 429

18.2. Open source software used in Chem2Bio2RDF 430

20.1. Database comparisons 470

20.2. Comparison of open source BI frameworks 476

21.1. GAMP® 5 software categories 486

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.26.138