Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 22 Metadata

David H. Christiansen

22.1 Introduction

Metadata (also “meta data” or “metadata”) are commonly defined as “data about data.” Clinical trial metadata are concerned with describing data originating from or related to clinical trials, including data sets and statistical analyses performed on the datasets. Clinical trial metadata may be in the form of a separate written document, may be linked electronically to a document or data set, or may be integrated into a data set as part of the definition of the data fields or variables. Metadata may be accessed by statisticians performing analyses on the data and by other scientists reviewing or using the data and results. Increasingly, metadata included in computerized data sets (machine-readable metadata) can also be used by statistical software and other computer applications to present or use the data in an appropriate manner, based on the metadata description. For machine-readable metadata to be used by a computer application, standards for the format and content must exist for both the metadata and the application that reads the metadata. Metadata are an important component of the documentation required for regulatory submissions and should provide a clear and concise description of the data collected and the analyses performed.

22.2 History/Background

The term “metadata” was coined in 1969 by Jack E. Kelly. Although “Metadata” was copyrighted in 1986 by The Metadata Company [1], the generic “metadata” is commonly used in many disciplines, including computer science, database administration, geographic science, and clinical data management.

22.2.1 A Metadata Example

The importance of metadata can be illustrated by the following example adapted from actual events. Consider the two data sets shown in Table 1. Each data set contains the same variables but has different data values. With only cryptic variable names and no additional metadata, it would be almost impossible to determine what the data values represent without additional documentation. If we add context by stating that these data sets represent rocket burn instructions for placing a satellite in orbit and that the last three variables represent distance, speed, and force, then a knowledgeable rocket scientist may be able to infer what the data points could represent.

Table 1: Example Data Sets with No Metadata

In fact, that inference, based on inadequate metadata, led to a very costly mistake. The Mars Climate Orbiter was launched in December 1998 and was scheduled to go into orbit around Mars 9 months later. The $ 150 million satellite failed to reach orbit because the force calculations were “low by a factor of 4.45 (1 pound force = 4.45 Newtons), because the impulse bit data contained in the AMD file was delivered in lb-sec instead of the specified and expected units of Newton-sec” [2]. In other words, a data file in English units (pounds) was input into a program that expected metric units (Newtons). Table 2 illustrates the same two data sets with additional metadata, including meaningful variable names, labels, and units.

Table 2: Mars Climate Orbiter Burn Instructions with Metadata

Note, the units used in each data set are now clearly identified in the column header metadata. Although it cannot be said with certainty that this metadata would have prevented the error, it would have increased the probability that the error would have been found by project personnel. The use of machine readable or “parsable” metadata would allow computer programs to be used to check for compatible units.

22.2.2 Geospatial Data

Geographic science is an example of a discipline with well-defined metadata. In 1990, the Office of Management and Budget established the Federal Geographic Data Committee as an interagency committee to promote the coordinated development, use, sharing, and dissemination of geospatial data on a national basis [3]. In addition to detailed metadata and tools, the group also provides informational and educational materials that may be useful for motivating the development and use of metadata. For example, the U.S. Geologic Survey in a document titled “Metadata in Plain Language” [4] poses the following questions about geospatial data:

1. What does the data set describe?

2. Who produced the data set?

3. Why was the data set created?

4. How was the data set created?

5. How reliable are the data; what problems remain in the data set?

6. How can someone get a copy of the data set?

7. Who wrote the metadata?

Although these questions may have a different context and different emphasis in clinical trials, they can provide background information for our discussion of clinical trial metadata.

22.2.3 Research Data and Statistical Software

Research data management refers to the design, collection, editing, processing, analyzing, and reporting of data that results from a research project such as a clinical trial. The characteristics and requirements for research data management activities are different from those of a commercial data management system. Because clinical trials are experimental by nature, the resulting data sets are unique, contain many different variables, and require a high degree of study-specific documentation, including metadata. Conversely, commercial data systems, such as payroll or credit card billing, are characterized by very stable systems that perform well-defined functions on a relatively small number of standard variables. These differences make it difficult to use commercial data systems, such as databases and reporting systems, for research data.

These unique requirements drove the development of statistical software specifically for research data over the last 30 years. The most commonly used systems all have some type of metadata at both the data set and variable levels. These metadata allow the researchers conducting the studies to describe the structure and content of the data sets clearly. The ability to describe the data clearly and unambiguously is important in the analysis and reporting of the study results, both within an organization and for regulatory review and approval of new treatments.

22.2.4 Electronic Regulatory Submission

The International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH) [5] has developed a Common Technical Document (CTD) that “addresses the organization of the information to be presented in registration applications for new pharmaceuticals (including biotechnology-derived products) in a format that will be acceptable in all three ICH regions (Japan, Europe and United States)” [6]. The CTD outline is, in fact, metadata for the regulatory submission. It provides information about the submission in a structured format that facilitates both the creation and review of the submission.

Because the requirements for submission of clinical data vary from country to country, the CTD does not specifically address electronic submission of clinical data. The U.S. Food and Drug Administration (FDA), however, defined metadata requirements for clinical trial data sets submitted as part of its treatment approval processes in 1999 [7]. The Clinical Data Interchange Standard Consortium (CDISC) enhanced the FDA metadata by adding metadata attributes at the data set and variable levels [8]. Since that time, the FDA and CDISC have collaborated on more standards, resulting in the FDA referencing the CDISC standard for clinical domain data as an acceptable format [9,10], The FDA and CDISC currently are working on a similar standard for analysis data sets [11,12],

22.3 Data Set Metadata

In the context of clinical trials and regulatory submissions, the most useful metadata refer to the data sets that contain the trial results. The data sets can be generally classified as those data collected during the execution of the trial (tabulation data or observed data) or data derived from the observed data for the purpose of display or analysis (analysis data). Data set metadata can be classified at various levels as described below. The metadata attributes described are discussed by one or more of the following standards or organizations: CDISC Tabulation [10], CDISC Analysis [11], FDA [13], and SAS [14].

22.3.1 Data Set-Level Metadata

A data set is a computer file structured in a predictable format that can be read and processed by a computer application or program, typically a statistical system such as SAS (SAS Institute, Gary, NC), S-Plus (Insightful Corporation, Seattle, WA), or SPSS (SPSS Corporation, Chicago, IL). In the discussion here SAS terminology will be used, but similar concepts and attributes exist in all statistical languages and systems. Data set-level metadata describes the content, structure, and use of the data set by describing its physical and logical attributes. Some of these attributes relate to the data set itself, whereas others are dependent on the context in which the data set is used. The description and use of these attributes as they relate to regulatory submissions are shown below.

Data Set Name Unique data set name for this file. FDA and CDISC have naming conventions for some clinical domain and analysis data sets. (FDA, CDISC, and SAS)

Description (Label) A more detailed description of the content of the data set. (FDA, CDISC, and SAS)

Location The relative physical file location in an electronic submission. (FDA and CDISC)

Structure The shape of the data set or the level of detail, represented by each row or record. Structure can range from very-horizontal (one record per subject) to very vertical (one record per subject per visit per measurement). It is recommended that structure be defined as “one record per…” rather than the ambiguous terms normalized or denormalized, horizontal or vertical, tall or short, skinny or fat, and so forth. (CDISC)

Purpose Definition of the type of data set as tabulation or analysis. (CDISC)

Key Fields Variables used to uniquely identify and index records or observations in the data set. (CDISC).

Merge Fields A subset of key fields that may be used to merge or join SAS data sets. (CDISC analysis)

Analysis Data Set Documentation Written documentation that includes descriptions of the source data sets, processing steps, and scientific decisions pertaining to creation of the data set. Analysis data set creation programs may also be included, (CDISC Analysis)

Rows The number of data records or observations in the data set. (SAS)

Columns The number of variables or fields in the data set. For most analysis data sets, each column represents the measurement value of some characteristic of the subject such as sex, age, and weight at a visit, for example. (SAS)

22.3.2 Variable-Level Metadata

Each column or variable in a data set has certain attributes that describe the content and use of the variable. These variable attributes are usually consistent for all observations within a data set. That is. each column has the same attributes for all rows of the data set. This rectangular data structure is required for statistical analysis by the vast majority of statistical analysis software and is the natural structure for CDISC Analysis data sets. Many CDISC Tabulation data sets, however, have a structure of one record per measurement. Because different measurements have different attributes, metadata at the variable level is not adequate. See the section on value-level metadata below for a discussion of this issue.

Variable Name Unique name for the variable. A variable name should be consistent across data sets and studies within a regulatory submission. The data values of a variable should also be consistent in definition and units across all data sets within a submission. For example, if AGE is recorded in years in one study and months in another, then either AGE must be converted to, say, years in the second study or a different variable name, say AGEMON, must be used for age in months. (FDA, CDISC, and SAS)

Variable Label A more detailed description of the variable. This description may be used by some software and review tools. (FDA, CDISC, and SAS)

Type Description of how the data values for this variable are stored. Current conventions specify only a character string (CHAR) or a numeric value (NUM). These two types are consistent with SAS and other software, but additional types such as floating point, integer, binary, and date/time are used by some systems. (FDA, CDISC, and SAS)

Length The number of bytes allocated to store the data value. For CHAR variables, this number is the length of the character string. SAS and other software define this attribute as the number of bytes allocated to store the numeric value in some internally defined form, typically floating point hexadecimal. This length is not the number of digits used to display a number. See Format below. (SAS)

Format Description of how a variable value is displayed. Formats can be general, defining the width of a display field and the number of decimal digits displayed for a number. They can also provide code list values (1 = “Female”; 2 = “Male”) or display numerically coded values such as SAS dates in a human-readable form. (FDA, CDISC, and SAS)

CDISC Tabulation metadata defines this code list attribute as “Controlled Terms.” These terms are standard values for specific variables in certain clinical domains.

Role Description of how a variable may be used in a particular data set. A variable may be assigned multiple roles. The role attribute is used by CDISC in two distinct ways. First, CDISC Tabulation data sets have a set of specific roles designed to describe a variable in the context of the proposed FDA JANUS database [15].

Identifier variables, usually keys, identify the study, subject, domain, or sequence number of the observation.
Topic variables specify the focus of the observation, typically the name of a measurement or lab test in a one record per measurement structured data set.
Timing variables specify some chronological aspect of a record such as visit number or start date.
Qualifier variables define a value with text, units, or data quality. Qualifiers are often used to store variables not otherwise allowed in tabulation data sets.

Role attributes in CDISC Analysis data sets have a emphasis different than those in the tabulation models. Analysis roles focus on providing information useful to the statistical review rather than specification for the still-to-be-developed JANUS database. Because the primary goal of analysis roles is clear communication of the statistical analysis performed by the sponsor, the values of the role attribute are open-ended and can be extended as needed to meet this goal. The following roles have been identified by FDA and CDISC through the Analysis Dataset Model group [16]. It should be noted that these definitions are still under development and may be subject to change.

Selection variables are frequently used to subset, sort, or group data for reporting, displaying, or analysis. Common selection variables include treatment group, age, sex, and race. Specific study designs, drug indications, and end points may have specific selection variables as well. For example, a hypertension trial may identify baseline blood pressure measurements as selection variables of interest. Flag variables identifying analysis populations such as “per protocol” or “intent to treat” are also commonly identified as selection variables.
Analysis variables relating to major study objectives or end points may be identified to assist reviewers. This identification may be especially useful in complex studies in which it may not be clear from the variable name which variable is the primary endpoint.
Support variables are identified as useful for background or reference. For example, the study center identifier may be used to group subjects by center, but the study name or investigator name would provide supporting information.
Statistical roles such as covariate, censor, end point, and so forth may also be useful for specific analyses and study designs. These roles may be added as needed to improve clear communication between the study authors and the reviewers.

Origin Describes the point of origin of a variable. CDISC Tabulation data sets allow CRF “derived” and “sponsor-defined” values of origin. This attribute usually refers to the first occurrence of a value in a clinical study and does not change if a variable is added to another file such as an analysis data set.

Source In a CDISC Analysis data set, source provides information about how a variable was created and defines its immediate predecessor. For example, an analysis data set may be created by merging two or more source datasets. The variable SEX from the demographics domain dataset DM would have a source of “DM.SEX.” This convention of specifying the immediate predecessor defines an audit trail back to the original value, no matter how many generations of data sets are created. Derived variables in analysis data sets may include a code fragment to define the variable or may hyperlink to more extensive documentation or programs.

22.3.3 Value-Level Metadata

Many CDISC Tabulation (SDTM) data sets have a structure of one record per subject per time point per measurement. This structure means that different measurements having different attributes such as name, label, type, and format vary from record to record. To provide clear communication of the content and use of such a data set, each measurement name (test code) must have its own metadata. For example, the CDISC Tabulation model defines a Vital Signs Data Set with measurements for height, weight, and frame size. Table 3 illustrates selected portions of such a data set and its metadata. Some CDISC attributes have been changed or omitted to simplify the example.

Table 3: Vital Signs Tabulation Data Set

Note that the data values of HEIGHT, WEIGHT, and FRMSIZE are stored in the same column, and no useful metadata identifies format, variable type, role, or origin. This data set is difficult to understand and cannot be used with statistical software without additional programming. This file structure requires additional metadata for each value of the vital signs test as shown in Table 4.

Table 4: Vital Signs Tabulation Value-Level Metadata

The value-level metadata defines the attributes needed to transpose the tabulation data set into a CDISC Analysis data set, typically with a structure of one record per subject. This data set, shown in Table 5, contains the data points from the tabulation data set, but in a different structure. Values of SEX and the derived variable body mass index (BMI) have been added to illustrate that analysis data set can include data from several sources.

Table 5: Vital Signs Analysis Data Set

Note that now each measurement is represented by a variable, and each variable can have its own metadata attributes. This data set structure is preferred by most statisticians and can be used directly by most statistical packages.

22.3.4 Item-Level Metadata

Item-level metadata refers to the attributes of an individual data value or cell in a rectangular data matrix. That is, it refers to a value for the crossing of a specific variable (column) and observation (row). Item-level metadata are typically used to describe the quality of that particular measurement, for example, a partial date in which the day portion is missing and is imputed or a lipid measurement in which a frozen sample was accidentally allowed to thaw. In each of these cases, a value exists, but it may be considered to be of lesser quality than a regular measurement.

Because of the complex and expensive nature of clinical trials, it is not always practical to discard such measurements, nor is it scientifically valid to treat them as complete. The identification of these data items is especially important for accurate statistical analysis in regulatory submissions. In a discussion of quality assurance for submission of analysis data sets, one FDA statistical director stated that it was desirable to provide a “clear description of what was done to each data element: edits, imputations, partial missing….” [17].

Historically, the concept of status attributes for data items was used in the 1970s by the Lipids Research Clinic Program to identify missing values and record the editing status of each variable [18]. Clinical research data management systems may also have similar features, but this information is not easily exported to statistical reporting and analysis programs. Currently, the most common method of identifying item-level quality involves the addition of a separate variable that contains the status value. These “status flags” are cumbersome to maintain and can increase the size of data sets considerably. The development of statistical data sets using the extensible Markup Language (XML) [19] has the potential to provide the more flexible structure required for integrating item-level metadata into clinical data sets, but the tools required do not exist at this time.

An audit file system for regulatory statistical reviewers was proposed by an FDA statistical director as a “file describing the changes or edits made during the data management or cleaning of the data,” that is, provide a link or audit trial between the original and submitted data values by providing metadata related to the edits, including the following attributes [20]:

Patient, observation, visit, variable, and other identifiers
Original and submitted values
Qualifiers describing the change such as who, when, and why
Edit codes describing the action taken such as empty (not recorded), completed, replaced, confirmed, suspicious but not collectable.

It is interesting to note that many of the edit codes proposed by the FDA director are similar to the 1970s system described above. It should also be noted that current data management systems do have audit trails, but they typically cannot extract and submit the data in a form that is useable by a regulatory reviewer.

22.4 Analysis Results Metadata

Analysis results metadata define the attributes of a statistical analysis performed on clinical trial data. Analyses may be tables, listings, or figures included in a study report or regulatory submission. Analyses may also be statistical statements in a report, for example, “The sample size required to show a 20% improvement in the primary end point is 200 subjects per treatment arm” or “The active treatment demonstrated a 23% reduction in mortality (p = 0.023) as compared to placebo.”

Analysis results metadata are designed to provide the reader or reviewer with sufficient information to evaluate the analysis performed. Inclusion of such metadata in FDA regulatory submissions was proposed in 2004 [21] and is included in the CDISC Analysis Data Model V 2.0 [22]. By providing this information in a standard format in a predictable location, reviewers can link from a statistical result to metadata that describe the analysis, the reason for performing the analysis, and the data sets and programs used to generate the analysis. Note that analysis results metadata are not part of an analysis data set but that one attribute of analysis results metadata describes the analysis data sets used in the analysis.

Analysis Name − A unique identifier for this analysis. Tables, figures, and listing may incorporate the name and number (Figure 4 or Table 2.3). Conventions for this name may be sponsor-specific to conform to Standard Operating Procedures (SOPs).
Description − Additional text describing the analysis. This field could be used to search for a particular analysis or result.
Reason − Planned analyses should be linked to the Statistical Analysis Plan (SAP). Other reasons would include data driven, exploratory, requested by FDA. and so forth.
Data set(s) − Names of data sets used in the analysis. If data sets are part of a submission, then a link to the data set location should be provided.
Documentation − A description of the statistical methodology, software used for computation, unexpected results, or any other information to provide the reviewer with a clear description of the analysis performed. Links to the SAP, external references, or analysis programs may also be included.

22.5 Regulatory Submission Metadata

22.5.1 ICH Electronic Common Technical Document

In addition to the Common Technical Document described earlier, the ICH has also developed a specification for an electronic Common Technical Document (eCTD), thus defining machine-readable metadata for regulatory submissions. This eCTD is defined to serve as an interface for industry-to-agency transfer of regulatory information while at the same time taking into consideration the facilitation of the creation, review, life cycle management, and archiving of the electronic submission [23]. The eCTD uses XML [24] to define the overall structure of the document. The purpose of this XML backbone is twofold: “[1] to manage meta-data for the entire submission and each document within the submission and [2] to constitute a comprehensive table of contents and provide corresponding navigation aids” [25].

Metadata on submission level include information about submitting and receiving organization, manufacturer, publisher, ID and kind of the submission, and related data items. Examples for metadata on document level are versioning information, language, descriptive information such as document names, and checksums used to ensure accuracy.

22.5.2 FDA Guidance on eCTD Submissions

The FDA has developed a guidance for electronic submission based on the ICH eCTD backbone. As discussed above, the ICH does not define detailed specification for submission of clinical data within the eCTD, but it does provide a “place-holder” for such data in the guideline E3 “Structure and Content of Clinical Study Reports” as Appendix 16.4 with the archaic term “INDIVIDUAL PATIENT DATA LISTINGS (US ARCHIVAL LISTINGS)” [26]. The FDA eCTD specifies that submitted data sets should be organized as follows [27]:

Individual Patient Data Listings (CRTs)

Data tabulations

– Data tabulations data sets

– Data definitions

– Annotated case report form

Data listing

– Data listing data sets

– Data definitions

– Annotated case report form

Analysis data sets

– Analysis data sets

– Analysis programs

– Data definitions

– Annotated case report form

Subject profiles
IND safety reports

The FDA Study Data Specification document [28] defines tabulation and analysis data sets and refers to the CDISC Data Definition Specification (Define.XML) for machine-readable data set metadata in XML [29]. This machine-readable metadata and the ICH eCDT are key elements in providing clear communication of the content and structure of clinical data sets and regulatory submissions. These metadata standards allow the regulatory agencies, software developers, and drug developers to create and use standard tools for creating, displaying, and reviewing electronic submissions and clinical data sets.

The FDA has developed a viewer to use the ICH eCTD backbone to catalog and view the components of a submission, thus providing FDA reviewers with a powerful tool to view, manage, and review submissions. Software developers have used the Define.XML standard metadata to develop tools for compiling and viewing patient profiles and viewing tabulation data sets. SAS Institute has developed software to generate XML-based data sets with Define.XML metadata [30] and viewing tools for review and analysis. These first steps demonstrate the power of having clearly defined metadata for clinical research. The adoption and additional specification of these metadata standards will provide the basis for the development of a new generation of tools for review and analysis. Future developments may include protocol authoring tools, Statistical Analysis Plan templates, eCRF and CRF automated database design, automated analysis and reporting, and submission assembly. This tool development by government, drug developers, and software providers will contribute to drug development and approval by enhancing the clear communication of the content and structure of clinical trial data and documents.

References

[1] U.S. Trademark Registration No. 1,409,260.

[2] Mars Climate Orbiter Mishap Investigation Board Phase I Report, 1999:13. Available: ftp.hq.nasa.gov/pub/pao/reports/1999/MCO_report.pdf.

[3] The Federal Geographic Data Committee. Available: www.fgdc.gov/.

[4] U.S. Geologic Survey, Metadata in Plain Language. Available: geology.usgs.gov/tools/metadata/tools/doc/etc/.

[5] International Conference on Harmonisation. Available: www.ich.org.

[6] International Conference on Harmonisation, Organization of the Common Technical Document for the Registration of Pharmaceuticals for Human Uses M4, 2004. Available: www.ich.org/LOB/media/MEDIA554.pdf.

[7] Providing Regulatory Submissions in Electronic Format—NDAs, FDA Guidance, 1999.

[8] D. H. Christiansen and W. Kubick, CDISC Submission Metadata Model. 2001. Available: www.cdisc.org/standards/Submission Metadata ModelV2.pdf.

[9] FDA Study Data Specification, 2006. Available: www.fda.gov/cder/regulatory/ersr/Studydata-v1.3.pdf.

[10] CDISC Study Data Tabulation Model Version 1.1, 2005. Available: www.cdisc.org/models/sds/v3.1/index.html.

[11] CDISC Analysis Data Model: Version 2.0, 2006. Available: www.cdisc.org/pdf/ADaMdocument_v2.0_2_Final_2006-08-24.pdf.

[12] FDA Future Guidance List, p. 2. Available: www.fda.gov/cder/guidance/CY06.pdf.

[13] Providing Regulatory Submissions in Electronic Format—Human Pharmaceutical Product Applications and Related Submissions Using the eCTD Specifications, FDA Guidance, 2006. Available: www.fda.gov/cder/guidance/7087rev.pdf.

[14] SAS^® 9.1.3 Language Reference: Concepts. Cary, NC: SAS Institute Inc., 2005, pp. 475–476.

[15] JANUS Project Description, NCI Clinical Research Information Exchange, 2006. Available: crix.nci.nih.gov/projects/janus/.

[16] D. H. Christiansen and S. E. Wilson, Submission of analysis data sets and documentation: scientific and regulatory perspectives. PharmaSUG 2004 Conference Proc., San Diego, CA, 2004: Paper FC04, p. 3.

[17] S. E. Wilson, Submission of analysis datasets and documentation: regulatory perspectives. PharmaSUG 2004 Conference Proc., San Diego, CA, 2004.

[18] W. C. Smith, Correction of data errors in a large collaborative health study. Joint Statistical Meetings presentation, Atlanta, GA, 1975.

[19] SAS^® 9.1.3 XML LIBNAME Engine: User’s Guide. Cary, NC: SAS Institute Inc., 2004.

[20] S. E. Wilson, Clinical data quality: a regulator’s perspective. DIA 38th Annual Meeting presentation, Chicago, IL, 2002: 20–21. Available: www.fda.gov/cder/present/DIA62002/default.htm.

[21] D. H. Christiansen and S. E. Wilson, submission of analysis datasets and documentation: scientific and regulatory perspectives. PharmaSUG 2004 Conference Proc., San Diego, CA, 2004: Paper FC04, p. 5.

[22] CDISC Analysis Data Model: Version 2.0, 2006, p. 22. Available: www.cdisc.org/pdf/ADaMdocument_v2.0_2_Final_2006-08-24.pdf.

[23] ICH M2 EWG Electronic Common Technical Document Specification V 3.2, 2004, p. 1. Available: http://estri.ich.org/eCTD/eCTD_Specification_v3-2.pdf.

[24] World Wide Web Consortium (W3C) Extensible Markup Language (XML). Available: http://www.w3.org/XML/.

[25] ICH M2 EWG Electronic Common Technical Document Specification V 3.2. 2004: Appendix 1, p. 1-1. Available: http://estri.ich.org/eCTD/eCTD-Specification_v3_2.pdf.

[26] ICH Structure and Content of Clinical Study Reports E3, p. 29. Available: http://www.ich.org/LOB/media/MEDIA479.pdf.

[27] Providing Regulatory Submissions in Electronic Format—Human Pharmaceutical Product Applications and Related Submissions Using the eCTD Specifications, FDA Guidance, 2006. Available: http://www.fda.gov/cder/guidance/7087rev.pdf.

[28] FDA Study Specifications. Available: http://www.fda.gov/cder/regulatory/ersr/Studydata-v1.3.pdf.

[29] CDISC Case Report Tabulation Data Definition Specification (define.xml). Available: http://www.cdisc.org/models/def/v1.0/CRT_DDSpecification1_0_0.pdf.

[30] SAS^® 9.1.3 XML LIBNAME Engine: User’s Guide. Cary NC: SAS Institute Inc., 2004, pp. 27, 39–43.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 22: Metadata

Create new playlist

Sign In

Sign Up

Chapter 22

Metadata

22.1 Introduction

22.2 History/Background

22.2.1 A Metadata Example

22.2.2 Geospatial Data

22.2.3 Research Data and Statistical Software

22.2.4 Electronic Regulatory Submission

22.3 Data Set Metadata

22.3.1 Data Set-Level Metadata

22.3.2 Variable-Level Metadata

22.3.3 Value-Level Metadata

22.3.4 Item-Level Metadata

22.4 Analysis Results Metadata

22.5 Regulatory Submission Metadata

22.5.1 ICH Electronic Common Technical Document

22.5.2 FDA Guidance on eCTD Submissions

References

Table of Contents for
Chapter 22: Metadata