Unstructured data

Any documents, including clinical documentation, personal emails, progress notes, and business reports, that are composed primarily of unstructured text with little or no predefined structure or metadata describing the content of the document can be classified as unstructured data. Unstructured data is different from structured data in the sense that its structure is unpredictable.

Examples of unstructured data include documents, emails, blogs, reports, minutes, notes, digital images, videos, and satellite imagery. It also incorporates some data produced by machines or sensors. In fact, unstructured data accounts for the majority of the data that’s on any company’s premises, as well as external to any company in online private and public sources, such as LinkedIn, Twitter, Snapchat, Instagram, Dribble, YouTube, and Facebook.

Sometimes, metadata tags are attached to give information and context about the content of the data. The data with meta information is referred to as semi-structured. The line between unstructured and semi-structured data is not absolute, though some data-management consultants dispute that all data, even the unstructured kind, has some degree of structure:

Figure 3.4: Different types of unstructured data sources
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.171.58