Variety refers to the different data formats. Relational databases, Excel files, or even simple text files are all examples of different data formats. A system should be capable of handling new varieties of data as and when they arrive. Extensibility is the key component for a data-intensive system when it comes to handling varieties of data. Data variety can be broadly classified into three major blocks:
- Structured: Data that has a well-defined schema associated with it, for example, relational data, and XML-formatted data.
- Semi-structured: Data whose structure can be anticipated but that does not always conform to a set standard. Examples include JSON-formatted data, and columnar data.
- Unstructured: Binary large object (BLOB) data, for example, video, and audio.