To learn from data, we must be able to understand and manage data in all forms. Data originates from many different sources, and consequently, datasets may differ widely in structure or have little or no structure at all. In this section, we present a high-level classification of datasets with commonly occurring examples.
Based on their structure, or the lack thereof, datasets may be classified as containing the following:
Three genomic sequences are taken into consideration to show the repetition of the sequences CGGGT
and TTGAAAGTGGTG
in all three genomic sequences:
3.138.106.233