Summary

We stored annual sunspots cycles data in different relational and NoSQL databases.

The term relational here does not just pertain to relationships between tables; firstly, it has to do with the relationship between columns inside a table; secondly, it relates to connections between tables.

The sqlite3 module in the standard Python distribution can be used to work with a SQLite database. We can give pandas a SQLite database connection or a SQLAlchemy connection.

SQLAlchemy is renowned for its ORM, based on a design pattern, where Python classes are mapped to database tables. The ORM pattern is a general architectural pattern applicable to other object-oriented programming languages. SQLAlchemy abstracts away the technical details of working with databases including writing SQL.

MongoDB is a document-based store, which can hold a huge amount of data.

In the in-memory mode, Redis is extremely fast, with writing and reading being almost equally fast. Redis is a key-value store that functions similarly to a Python dictionary.

Apache Cassandra mixes features of key-value and traditional relational databases. It is column oriented and its columns are organized into families, which are the equivalent of tables in relational databases. Rows in Apache Cassandra are not tied to a particular set of columns.

The next chapter, Chapter 9, Analyzing Textual Data and Social Media, describes analysis techniques for plain text data. Plain text data is found in many organizations and on the Internet. Generally, plain text data is very unstructured and requires a different approach than data that has been tabulated and cleaned. For the analysis, we will use NLTK—an open source Python package. NLTK is very comprehensive and comes with its own datasets.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.105.215