The numpy
library1 gives Python the ability to work with matrices and arrays.
1. https://numpy.org/doc/stable/
import numpy as np
Pandas started off as an extension to numpy.ndarray
that provided more features suitable for data analysis. Since then, Pandas has evolved to the point that it shouldn’t be thought of as a collection of numpy
arrays, since the two libraries are different.
import pandas as pd
df = pd.read_csv('data/concat_1.csv')
print(df)
A B C D
0 a0 b0 c0 d0
1 a1 b1 c1 d1
2 a2 b2 c2 d2
3 a3 b3 c3 d3
If you do need to get the numpy.ndarray values
from a Series
or DataFrame
, you can use the values
attribute.
a = df['A']
print(a)
0 a0
1 a1
2 a2
3 a3
Name: A, dtype: object
print(type(a))
<class 'pandas.core.series.Series'>
print(a.values)
['a0' 'a1' 'a2' 'a3']
print(type(a.values))
<class 'numpy.ndarray'>
This is particularly helpful when cleaning data in Pandas. You can then use your newly cleaned data in other Python libraries that do not fully support the Series
and DataFrame
objects. The Software-Carpentry Python Inflammation lesson2 uses numpy
and can be another good reference to learn about the library and Python as a whole.
2. https://swcarpentry.github.io/python-novice-inflammation/
18.119.19.174