Label, integer, and mixed indexing

In addition to the standard indexing operator [] and attribute operator, there are operators provided in pandas to make the job of indexing easier and more convenient. By label indexing, we generally mean indexing by a header name, which tends to be a string value in most cases. These operators are as follows:

  • The .loc operator: It allows label-oriented indexing
  • The .iloc operator: It allows integer-based indexing
  • The .ix operator: It allows mixed label and integer-based indexing

We will now turn our attention to these operators.

Label-oriented indexing

The .loc operator supports pure label-based indexing. It accepts the following as valid inputs:

  • A single label such as ['March'], [88] or ['Dubai']. Note that in the case where the label is an integer, it doesn't refer to the integer position of the index, but to the integer itself as a label.
  • List or array of labels, for example, ['Dubai','UK Brent'].
  • A slice object with labels, for example, 'May':'Aug'.
  • A Boolean array.

For our illustrative dataset, we use the average snowy weather temperature data for New York city from the following:

Create DataFrame

In [723]: NYC_SnowAvgsData={'Months' :
['January','February','March', 
'April', 'November', 'December'],
'Avg SnowDays' : [4.0,2.7,1.7,0.2,0.2,2.3],
'Avg Precip. (cm)' : [17.8,22.4,9.1,1.5,0.8,12.2],
'Avg Low Temp. (F)' : [27,29,35,45,42,32] }
In [724]: NYC_SnowAvgsData
Out[724]:{'Avg Low Temp. (F)': [27, 29, 35, 45, 42, 32],
          'Avg Precip. (cm)': [17.8, 22.4, 9.1, 1.5, 0.8, 12.2],
          'Avg SnowDays': [4.0, 2.7, 1.7, 0.2, 0.2, 2.3],
          'Months': ['January', 'February', 'March', 'April',   
                     'November', 'December']}

In [726]:NYC_SnowAvgs=pd.DataFrame(NYC_SnowAvgsData,      
                      index=NYC_SnowAvgsData['Months'], 
                      columns=['Avg SnowDays','Avg Precip. (cm)',   
                               'Avg Low Temp. (F)'])
         NYC_SnowAvgs

Out[726]:        Avg SnowDays   Avg Precip. (cm) Avg Low Temp. (F)
        January  4.0            17.8             27
        February 2.7            22.4             29
        March    1.7            9.1              35
        April    0.2            1.5              45
        November 0.2            0.8              42
        December 2.3            12.2             32

Using a single label:

In [728]: NYC_SnowAvgs.loc['January']
Out[728]: Avg SnowDays          4.0
          Avg Precip. (cm)     17.8
          Avg Low Temp. (F)    27.0
          Name: January, dtype: float64

Using a list of labels:

In [730]: NYC_SnowAvgs.loc[['January','April']]
Out[730]:        Avg SnowDays   Avg Precip. (cm) Avg Low Temp. (F)
        January  4.0            17.8             27
        April    0.2            1.5              45

Using a label range:

In [731]: NYC_SnowAvgs.loc['January':'March']
Out[731]:        Avg SnowDays   Avg Precip. (cm) Avg Low Temp. (F)
        January  4.0            17.8             27
        February 2.7            22.4             29
        March    1.7            9.1              35

Note that while using the .loc, .iloc, and .ix operators on a DataFrame, the row index must always be specified first. This is the opposite of the [] operator, where only columns can be selected directly. Hence, we get an error if we do the following:

In [771]: NYC_SnowAvgs.loc['Avg SnowDays']

KeyError: 'Avg SnowDays'

The correct way to do this is to specifically select all rows by using the colon (:) operator as follows:

In [772]: NYC_SnowAvgs.loc[:,'Avg SnowDays']
Out[772]: January     4.0
          February    2.7
          March       1.7
          April       0.2
          November    0.2
          December    2.3
          Name: Avg SnowDays, dtype: float64

Here, we see how to select a specific coordinate value, namely the average number of snow days in March:

In [732]: NYC_SnowAvgs.loc['March','Avg SnowDays']
Out[732]: 1.7

This alternative style is also supported:

In [733]: NYC_SnowAvgs.loc['March']['Avg SnowDays']
Out[733]: 1.7

The following is the equivalent of the preceding case using the square bracket operator []:

In [750]: NYC_SnowAvgs['Avg SnowDays']['March']
Out[750]: 1.7

Note again, however, that specifying the row index value first as is done with the .loc operator will result in Keyerror. This is a consequence of the fact discussed previously, that the [] operator cannot be used to select rows directly. The columns must be selected first to obtain a Series, which can then be selected by rows. Thus, you will get KeyError: u'no item named March' if you use either of the following:

In [757]: NYC_SnowAvgs['March']['Avg SnowDays']

Or

In [758]: NYC_SnowAvgs['March']

We can use the .loc operator to select the rows instead:

In [759]: NYC_SnowAvgs.loc['March']
Out[759]: Avg SnowDays          1.7
          Avg Precip. (cm)      9.1
          Avg Low Temp. (F)    35.0
          Name: March, dtype: float64

Selection using a Boolean array

Now, we will show how to select which months have less than one snow day on average by using a Boolean array:

In [763]: NYC_SnowAvgs.loc[NYC_SnowAvgs['Avg SnowDays']<1,:]
Out[763]:         Avg SnowDays  Avg Precip. (cm) Avg Low Temp. (F)
        April     0.2           1.5              45
        November  0.2           0.8              42

Or, in the case of the spot crude prices mentioned earlier, select the column corresponding to the brand of crude that was priced above 110 a barrel for row 2013-Q1:

In [768]: SpotCrudePrices_2013.loc[:,SpotCrudePrices_2013.loc['2013-Q1']>110]
Out[768]:        UK_Brent
         2013-Q1 112.9
         2013-Q2 103.0
         2013-Q3 110.1
         2013-Q4 109.4

Note that the preceding arguments involve the Boolean operators < and > that actually evaluate the Boolean arrays, for example:

In [769]: SpotCrudePrices_2013.loc['2013-Q1']>110
Out[769]: Dubai                    False
          UK_Brent                 True
          West_Texas_Intermediate  False
          Name: 2013-Q1, dtype: bool

Integer-oriented indexing

The .iloc operator supports integer-based positional indexing. It accepts the following as inputs:

  • A single integer, for example, 7
  • A list or array of integers, for example, [2,3]
  • A slice object with integers, for example, 1:4

Let us create the following:

In [777]: import scipy.constants as phys
          import math
In [782]: sci_values=pd.DataFrame([[math.pi, math.sin(math.pi), 
                                    math.cos(math.pi)],
                                   [math.e,math.log(math.e), 
                                    phys.golden],
                                   [phys.c,phys.g,phys.e],
                                   [phys.m_e,phys.m_p,phys.m_n]],
                                   index=list(range(0,20,5)))

Out[782]:             0               1               2
                0     3.141593e+00    1.224647e-16   -1.000000e+00
                5     2.718282e+00    1.000000e+00    1.618034e+00
                10    2.997925e+08    9.806650e+00    1.602177e-19
                15    9.109383e-31    1.672622e-27    1.674927e-27

We can select the non-physical constants in the first two rows by using integer slicing:

In [789]: sci_values.iloc[:2]
Out[789]:               0       1        2
        0        3.141593  1.224647e-16 -1.000000
        5        2.718282  1.000000e+00  1.618034

Alternatively, we can use the speed of light and the acceleration of gravity in the third row:

In [795]: sci_values.iloc[2,0:2]
Out[795]: 0    2.997925e+08
          1    9.806650e+00
          dtype: float64

Note that the arguments to .iloc are strictly positional and have nothing to do with the index values. Hence, consider a case where we mistakenly think that we can obtain the third row by using the following:

In [796]: sci_values.iloc[10]
          ------------------------------------------------------
          IndexError                                Traceback (most recent call last)
          ...
          IndexError: index 10 is out of bounds for axis 0 with size 4

Here, we get IndexError in the preceding result; so, now, we should use the label-indexing operator .loc instead, as follows:

In [797]: sci_values.loc[10]
Out[797]: 0    2.997925e+08
          1    9.806650e+00
          2    1.602177e-19
          Name: 10, dtype: float64

To slice out a specific row, we can use the following:

In [802]: sci_values.iloc[2:3,:]
Out[802]:     0          1       2
         10   299792458  9.80665 1.602177e-19

To obtain a cross-section using an integer position, use the following:

In [803]: sci_values.iloc[3]
Out[803]: 0    9.109383e-31
          1    1.672622e-27
          2    1.674927e-27
          Name: 15, dtype: float64

If we attempt to slice past the end of the array, we obtain IndexError as follows:

In [805]: sci_values.iloc[6,:]
          --------------------------------------------------------
          IndexError                                Traceback (most recent call last)
          IndexError: index 6 is out of bounds for axis 0 with size 4

The .iat and .at operators

The .iat and .at operators can be used for a quick selection of scalar values. This is best illustrated as follows:

In [806]: sci_values.iloc[3,0]
Out[806]: 9.1093829099999999e-31
In [807]: sci_values.iat[3,0]
Out[807]: 9.1093829099999999e-31

In [808]: %timeit sci_values.iloc[3,0]
          10000 loops, best of 3: 122 μs per loop
In [809]: %timeit sci_values.iat[3,0]
          10000 loops, best of 3: 28.4 μs per loop

Thus, we can see that .iat is much faster than the .iloc/.ix operators. The same applies to .at versus .loc.

Mixed indexing with the .ix operator

The .ix operator behaves like a mixture of the .loc and .iloc operators, with the .loc behavior taking precedence. It takes the following as possible inputs:

  • A single label or integer
  • A list of integers or labels
  • An integer slice or label slice
  • A Boolean array

Let us re-create the following DataFrame by saving the stock index closing price data to a file (stock_index_closing.csv) and reading it in:

TradingDate,Nasdaq,S&P 500,Russell 2000
2014/01/30,4123.13,1794.19,1139.36
2014/01/31,4103.88,1782.59,1130.88
2014/02/03,3996.96,1741.89,1094.58
2014/02/04,4031.52,1755.2,1102.84
2014/02/05,4011.55,1751.64,1093.59
2014/02/06,4057.12,1773.43,1103.93

The source for this data is http://www.economagic.com/sp.htm#Daily. Here's how we read the CSV data into a DataFrame:

In [939]: stockIndexDataDF=pd.read_csv('./stock_index_data.csv')
In [940]: stockIndexDataDF
Out[940]:        TradingDate  Nasdaq   S&P 500  Russell 2000
        0        2014/01/30   4123.13  1794.19  1139.36
        1        2014/01/31   4103.88  1782.59  1130.88
        2        2014/02/03   3996.96  1741.89  1094.58
        3        2014/02/04   4031.52  1755.20  1102.84
        4        2014/02/05   4011.55  1751.64  1093.59
        5        2014/02/06   4057.12  1773.43  1103.93

What we see from the preceding example is that the DataFrame created has an integer-based row index. We promptly set the index to be the trading date to index it based on the trading date so that we can use the .ix operator:

In [941]: stockIndexDF=stockIndexDataDF.set_index('TradingDate')
In [942]:stockIndexDF
Out[942]:               Nasdaq   S&P 500  Russell 2000
        TradingDate                     
        2014/01/30      4123.13  1794.19  1139.36
        2014/01/31      4103.88  1782.59  1130.88
        2014/02/03      3996.96  1741.89  1094.58
        2014/02/04      4031.52  1755.20  1102.84
        2014/02/05      4011.55  1751.64  1093.59
        2014/02/06      4057.12  1773.43  1103.93

We now show examples of using the .ix operator:

Using a single label:

In [927]: stockIndexDF.ix['2014/01/30']
Out[927]: Nasdaq          4123.13
          S&P 500         1794.19
          Russell 2000    1139.36
          Name: 2014/01/30, dtype: float64

Using a list of labels:

In [928]: stockIndexDF.ix[['2014/01/30']]
Out[928]:           Nasdaq   S&P 500  Russell 2000
        2014/01/30  4123.13  1794.19  1139.36

In [930]: stockIndexDF.ix[['2014/01/30','2014/01/31']]
Out[930]:           Nasdaq  S&P 500  Russell 2000
        2014/01/30  4123.13    1794.19  1139.36
        2014/01/31  4103.88    1782.59  1130.88

Note the difference in the output between using a single label versus using a list containing just a single label. The former results in a series and the latter, a DataFrame:

In [943]: type(stockIndexDF.ix['2014/01/30'])
Out[943]: pandas.core.series.Series

In [944]: type(stockIndexDF.ix[['2014/01/30']])
Out[944]: pandas.core.frame.DataFrame

For the former, the indexer is a scalar; for the latter, the indexer is a list. A list indexer is used to select multiple columns. A multi-column slice of a DataFrame can only result in another DataFrame since it is 2D; hence, what is returned in the latter case is a DataFrame.

Using a label-based slice:

In [932]: tradingDates=stockIndexDataDF.TradingDate
In [934]: stockIndexDF.ix[tradingDates[:3]]
Out[934]:               Nasdaq    S&P 500  Russell 2000
        2014/01/30       4123.13  1794.19  1139.36
        2014/01/31       4103.88  1782.59  1130.88
        2014/02/03       3996.96  1741.89  1094.58

Using a single integer:

In [936]: stockIndexDF.ix[0]
Out[936]: Nasdaq          4123.13
          S&P 500         1794.19
          Russell 2000    1139.36
          Name: 2014/01/30, dtype: float64

Using a list of integers:

In [938]: stockIndexDF.ix[[0,2]]
Out[938]:               Nasdaq   S&P 500  Russell 2000
        TradingDate                     
        2014/01/30      4123.13  1794.19  1139.36
        2014/02/03      3996.96  1741.89  1094.58

Using an integer slice:

In [947]: stockIndexDF.ix[1:3]
Out[947]:               Nasdaq   S&P 500  Russell 2000
        TradingDate                     
        2014/01/31      4103.88  1782.59  1130.88
        2014/02/03      3996.96  1741.89  1094.58

Using a Boolean array:

In [949]: stockIndexDF.ix[stockIndexDF['Russell 2000']>1100]
Out[949]:               Nasdaq  S&P 500  Russell 2000
        TradingDate                     
        2014/01/30      4123.13 1794.19  1139.36
        2014/01/31      4103.88 1782.59  1130.88
        2014/02/04      4031.52 1755.20  1102.84
        2014/02/06      4057.12 1773.43  1103.93

As in the case of .loc, the row index must be specified first for the .ix operator.

MultiIndexing

We now turn to the topic of MultiIndexing. Multi-level or hierarchical indexing is useful because it enables the pandas user to select and massage data in multiple dimensions by using data structures such as Series and DataFrame. In order to start, let us save the following data to a file: stock_index_prices.csv and read it in:

TradingDate,PriceType,Nasdaq,S&P 500,Russell 2000
2014/02/21,open,4282.17,1841.07,1166.25
2014/02/21,close,4263.41,1836.25,1164.63
2014/02/21,high,4284.85,1846.13,1168.43
2014/02/24,open,4273.32,1836.78,1166.74
2014/02/24,close,4292.97,1847.61,1174.55
2014/02/24,high,4311.13,1858.71,1180.29
2014/02/25,open,4298.48,1847.66,1176
2014/02/25,close,4287.59,1845.12,1173.95
2014/02/25,high,4307.51,1852.91,1179.43
2014/02/26,open,4300.45,1845.79,1176.11
2014/02/26,close,4292.06,1845.16,1181.72
2014/02/26,high,4316.82,1852.65,1188.06
2014/02/27,open,4291.47,1844.9,1179.28
2014/02/27,close,4318.93,1854.29,1187.94
2014/02/27,high,4322.46,1854.53,1187.94
2014/02/28,open,4323.52,1855.12,1189.19
2014/02/28,close,4308.12,1859.45,1183.03
2014/02/28,high,4342.59,1867.92,1193.5

In [950]:sharesIndexDataDF=pd.read_csv('./stock_index_prices.csv')
In [951]: sharesIndexDataDF
Out[951]:
  TradingDate  PriceType  Nasdaq     S&P 500  Russell 2000
0   2014/02/21   open     4282.17  1841.07  1166.25
1   2014/02/21   close     4263.41  1836.25  1164.63
2   2014/02/21   high     4284.85  1846.13  1168.43
3   2014/02/24   open     4273.32  1836.78  1166.74
4   2014/02/24   close     4292.97  1847.61  1174.55
5   2014/02/24   high      4311.13  1858.71  1180.29
6   2014/02/25   open     4298.48  1847.66  1176.00
7   2014/02/25   close     4287.59  1845.12  1173.95
8   2014/02/25   high     4307.51  1852.91  1179.43
9   2014/02/26   open     4300.45  1845.79  1176.11
10   2014/02/26   close     4292.06  1845.16  1181.72
11   2014/02/26   high     4316.82  1852.65  1188.06
12   2014/02/27   open     4291.47  1844.90  1179.28
13   2014/02/27   close     4318.93  1854.29  1187.94
14   2014/02/27   high     4322.46  1854.53  1187.94
15   2014/02/28   open     4323.52  1855.12 1189.19
16   2014/02/28   close     4308.12  1859.45 1183.03
17   2014/02/28   high     4342.59  1867.92 1193.50

Here, we create a MultiIndex from the trading date and priceType columns:

In [958]: sharesIndexDF=sharesIndexDataDF.set_index(['TradingDate','PriceType'])
In [959]: mIndex=sharesIndexDF.index; mIndex
Out[959]: MultiIndex
    [(u'2014/02/21', u'open'), (u'2014/02/21', u'close'), (u'2014/02/21', u'high'), (u'2014/02/24', u'open'), (u'2014/02/24', u'close'), (u'2014/02/24', u'high'), (u'2014/02/25', u'open'), (u'2014/02/25', u'close'), (u'2014/02/25', u'high'), (u'2014/02/26', u'open'), (u'2014/02/26', u'close'), (u'2014/02/26', u'high'), (u'2014/02/27', u'open'), (u'2014/02/27', u'close'), (u'2014/02/27', u'high'), (u'2014/02/28', u'open'), (u'2014/02/28', u'close'), (u'2014/02/28', u'high')]

In [960]: sharesIndexDF
Out[960]:              Nasdaq  S&P 500   Russell 2000
TradingDate PriceType
2014/02/21  open    4282.17  1841.07  1166.25
            close   4263.41  1836.25  1164.63
            high    4284.85  1846.13  1168.43
2014/02/24  open    4273.32  1836.78  1166.74
            close   4292.97  1847.61  1174.55
            high    4311.13  1858.71  1180.29
2014/02/25  open    4298.48  1847.66  1176.00
            close   4287.59  1845.12  1173.95
            high    4307.51  1852.91  1179.43
2014/02/26  open    4300.45  1845.79  1176.11
            close   4292.06  1845.16  1181.72
            high    4316.82  1852.65  1188.06
2014/02/27  open    4291.47  1844.90  1179.28
            close   4318.93  1854.29  1187.94
            high    4322.46  1854.53  1187.94
2014/02/28  open    4323.52  1855.12  1189.19
            close   4308.12  1859.45  1183.03
            high    4342.59  1867.92  1193.50

Upon inspection, we see that the MultiIndex consists of a list of tuples. Applying the get_level_values function with the appropriate argument produces a list of the labels for each level of the index:

In [962]: mIndex.get_level_values(0)
Out[962]: Index([u'2014/02/21', u'2014/02/21', u'2014/02/21', u'2014/02/24', u'2014/02/24', u'2014/02/24', u'2014/02/25', u'2014/02/25', u'2014/02/25', u'2014/02/26', u'2014/02/26', u'2014/02/26', u'2014/02/27', u'2014/02/27', u'2014/02/27', u'2014/02/28', u'2014/02/28', u'2014/02/28'], dtype=object)

In [963]: mIndex.get_level_values(1)
Out[963]: Index([u'open', u'close', u'high', u'open', u'close', u'high', u'open', u'close', u'high', u'open', u'close', u'high', u'open', u'close', u'high', u'open', u'close', u'high'], dtype=object)

However, IndexError will be thrown if the value passed to get_level_values() is invalid or out of range:

In [88]: mIndex.get_level_values(2)
         ---------------------------------------------------------
IndexError                      Traceback (most recent call last)
...

You can achieve hierarchical indexing with a MultiIndexed DataFrame:

In [971]: sharesIndexDF.ix['2014/02/21']
Out[971]:       Nasdaq   S&P 500	Russell 2000
  PriceType
  open       4282.17  1841.07  1166.25
  close       4263.41  1836.25  1164.63
  high       4284.85  1846.13  1168.43

In [976]: sharesIndexDF.ix['2014/02/21','open']
Out[976]: Nasdaq          4282.17
    S&P 500         1841.07
    Russell 2000    1166.25
    Name: (2014/02/21, open), dtype: float64 

We can slice using a MultiIndex:

In [980]: sharesIndexDF.ix['2014/02/21':'2014/02/24']
Out[980]:      Nasdaq   S&P 500   Russell 2000
  TradingDate  PriceType
  2014/02/21   open  4282.17   1841.07   1166.25
         close  4263.41   1836.25   1164.63
         high  4284.85   1846.13   1168.43
  2014/02/24   open  4273.32   1836.78   1166.74
         close  4292.97   1847.61   1174.55
         high  4311.13   1858.71   1180.29

We can try slicing at a lower level:

In [272]:
sharesIndexDF.ix[('2014/02/21','open'):('2014/02/24','open')]
------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-272-65bb3364d980> in <module>()
----> 1 sharesIndexDF.ix[('2014/02/21','open'):('2014/02/24','open')]
...
KeyError: 'Key length (2) was greater than MultiIndex lexsort depth (1)'

However, this results in KeyError with a rather strange error message. The key lesson to be learned here is that the current incarnation of MultiIndex requires the labels to be sorted for the lower-level slicing routines to work correctly.

In order to do this, you can utilize the sortlevel() method, which sorts the labels of an axis within a MultiIndex. To be on the safe side, sort first before slicing with a MultiIndex. Thus, we can do the following:

In [984]: sharesIndexDF.sortlevel(0).ix[('2014/02/21','open'):('2014/02/24','open')]
Out[984]:          Nasdaq    S&P 500  Russell 2000
  TradingDate  PriceType
  2014/02/21   open      4282.17   1841.07   1166.25
  2014/02/24   close     4292.97   1847.61   1174.55
               high      4311.13   1858.71   1180.29
               open      4273.32   1836.78   1166.74

We can also pass a list of tuples:

In [985]: sharesIndexDF.ix[[('2014/02/21','close'),('2014/02/24','open')]]
Out[985]:      Nasdaq  S&P 500  Russell 2000
  TradingDate  PriceType
  2014/02/21   close  4263.41  1836.25  1164.63
  2014/02/24   open  4273.32  1836.78  1166.74
  2 rows × 3 columns

Note

Note that by specifying a list of tuples, instead of a range as in the previous example, we display only the values of the open PriceType rather than all three for the TradingDate 2014/02/24.

Swapping and reordering levels

The swaplevel function enables levels within the MultiIndex to be swapped:

In [281]: swappedDF=sharesIndexDF[:7].swaplevel(0, 1, axis=0)
            swappedDF
Out[281]:        Nasdaq    S&P 500  Russell 2000
  PriceType  TradingDate
  open	     2014/02/21   4282.17  1841.07  1166.25
  close    2014/02/21    4263.41  1836.25  1164.63
  high     2014/02/21    4284.85  1846.13  1168.43
  open     2014/02/24    4273.32  1836.78  1166.74
  close    2014/02/24    4292.97  1847.61  1174.55
  high     2014/02/24    4311.13  1858.71  1180.29
  open	    2014/02/25    4298.48  1847.66  1176.00
  7 rows × 3 columns

The reorder_levels function is more general, allowing you to specify the order of the levels:

In [285]: reorderedDF=sharesIndexDF[:7].reorder_levels(['PriceType',
                                                      'TradingDate'],
                                                       axis=0)
       reorderedDF
Out[285]:        Nasdaq    S&P 500  Russell 2000
  PriceType  TradingDate
  open     2014/02/21   4282.17  1841.07  1166.25
  close    2014/02/21   4263.41  1836.25  1164.63
  high     2014/02/21   4284.85  1846.13  1168.43
  open     2014/02/24   4273.32  1836.78  1166.74
  close    2014/02/24   4292.97  1847.61  1174.55
  high     2014/02/24   4311.13  1858.71  1180.29
  open     2014/02/25   4298.48  1847.66  1176.00
  7 rows × 3 columns

Cross sections

The xs method provides a shortcut means of selecting data based on a particular index level value:

In [287]: sharesIndexDF.xs('open',level='PriceType')
Out[287]:
      Nasdaq    S&P 500  Russell 2000
  TradingDate
  2014/02/21   4282.17  1841.07  1166.25
  2014/02/24   4273.32  1836.78  1166.74
  2014/02/25   4298.48  1847.66  1176.00
  2014/02/26   4300.45  1845.79  1176.11
  2014/02/27   4291.47  1844.90  1179.28
  2014/02/28   4323.52  1855.12  1189.19
  6 rows × 3 columns

The more long-winded alternative to the preceding command would be to use swaplevel to switch between the TradingDate and PriceType levels and then, perform the selection as follows:

In [305]: sharesIndexDF.swaplevel(0, 1, axis=0).ix['open']
Out[305]:     Nasdaq   S&P 500  Russell 2000
  TradingDate
  2014/02/21  4282.17  1841.07  1166.25
  2014/02/24  4273.32  1836.78  1166.74
  2014/02/25  4298.48  1847.66  1176.00
  2014/02/26  4300.45  1845.79  1176.11
  2014/02/27  4291.47  1844.90  1179.28
  2014/02/28  4323.52  1855.12  1189.19
  6 rows × 3 columns

Using .xs achieves the same effect as obtaining a cross-section in the previous section on integer-oriented indexing.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.210.91