您的位置:首页 > 其它

pandas RESHAPING AND PIVOT TABLES

2015-07-30 12:15 435 查看
In [1]: df
Out[1]:
date    variable value
0 2000-01-03 A 0.469112
1 2000-01-04 A -0.282863
2 2000-01-05 A -1.509059
3 2000-01-03 B -1.135632
4 2000-01-04 B 1.212112
5 2000-01-05 B -0.173215
6 2000-01-03 C 0.119209
7 2000-01-04 C -1.044236
8 2000-01-05 C -0.861849
9 2000-01-03 D -2.104569
10 2000-01-04 D -0.494929
11 2000-01-05 D 1.071804
In [3]: df.pivot(index=’date’, columns=’variable’, values=’value’)
Out[3]:
variable   A        B         C        D
date
2000-01-03 0.469112 -1.135632 0.119209 -2.104569
2000-01-04 -0.282863 1.212112 -1.044236 -0.494929
2000-01-05 -1.509059 -0.173215 -0.861849 1.071804


Reshaping by stacking and unstacking

In [12]: df2
Out[12]:
A        B
first second
bar   one    0.721555 -0.706771
two    -1.039575 0.271860
baz   one    -0.424972 0.567020
two    0.276232 -1.087401
In [13]: stacked = df2.stack()
In [14]: stacked
Out[14]:
first second
bar   one    A 0.721555
B -0.706771
two    A -1.039575
B 0.271860
baz one      A -0.424972
B 0.567020
two      A 0.276232
B -1.087401
dtype: float64
In [15]: stacked.unstack()
Out[15]:
A        B
first second
bar   one    0.721555 -0.706771
two    -1.039575 0.271860
baz   one    -0.424972 0.567020
two    0.276232 -1.087401
In [16]: stacked.unstack(1)
Out[16]:
second one      two
first
bar  A 0.721555 -1.039575
B -0.706771 0.271860
baz  A -0.424972 0.276232
B 0.567020 -1.087401
In [17]: stacked.unstack(0)
Out[17]:
first    bar      baz
second
one    A 0.721555 -0.424972
B -0.706771 0.567020
two    A -1.039575 0.276232
B 0.271860 -1.087401


In [23]: columns = MultiIndex.from_tuples([
....: (’A’, ’cat’, ’long’), (’B’, ’cat’, ’long’),
....: (’A’, ’dog’, ’short’), (’B’, ’dog’, ’short’)
....: ],
....: names=[’exp’, ’animal’, ’hair_length’]
....: )
In [24]: df = DataFrame(randn(4, 4), columns=columns)
In [25]: df
Out[25]:
exp        A      B         A        B
animal     cat    cat       dog      dog
hair_length long  long      short    short
0        1.075770 -0.109050 1.643563 -1.469388
1        0.357021 -0.674600 -1.776904 -0.968914
2        -1.294524 0.413738 0.276662 -0.472035
3        -0.013960 -0.362543 -0.006154 -0.923061
In [26]: df.stack(level=[’animal’, ’hair_length’])
Out[26]:
exp             A     B
animal hair_length
0 cat   long 1.075770 -0.109050
dog   short 1.643563 -1.469388
1 cat   long 0.357021 -0.674600
dog   short -1.776904 -0.968914
2 cat   long -1.294524 0.413738
dog   short 0.276662 -0.472035
3 cat   long -0.013960 -0.362543
dog   short -0.006154 -0.923061


In [32]: df2
Out[32]:
exp     A        B        A
animal  cat      dog      cat       dog
first second
bar one 0.895717 0.805244 -1.206412 2.565646
two 1.431256 1.340309 -1.170299 -0.226169
baz one 0.410835 0.813850 0.132003 -0.827317
foo one -1.413681 1.607920 1.024180 0.569605
two 0.875906 -2.211372 0.974466 -2.006747
qux two -1.226825 0.769804 -1.281247 -0.727707
In [33]: df2.stack(’exp’)
Out[33]:
animal cat dog
first second exp
bar   one    A 0.895717 2.565646
B -1.206412 0.805244
two    A 1.431256 -0.226169
B -1.170299 1.340309
baz   one    A 0.410835 -0.827317
B 0.132003 0.813850
foo   one    A -1.413681 0.569605
B 1.024180 1.607920
two    A 0.875906 -2.006747
B 0.974466 -2.211372
qux   two    A -1.226825 -0.727707
B -1.281247 0.769804
In [34]: df2.stack(’animal’)
Out[34]:
exp              A        B
first second animal
bar   one    cat 0.895717 -1.206412
dog 2.565646 0.805244
two    cat 1.431256 -1.170299
dog -0.226169 1.340309
baz   one    cat 0.410835 0.132003
dog -0.827317 0.813850
foo   one    cat -1.413681 1.024180
dog 0.569605 1.607920
two    cat 0.875906 0.974466
dog -2.006747 -2.211372
qux   two    cat -1.226825 -1.281247
dog -0.727707 0.769804


Reshaping by Melt

In [38]: cheese
Out[38]:
first height last weight
0 John   5.5   Doe  130
1 Mary   6.0   Bo   150
In [39]: melt(cheese, id_vars=[’first’, ’last’])
Out[39]:
first last variable value
0 John   Doe height   5.5
1 Mary   Bo  height   6.0
2 John   Doe weight   130.0
3 Mary   Bo  weight   150.0
In [40]: melt(cheese, id_vars=[’first’, ’last’], var_name=’quantity’)
Out[40]:
first last quantity value
0 John  Doe  height   5.5
1 Mary  Bo   height   6.0
2 John  Doe  weight   130.0
3 Mary  Bo   weight   150.0


Combining with stats and GroupBy

In [45]: df
Out[45]:
exp     A        B                  A
animal  cat      dog      cat       dog
first second
bar one 0.895717 0.805244 -1.206412 2.565646
two 1.431256 1.340309 -1.170299 -0.226169
baz one 0.410835 0.813850 0.132003 -0.827317
two -0.076467 -1.187678 1.130127 -1.436737
foo one -1.413681 1.607920 1.024180 0.569605
two 0.875906 -2.211372 0.974466 -2.006747
qux one -0.410001 -0.078638 0.545952 -1.219217
two -1.226825 0.769804 -1.281247 -0.727707
In [46]: df.stack().mean(1).unstack()
Out[46]:
animal  cat       dog
first second
bar one -0.155347 1.685445
two 0.130479 0.557070
baz one 0.271419 -0.006733
two 0.526830 -1.312207
foo one -0.194750 1.088763
two 0.925186 -2.109060
qux one 0.067976 -0.648927
two -1.254036 0.021048
# same result, another way
In [47]: df.groupby(level=1, axis=1).mean()
Out[47]:
animal  cat       dog
first second
bar one -0.155347 1.685445
two 0.130479 0.557070
baz one 0.271419 -0.006733
two 0.526830 -1.312207
foo one -0.194750 1.088763
two 0.925186 -2.109060
qux one 0.067976 -0.648927
two -1.254036 0.021048
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: