要将自定义或其他库的函数应用于Pandas对象,有三个重要的方法,下面来讨论如何使用这些方法。使用适当的方法取决于函数是否期望在整个DataFrame,行或列或元素上进行操作。
- 表合理函数应用:pipe()
- 行或列函数应用:apply()
- 元素函数应用:applymap()
def adder(ele1,ele2):
return ele1+ele2
现在将使用自定义函数对DataFrame进行操作。
df = pd.DataFrame(np.random.randn(5,3),columns=['col1','col2','col3'])
df.pipe(adder,2)
下面来看看完整的程序 -
import pandas as pd
import numpy as np
def adder(ele1,ele2):
return ele1+ele2
df = pd.DataFrame(np.random.randn(5,3),columns=['col1','col2','col3'])
df.pipe(adder,2)
print(df)
col1 col2 col3 0 -1.402521 -0.152892 1.829331 1 -0.100956 -0.780239 -0.923624 2 -1.020940 0.439790 1.697911 3 0.410887 0.315629 -0.137177 4 1.031910 -0.760400 -0.023329
df = pd.DataFrame(np.random.randn(5,3),columns=['col1','col2','col3'])
df.apply(np.mean)
print(df)
col1 col2 col3 0 -0.119737 0.709267 -0.204319 1 0.528979 -1.812767 0.223392 2 -0.209128 -0.152610 -0.491563 3 0.139953 -2.067512 -0.894442 4 0.147579 -1.299550 -0.744132
通过传递axis参数,可以在行上执行操作。
df = pd.DataFrame(np.random.randn(5,3),columns=['col1','col2','col3'])
df.apply(np.mean,axis=1)
df
col1 | col2 | col3 | |
---|---|---|---|
0 | 0.335770 | -0.065489 | 1.815575 |
1 | 1.108302 | 0.099355 | -0.470350 |
2 | -1.213831 | -1.135103 | 0.299360 |
3 | -0.033615 | 1.172726 | -0.115489 |
4 | 0.751986 | 1.557329 | -0.937019 |
df = pd.DataFrame(np.random.randn(5,3),columns=['col1','col2','col3'])
df.apply(lambda x: x.max() - x.min())
df
col1 | col2 | col3 | |
---|---|---|---|
0 | 0.666385 | 0.174257 | -1.439446 |
1 | 0.371453 | 0.604340 | 0.598378 |
2 | -0.213248 | 1.682614 | -2.041215 |
3 | 0.195058 | -1.406248 | -2.266669 |
4 | 0.431626 | -2.121158 | 0.694977 |
df = pd.DataFrame(np.random.randn(5,3),columns=['col1','col2','col3'])
# My custom function
df['col1'].map(lambda x:x*100)
df
col1 | col2 | col3 | |
---|---|---|---|
0 | 0.277096 | -0.274364 | -0.189870 |
1 | 0.888756 | 0.546460 | -0.293550 |
2 | 0.944017 | -0.687024 | 0.357593 |
3 | -0.014107 | -0.841510 | 1.226355 |
4 | -0.341670 | 1.174623 | 0.322051 |
# My custom function
df = pd.DataFrame(np.random.randn(5,3),columns=['col1','col2','col3'])
df.applymap(lambda x:x*100)
df
/tmp/ipykernel_1016/3736165600.py:3: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead. df.applymap(lambda x:x*100)
col1 | col2 | col3 | |
---|---|---|---|
0 | -1.012291 | -0.223252 | -0.216416 |
1 | -0.395696 | -0.167642 | 1.310985 |
2 | 0.956191 | -0.730842 | -0.453613 |
3 | 0.059378 | 0.287856 | -0.144362 |
4 | -1.038070 | 0.022807 | 1.670587 |