Python Pandas - Function Application
To apply your own or another library’s functions to Pandas objects, you should be aware of the three important methods. The methods have been discussed below. The appropriate method to use depends on whether your function expects to operate on an entire DataFrame, row- or column-wise, or element wise.
-
Table wise Function Application: pipe()
-
Row or Column Wise Function Application: apply()
-
Element wise Function Application: applymap()
Table-wise Function Application:
Custom operations can be performed by passing the function and the appropriate number of parameters as pipe arguments. Thus, operation is performed on the whole DataFrame.
For example, add a value 2 to all the elements in the DataFrame. Then,
adder function:
The adder function adds two numeric values as parameters and returns the sum.
def adder(ele1,ele2):
return ele1+ele2
We will now use the custom function to conduct operation on the DataFrame.
df = pd.DataFrame(np.random.randn(5,3),columns=['col1','col2','col3'])
df.pipe(adder,2)
Let’s see the full program −
import
pandas as
pd
import
numpy as
np
def
adder(
ele1,
ele2):
return
ele1+
ele2
df =
pd.
DataFrame
(
np.
random.
randn(
5
,
3
),
columns=[
'col1'
,
'col2'
,
'col3'
])
df.
pipe(
adder,
2
)
print
df.
apply(
np.
mean)
Its output
is as follows −
col1 col2 col3
0 2.176704 2.219691 1.509360
1 2.222378 2.422167 3.953921
2 2.241096 1.135424 2.696432
3 2.355763 0.376672 1.182570
4 2.308743 2.714767 2.130288
Row or Column Wise Function Application:
Arbitrary functions can be applied along the axes of a DataFrame or Panel using the apply()
method, which, like the descriptive statistics methods, takes an optional axis argument. By default, the operation performs column wise, taking each column as an array-like.
Example 1:
import
pandas as
pd
import
numpy as
np
df =
pd.
DataFrame
(
np.
random.
randn(
5
,
3
),
columns=[
'col1'
,
'col2'
,
'col3'
])
df.
apply(
np.
mean)
print
df.
apply(
np.
mean)
Its output
is as follows −
col1 -0.288022
col2 1.044839
col3 -0.187009
dtype: float64
By passing axis
parameter, operations can be performed row wise.
Example 2:
import
pandas as
pd
import
numpy as
np
df =
pd.
DataFrame
(
np.
random.
randn(
5
,
3
),
columns=[
'col1'
,
'col2'
,
'col3'
])
df.
apply(
np.
mean,
axis=
1
)
print
df.
apply(
np.
mean)
Its output
is as follows −
col1 0.034093
col2 -0.152672
col3 -0.229728
dtype: float64
Example 3:
import
pandas as
pd
import
numpy as
np
df =
pd.
DataFrame
(
np.
random.
randn(
5
,
3
),
columns=[
'col1'
,
'col2'
,
'col3'
])
df.
apply(
lambda
x:
x.
max()
-
x.
min())
print
df.
apply(
np.
mean)
Its output
is as follows −
col1 -0.167413
col2 -0.370495
col3 -0.707631
dtype: float64
Element Wise Function Application:
Not all functions can be vectorized (neither the NumPy arrays which return another array nor any value), the methods applymap()
on DataFrame and analogously map()
on Series accept any Python function taking a single value and returning a single value.
Example 1:
import
pandas as
pd
import
numpy as
np
df =
pd.
DataFrame
(
np.
random.
randn(
5
,
3
),
columns=[
'col1'
,
'col2'
,
'col3'
])
# My custom function
df[
'col1'
].
map(
lambda
x:
x*
100
)
print
df.
apply(
np.
mean)
Its output
is as follows −
col1 0.480742
col2 0.454185
col3 0.266563
dtype: float64
Example 2:
import
pandas as
pd
import
numpy as
np
df =
pd.
DataFrame
(
np.
random.
randn(
5
,
3
),
columns=[
'col1'
,
'col2'
,
'col3'
])
df.
applymap(
lambda
x:
x*
100
)
print
df.
apply(
np.
mean)
Its output
is as follows −
col1 0.395263
col2 0.204418
col3 -0.795188
dtype: float64