# Part 4: UNIVERSAL FUNCTIONS IN PANDAS

Article Contents

## 1. UNIVERSAL FUNCTIONS: INDEX PRESERVATION

All NumPy Ufunc will work on Pandas `Series` and `DataFrame`

First, let’s create Pandas `Series` of random integers

``````import numpy as np
import pandas as pd

# creating random state
rand = np.random.RandomState(42)

# Creating Pandas Series of random integers
ser1 = pd.Series(rand.randint(10, size=4))
print(ser1)
``````
``````0    6
1    3
2    7
3    4
dtype: int64
``````

Second, create a Pandas `DataFrame` of random integers

``````# Creating Pandas DataFrame
df1 = pd.DataFrame(rand.randint(10,size=(3,4)),
columns=['a','b','c','d'])

print(df1)
``````
``````   a  b  c  d
0  6  9  2  6
1  7  4  3  7
2  7  2  5  4
``````

Now, if we apply any Numpy Ufunc on these objects (`Series` or `DataFrame`) the result will be another Panda object with indices preserved

``````# Taking exponent of all element in the Series, sr1
np.exp(ser1)
``````
``````0     403.428793
1      20.085537
2    1096.633158
3      54.598150
dtype: float64
``````
``````# Doing arithmatic on each element of dataframe, df1
print(np.multiply(df1,10))
``````
``````    a   b   c   d
0  60  90  20  60
1  70  40  30  70
2  70  20  50  40
``````

## 2. UNIVERSAL FUNCTIONS: INDEX ALIGNMENT

### 2.1. Index Alignment in Series

When we try to `add` two `Series` with non-identical index, the resulting sum will keep the index alignment

``````# First, define two series whose index are not identical
A = pd.Series([1,2,3], index=[0,1,2]) #index[0,1,2]
B = pd.Series([10,20,30], index=[1,2,3]) #index[1,2,3]

# Second, perform addition of these two series
print(A); print(B)
print(A.add(B))
``````
``````0    1
1    2
2    3
dtype: int64
1    10
2    20
3    30
dtype: int64
0     NaN
1    12.0
2    23.0
3     NaN
dtype: float64
``````

As we can tell from above example, when we perform the sum, the indices of both series are preserved.

#### add() method with fill_value

• When Python doesn’t find any corresponding value on same index, it returns `NaN`
• For example, in Series `A` there is index 0 but no corresponding value for Series `B`, index 0
• To handle this NaN, we can use kwarg `fill_value` with Pandas `.add()` method
``````A.add(B, fill_value=0)
``````
``````0     1.0
1    12.0
2    23.0
3    30.0
dtype: float64
``````

### 2.2. Index Alignment in DataFrame

When we try to `add` two `DataFrame` with non-identical index, the resulting sum will keep the index alignment

``````# First, defining two dataframes with not identical indices or columns
C = pd.DataFrame(rand.randint(10, size=(2,2)),
columns=['a','b'])

D = pd.DataFrame(rand.randint(10, size=(3,3)),
columns=['a','b','c'])

print(C); print(D)
``````
``````   a  b
0  1  7
1  5  1
a  b  c
0  4  0  9
1  5  8  0
2  9  2  6
``````
``````# Secondly, we add these two dataframes and see how results are handled
print(C.add(D))
``````
``````      a    b   c
0   5.0  7.0 NaN
1  10.0  9.0 NaN
2   NaN  NaN NaN
``````

#### add() method with fill_value

• When Python doesn’t find any corresponding value on same index and column, it returns `NaN`
• For example, in DataFrame `D` there is index 0, column ‘c’ but no corresponding value for Series `C` under index 0, column ‘c’
• We can use keyword argument, `fill_value` with Pandas `.add()` method to handle the NaN
``````print(C.add(D, fill_value=0))
``````
``````      a    b    c
0   5.0  7.0  9.0
1  10.0  9.0  0.0
2   9.0  2.0  6.0
``````

## 3. UNIVERSAL FUNCTIONS: OTHER OPERATIONS

### 3.1. Understanding ‘axis’ keyword argument

#### One way to look at `axis` kwarg:

Remember that we mention, `axis=0` or `axis=index` the operation will be performed column wise and when we mention `axis=1` or `axis=column`, the operation will be performed row wise.

#### Another way to look at `axis` kwarg:

• `axis=0` or `axis=index` means to perform operation on all the rows in each column
• `axis=1` or `axis=column` means to perform operation on all the columns in each row

### 3.2. Operations on Self

Let’s subtract values of first row of the `df1` from all rows in `df1`. In this case, the default value of kwarg, `axis` is `1` or `columns`

``````print(df1)
print(df1.subtract(df1.iloc[0]))
``````
``````   a  b  c  d
0  6  9  2  6
1  7  4  3  7
2  7  2  5  4
a  b  c  d
0  0  0  0  0
1  1 -5  1  1
2  1 -7  3 -2
``````

However, If we would like to apply this arithmetic operation index-wise, we can use, `axis=0` or `axis=index`

``````print(df1.subtract(df1['a'], axis=0))
``````
``````   a  b  c  d
0  0  3 -4  0
1  0 -3 -4  0
2  0 -5 -2 -3
``````

### 3.3. Operation between Series and DataFrame

Operations between a `DataFrame` and `Series` object are similar to operations between a two-dimensional and one-dimensional NumPy array

``````# Series
ser11 = pd.Series(rand.randint(12, size=3))
ser11
``````
``````0     2
1     9
2    11
dtype: int64
``````
``````# DataFrame
df11 = pd.DataFrame(rand.randint(10,size=(3,4)),
columns=['a','b','c','d'] )
print(df11)
``````
``````   a  b  c  d
0  7  5  7  8
1  3  0  0  9
2  3  6  1  2
``````

Let add `Series` to `DataFrame` with kwarg, `axis=0` or `axis=index`, which matches the index . Both `ser1` and `df1` have identical index

``````print(df1.add(ser1, axis=0))
``````
``````    a   b   c   d
0   9   7   9  10
1  12   9   9  18
2  14  17  12  13
``````