What are lambda functions in python and why you should start using them right now
Many beginner Data Scientists have heard of lambda functions but may not be sure what they are and how to use them. This article will explain:
What are lambda functions?
How do lambda functions differ from normal functions?
Why are lamdas functions useful?
And will finally give some examples of real usage of lambda functions within python and pandas.
Let’s get started.
What are lambda functions and how do they differ from normal functions?
I am assuming here that you are familiar with the normal python function definition. It always starts with the word def, which is followed by the function name, its arguments in parenthesis, and then finally the colon sign. Then on the new line or lines, we have a function body that performs the desired operation, and that usually finishes with the return statement. Let’s have a look at the example of the function that does a very simple operation, it adds one to a number:
def add_one_to_number(number): return number + 1
The above function could be actually rewritten using lambda notation:
lambda x: x + 1
You can notice that in order to use the lambda function you need to use a lambda keyword followed by the argument name (usually a single letter) and a colon. The colon is then followed by the function definition. And that’s all.
You can see that this definition is much simpler than the normal python function definition you are used to. It is simple, concise, and can be written in a single line of code. There are some important things to remember about lambda functions before you move on:
Lambda functions are sometimes called anonymous functions. This is because they do not have a name.
Lambda functions can take only one expression, so you will not be able to create long multi expression function definitions.
Why are lamdas functions useful?
Once you understand that lambda functions are like normal functions without names and written with single expressions it is a time to explain why they are useful. They become useful when you want to use functions that take another function as an argument. An example of such a function in python could be: filter(), map() or reduce().
The reason why lambda functions become so useful is the fact that it is usually more convenient to use simple and concise lambda notation rather than defining a new function in a traditional way, especially if the function is designed to do only one single operation rather than serve as a repeatable component.
A practical example of lambda function in python.
Let’s see how we can use the python filter() function with lambda notation. Filter() function takes a function as the first argument (this will be our lambda function) and the list as the second argument to which we want to apply a filtering function. Let’s see an example:
my_list = [1, 2, 3, 4, 5] list(filter(lambda x: x > 2, my_list)) [3, 4, 5]
In the example above we can see that we have used filter() function with an anonymous function defined as lambda x: x > 2 and we applied it to my_list. As a result, we have filtered the initial list to include only elements that are larger than number 2 and got [3, 4, 5] as a result. Note that we had to change the result of filter() function to a list, otherwise the result would be a filter object and not a list itself.
We could use lambda notation in a similar way with map() and reduce() but we are not going to cover it here. Filter() example should allow you to figure how to use lambda with the above functions by yourself. Instead, we will move to some examples with pandas functions.
A practical example of lambda function with pandas apply()
As a Data Scientist, you will work a lot with pandas library and this is a place where you will be using lambda notation often. Anonymous functions are mostly used with apply(), applymap() and map(). If you are not sure what these functions are you can check out my article that explains their usage:
Pandas data manipulation functions: apply(), map() and applymap() If you know how these functions work already you can jump into the example straight away. Let’s start with loading Iris data set first.
from sklearn import datasets import pandas as pd iris_data = datasets.load_iris() df_iris = pd.DataFrame(iris_data.data,columns=iris_data.feature_names) df_iris['target'] = pd.Series(iris_data.target) df_iris.head()
Let’ s now create a function that will add a sepal length description column that will be based on the initial sepal length column. For this, we will use apply() on the sepal length column.
df_iris['sepal_length_description'] = df_iris['sepal length (cm)'].apply(lambda x: 'sepal length is: ' + str(x)) df_iris.head()
As you can see we have created a simple function ‘on the go’ that concatenates string ‘sepal length is: ‘ with the result of changing a numeric value from sepal_length column to string.
As you can imagine apply() with lambda notation is quite powerful and will allow you to manipulate and create new columns with the desired outcomes efficiently. But what happens if you want to use lambda function with more than one data frame column.
How to use lambda function with apply() and access different columns
As mentioned above you can use lambda functions and apply() to combine information from different columns. A good example to illustrate this will be using apply() function on Iris data set. We will combine information from several columns to create a new column called ‘sepal description’:
df_iris['sepal_description'] = df_iris.apply(lambda x: 'sepal length/width ration is :' + str(round(x['sepal length (cm)'] / x['sepal width (cm)'], 2)), axis=1) df_iris.head()
As you can see we have used apply() function of the whole data frame here. In that example, variable ‘x’ refers to the whole data frame and we can call individual columns as we normally do it with traditional data frames. Therefore we have used x[‘sepal length (cm)’] and x[‘sepal width(cm)’] to compute sepal ratio, then we have rounded the result, changed it to float and finally concatenated the result to description string.
This example quite well shows how useful the lambda function can be at manipulating data. We have used only to columns here but you can easily see that you could access any number of columns if your operations would require this.
Summary In this quick article, we have explained what lambda functions are and how to use them in python and pandas. We have covered a filter() example and then demonstrated how to use lambdas with apply() both os Series and DataFrame object.
I hope you have found the examples useful and you will start using lambdas in your own code.