Python’s Pandas library is a well-liked tool for handling and analyzing data. It offers several ways to handle and work with different types of data, including numerical data. It can be helpful to identify the numerical columns in a dataframe when working with a large dataset.

In this article, we will see how to find numerical columns in pandas.

First, let’s create a sample dataframe:

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [1, 2, 3, 4, 5],
                   'B': [6, 7, 8, 9, 10],
                   'C': ['a', 'b', 'c', 'd', 'e'],
                   'D': [0.1, 0.2, 0.3, 0.4, 0.5]})

The sample dataframe df will look like this:

   A   B  C    D
0  1   6  a  0.1
1  2   7  b  0.2
2  3   8  c  0.3
3  4   9  d  0.4
4  5  10  e  0.5

To find the numeric columns, we can use the select_dtypes method and pass in the argument include='number':

numeric_cols = df.select_dtypes(include='number').columns

In this example, the method select_dtypes is used to select the columns that have numerical data types (i.e., int and float). The resulting numeric_cols will be a list of the numeric column names:

Index(['A', 'B', 'D'], dtype='object')

You can also use the.apply method and the .isnumeric method to find numeric columns.

numeric_cols = df.columns[df.apply(lambda x: x.str.isnumeric().all())]

In this example, the .apply method is used to apply the .isnumeric method to each column in the dataframe. The .isnumeric method returns a Boolean value indicating whether or not all elements in the column are numeric. The .columns property is used to get the column names.

The resulting numeric_cols will be a list of the numeric column names:

Index(['A', 'B'], dtype='object')

Categorized in:

Tagged in: