How to count the number of missing values in each row in Pandas dataframe?

Question

How can I get the number of missing value in each row in Pandas dataframe. I would like to split dataframe to different dataframes which have same number of missing values in each row.

Any suggestion?

score 50 · Answer 1 · edited Jun 14 '20 at 13:15

When using pandas, try to avoid performing operations in a loop, including apply, map, applymap etc. That's slow!

A DataFrame object has two axes: “axis 0” and “axis 1”. “axis 0” represents rows and “axis 1” represents columns.

If you want to count the missing values in each column, try:

df.isnull().sum() as default or df.isnull().sum(axis=0)

On the other hand, you can count in each row (which is your question) by:

df.isnull().sum(axis=1)

It's roughly 10 times faster than Jan van der Vegt's solution(BTW he counts valid values, rather than missing values):

In [18]: %timeit -n 1000 df.apply(lambda x: x.count(), axis=1)
1000 loops, best of 3: 3.31 ms per loop

In [19]: %timeit -n 1000 df.isnull().sum(axis=1)
1000 loops, best of 3: 329 µs per loop

Jan van der Vegt · Accepted Answer · 2016-07-07T11:18:34.297

26

You can apply a count over the rows like this:

test_df.apply(lambda x: x.count(), axis=1)

test_df:

    A   B   C
0:  1   1   3
1:  2   nan nan
2:  nan nan nan

output:

0:  3
1:  1
2:  0

You can add the result as a column like this:

test_df['full_count'] = test_df.apply(lambda x: x.count(), axis=1)

Result:

    A   B   C   full_count
0:  1   1   3   3
1:  2   nan nan 1
2:  nan nan nan 0

edited Jul 07 '16 at 11:18

answered Jul 07 '16 at 11:13

Jan van der Vegt

9,448
37
52

score 6 · Answer 3 · answered Feb 07 '17 at 12:37

6

The simplist way:

df.isnull().sum(axis=1)

answered Feb 07 '17 at 12:37

Yuan JI

161
1
3

score 4 · Answer 4 · answered Dec 27 '17 at 02:45

4

Or, you could simply make use of the info method for dataframe objects:

df.info()

which provides counts of non-null values for each column.

answered Dec 27 '17 at 02:45

Chris Ivan

171
4

score 4 · Answer 5 · answered Feb 14 '19 at 12:14

4

null values along the column,

df.isnull().sum(axis=0)

blank values along the column,

c = (df == '').sum(axis=0)

null values along the row,

df.isnull().sum(axis=1)

blank values along the row,

c = (df == '').sum(axis=1)

answered Feb 14 '19 at 12:14

Rakesh Chaudhari

151
2

score 1 · Answer 6 · answered Jul 08 '16 at 10:18

1

>>> df = pd.DataFrame([[1, 2, np.nan],
...                    [np.nan, 3, 4],
...                    [1, 2,      3]])

>>> df
    0  1   2
0   1  2 NaN
1 NaN  3   4
2   1  2   3

>>> df.count(axis=1)
0    2
1    2
2    3
dtype: int64

answered Jul 08 '16 at 10:18

K3---rnc

3,582
1
14
12

score 0 · Answer 7 · answered Dec 30 '19 at 16:58

0

This snippet will return integer value of total number of columns with missing value:

(df.isnull().sum() > 0).astype(np.int64).sum()

answered Dec 30 '19 at 16:58

neil armstrong

1

score -1 · Answer 8 · answered Jan 16 '17 at 11:26

-1

If you want count of missing values:

np.logical_not(df.isnull()).sum()

answered Jan 16 '17 at 11:26

Itachi

251
2
8

How to count the number of missing values in each row in Pandas dataframe?

8 Answers8