Nội dung chính
- How to check if a single value is NaN in python. There are approaches are using libraries (pandas, math and numpy) and without using libraries.
- Method 1: Using Pandas Library
- Method 2: Using Numpy Library
- Method 3: Using math library
- Method 4: Comparing with itself
- Method 5: Checking the range
- Become a Member
How to check if a single value is NaN in python. There are approaches are using libraries (pandas, math and numpy) and without using libraries.
NaN stands for Not A Number and is one of the common ways to represent the missing value in the data. It is a special floating-point value and cannot be converted to any other type than float.
NaN value is one of the major problems in Data Analysis. It is very essential to deal with NaN in order to get the desired results.
Finding and dealing with NaN within an array, series or dataframe is easy. However, identifying a stand alone NaN value is tricky. In this article I explain five methods to deal with NaN in python. The first three methods involves in-built functions from libraries. The last two relies on properties of NaN for finding NaN values.
Method 1: Using Pandas Library
isna() in pandas library can be used to check if the value is null/NaN. It will return True if the value is NaN/null.
import pandas as pdx = float("nan")
print(f"It's pd.isna : {pd.isna(x)}")OutputIt's pd.isna : True
Method 2: Using Numpy Library
isnan() in numpy library can be used to check if the value is null/NaN. It is similar to isna() in pandas.
import numpy as npx = float("nan")
print(f"It's np.isnan : {np.isnan(x)}")OutputIt's np.isnan : True
Method 3: Using math library
Math library provides has built-in
mathematical functions. The library is applicable to all real numbers. cmath library can be used if dealing with complex numbers.
Math library has built in function isnan() to check null/NaN values.
x = float("nan")
print(f"It's math.isnan : {math.isnan(x)}")OutputIt's math.isnan : True
Method 4: Comparing with itself
When I started my career working with big IT company, I had to undergo a training for the first month. The trainer, when introducing the concept of NaN values mentioned that they are like aliens we know
nothing about. These aliens are constantly shapeshifting, and hence we cannot compare NaN value against itself.
The most common method to check for NaN values is to check if the variable is equal to itself. If it is not, then it must be NaN value.
return num!= numx=float("nan")
isNaN(x)OutputTrue
Method 5: Checking the range
Another property of NaN which can be used to check for NaN is the range. All floating point values fall within the range of minus infinity to infinity.
infinity < any number< infinity
However, NaN values does not come within this range. Hence, NaN can be identified if the value does not fall within the range from minus infinity to infinity.
This can be implemented as below:
def isNaN(num):if float('-inf') < float(num) < float('inf'):
return False
else:
return Truex=float("nan")
isNaN(x)OutputTrue
I hope you have found the above article helpful. I am sure there would be many other techniques to check for NaN values based on various other logics. Please share the other methods you have come across to check for NaN/ Null values.
Cheers!
Become a Member
I hope you like the article, I would highly recommend signing up for Medium Membership to read more articles by me or stories by thousands of other authors on variety of topics.
Your membership fee directly supports me and other writers you read. You’ll also get full access to every story on Medium.
Bạn có một cặp đôi tùy chọn.
import pandas as pd import numpy as np df = pd.DataFrame(np.random.randn(10,6)) # Make a few areas have NaN values df.iloc[1:3,1] = np.nan df.iloc[5,3] = np.nan df.iloc[7:9,5] = np.nanBây giờ khung dữ liệu trông giống như thế này:
0 1 2 3 4 5 0 0.520113 0.884000 1.260966 -0.236597 0.312972 -0.196281 1 -0.837552 NaN 0.143017 0.862355 0.346550 0.842952 2 -0.452595 NaN -0.420790 0.456215 1.203459 0.527425 3 0.317503 -0.917042 1.780938 -1.584102 0.432745 0.389797 4 -0.722852 1.704820 -0.113821 -1.466458 0.083002 0.011722 5 -0.622851 -0.251935 -1.498837 NaN 1.098323 0.273814 6 0.329585 0.075312 -0.690209 -3.807924 0.489317 -0.841368 7 -1.123433 -1.187496 1.868894 -2.046456 -0.949718 NaN 8 1.133880 -0.110447 0.050385 -1.158387 0.188222 NaN 9 -0.513741 1.196259 0.704537 0.982395 -0.585040 -1.693810- Tùy chọn 1 : df.isnull().any().any()- Điều này trả về giá trị boolean
Bạn biết cái isnull()nào sẽ trả về một khung dữ liệu như thế này:
0 1 2 3 4 5 0 False False False False False False 1 False True False False False False 2 False True False False False False 3 False False False False False False 4 False False False False False False 5 False False False True False False 6 False False False False False False 7 False False False False False True 8 False False False False False True 9 False False False False False FalseNếu bạn tạo nó df.isnull().any(), bạn chỉ có thể tìm thấy các cột có NaNgiá trị:
0 False 1 True 2 False 3 True 4 False 5 True dtype: boolMột người nữa .any()sẽ cho bạn biết nếu có bất kỳ điều nào ở trênTrue
> df.isnull().any().any() True- Tùy chọn 2 : df.isnull().sum().sum()- Điều này trả về một số nguyên của tổng số NaNgiá trị:
Điều này hoạt động theo cách tương tự như trước .any().any(), bằng cách trước tiên đưa ra tổng của số lượng NaNgiá trị trong một cột, sau đó là tổng của các giá trị đó:
df.isnull().sum() 0 0 1 2 2 0 3 1 4 0 5 2 dtype: int64Cuối cùng, để có được tổng số giá trị NaN trong DataFrame:
df.isnull().sum().sum() 5128 hữu ích 0 bình luận chia sẻ