Before proceeding with this post, it is important to understand the difference between NaN and None. One is a float type, the other is an object type. Pandas is better suited to working with scalar types as many methods on these types can be vectorised. Pandas does try to handle None and NaN consistently, but NumPy cannot. Show
My suggestion (and Andy's) is to stick with NaN. But to answer your question... pandas >= 0.18: Use na_values=['-'] argument with read_csvIf you loaded this data from CSV/Excel, I have good news for you. You can quash this at the root during data loading instead of having to write a fix with code as a subsequent step. Most of the
Now, to convert the
And similar for other functions/file formats. P.S.: On v0.24+, you can
preserve integer type even if your column has NaNs (yes, talk about having the cake and eating it too). You can specify
The dtype is not a conventional int type... but rather, a Nullable Integer Type. There are other options. Handling Numeric Data: pd.to_numeric with errors='coerceIf you're dealing with numeric data, a
faster solution is to use
To retain (nullable) integer dtype, use
To coerce multiple columns, use
...and assign the result back after. More information can be found in this answer. I'm learning pandas and I came across this problem. [code] df.replace('-', None) [/code] and this returns a pretty strange result [code] 0 0 - // this isn't replaced 1 3 2 2 3 5 4 1 5 -5 6 -1 7 -1 // this is changed to `-1`... 8 9 [/code] what exactly am I doing wrong?
Apr 3, 2018 in Python by
• 7,440 points • 83,682 views 4 answers to this question.Actually in later versions of pandas this will give a TypeError: df.replace('-', None)
TypeError: If "to_replace" and "value" are both None then regex must be a mapping You can do it by passing either a list or a dictionary: In [11]: df.replace('-', df.replace(['-'], [None]) # or .replace('-', {0: None}) Out[11]: 0 0 None 1 3 2 2 3 5 4 1 5 -5 6 -1 7 None 8 9 But I recommend using NaNs rather than None: In [12]: df.replace('-', np.nan) Out[12]: 0 0 NaN 1 3 2 2 3 5 4 1 5 -5 6 -1 7 NaN 8 9
answered Aug 13, 2018 by bug_seeker I found the solution using replace with a dict the most simple and elegant solution: df.replace({'-': None}) You can also have more replacements: df.replace({'-': None, 'None': None}) And even for larger replacements, it is always obvious and clear what is replaced by what - which is way harder for long lists, in my opinion. answered Oct 12, 2018 by findingbugs• 4,780 points Related Questions In Python
Subscribe to our Newsletter, and get personalized recommendations.Already have an account? Sign in. How do I replace Na in Python?Replace NaN Values with Zeros in Pandas DataFrame. (1) For a single column using Pandas: df['DataFrame Column'] = df['DataFrame Column'].fillna(0). (2) For a single column using NumPy: df['DataFrame Column'] = df['DataFrame Column'].replace(np.nan, 0). (3) For an entire DataFrame using Pandas: df.fillna(0). How do you fill a NA value in Python?The fillna() method replaces the NULL values with a specified value. The fillna() method returns a new DataFrame object unless the inplace parameter is set to True , in that case the fillna() method does the replacing in the original DataFrame instead.
How do I change NaN to NA in Python?Methods to replace NaN values with zeros in Pandas DataFrame:. fillna() The fillna() function is used to fill NA/NaN values using the specified method.. replace() The dataframe. replace() function in Pandas can be defined as a simple method used to replace a string, regex, list, dictionary etc. in a DataFrame.. How do I fill a string with Na?Pandas: How to Replace NaN Values with String. Method 1: Replace NaN Values with String in Entire DataFrame df. fillna('', inplace=True). Method 2: Replace NaN Values with String in Specific Columns df[['col1', 'col2']] = df[['col1','col2']]. fillna(''). Method 3: Replace NaN Values with String in One Column df. col1 = df.. |