How do i edit a column in a dataframe in python?


Examples of how to edit a pandas dataframe column values where a condition is verified in python:

Summary

  • 1 -- Create a simple dataframe with pandas
  • 2 -- Select a column
  • 3 -- Select only elements of the column where a condition is verified
  • 4 -- Select only elements of the column where multiple conditions are verified
  • 5 -- References

1 -- Create a simple dataframe with pandas

Lets' start by creating a simple dataframe with 5 columns and 20 rows:

>>> import pandas as pd
>>> import numpy as np
>>> data = np.arange(1,101)
>>> data = data.reshape(20,5)
>>> df = pd.DataFrame(data=data,columns=['a','b','c','d','e'])

returns

>>> df
     a   b   c   d    e
0    1   2   3   4    5
1    6   7   8   9   10
2   11  12  13  14   15
3   16  17  18  19   20
4   21  22  23  24   25
5   26  27  28  29   30
6   31  32  33  34   35
7   36  37  38  39   40
8   41  42  43  44   45
9   46  47  48  49   50
10  51  52  53  54   55
11  56  57  58  59   60
12  61  62  63  64   65
13  66  67  68  69   70
14  71  72  73  74   75
15  76  77  78  79   80
16  81  82  83  84   85
17  86  87  88  89   90
18  91  92  93  94   95
19  96  97  98  99  100

2 -- Select a column

To select a given column, a solution is to do:

>>> df['b']
0      2
1      7
2     12
3     17
4     22
5     27
6     32
7     37
8     42
9     47
10    52
11    57
12    62
13    67
14    72
15    77
16    82
17    87
18    92
19    97
Name: b, dtype: int64

or

>>> df.loc[:,'b']
0      2
1      7
2     12
3     17
4     22
5     27
6     32
7     37
8     42
9     47
10    52
11    57
12    62
13    67
14    72
15    77
16    82
17    87
18    92
19    97
Name: b, dtype: int64

Note: if the column does not have a name, there is the function iloc():

 >>> df.iloc[:,1]
0      2
1      7
2     12
3     17
4     22
5     27
6     32
7     37
8     42
9     47
10    52
11    57
12    62
13    67
14    72
15    77
16    82
17    87
18    92
19    97

To edit (for example here by multiplying by 10) all elements of the column b, a solution. is to do:

>>> df['b'] = df['b'] * 10
>>> df
     a    b   c   d    e
0    1   20   3   4    5
1    6   70   8   9   10
2   11  120  13  14   15
3   16  170  18  19   20
4   21  220  23  24   25
5   26  270  28  29   30
6   31  320  33  34   35
7   36  370  38  39   40
8   41  420  43  44   45
9   46  470  48  49   50
10  51  520  53  54   55
11  56  570  58  59   60
12  61  620  63  64   65
13  66  670  68  69   70
14  71  720  73  74   75
15  76  770  78  79   80
16  81  820  83  84   85
17  86  870  88  89   90
18  91  920  93  94   95
19  96  970  98  99  100

3 -- Select only elements of the column where a condition is verified

For example to edit only the values that are greater than 500:

>>> df['b'][ df['b'] > 500 ] = df['b'][ df['b'] > 500 ] / 10
>>> df
     a    b   c   d    e
0    1   20   3   4    5
1    6   70   8   9   10
2   11  120  13  14   15
3   16  170  18  19   20
4   21  220  23  24   25
5   26  270  28  29   30
6   31  320  33  34   35
7   36  370  38  39   40
8   41  420  43  44   45
9   46  470  48  49   50
10  51   52  53  54   55
11  56   57  58  59   60
12  61   62  63  64   65
13  66   67  68  69   70
14  71   72  73  74   75
15  76   77  78  79   80
16  81   82  83  84   85
17  86   87  88  89   90
18  91   92  93  94   95
19  96   97  98  99  100

Note: if the name of the column is not specify all columns will be modified, illustration:

>>> df[ df['b'] > 500 ] = df[ df['b'] > 500 ] / 10
>>> df
       a      b     c     d     e
0    1.0   20.0   3.0   4.0   5.0
1    6.0   70.0   8.0   9.0  10.0
2   11.0  120.0  13.0  14.0  15.0
3   16.0  170.0  18.0  19.0  20.0
4   21.0  220.0  23.0  24.0  25.0
5   26.0  270.0  28.0  29.0  30.0
6   31.0  320.0  33.0  34.0  35.0
7   36.0  370.0  38.0  39.0  40.0
8   41.0  420.0  43.0  44.0  45.0
9   46.0  470.0  48.0  49.0  50.0
10   5.1   52.0   5.3   5.4   5.5
11   5.6   57.0   5.8   5.9   6.0
12   6.1   62.0   6.3   6.4   6.5
13   6.6   67.0   6.8   6.9   7.0
14   7.1   72.0   7.3   7.4   7.5
15   7.6   77.0   7.8   7.9   8.0
16   8.1   82.0   8.3   8.4   8.5
17   8.6   87.0   8.8   8.9   9.0
18   9.1   92.0   9.3   9.4   9.5
19   9.6   97.0   9.8   9.9  10.0

4 -- Select only elements of the column where multiple conditions are verified

Example lets select the elements greater than 12 and lower than 40 in the column b:

>>> df['b'][ (df['b'] > 12) & (df['b'] < 40) ]
3    17
4    22
5    27
6    32
7    37

Edit those elements by adding 5

>>> df['b'][ (df['b'] > 12) & (df['b'] < 40) ] = df['b'][ (df['b'] > 12) & (df['b'] < 40) ] + 5
>>> df
     a   b   c   d    e
0    1   2   3   4    5
1    6   7   8   9   10
2   11  12  13  14   15
3   16  22  18  19   20
4   21  27  23  24   25
5   26  32  28  29   30
6   31  37  33  34   35
7   36  42  38  39   40
8   41  42  43  44   45
9   46  47  48  49   50
10  51  52  53  54   55
11  56  57  58  59   60
12  61  62  63  64   65
13  66  67  68  69   70
14  71  72  73  74   75
15  76  77  78  79   80
16  81  82  83  84   85
17  86  87  88  89   90
18  91  92  93  94   95
19  96  97  98  99  100

5 -- References

  • Selecting Subsets of Data in Pandas: Part 1
  • Select Rows & Columns by Name or Index in DataFrame using loc & iloc Python Pandas
  • pandas.DataFrame.loc
  • pandas.DataFrame.iloc

How do you modify a column in a data frame?

Use pandas..
print(df).
def add_one(x):.
return x + 1..
df[1] = df[1]. apply(add_one).
print(df).

How do I change a whole column in pandas?

In order to replace a value in Pandas DataFrame, use the replace() method with the column the from and to values.

How do you change a value in a column in Python?

To replace values in the column, call DataFrame. replace(to_replace, inplace=True) with to_replace set as a dictionary mapping old values to new values.

Can you edit a DataFrame in Python?

Deleting rows and columns (drop) The drop function returns a new DataFrame, with the columns removed. To actually edit the original DataFrame, the “inplace” parameter can be set to True, and there is no returned value. Rows can also be removed using the “drop” function, by specifying axis=0.