You can directly access the year and month attributes, or request a datetime.datetime:
In [15]: t = pandas.tslib.Timestamp.now() In [16]: t Out[16]: Timestamp('2014-08-05 14:49:39.643701', tz=None) In [17]: t.to_pydatetime() #datetime method is deprecated Out[17]: datetime.datetime(2014, 8, 5, 14, 49, 39, 643701) In [18]: t.day Out[18]: 5 In [19]: t.month Out[19]: 8 In [20]: t.year Out[20]: 2014One way to combine year and month is to make an integer encoding them, such as: 201408 for August, 2014. Along a whole column, you could do this as:
df['YearMonth'] = df['ArrivalDate'].map(lambda x: 100*x.year + x.month)or many variants thereof.
I'm not a big fan of doing this, though, since it makes date alignment and arithmetic painful later and especially painful for others who come upon your code or data without this same convention. A better way is to choose a day-of-month convention, such as final non-US-holiday weekday, or first day, etc., and leave the data in a date/time format with the chosen date convention.
The calendar module is useful for obtaining the number value of certain days such as the final weekday. Then you could do something like:
import calendar import datetime df['AdjustedDateToEndOfMonth'] = df['ArrivalDate'].map( lambda x: datetime.datetime( x.year, x.month, max(calendar.monthcalendar(x.year, x.month)[-1][:5]) ) )If you happen to be looking for a way to solve the simpler problem of just formatting the datetime column into some stringified representation, for that you can just make use of the strftime function from the datetime.datetime class, like this:
In [5]: df Out[5]: date_time 0 2014-10-17 22:00:03 In [6]: df.date_time Out[6]: 0 2014-10-17 22:00:03 Name: date_time, dtype: datetime64[ns] In [7]: df.date_time.map(lambda x: x.strftime('%Y-%m-%d')) Out[7]: 0 2014-10-17 Name: date_time, dtype: objectIn this short guide, I'll show you how to extract Month and Year from a DateTime column in Pandas DataFrame. You can also find how to convert string data to a DateTime. So at the end you will get:
01/08/2021 -> 2021-08
DD/MM/YYYY -> YYYY-MM
or any other date format. We will also cover MM/YYYY.
To start, here is the syntax that you may apply in order extract concatenation of year and month:
.dt.to_period('M')In the next section, I'll review the steps to apply the above syntax in practice.
Step 1: Create a DataFrame with Datetime values
Lets create a DataFrame which has a single column StartDate:
dates = ['2021-08-01', '2021-08-02', '2021-08-03'] df = pd.DataFrame({'StartDate': dates})result:
2021-08-01 |
2021-08-02 |
2021-08-03 |
In order to convert string to Datetime column we are going to use:
df['StartDate'] = pd.to_datetime(df['StartDate'])Step 2: Extract Year and Month with .dt.to_period('M') - format YYYY-MM
In order to extract from a full date only the year plus the month: 2021-08-01 -> 2021-08 we need just this line:
df['StartDate'].dt.to_period('M')result:
0 2021-08 1 2021-08 2 2021-08Step 3: Extract Year and Month other formats MM/YYYY
What if you like to get the month first and then the year? In this case we will use .dt.strftime in order to produce a column with format: MM/YYYY or any other format.
df['StartDate'].dt.strftime('%m/%Y') 0 08/2021 1 08/2021 2 08/2021Note: have in mind that this solution might be really slow in case of a huge DataFrame.
Step 4: Extracting Year and Month separately and combine them
A bit faster solution than step 3 plus a trace of the month and year info will be:
- extract month and date to separate columns
- combine both columns into a single one
2021-08-01 | 2021 | 8 |
2021-08-02 | 2021 | 8 |
2021-08-03 | 2021 | 8 |
and then:
df['yyyy'].astype(str) + '-'+ df['mm'].astype(str)Note: If you don't need extra columns you can just do:
df['StartDate'].dt.year.astype(str) + "-" + df['StartDate'].dt.month.astype(str)Notebook with all examples: Extract Month and Year from DateTime column