Python-docx replace text in header

enter image description hereI have created a word document and added a footer in that document using python docx. But now I want to replace text in the footer.

For example: I have a paragraph in footer i.e "Page 1 to 10". I want to find word "to" and replace it with "of" so my result will be "Page 1 of 10".

My code:

from docx import Document
document = Document()
document.add_heading('Document Title', 0)
section = document.sections[0]
footer = section.footer
footer.add_paragraph("Page 1 to 10")
document.save('mydoc.docx')

asked Feb 18, 2019 at 11:37

EkanshuEkanshu

691 silver badge9 bronze badges

2

Try this solution. You would have to iterate over all the footers in document

from docx import Document
document = Document('mydoc.docx')
for section in document.sections:
    footer = section.footer
    footer.paragraphs[1].text  = footer.paragraphs[1].text.replace("to", "of")

document.save('mydoc.docx') 

The reason why this code is editing the second element of the footer paragraphs list is because you have added another paragraph to the footer in your code. By default there is an empty paragraph already in the footer according to the documentation

answered Feb 18, 2019 at 12:40

mbbcembbce

2,2251 gold badge17 silver badges31 bronze badges

9

I know this is an old question however the accepted answer changes the page number of the footer and requires editing of the XML if you want to fix that issue. Instead I have found that if you edit the text using Runs, it will keep the original format for your header/footer. https://python-docx.readthedocs.io/en/latest/api/text.html#docx.text.run.Run

from docx import Document
document = Document('mydoc.docx')
for section in document.sections:
    footer = section.footer
    footer.paragraphs[1].runs[0].text  = footer.paragraphs[1].runs[0].text.replace("to", "of")

document.save('mydoc.docx') 

answered Feb 4 at 17:57

Word supports page headers and page footers. A page header is text that appears in the top margin area of each page, separated from the main body of text, and usually conveying context information, such as the document title, author, creation date, or the page number. The page headers in a document are the same from page to page, with only small differences in content, such as a changing section title or page number. A page header is also known as a running head.

A page footer is analogous in every way to a page header except that it appears at the bottom of a page. It should not be confused with a footnote, which is not uniform between pages. For brevity’s sake, the term header is often used here to refer to what may be either a header or footer object, trusting the reader to understand its applicability to both object types.

Adding “zoned” header content¶

A header with multiple “zones” is often accomplished using carefully placed tab stops.

The required tab-stops for a center and right-aligned “zone” are part of the Header and Footer styles in Word. If you’re using a custom template rather than the python-docx default, it probably makes sense to define that style in your template.

Inserted tab characters ("\t") are used to separate left, center, and right-aligned header content:

>>> paragraph = header.paragraphs[0]
>>> paragraph.text = "Left Text\tCenter Text\tRight Text"
>>> paragraph.style = document.styles["Header"]

Python-docx replace text in header

The Header style is automatically applied to a new header, so the third line just above (applying the Header style) is unnecessary in this case, but included here to illustrate the general case.

Understanding headers in a multi-section document¶

The “just start editing” approach works fine for the simple case, but to make sense of header behaviors in a multi-section document, a few simple concepts will be helpful. Here they are in a nutshell:

  1. Each section can have its own header definition (but doesn’t have to).
  2. A section that lacks a header definition inherits the header of the section before it. The _Header.is_linked_to_previous property simply reflects the presence of a header definition, False when a definition is present and True when not.
  3. Lacking a header definition is the default state. A new document has no defined header and neither does a newly-inserted section. .is_linked_to_previous reports True in both those cases.
  4. The content of a _Header object is its own content if it has a header definition. If not, its content is that of the first prior section that does have a header definition. If no sections have a header definition, a new one is added on the first section and all other sections inherit that one. This adding of a header definition happens the first time header content is accessed, perhaps by referencing header.paragraphs.