programming python

Hướng dẫn loop through xml python - lặp qua xml python

Bộ dữ liệu của tôi như sau:

<?xml version="1.0" encoding="UTF-8"?> <depts xmlns="//SOMELINK" xmlns:xsd="//www.w3.org/2001/XMLSchema" xmlns:xsi="//www.w3.org/2001/XMLSchema-instance" date="2021-01-15"> <dept dept_id="00001" col_two="00001value" col_three="00001false" name = "some_name"> <owners> <currentowner col_four="00001value" col_five="00001value" col_six="00001false" name = "some_name"> <addr col_seven="00001value" col_eight="00001value" col_nine="00001false"/> </currentowner> <currentowner col_four="00001bvalue" col_five="00001bvalue" col_six="00001bfalse" name = "some_name"> <addr col_seven="00001bvalue" col_eight="00001bvalue" col_nine="00001bfalse"/> </currentowner> </owners> </dept> <dept dept_id="00002" col_two="00002value" col_three="00002value" name = "some_name"> <owners> <currentowner col_four="00002value" col_five="00002value" col_six="00002false" name = "some_name"> <addr col_seven="00002value" col_eight="00002value" col_nine="00002false"/> </currentowner> </owners> </dept> </depts>

Hiện tại tôi có hai vòng, một vòng lặp dữ liệu import pandas import xml.etree.ElementTree as element_tree from xml.etree.ElementTree import parse tree = element_tree.parse('<HERE_GOES_XML>') root = tree.getroot() name_space = {'ns0': '//SOMELINK'} #root date_from = root.attrib['date'] print(date_from) #child for pharma in root.findall('.//ns0:dept', name_space): for key, value in pharma.items(): print(key +': ' + value) #granchild, this must be merged to above so entire script will iterate through entire dept node to move to the next for owner in root.findall('.//ns0:dept/ns0:owners/ns0:currentowner', name_space): owner_dict = {} for key, value in owner.items(): print(key +': ' + value) 2, khác qua import pandas import xml.etree.ElementTree as element_tree from xml.etree.ElementTree import parse tree = element_tree.parse('<HERE_GOES_XML>') root = tree.getroot() name_space = {'ns0': '//SOMELINK'} #root date_from = root.attrib['date'] print(date_from) #child for pharma in root.findall('.//ns0:dept', name_space): for key, value in pharma.items(): print(key +': ' + value) #granchild, this must be merged to above so entire script will iterate through entire dept node to move to the next for owner in root.findall('.//ns0:dept/ns0:owners/ns0:currentowner', name_space): owner_dict = {} for key, value in owner.items(): print(key +': ' + value) 3

import pandas import xml.etree.ElementTree as element_tree from xml.etree.ElementTree import parse tree = element_tree.parse('<HERE_GOES_XML>') root = tree.getroot() name_space = {'ns0': '//SOMELINK'} #root date_from = root.attrib['date'] print(date_from) #child for pharma in root.findall('.//ns0:dept', name_space): for key, value in pharma.items(): print(key +': ' + value) #granchild, this must be merged to above so entire script will iterate through entire dept node to move to the next for owner in root.findall('.//ns0:dept/ns0:owners/ns0:currentowner', name_space): owner_dict = {} for key, value in owner.items(): print(key +': ' + value)

Kết quả hiện tại là:

2021-01-15 dept_id: 00001 col_two: 00001value col_three: 00001false dept_id: 00002 col_two: 00002value col_three: 00002value col_four: 00001value col_five: 00001value col_six: 00001false col_four: 00002value col_five: 00002value col_six: 00002false

Tôi đang nhắm đến cái nhìn lồng nhau trước tiên sẽ lặp lại toàn bộ đứa trẻ import pandas import xml.etree.ElementTree as element_tree from xml.etree.ElementTree import parse tree = element_tree.parse('<HERE_GOES_XML>') root = tree.getroot() name_space = {'ns0': '//SOMELINK'} #root date_from = root.attrib['date'] print(date_from) #child for pharma in root.findall('.//ns0:dept', name_space): for key, value in pharma.items(): print(key +': ' + value) #granchild, this must be merged to above so entire script will iterate through entire dept node to move to the next for owner in root.findall('.//ns0:dept/ns0:owners/ns0:currentowner', name_space): owner_dict = {} for key, value in owner.items(): print(key +': ' + value) 4 với những đứa trẻ của nó và chỉ sau đó chuyển sang con tiếp theo. Kết quả dự kiến sẽ được đặt dưới đây sau đó được chuyển thành import pandas import xml.etree.ElementTree as element_tree from xml.etree.ElementTree import parse tree = element_tree.parse('<HERE_GOES_XML>') root = tree.getroot() name_space = {'ns0': '//SOMELINK'} #root date_from = root.attrib['date'] print(date_from) #child for pharma in root.findall('.//ns0:dept', name_space): for key, value in pharma.items(): print(key +': ' + value) #granchild, this must be merged to above so entire script will iterate through entire dept node to move to the next for owner in root.findall('.//ns0:dept/ns0:owners/ns0:currentowner', name_space): owner_dict = {} for key, value in owner.items(): print(key +': ' + value) 5 DataFrame (tôi sẽ cố gắng làm việc này tiếp theo). Do đó, một số cột có cùng tên giữa trẻ em/granchild, do đó, tiền tố sẽ được yêu cầu hoặc lặp lại chỉ thông qua import pandas import xml.etree.ElementTree as element_tree from xml.etree.ElementTree import parse tree = element_tree.parse('<HERE_GOES_XML>') root = tree.getroot() name_space = {'ns0': '//SOMELINK'} #root date_from = root.attrib['date'] print(date_from) #child for pharma in root.findall('.//ns0:dept', name_space): for key, value in pharma.items(): print(key +': ' + value) #granchild, this must be merged to above so entire script will iterate through entire dept node to move to the next for owner in root.findall('.//ns0:dept/ns0:owners/ns0:currentowner', name_space): owner_dict = {} for key, value in owner.items(): print(key +': ' + value) 6 cụ thể.

dept.dept_id: 00001 dept.col_two: 00001value dept.col_three: 00001false dept.name: some_name currentowner.col_four: 00001value currentowner.col_five: 00001value currentowner.col_six: 00001false currentowner.name: some_name currentowner.col_four: 00001bvalue currentowner.col_five: 00001bvalue currentowner.col_six: 00001bfalse currentowner.name: some_name addr.col_seven: 00001value addr.col_eight: 00001value addr.col_nine: 00001false dept.dept_id: 00002 dept.col_two: 00002value dept.col_three: 00002value dept.name: some_name currentowner.col_four: 00002value currentowner.col_five: 00002value currentowner.col_six: 00002false currentowner.name: some_name addr.col_seven: 00002value addr.col_eight: 00002value addr.col_nine: 00002false

[Cập nhật] - Tôi đã bắt gặp import pandas import xml.etree.ElementTree as element_tree from xml.etree.ElementTree import parse tree = element_tree.parse('<HERE_GOES_XML>') root = tree.getroot() name_space = {'ns0': '//SOMELINK'} #root date_from = root.attrib['date'] print(date_from) #child for pharma in root.findall('.//ns0:dept', name_space): for key, value in pharma.items(): print(key +': ' + value) #granchild, this must be merged to above so entire script will iterate through entire dept node to move to the next for owner in root.findall('.//ns0:dept/ns0:owners/ns0:currentowner', name_space): owner_dict = {} for key, value in owner.items(): print(key +': ' + value) 7 nên thực hiện thủ thuật.UPDATE] - I came across import pandas import xml.etree.ElementTree as element_tree from xml.etree.ElementTree import parse tree = element_tree.parse('<HERE_GOES_XML>') root = tree.getroot() name_space = {'ns0': '//SOMELINK'} #root date_from = root.attrib['date'] print(date_from) #child for pharma in root.findall('.//ns0:dept', name_space): for key, value in pharma.items(): print(key +': ' + value) #granchild, this must be merged to above so entire script will iterate through entire dept node to move to the next for owner in root.findall('.//ns0:dept/ns0:owners/ns0:currentowner', name_space): owner_dict = {} for key, value in owner.items(): print(key +': ' + value) 7 which should do the trick.

dept_list = [] for item in root.iterfind('.//ns0:dept', name_space): #print(item.attrib) dept_list.append(item.attrib) #print(dept_list) owner_list = [] for item in root.iterfind('.//ns0:dept/ns0:owners/ns0:currentowner', name_space): #print(item.attrib) owner_list.append(item.attrib) #print(owner_list) zipped = zip(dept_list, owner_list)

XML là gì?

XML là viết tắt của ngôn ngữ đánh dấu mở rộng. Nó được thiết kế để lưu trữ và vận chuyển một lượng dữ liệu nhỏ đến trung bình và được sử dụng rộng rãi để chia sẻ thông tin có cấu trúc.

Python cho phép bạn phân tích và sửa đổi các tài liệu XML. Để phân tích tài liệu XML, bạn cần có toàn bộ tài liệu XML trong bộ nhớ. Trong hướng dẫn này, chúng ta sẽ xem cách chúng ta có thể sử dụng lớp XML Minidom trong Python để tải và phân tích các tệp XML.

Cách phân tích XML bằng Minidom

Chúng tôi đã tạo một tệp XML mẫu mà chúng tôi sẽ phân tích.

Bước 1) Tạo tệp XML mẫu

Bên trong tập tin, chúng ta có thể thấy tên, họ, nhà và lĩnh vực chuyên môn (SQL, Python, thử nghiệm và kinh doanh)

Bước 2) Sử dụng hàm phân tích cú pháp để tải và phân tích tệp XML

Khi chúng tôi đã phân tích cú pháp tài liệu, chúng tôi sẽ in ra tên nút của tên gốc của gốc của tài liệu và tên đầu tiên của trò chơi. TagName và Nodename là các thuộc tính tiêu chuẩn của tệp XML.“node name” of the root of the document and the “firstchild tagname”. Tagname and nodename are the standard properties of the XML file.

Nhập mô -đun xml.dom.minidom và tệp khai báo phải được phân tích cú pháp (myxml.xml)
Tệp này mang một số thông tin cơ bản về một nhân viên như tên, họ, nhà, chuyên môn, v.v.
Chúng tôi sử dụng chức năng phân tích cú pháp trên XML Minidom để tải và phân tích tệp XML
Chúng tôi có Biến DOC và DOC nhận được kết quả của chức năng phân tích cú pháp
Chúng tôi muốn in tên gật đầu và tên con từ tệp, vì vậy chúng tôi khai báo nó trong hàm in
Chạy mã- nó in ra tên gật đầu (#Document) từ tệp XML và tên con cái đầu tiên (nhân viên) từ tệp XML

Ghi chú::

Tên gật đầu và tên con là tên hoặc thuộc tính tiêu chuẩn của XML DOM.

Bước 3) Gọi danh sách các thẻ XML từ tài liệu XML và được in ra

Tiếp theo, chúng tôi cũng có thể gọi danh sách các thẻ XML từ tài liệu XML và được in ra. Ở đây chúng tôi đã in ra tập hợp các kỹ năng như SQL, Python, thử nghiệm và kinh doanh.

Tuyên bố chuyên môn biến đổi, từ đó chúng tôi sẽ trích xuất tất cả các nhân viên tên chuyên môn đang có
Sử dụng chức năng tiêu chuẩn DOM có tên là GetElementsByByTaGname ”
Điều này sẽ nhận được tất cả các yếu tố có tên là kỹ năng
Khai báo vòng lặp trên từng thẻ kỹ năng
Chạy mã- nó sẽ đưa ra danh sách bốn kỹ năng

Chúng ta có thể tạo một thuộc tính mới bằng cách sử dụng chức năng created createdEement và sau đó nối phần thuộc tính hoặc thẻ mới này vào các thẻ XML hiện có. Chúng tôi đã thêm một thẻ mới, Bigdata, trong tệp XML của chúng tôi.

Bạn phải mã để thêm thuộc tính mới (BigData) vào thẻ XML hiện có
Sau đó, bạn phải in thẻ XML bằng các thuộc tính mới được thêm vào thẻ XML hiện có

Để thêm một XML mới và thêm nó vào tài liệu, chúng tôi sử dụng mã là Doc.Create Elements
Mã này sẽ tạo một thẻ kỹ năng mới cho thuộc tính mới của chúng tôi
Thêm thẻ kỹ năng này vào tài liệu đầu tiên (nhân viên)
Chạy mã- Thẻ mới Dữ liệu lớn "sẽ xuất hiện với danh sách chuyên môn khác

Ví dụ trình phân tích cú pháp XML

Ví dụ Python 2

import xml.dom.minidom def main(): # use the parse() function to load and parse an XML file doc = xml.dom.minidom.parse("Myxml.xml"); # print out the document node and the name of the first child tag print doc.nodeName print doc.firstChild.tagName # get a list of XML tags from the document and print each one expertise = doc.getElementsByTagName("expertise") print "%d expertise:" % expertise.length for skill in expertise: print skill.getAttribute("name") #Write a new XML tag and add it into the document newexpertise = doc.createElement("expertise") newexpertise.setAttribute("name", "BigData") doc.firstChild.appendChild(newexpertise) print " " expertise = doc.getElementsByTagName("expertise") print "%d expertise:" % expertise.length for skill in expertise: print skill.getAttribute("name") if name == "__main__": main();

Ví dụ Python 3

import xml.dom.minidom def main(): # use the parse() function to load and parse an XML file doc = xml.dom.minidom.parse("Myxml.xml"); # print out the document node and the name of the first child tag print (doc.nodeName) print (doc.firstChild.tagName) # get a list of XML tags from the document and print each one expertise = doc.getElementsByTagName("expertise") print ("%d expertise:" % expertise.length) for skill in expertise: print (skill.getAttribute("name")) # Write a new XML tag and add it into the document newexpertise = doc.createElement("expertise") newexpertise.setAttribute("name", "BigData") doc.firstChild.appendChild(newexpertise) print (" ") expertise = doc.getElementsByTagName("expertise") print ("%d expertise:" % expertise.length) for skill in expertise: print (skill.getAttribute("name")) if __name__ == "__main__": main();

Cách phân tích XML bằng cách sử dụng ElementTree

ElementTree là API để thao tác XML. ElementTree là cách dễ dàng để xử lý các tệp XML.

Chúng tôi đang sử dụng tài liệu XML sau làm dữ liệu mẫu:

<data> <items> <item name="expertise1">SQL</item> <item name="expertise2">Python</item> </items> </data>

Đọc XML bằng ElementTree:

Trước tiên chúng ta phải nhập mô -đun xml.etree.elementtree.

import xml.etree.ElementTree as ET

Bây giờ, hãy để lấy phần tử gốc:

root = tree.getroot()

Sau đây là mã hoàn chỉnh để đọc dữ liệu XML trên

output:

Summary:

Python cho phép bạn phân tích toàn bộ tài liệu XML tại một lần và không chỉ một dòng tại một thời điểm. Để phân tích tài liệu XML, bạn cần có toàn bộ tài liệu trong bộ nhớ.

Để phân tích tài liệu xml
Nhập XML.DOM.Minidom
Sử dụng chức năng parse parse để phân tích cú pháp tài liệu (doc = xml.dom.minidom.parse (tên tệp);
Gọi danh sách các thẻ XML từ tài liệu XML bằng mã (= doc.getElementsByTagName (tên của tên XML Tags,)
Để tạo và thêm thuộc tính mới trong tài liệu XML
Sử dụng chức năng “createdeLement”

Hướng dẫn loop through xml python - lặp qua xml python

XML là gì?

Cách phân tích XML bằng Minidom

Ví dụ trình phân tích cú pháp XML

Cách phân tích XML bằng cách sử dụng ElementTree

Summary:

Bài Viết Liên Quan

Áo inter milan 2023

Hướng dẫn php search engine github - công cụ tìm kiếm php github

Hướng dẫn how do i choose python3 version? - làm cách nào để chọn phiên bản python3?

Hướng dẫn average tuple python - trăn tuple trung bình

Hướng dẫn background-size css - background-size css

Hướng dẫn python write to file new line - python ghi vào tệp dòng mới

Hướng dẫn how do i add eof in python? - làm cách nào để thêm eof trong python?

Hướng dẫn how to save csv file in python pandas - cách lưu tệp csv trong pandas python

Hướng dẫn how do i convert string to date in python? - làm cách nào để chuyển đổi chuỗi thành ngày tháng trong python?

Hướng dẫn turn off javascript chrome - tắt javascript chrome

Toplist

Top 30 bài tập bổ trợ tiếng anh 6 i learn smart world 2022

Top 10 giáo án tự nhiên xã hội lớp 3 cả năm môi nhất violet 2022

Top 9 download mẫu phong bì mừng đám cưới 2022

Top 9 gia đình và con cái ông nguyễn phú trọng 2022

Top 29 lời dân chương trình bài hát gửi về quan họ 2022

Top 10 giáo án i learn smart world violet 2022

Top 9 đề thi vào lớp 6 trường lê lợi hà đông môn toán 2022

Top 10 thủ tục giám đốc thẩm và tái thẩm trong tố tụng hành chính 2022

Top 9 lễ cô sáu ở công viên tuổi trẻ 2022

Bài mới nhất

Giải bài 27 trang 16 sgk toán 9 tập 1 năm 2024

Kẹo sâm hàn quốc loại nào tốt nhất năm 2024

What is the eye appearing top scrren samsung năm 2024

Phương thức thanh toán quốc tế an toàn nhất năm 2024

Chưa yêu lần nào biết ra làm sao remix năm 2024

Công văn gửi danh sách công chức làm pháp chế năm 2024

Giải vở bài tập sinh học lớp 9 bài 16 năm 2024

Cách làm vở bài tập địa lý lớp 5 năm 2024

Bài tập yoga với dây ten tieng anh la gi năm 2024

Lào cai có địa điểm du lịch nào năm 2024

Chủ đề