python代码xml转txt实例
为了训练深度学习模型,经常要整理⼤量的标注数据,需统⼀不同格式的标注数据,⼀般情况下习惯读取TXT格式的数据。但实际中经常遇到XML格式的标注数据,在此举例:1.读取XML标注数据;2.写⼊TXT⽂件。
XML标注数据如下
<annotation verified="no">
<folder>suE</folder>
<filename>Drivingrecord_001</filename>
<path>C:\Desktop\Drivingrecord_001.jpg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>1920</width>
<height>1080</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>苏E*****-蓝-1-⽩,灰-⼤众-上海⼤众-桑塔纳-尚纳</name>
<flag>polygon</flag>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<leftTopx>170</leftTopx>
<leftTopy>704</leftTopy>
<rightTopx>167</rightTopx>
<rightTopy>729</rightTopy>
<rightBottomx>242</rightBottomx>
<rightBottomy>735</rightBottomy>
<leftBottomx>243</leftBottomx>
<leftBottomy>710</leftBottomy>
</bndbox>
</object>
<object>
<name>苏E*****-蓝-1-黄-雷克萨斯-雷克萨斯(进⼝)-雷克萨斯RX</name>
<flag>polygon</flag>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<leftTopx>733</leftTopx>
<leftTopy>721</leftTopy>
<rightTopx>733</rightTopx>
<rightTopy>759</rightTopy>
<rightBottomx>881</rightBottomx>
<rightBottomy>760</rightBottomy>
<leftBottomx>882</leftBottomx>
<leftBottomy>722</leftBottomy>
</bndbox>
</object>
<object>
<name>苏*****-蓝-1-⿊-宝马-宝马(进⼝)-宝马7系</name>
<flag>polygon</flag>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<leftTopx>1274</leftTopx>
<leftTopy>657</leftTopy>
<rightTopx>1274</rightTopx>
<rightTopy>671</rightTopy>
<rightBottomx>1325</rightBottomx>
<rightBottomy>670</rightBottomy>
<leftBottomx>1326</leftBottomx>
<leftBottomy>656</leftBottomy>
</bndbox>
</object>
<object>
<name>苏*****-蓝-1-灰-标致-东风标致-标致307</name>
<flag>polygon</flag>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<leftTopx>1609</leftTopx>
<leftTopy>658</leftTopy>
<rightTopx>1611</rightTopx>
<rightTopy>671</rightTopy>
<rightBottomx>1659</rightBottomx>
<rightBottomy>669</rightBottomy>
<leftBottomx>1657</leftBottomx>
<leftBottomy>656</leftBottomy>
</bndbox>
</object>
</annotation>
在此,我们只需要图⽚名filename,和每个object的坐标(四个点的坐标)
Drivingrecord_001.jpg 170 704 167 729 242 735 243 710 733 721 733 759 881 760 882 722 1274 657 1274 671 1325 670 1326 656 1609 658 1611 671 1659 669 1657 656
利⽤xml.dom.*模块,⽂件对象模块DOM在读取XML⽂件时,⼀次读取整个⽂件,将其所有数据保存在⼀个树结构中,此时,可利⽤DOM的各种函数来读取⽬标数据。在此,利⽤xml.dom.minidom解析XML⽂件。
并将⽬标数据写⼊TXT⽂档。
# -*- coding: utf-8 -*-
"""
Created on Fri Mar 2 15:36:44 2018
@author: gg
"""
import xml.dom.minidom
import os
save_dir = 'D:\plate_train'
if not ists(save_dir):
os.mkdir(save_dir)
f = open(os.path.join(save_dir, ''), 'w')
DOMTree = xml.dom.minidom.parse('D:\plate_train\label\l')
annotation = DOMTree.documentElement
filename = ElementsByTagName("filename")[0]
imgname = filename.childNodes[0].data+'.jpg'
print(imgname)
objects = ElementsByTagName("object")
loc = [imgname] #⽂档保存格式:⽂件名坐标
for object in objects:
bbox = ElementsByTagName("bndbox")[0]
leftTopx = ElementsByTagName("leftTopx")[0]
lefttopx = leftTopx.childNodes[0].data
print(lefttopx)
leftTopy = ElementsByTagName("leftTopy")[0]
lefttopy = leftTopy.childNodes[0].data
print(lefttopy)
rightTopx = ElementsByTagName("rightTopx")[0]
righttopx = rightTopx.childNodes[0].data
print(righttopx)
rightTopy = ElementsByTagName("rightTopy")[0]
righttopy = rightTopy.childNodes[0].data
print(righttopy)
rightBottomx = ElementsByTagName("rightBottomx")[0]
rightbottomx = rightBottomx.childNodes[0].data
print(rightbottomx)
rightBottomy = ElementsByTagName("rightBottomy")[0]
rightbottomy = rightBottomy.childNodes[0].data
print(rightbottomy)
leftBottomx = ElementsByTagName("leftBottomx")[0]
leftbottomx = leftBottomx.childNodes[0].data
print(leftbottomx)
leftBottomy = ElementsByTagName("leftBottomy")[0]
leftbottomy = leftBottomy.childNodes[0].data
print(leftbottomy)
loc = loc + [lefttopx, lefttopy, righttopx, righttopy, rightbottomx, rightbottomy, leftbottomx, leftbottomy]
for i in range(len(loc)):
python处理xml文件f.write(str(loc[i])+' ')
f.write('\t\n')
f.close()
以上这篇python代码xml转txt实例就是⼩编分享给⼤家的全部内容了,希望能给⼤家⼀个参考,也希望⼤家多多⽀持。
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。
发表评论