dnf搬砖代码Python_让搬砖变得轻松——python操作excel⽤
的脚本
⾸先引⼊⼀个概念,python这个语⾔⾥⾯有很多很成熟的“包”,也就是各种操作的moudle,对于某⼀个特定的project⾥⾯的code,我们可以调⽤各种各样的包来帮助我们完成某些操作,这⾥不由得感叹⼀句,那些开发出这些包的⼈真的是蛮厉害的
在调⽤这些interpreter之前,我们要先加载这些包,这个操作可以在Pycharm⾥⾯进⾏,preference->Project: Projectname->Project interpreter->"+" sign
加载包们的窗⼝
读了⼀些各位⼤⽜的博客之后发现,好像python的话常⽤来操作excel的有这个openpyxl,xlwt,xlrd等等,⼤概看了看⼤佬们的推荐,感觉openpyxl的功能⽐较全⾯,⽽且只有⼀个包就够了
各个操作excel的python包的功能对⽐
好像这个的缺点是处理数据需要的时间相对来说⽐较长,不过我的数据处理量也没有那么⼤,于是就选择了openpyxl这个包来折腾
⾸先明确⼀下这个脚本的⽬的,就是读取从server上下载的.dat⽂件,这个⽂件⾥⾯的格式⼤概是这样的
.dat ⽂件的样⼦(可能已经有⼈发现了是bader的输出⽂件)
对于这样⼀个⽂件,我们把它转换成Excel的⽂件会更容易进⾏直观的编辑,我们想要的结果⼤概是这样的
希望他变成.xlsx的⽂件的样⼦
⽽我们有⼀⼤堆这样的⽂件,分别处于以规律的数字命名的directory⾥(其实也就是提交job时候的⽂件夹),于是我们就需要⼀个script来分别打开这样的.dat⽂件,再把他们写⼊到⼀个workbook⾥⾯,每⼀个.dat⽂件独占⼀个worksheet,worksheet以读取该.dat⽂件的directory的名字命名,这就是我们这个script的意义
步骤如下:
1. 读取.dat⽂件
使⽤python⾥⾯的open打开⽂件,读取每⼀⾏的数据
def ReadTxt(file):
ls = list()
with open(file, 'r', encoding='utf-8-sig') as f:
再将读取的数据分割成columns,因为数据⼀共7个cols,所以这⾥分割成了7个,这个写得有点愚蠢,⼤⼤们不要喷我哈,以后再尝试改成更精简的⽅法
for line adlines():
# To check if the given data can be split
try:
lists = line.split(None, 7)
# Split the list 'line' into 7 different parts, according to if it is an alpha
co1 = lists[0]
co2 = lists[1]
co3 = lists[2]
co4 = lists[3]
co5 = lists[4]
co6 = lists[5]
co7 = lists[6]
# split the given data into cols
ls.append((float(co1), float(co2), float(co3), float(co4), float(co5), float(co6), float(co7)))
# Put remainders in the list
except:
print('Wrong format!')
这样我们就得到了⼀个存着被分割好的数据的list(这个list感觉就像c⾥⾯的⼀个2d array,不知道我理解得有没有问题),然后我们把这个list写进.xlsx⽂件就可以了
2. 写⼊.xlsx⽂件
a. ⾸先⽤pip3安装openpyxl
openpyxl这个Moudle似乎⽐较特殊,要先安装jdcal这个moudle,否则是⽆法安装成功的?这⾥有⼤⼤懂这个的话希望可以帮忙解释⼀下
pip3 install jdcal
提⽰成功之后再如法炮制安装openpyxl
pip3 install openpyxl
b. 安装完成之后可以开始写调⽤这个包的code了
这⾥我们⽤到的openpyxl⾥⾯的东西主要是这个load_workbook
from openpyxl import Workbook, load_workbook
之后可以写出
wb = load_workbook(path)
这样就可以打开/创建⼀个处于给定的path的xlsx⽂件了,这个⽂件是⼀个workbook,⾥⾯最开始会⾃动创建⼀个worksheet,名叫Sheet,这个操作就和直接新建⼀个空⽩的xlsx是⼀样的
在创建了这个⽂件之后,我们可以对这个⽂件进⾏写⼊和读出等操作,⾸先是创建⼀个以读取.dat⽂件的directory的名字命名的worksheet
sheet = wb.create_sheet(sheet_name)
然后将之前的list写⼊,保存⽂件就可以了,记得写完之后要保存,不然就像我们平时写东西但是退出的时候选择’不保存‘⼀样,⽩⼲⼀场了
index = len(value)
for i in range(index):
sheet.append(value[i])
# write the list into worksheet rank by rank
# Save the workbook in previous path
wb.save(path)
3. 整理⼀下并且加⼊⼀些prompts
加⼊⼀些读取格式错误的prompts,让这些prompts能够print到log⽂件⾥,以后出了问题查起来也⽅便,当然最好是不出问题最后的代码就是这个样⼦了
#!/usr/bin/env python
# a script to read text and transfer to xlsx file
#-*- coding:utf-8 -*-
from openpyxl import Workbook, load_workbook
# Read from Text (.dat)
def ReadTxt(file):
ls = list()
with open(file, 'r', encoding='utf-8-sig') as f:
num = 1
# A statistic num of ranks
for line adlines():
# To check if the given data can be split
try:
lists = line.split(None, 7)
# Split the list 'line' into 7 different parts, according to if it is an alpha
co1 = lists[0]
co2 = lists[1]
co3 = lists[2]
co4 = lists[3]
co5 = lists[4]
co6 = lists[5]
co7 = lists[6]
# split the given data into cols
if co1 != "#":
# Chew up the first line title
ls.append((float(co1), float(co2), float(co3), float(co4), float(co5), float(co6), float(co7)))
# Put remainders in the list
num = num +1
except:
print('Wrong format in line ' + str(num) + '!')
num = num +1
# return as a list
return ls
# Write in xlsx
def Write_Excel(path, sheet_name, value):
index = len(value)
# To detect how many ranks in the list
wb = load_workbook(path)
# Open a workbook (No matter already exist or not) in a specific path, which we can specify when we call this func sheet = wb.create_sheet(sheet_name)
# Create a new worksheet in this workbook, named as given name
#lumn_dimensions['B'].width = 115
# Set cell format, width,
for i in range(index):
sheet.append(value[i])
# write the list into worksheet rank by rank
python怎么读取dat文件
# Save the workbook in previous path
wb.save(path)
print("Current txt " + sheet_name + " has been wrote, Tadaaaaaa!")
# Remove empty sheet
def Remove_empty(path):
wb = load_workbook(path)
ws = wb['Sheet']
# Remove empty sheet
wb.save(path)
print('Empty sheet has been removed successfully')
# Main func
if __name__=='__main__':
book_name_xlsx = r'/path/sum.xlsx'
# .xlsx file path where we want to generate this file
wb = Workbook()
wb.save(book_name_xlsx)
# Create and save file as given name and path
# Create a worksheet named as given word
for name in range(12,21):
# Set the range as 12 to 20, which is the range of interested interlayer distance
sheet_name_xlsx = str(name)
# Use target directory name as sheet name
art = ReadTxt(r'/path/'+sheet_name_xlsx+'/ACF.dat')
# Call previous ReadText func, the path is the parent folder of tasks
# Insert the title
art.insert(0, ('#', 'X', 'Y', 'Z', 'CHARGE', 'MIN DIST', 'ATOMIC VOL'))
Write_Excel(book_name_xlsx, sheet_name_xlsx, art)
# Call previous write func
Remove_empty(book_name_xlsx)
# Remove empty sheet
这个就是最后的半成品了,可以再加⼀些其他的功能,整体的框架⼤概就是这样,引⼊了⼀个openpyxl的包,实现了对.xlsx⽂件的创建以及写⼊,希望能帮到苦于机械式重复操作的你
range () 函数的使⽤是这样的:
range(start, stop, [step]),分别是起始、终⽌和步长,实际范围是从start 到 stop-1

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。