清华开源知识图谱OPENKEpytorchgithub入门bug及解决方案配置:--688IT编程网

清华开源知识图谱OPENKEpytorchgithub⼊门bug及解决⽅案配置：LINUX。。

。

电脑配置: python3.5 有虚拟环境

⾸先安装 OpenKE 软件安装包

git clone -b OpenKE-PyTorch github/thunlp/OpenKE

remote: Enumerating objects: 1033, done.

error: RPC failed; curl 56 GnuTLS recv error (-54): Error in the pull function.python虚拟机

fatal: The remote end hung up unexpectedly

fatal: early EOF

fatal: index-pack failed"

出现了以上的报错，查询了之后，发现是缓冲区溢出了,

纠正⽅法:

git config --global http.postBuffer 524288000

remote: Enumerating objects: 1033, done.

remote: Total 1033 (delta 0), reused 0 (delta 0), pack-reused 1033

Receiving objects: 100% (1033/1033), 276.84 MiB | 84.00 KiB/s, done.

Resolving deltas: 100% (501/501), done.

没有报错

cd OpenKE # 进⼊OpenKE ⽂件夹下

bash make.sh

第⼀步安装module OpenkE 完成

训练知识图谱模型—linux shell 中输⼊代码训练

import config

import models

import json

import numpy as n

错误1 ：虚拟环境引起的错误

在训练时由于我的pytorch 在虚拟环境⾥，出现了⼀些问题

直接import config 报错提⽰

“ModuleNotFoundError: No module named ‘torch’”

因此我将整个OpenKE⽂件挪到了虚拟环境安装包⾥，复制粘贴就可以.进⼊虚拟环境后, 直接import config

错误2 import config 为 python 中默认的config

设置 con.set_in_path("./benchmarks/FB15K/")的时候报错，提⽰我没有set_in_path,

python3.5 模块中⾃带 config，因此，默认导⼊的不是OpenKE 中的 config ⽽是python⾃带的config

解决⽅案1: 将OpenKE⽂件夹的位置添加到系统路径下

在import config 前加⼊

import sys

sys.path.append("./OpenKE")

解决⽅案2 : 在虚拟环境中进⼊ OpenKE⽂件夹下

输⼊

cd OpenKE

就可以解决上述两个错误

输⼊python

进⼊ python 界⾯

错误3:github 页的代码很多参数名称在源码的config中做了微调

import config

import models

import json

import numpy as np

con = config_o.Config()

con.set_in_path('./benchmarks/FB15K/')

con.set_work_threads(4)

con.set_train_times(500)

con.set_nbatches(100)

con.set_alpha(0.001)

con.set_margin(1)

con.set_born(0)

con.set_export_files("./res/model.vec.tf", 0)

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

AttributeError: 'Config' object has no attribute 'set_export_files'

原因: github 页的代码很多参数名称在源码的config中做了微调，调整后的训练代码为:

可在jupyter notebook /spyder /pycharm 中运⾏这段代码

import config

from models import *

import json

## 因为我的服务器上没有cuda 就把os注释掉了，如果有安装cuda 的电脑可以把注释撤掉

# import os

# os.environ['CUDA_VISIBLE_DEVICES']='5'

con = config_o.Config()

con.set_use_gpu(True)

con.set_in_path("./benchmarks/FB15K/")

con.set_work_threads(8)

con.set_train_times(1000)

con.set_nbatches(100)

con.set_alpha(0.001)

con.set_bern(0)

con.set_dimension(100)

con.set_margin(1.0)

con.set_ent_neg_rate(1)

con.set_rel_neg_rate(0)

con.set_opt_method("SGD")

con.set_save_steps(100)

con.set_valid_steps(100)

con.set_early_stopping_patience(10)

con.set_checkpoint_dir("./checkpoint")

con.set_result_dir("./result")

con.set_test_link(True)

con.set_test_triple(True)

con.init()

con.set_train_model(TransE)

测试代码为:

import config

from models import *

import json

## 因为我的服务器上没有cuda 就把os注释掉了，如果有安装cuda 的电脑可以把注释撤掉

# import os

# os.environ['CUDA_VISIBLE_DEVICES']='6'

con = config.Config()

con.set_use_gpu(False)

#Input training files from benchmarks/FB15K/ folder.

con.set_in_path("./benchmarks/FB15K/")

#True: Input test files from the same folder.

con.set_result_dir("./result/")

con.set_test_link(True)

con.set_test_triple(True)

con.init()

con.set_test_model(TransE)

错误4: pytorch 1.0 弃⽤引起的警告

/home/chenmengyuan/py35/OpenKE/models/TransE.py:19: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.

nn.init.xavier__embeddings.weight.data)

/home/chenmengyuan/py35/OpenKE/models/TransE.py:20: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_. nn.i nit.xavier_l_embeddings.weight.data)

解决⽅案: 到19/20⾏的代码，将nn.init.xavier_uniform_后边加⼀个下划线nn.init.xavier_uniform_

结果展⽰:

时间消耗:

⼀个transe 模型训练⼤概要跑1个⼩时，测试半个⼩时之内吧，真慢啊，漫长的模型时间，可以趁着这个时候看看源码.

⼩建议:

希望下次有⼀个MINI FB15K 数据集出来,10分钟搞⼀搞，关键不是看结果，是看这些代码是不是都跑的通，我遇到了⼀个坑，训练集跑完了，测试集跑的时候报错了，结果训练集的东西没保存，修正完错误后⼜重新跑了⼀遍…

当然你可以把epoch 都调为1跑⼀跑⾮常快，可以⽤来测试代码是否都准确了，我当时跑的时候没有想到测试代码，被坑到了训练结果:

感觉训练出来的效果不是很好

训练过程:

Epoch 989 | loss: 1045.272685

Epoch 990 | loss: 980.651575

Epoch 991 | loss: 1002.198379

Epoch 992 | loss: 980.045353

Epoch 993 | loss: 959.496778

Epoch 994 | loss: 985.212145

Epoch 995 | loss: 1008.031349

Epoch 996 | loss: 1005.073647

Epoch 997 | loss: 990.986805

Epoch 998 | loss: 945.992249

Epoch 999 | loss: 981.005116

TransE 模型的结果 68%

测试结果还⽐较好最⾼hits@10 达到了80%

第⼀集的技术细节及bug 总结就说到这⾥

还会写⼀个代码细节原理以及结果的解读称之为第⼆集

688IT编程网

清华开源知识图谱OPENKEpytorchgithub入门bug及解决方案配置:

发表评论

推荐文章

应用程序的安全检测方法、装置、电子设备和存储介质

nginx map用法正则

VBA之正则表达式(1)--基础篇

Prometheus监控学习笔记之初识PromQL

关于PHP中的webshell

热门文章

m函数数字提取

jest断言方法大全

中兴ZXSEC US 管理员手册

keras系列(一):参数设置

Qt从QString中提取出数字

element input 金额千分位格式化

freemaker 参数解析正则

C#正则验证数字

form表单验证正则

scanf正则表达式用法

grafana value的正则表达式

Android平台浮点数运算应用

js-(JS正则表达式验证数字)

判断Python输入是否是整数,字符,或浮点数

c语言 sscanf 正则规则

从文本中提取数值技巧

js将整数转换成两位浮点数的方法

vue正则限制浮点数

8到20的结尾的正则

shell 正则表达式最后一行

最新文章

应用程序的安全检测方法、装置、电子设备和存储介质

VBA之正则表达式(1)--基础篇

代码编辑的辅助方法、装置及电子设备

SHELL查字符串中包含字符的命令

String方法中replace和replaceAll的区别详解(源码分析)

双字节符号正则

标签列表

688IT编程网

清华开源知识图谱OPENKEpytorchgithub入门bug及解决方案配置:

发表评论

推荐文章

应用程序的安全检测方法、装置、电子设备和存储介质

nginx map用法 正则

VBA之正则表达式(1)--基础篇

Prometheus监控学习笔记之初识PromQL

关于PHP中的webshell

热门文章

m函数数字提取

jest断言方法大全

中兴ZXSEC US 管理员手册

keras系列(一):参数设置

Qt从QString中提取出数字

element input 金额千分位格式化

freemaker 参数解析正则

C#正则验证数字

form表单验证正则

scanf正则表达式用法

grafana value的正则表达式

Android平台浮点数运算应用

js-(JS正则表达式验证数字)

判断Python输入是否是整数,字符,或浮点数

c语言 sscanf 正则规则

从文本中提取数值技巧

js将整数转换成两位浮点数的方法

vue正则限制浮点数

8到20的结尾的正则

shell 正则表达式 最后一行

最新文章

应用程序的安全检测方法、装置、电子设备和存储介质

VBA之正则表达式(1)--基础篇

代码编辑的辅助方法、装置及电子设备

SHELL查字符串中包含字符的命令

String方法中replace和replaceAll的区别详解(源码分析)

双字节符号正则

标签列表

nginx map用法正则

shell 正则表达式最后一行