从头开始阅读PyTorch代码--Operators篇
这篇是阅读PyTorch源代码整理的笔记,⽅便以后翻阅。这⾥主要是想知道PyTorch的operators的定义都是怎么组织的,以及如果要添加新的operator的话,该怎么做。
__init__.py跟setup.py
⽐较不错的着⼿点是torch这个模块的__init__.py跟安装⽤的setup.py。 把两个⽂件都浏览⼀遍有个⼤体的概念。然后在__init__.py⾥⾯
搜__all__,通过观察这些operators是怎么被添加到__all__⾥⾯去的,就能知道我们⽤的所有的那些个operators是怎么来的了。
搜__all__发现的关键的⼀段是:
__all__ += [name for name in dir(_C)
if name[0] != '_' and
dswith('Base')]
这段⼲的事情就是把torch._C中定义的各种东西按需加⼊__all__⾥⾯去,所以要想知道哪些operators是怎么来的,还是需要去
看torch._C。从名字⼀看就知道,torch._C这个东西,是PyTorch的⽤C/C++之类的语⾔写的那⼀部分。这就涉及到这⼀部分是怎么构建的了,这个要从setup.py⾥⾯翻,发现的相关的代码段如下:
main_sources = ["torch/csrc/stub.cpp"]
extensions = []
packages = find_packages(exclude=('tools', 'tools.*'))
C = Extension("torch._C",
libraries=main_libraries,
sources=main_sources,
language='c++',
extra_compile_args=main_compile_args + extra_compile_args,
include_dirs=[],
library_dirs=library_dirs,
extra_link_args=extra_link_args + main_link_args + [make_relative_rpath('lib')],
)
extensions.append(C)
基本上就可以断定torch._C这个东西就是从torch/csrc/这个⽬录⾥⾯的⼀⼤堆⽂件编译出来的了。torch/csrc/⾥⾯⽂件⼀⼤堆,从哪⾥着⼿是个问题。因为torch._C是python的⼀个模块,那么肯定得有地⽅通过python的C-binding创建这个模块才是,这就是个不错的着⼿点。要到这个模块是从哪⾥创建的,这时候就要祭出grep⼤法了,在torch/csrc/这个⽬录⾥⾯运⾏这样⼀条命令:
grep 'torch._C' -r .
就可以得到所有有关的代码⾏了,在Module.cpp发现的最像的⼀⾏是:
#if PY_MAJOR_VERSION == 2
ASSERT_TRUE(module = Py_InitModule("torch._C", methods.data()));
#else
static struct PyModuleDef torchmodule = {
PyModuleDef_HEAD_INIT,
"torch._C",
nullptr,
-1,
methods.data()
};
ASSERT_TRUE(module = PyModule_Create(&torchmodule));
#endif
这⼀⾏⼀看就是在初始化这个模块,⽽且⽂件名叫做Module.cpp也很符合,那下⼀步就从这⾥开始好了。__init__.py跟setup.py也可以退出我们的历史舞台了。
另外要注意,在施展grep⼤法之前,⼀定要先把PyTorch给编译⼀遍,因为PyTorch的很多代码是编译的时候根据其他⽂件⽣成的,带着⽣成的⽂件⼀起查⽐较好。
Module.cpp跟autograd
把这个⽂件从头到尾浏览⼀遍,基本上就可以断定初始化模块是在initModule⾥⾯完成的了,这个函数⾥⾯初始化了⼀⼤堆东西,主要还是具体的哪⼀⾏是负责初始化那堆operators的。注意到initModule⾥⾯有好多类似
#ifdef USE_CUDA
torch::cuda::initModule(module);
#endif
ASSERT_TRUE(THPDoubleStorage_init(module));
ASSERT_TRUE(THPFloatStorage_init(module));
ASSERT_TRUE(THPHalfStorage_init(module));
ASSERT_TRUE(THPLongStorage_init(module));
ASSERT_TRUE(THPIntStorage_init(module));
ASSERT_TRUE(THPShortStorage_init(module));
ASSERT_TRUE(THPCharStorage_init(module));
ASSERT_TRUE(THPByteStorage_init(module));
ASSERT_TRUE(THPBoolStorage_init(module));
这种就可以直接跳过不读了,因为不管你是否启⽤了这⼀堆的feature,那些个operators都是存在的,那就当你没启⽤这堆好了。还有⼀堆东西,看名字就不相关,就直接不⽤搭理了。所以到最后基本上筛选出来的看起来可能是的也只有:
PyObject* initModule() {
HANDLE_TH_ERRORS
at::init_num_threads();
C10_LOG_API_USAGE_ONCE("torch.python.import");
#define ASSERT_TRUE(cmd) if (!(cmd)) return nullptr
THPUtils_addPyMethodDefs(methods, TorchMethods);
THPUtils_addPyMethodDefs(methods, DataLoaderMethods);
THPUtils_addPyMethodDefs(methods, torch::autograd::python_functions());
THPUtils_addPyMethodDefs(methods, torch::multiprocessing::python_functions());
#ifdef USE_CUDA
THPUtils_addPyMethodDefs(methods, THCPModule_methods());
#endif
#ifdef USE_CUDNN
THPUtils_addPyMethodDefs(methods, THCUDNN_methods());
#endif
#ifdef USE_DISTRIBUTED
THPUtils_addPyMethodDefs(methods, THDPModule_methods());
#ifdef USE_C10D
THPUtils_addPyMethodDefs(methods, torch::distributed::c10d::python_functions());
#endif
#endif
那就先从TorchMethods看起,这个东西就定义在Module.cpp⾥⾯,看⼀眼就会知道跟operators没啥关
系。torch::autograd::python_functions()这个东西,使⽤grep⼤法,可以发现他的定义位于torch/csrc/autograd/init.cpp,也不是啥想要的,继续看THPVariable_initModule,使⽤grep⼤法,就
会发现这个函数是在torch/csrc/autograd/python_variable.cpp⾥⾯定义的,翻看⼀下这个函数的定义,注意到下⾯这⾏:
bool THPVariable_initModule(PyObject *module)
{
static std::vector<PyMethodDef> methods;
THPUtils_addPyMethodDefs(methods, torch::autograd::variable_methods);
THPUtils_addPyMethodDefs(methods, extra_methods);
THPVariableType.tp_methods = methods.data();
if (PyType_Ready(&THPVariableType) < 0)
return false;
Py_INCREF(&THPVariableType);
PyModule_AddObject(module, "_TensorBase",  (PyObject *)&THPVariableType);
torch::autograd::initTorchFunctions(module);
torch::autograd::initTensorImplConversion(module);
return true;
}
整个函数定义⾥⾯,也只有这⼀⾏最像是定义operators的了,于是继续深挖,继续使⽤grep⼤法:
grep variable_methods -r .
就会发现这个变量是定义在torch/csrc/下的autograd/generated/python_variable_methods.cpp⾥⾯的,打开看看,发现如下的内容:
PyMethodDef variable_methods[] = {
{"__add__", (PyCFunction)THPVariable_add, METH_VARARGS | METH_KEYWORDS, NULL},
{"__radd__", (PyCFunction)THPVariable_add, METH_VARARGS | METH_KEYWORDS, NULL},
{"__iadd__", (PyCFunction)THPVariable_add_, METH_VARARGS | METH_KEYWORDS, NULL},
{"__rmul__", (PyCFunction)THPVariable_mul, METH_VARARGS | METH_KEYWORDS, NULL},
{"__mul__", (PyCFunction)THPVariable_mul, METH_VARARGS | METH_KEYWORDS, NULL},
{"__imul__", (PyCFunction)THPVariable_mul_, METH_VARARGS | METH_KEYWORDS, NULL},
{"__sub__", (PyCFunction)THPVariable_sub, METH_VARARGS | METH_KEYWORDS, NULL},
{"__isub__", (PyCFunction)THPVariable_sub_, METH_VARARGS | METH_KEYWORDS, NULL},
{"__div__", (PyCFunction)THPVariable_div, METH_VARARGS | METH_KEYWORDS, NULL},
{"__truediv__", (PyCFunction)THPVariable_div, METH_VARARGS | METH_KEYWORDS, NULL},
{"__idiv__", (PyCFunction)THPVariable_div_, METH_VARARGS | METH_KEYWORDS, NULL},
{"__mod__", (PyCFunction)THPVariable_remainder, METH_VARARGS | METH_KEYWORDS, NULL},
{"__bool__", (PyCFunction)THPVariable_bool_scalar, METH_NOARGS, NULL},
{"__float__", (PyCFunction)THPVariable_float_scalar, METH_NOARGS, NULL},
{"__int__", (PyCFunction)THPVariable_integral_scalar, METH_NOARGS, NULL},
python怎么读文件{"__long__", (PyCFunction)THPVariable_integral_scalar, METH_NOARGS, NULL},
{"__index__", (PyCFunction)THPVariable_index_scalar, METH_NOARGS, NULL},
{"__nonzero__", (PyCFunction)THPVariable_bool_scalar, METH_NOARGS, NULL},
{"__invert__", (PyCFunction)THPVariable_invert, METH_NOARGS, NULL},
{"__matmul__", (PyCFunction)THPVariable_matmul, METH_VARARGS | METH_KEYWORDS, NULL},
{"_is_view", (PyCFunction)THPVariable__is_view, METH_NOARGS, NULL},
{"apply_", (PyCFunction)THPVariable_apply_, METH_O, NULL},
{"byte", (PyCFunction)THPVariable_byte, METH_NOARGS, NULL},
{"char", (PyCFunction)THPVariable_char, METH_NOARGS, NULL},
{"contiguous", (PyCFunction)THPVariable_contiguous, METH_VARARGS | METH_KEYWORDS, NULL},
{"copy_", (PyCFunction)THPVariable_copy_, METH_VARARGS | METH_KEYWORDS, NULL},
{"cpu", (PyCFunction)THPVariable_cpu, METH_NOARGS, NULL},
{"cuda", (PyCFunction)THPVariable_cuda, METH_VARARGS | METH_KEYWORDS, NULL},
{"dim", (PyCFunction)THPVariable_dim, METH_NOARGS, NULL},
{"double", (PyCFunction)THPVariable_double, METH_NOARGS, NULL},
{"element_size", (PyCFunction)THPVariable_element_size, METH_NOARGS, NULL},
{"float", (PyCFunction)THPVariable_float, METH_NOARGS, NULL},
{"get_device", (PyCFunction)THPVariable_get_device, METH_NOARGS, NULL},
{"bool", (PyCFunction)THPVariable_bool, METH_NOARGS, NULL},
{"half", (PyCFunction)THPVariable_half, METH_NOARGS, NULL},
{"int", (PyCFunction)THPVariable_int, METH_NOARGS, NULL},
{"is_contiguous", (PyCFunction)THPVariable_is_contiguous, METH_VARARGS | METH_KEYWORDS, NULL},
{"item", (PyCFunction)THPVariable_item, METH_NOARGS, NULL},
{"long", (PyCFunction)THPVariable_long, METH_NOARGS, NULL},
{"map_", (PyCFunction)THPVariable_map_, METH_VARARGS | METH_KEYWORDS, NULL},
{"map_", (PyCFunction)THPVariable_map_, METH_VARARGS | METH_KEYWORDS, NULL},
{"map2_", (PyCFunction)THPVariable_map2_, METH_VARARGS | METH_KEYWORDS, NULL},
{"ndimension", (PyCFunction)THPVariable_dim, METH_NOARGS, NULL},
{"nelement", (PyCFunction)THPVariable_numel, METH_NOARGS, NULL},
{"new", (PyCFunction)THPVariable_new, METH_VARARGS | METH_KEYWORDS, NULL},
{"new_empty", (PyCFunction)THPVariable_new_empty, METH_VARARGS | METH_KEYWORDS, NULL},
{"new_full", (PyCFunction)THPVariable_new_full, METH_VARARGS | METH_KEYWORDS, NULL},
{"new_ones", (PyCFunction)THPVariable_new_ones, METH_VARARGS | METH_KEYWORDS, NULL},
{"new_tensor", (PyCFunction)THPVariable_new_tensor, METH_VARARGS | METH_KEYWORDS, NULL},
{"new_zeros", (PyCFunction)THPVariable_new_zeros, METH_VARARGS | METH_KEYWORDS, NULL},
{"numpy", (PyCFunction)THPVariable_numpy, METH_NOARGS, NULL},
{"record_stream", (PyCFunction)THPVariable_record_stream, METH_O, NULL},
{"requires_grad_", (PyCFunction)THPVariable_requires_grad_, METH_VARARGS | METH_KEYWORDS, NULL},
{"short", (PyCFunction)THPVariable_short, METH_NOARGS, NULL},
{"size", (PyCFunction)THPVariable_size, METH_VARARGS | METH_KEYWORDS, NULL},
{"storage", (PyCFunction)THPVariable_storage, METH_NOARGS, NULL},
{"storage_offset", (PyCFunction)THPVariable_storage_offset, METH_NOARGS, NULL},
{"storage_type", (PyCFunction)THPVariable_storage_type, METH_NOARGS, NULL},
{"stride", (PyCFunction)THPVariable_stride, METH_VARARGS | METH_KEYWORDS, NULL},
{"to", (PyCFunction)THPVariable_to, METH_VARARGS | METH_KEYWORDS, NULL},
{"tolist", (PyCFunction)THPVariable_tolist, METH_NOARGS, NULL},
{"type", (PyCFunction)THPVariable_type, METH_VARARGS | METH_KEYWORDS, NULL},
${py_method_defs}
{NULL}
};
这就是⼀个长长的列表,列举了所有的的operators,每个operator对应⼀个THPVariable_开头的函数定义在同⼀个⽂件⾥⾯,⽽这个⽂件的开头,则说明了这个⽂件是从tools/autograd/templates/python_variable_methods.cpp这个模板⽣成⽽来。
浏览⼀下所有的THPVariable_开头的函数,就会发现所有的不同的这些个函数都⼤同⼩异,基本上核⼼部分只有下⾯(wrap)的内容:
static PyObject * THPVariable_integral_scalar(PyObject* self, PyObject* args) {
HANDLE_TH_ERRORS
jit::tracer::warn("Converting a tensor to a Python integer", jit::tracer::WARN_PYTHON_DATAFLOW);
auto& self_ = reinterpret_cast<THPVariable*>(self)->cdata;
if (isFloatingType(self_.scalar_type())) {
// we can't dispatch to item<int64_t> here because we want to avoid ATen overflow checks;
/
/ the python integral type (long in python2) can't overflow.
return THPUtils_packDoubleAsInt(dispatch_to_CDouble(self_));
} else {
return wrap(dispatch_to_CLong(self_));
}
END_HANDLE_TH_ERRORS
}
其中dispatch_xxxxx应该就是xxxxx这个operator的核⼼实现部分。继续动⽤grep⼤法挖,只需要随便挑选⼀个operator搜索就⾏了,例如:
grep dispatch_to_CDouble -r .
搜了就会发现这些个dispatch_开头的函数,是定义在同⽬录以下的python_variable_methods.cpp⽂件⾥⾯的。翻开这个⽂件,浏览⼀下这些dispatch函数的定义,都⼤同⼩异,下⾯摘录其中⼀个:
static double dispatch_to_CDouble(const Tensor & self) {
AutoNoGIL no_gil;
OptionalDeviceGuard device_guard(device_of(self));
if (self.numel() != 1) {
throw ValueError("only one element tensors can be converted to Python scalars");
}
return self.item<double>();
}
从代码发现,这些个operator,实际上是Tensor这个类的成员函数,所以我们就知道,下⼀步应该挖的,就是Tensor这个类了。除此以外,还有⼀个很重要的东西就是搞明⽩代码⽣成的原理,这样就能知道代码⽣成器是怎样到这些operators的定义,进⽽⽣成这些函数的了。
Tensor这个类的出处在python_variable_methods_dispatch.h⽂件的头部可以到:
namespace torch { namespace autograd {
using at::Tensor;
using at::Scalar;
using at::TensorList;
using at::IntArrayRef;
using at::Generator;
using at::SparseTensorRef;
using at::Storage;
${py_method_dispatch}
}} // namespace torch::autograd
由此可见Tensor是ATen⾥⾯定义的,由此看来autograd也基本要退出我们的历史舞台了,轮到ATen登场了。
ATen
要学习ATen其实⾮常简单,在aten⽬录⾥⾯乱扒乱翻⼀通,挨个⽂件夹都点开瞅两眼,把所有的README.md都读⼀遍,就会发现,实际上ATen的算符是怎么定义的,实际上,已经在aten/src/ATen/README.md⽂件中,进⾏了⾮常详细的说明。
综合各个README.md的信息,并简单总结⼀下,就是:PyTorch的算符,都是定义在ATen⾥⾯的,⽽ATen⾥⾯的算符的实现,⼀部分是从⽼的Lua Torch继承⽽来,这⼀部分的代码,位于aten/src/TH*这些个⽬录⾥⾯,这些都是历史遗留的遗产,继承过来直接⽤,并不是PyTorch最终想要的operator的实现⽅式。最终“好”的实现⽅式,是在aten/src/ATen/native/⽬录⾥⾯。很多算符,也已经在这个⽬录下,被重新实现了⼀遍。这些⽼的算符的列表,是在aten/src/ATen/Declarations.cwrap中定义的。⽽新的算符的列表的定义,是
在aten/src/ATen/native/native_functions.yaml中。本⽂只去探讨新的算符的实现⽅式。
到了这⾥,想要继续扒⼀下新的算符实现的,就需要去扒⼀下native_functions.yaml这个⽂件是怎么被读取的了。在PyTorch的根⽬录下,继续⽤grep⼤法,搜索关键字native_functions.yaml,得到的结果,在aten/src/ATen/gen.py: ⼀条看起来很像是我们想要的结果的是:
native_files = filter_by_extension(options.files, 'native_functions.yaml')
打开gen.py这个⽂件查看,就会发现下⾯之类的代码:

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系QQ:729038198,我们将在24小时内删除。