您的位置:首页 > 其它

Torch学习笔记

2016-08-14 15:15 127 查看

Torch笔记 (二)快速入门

Torch中的唯一的数据结构就是Tensor了,而该结构简洁而且强大,非常适合进行矩阵类的数值计算,它是Torch中最最重要的类了。这个Tensor其实就是个多维矩阵,支持矩阵的各种操作。这里需要特别强调的是,lua中的数组(其实是table)下标是从1开始的,因此Tensor对象的下标也是从1开始的。

咱们从程序猿的角度来说,首先,Tensor也是有类型的,Tensor家族有ByteTensor 、CharTensor 、ShortTensor 、IntTensor 、LongTensor 、FloatTensor 、DoubleTensor 这么几个成员,不用我说,看字面意思就知道这些Tensor各式表示什么意思。默认的是DoubleTensor ,这是为了计算的方便吧。

然后是Tensor的构造函数了,怎么创建一个Tensor呢,常用的有下面几个。在安装有Torch的机器上输入“th”命令进入Torch环境。

th> a = torch.Tensor(2,4);print(a)
0  0  0  0
0  0  0  0
[torch.DoubleTensor of size 2x4]


th> a = torch.Tensor(2,4,2)
th> print(b)
(1,.,.) =
6.9414e-310  6.9414e-310
5.0872e-317  2.3253e+251
5.0450e+223  1.6304e-322
6.9414e-310  5.0873e-317

(2,.,.) =
1.0277e-321  2.3715e-322
5.0873e-317  5.9416e-313
5.0873e-317   8.5010e-96
6.9677e+252  1.6304e-322
[torch.DoubleTensor of size 2x4x2]

th> c = torch.IntTensor(2,3);print(c) -- 也可以指定类型
1.0302e+07  0.0000e+00  1.7000e+09
1.1000e+02  0.0000e+00  0.0000e+00
[torch.IntTensor of size 2x3]


torch.Tensor(sz1 [,sz2 [,sz3 [,sz4]]]])


上面的的构造函数是创建一个sz1 x sz2 x sx3 x sz4 x …的N维Tensor,例如上述Tensor对象b就是个2 x 4x 2的3维Tensor,但是这样只是定义,没有初始化,需要对其进行赋值操作。

th> torch.Tensor({{1,2,3,4}, {5,6,7,8}})
1  2  3  4
5  6  7  8
[torch.DoubleTensor of dimension 2x4]


上面的是用一个lua的table来进行初始化的,即下面的形式

torch.Tensor(table)


还可以使用Tensor对象进行初始化

torch.Tensor(tensor)


各个Tensor之间的转化

th> a = torch.IntTensor({3,4,5,3,3});b = torch.DoubleTensor({3.3,4.9,3.2});c = b:typeAs(a);print(c);print(b)
3
4
3
[torch.IntTensor of size 3]

3.3000
4.9000
3.2000
[torch.DoubleTensor of size 3]


Tensor中元素的类型转换,注意其返回的还是一个Tensor对象,而不是单纯的数字

[Tensor] byte(), char(), short(), int(), long(), float(), double()


Torch中还有个Storage类,其实就是对应着C语言的一维数组,在一般文件操作使用,节省空间但相关函数不是很丰富,这方面也有ByteStorage, CharStorage, ShortStorage, IntStorage, LongStorage, FloatStorage, DoubleStorage.等待类型,用法和Tensor差不多。由于Storage相当于一个一维数组,所以可以这样创建一个多维数组(这里也是没有初始化)

th> x =torch.Tensor( torch.LongStorage({2,3,4}));print(x)
(1,.,.) =
1.1132e+171  6.1707e-114  8.8211e+199  1.0167e-152
5.7781e-114  7.3587e+223   2.9095e-14   6.9117e-72
8.8211e+199  1.0167e-152   3.9232e-85   6.9183e-72

(2,.,.) =
1.3923e-259  2.2831e-109  1.6779e+243  7.3651e+228
2.2082e-259  1.1132e+171  6.1707e-114  2.3253e+251
5.0450e+223  2.8811e+159   1.1995e-22  2.1723e-153
[torch.DoubleTensor of size 2x3x4]


因为Tensor对象在操作的时候其实是像在操作C++中的引用,是会改变其本身的值的,所有有些时候我们需要进行备份复制操作,注意在Lua中“:”表示的含义,它表示类的成员。只有包(像nn、torch、optim)等等使用其中的函数时用“.”。

th> a = torch.randn(3);b = a:clone();print(a,b)
-0.7112
0.1953
-2.0389
[torch.DoubleTensor of size 3]

-0.7112
0.1953
-2.0389
[torch.DoubleTensor of size 3]


获取Tensor的维度dim(),还有每个维度具体数值size(),下面显示的Tensor a的第一维为size(1) = 2,第二维为size(2) = 4。获取Tensor元素总个数nElement()

th> a = torch.Tensor(2,4):zero();print(a);print(a:dim())
0  0  0  0
0  0  0  0
[torch.DoubleTensor of size 2x4]

2

th> print(a:size())
2
4
[torch.LongStorage of size 2]

th> print(a:nElement())
8


下标访问方式

x = torch.Tensor(3,3)
i = 0; x:apply(function() i = i + 1; return i end)
> x
1  2  3
4  5  6
7  8  9
[torch.DoubleTensor of dimension 3x3]

> x[2] -- returns row 2
4
5
6
[torch.DoubleTensor of dimension 3]

> x[2][3] -- returns row 2, column 3
6

> x[{2,3}] -- another way to return row 2, column 3
6

> x[torch.LongStorage{2,3}] -- yet another way to return row 2, column 3
6


元素复制,只要两个Tensor的元素总数一样就可以,矩阵形状可以不同

x = torch.Tensor(4):fill(1)
y = torch.Tensor(2,2):copy(x)
> x
1
1
1
1
[torch.DoubleTensor of dimension 4]

> y
1  1
1  1
[torch.DoubleTensor of dimension 2x2]


在需要动态扩充Tensor时非常有用的函数resize(sz1 [,sz2 [,sz3 [,sz4]]]])

th> x = torch.randn(2,4):zero();print(x);print(x:resize(3,4))
0  0  0  0
0  0  0  0
[torch.DoubleTensor of size 2x4]

0.0000e+00   0.0000e+00   0.0000e+00   0.0000e+00
0.0000e+00   0.0000e+00   0.0000e+00   0.0000e+00
8.8211e+199  7.4861e-114   2.9085e-33  1.0251e+170
[torch.DoubleTensor of size 3x4]


下面介绍的narrow(dim, index, size),sub(dim1s, dim1e … [, dim4s [, dim4e]]),select(dim, index)返回的对象会得到一个Tensor一部分数据,但是这个返回的对象其实是这部分数据的引用,一旦对该对象进行任何的操作,都将直接影响原来的Tensor

narrow的dim是待选取的第几个维度,index是开始的位置,结束的位置是index+size-1

th><font size=3.5> x = torch.Tensor(5, 6):zero();y = x:narrow(1, 2, 3);y:fill(1);print(y);print(x)
1  1  1  1  1  1
1  1  1  1  1  1
1  1  1  1  1  1
[torch.DoubleTensor of size 3x6]

0  0  0  0  0  0
1  1  1  1  1  1
1  1  1  1  1  1
1  1  1  1  1  1
0  0  0  0  0  0
[torch.DoubleTensor of size 5x6]

th> x = torch.Tensor(5, 6):zero();y = x:narrow(2, 2, 3);y:fill(1);print(y);print(x)
1  1  1
1  1  1
1  1  1
1  1  1
1  1  1
[torch.DoubleTensor of size 5x3]

0  1  1  1  0  0
0  1  1  1  0  0
0  1  1  1  0  0
0  1  1  1  0  0
0  1  1  1  0  0
[torch.DoubleTensor of size 5x6]


sub中的dim1s表示第一个维度开始位置,dim1e表示第一个维度的结束位置,均可以取负值,-1表示最后一个位置,-2表示倒数第二个位置

th> x = torch.Tensor(5, 6):zero();z = x:sub(2,4,3,4):fill(2);print(z);print(x)
2  2
2  2
2  2
[torch.DoubleTensor of size 3x2]

0  0  0  0  0  0
0  0  2  2  0  0
0  0  2  2  0  0
0  0  2  2  0  0
0  0  0  0  0  0
[torch.DoubleTensor of size 5x6]


select(dim, index)返回的Tensor比原来的Tensor少一维,dim依然表示第几个维度

th> x = torch.Tensor(5,6):zero();y = x:select(1, 2):fill(2);print(y);print(x)
2
2
2
2
2
2
[torch.DoubleTensor of size 6]

0  0  0  0  0  0
2  2  2  2  2  2
0  0  0  0  0  0
0  0  0  0  0  0
0  0  0  0  0  0
[torch.DoubleTensor of size 5x6]

th> x = torch.Tensor(2,3,4):zero();y = x:select(1, 2):fill(2);print(y);print(x)
2  2  2  2
2  2  2  2
2  2  2  2
[torch.DoubleTensor of size 3x4]

(1,.,.) =
0  0  0  0
0  0  0  0
0  0  0  0

(2,.,.) =
2  2  2  2
2  2  2  2
2  2  2  2
[torch.DoubleTensor of size 2x3x4]


还有下标操作符[{ dim1,dim2,… }] or [{ {dim1s,dim1e}, {dim2s,dim2e} }]也是返回一个Tensor,这种方式更加简洁和常用

th> x = torch.Tensor(3, 4):zero();x[{ 1,3 }] = 1;print(x) -- 和x[1][3] = 1 等价
0  0  1  0
0  0  0  0
0  0  0  0
[torch.DoubleTensor of size 3x4]

th><font size=3.5> x = torch.Tensor(3, 4):zero();x[{ 2,{2,4} }] = 1;print(x) -- 2表示第一维index = 2,{2,4}表示第二维index从2到4
0  0  0  0
0  1  1  1
0  0  0  0
[torch.DoubleTensor of size 3x4]

th><font size=3.5> x = torch.Tensor(3, 4):zero();x[{ {},2 }] = torch.range(1,3);print(x)  -- {}表示第一个维度的所有index,2表示第二个维度index = 2 range生成的Tensor叫IDTensor
0  1  0  0
0  2  0  0
0  3  0  0
[torch.DoubleTensor of size 3x4]

th> x = torch.Tensor(3, 4):fill(5);x[{ {},2 }] = torch.range(1,3);x[torch.lt(x,3)] = -2;print(x) ;
5 -2  5  5
5 -2  5  5
5  3  5  5
[torch.DoubleTensor of size 3x4]


index(dim, index)函数就不一样了,返回的Tensor就是一个新的Tensor了,和原来的Tensor就没有关系了,这里的index比较特殊,必须是LongTensor类型的

th> x = torch.randn(3,5);print(x);y = x:index(2,torch.range(2,4):typeAs(torch.LongTensor()));y:mul(100);print(y);print(x)
0.6146 -0.3204 -1.2182  1.5573 -0.7232
-1.1692 -0.0071  3.1590  0.6008  0.4566
0.1957 -0.4057  2.0835 -0.3365 -1.3541
[torch.DoubleTensor of size 3x5]

-32.0396 -121.8182  155.7330
-0.7097  315.9025   60.0773
-40.5727  208.3459  -33.6517
[torch.DoubleTensor of size 3x3]

0.6146 -0.3204 -1.2182  1.5573 -0.7232
-1.1692 -0.0071  3.1590  0.6008  0.4566
0.1957 -0.4057  2.0835 -0.3365 -1.3541
[torch.DoubleTensor of size 3x5]

th>x = torch.randn(3,5);print(x);y = x:index(2,torch.LongTensor({1,4}));y:mul(100)print(y);print(x) -- 指定获取第二维中index = 1 和index = 4 即原Tensor中第1列和第4列形成的新Tensor
-0.4880 -1.6397 -0.3257 -0.5051 -0.1214
-0.4002  1.3845  0.4411  0.1753  2.0174
0.5882  0.9351 -0.7685  0.6377 -1.7308
[torch.DoubleTensor of size 3x5]

-48.8024 -50.5097
-40.0198  17.5341
58.8207  63.7730
[torch.DoubleTensor of size 3x2]

-0.4880 -1.6397 -0.3257 -0.5051 -0.1214
-0.4002  1.3845  0.4411  0.1753  2.0174
0.5882  0.9351 -0.7685  0.6377 -1.7308
[torch.DoubleTensor of size 3x5]


indexCopy(dim, index, tensor)将tensor复制进来,index也是LongTensor类型的,类似的函数还有indexAdd(dim, index, tensor),indexFill(dim, index, val)

th> x = torch.randn(3,5);print(x);y = torch.Tensor(2,5);y:select(1,1):fill(-1);y:select(1,2):fill(-2);print(y);x:indexCopy(1,torch.LongTensor{3,1},y);print(x)
0.8086  1.7714 -1.6337  0.2549  0.2131
1.4018 -0.9938  0.3035  1.6247 -0.1368
0.3516 -1.3728 -0.5203  0.2754 -1.6965
[torch.DoubleTensor of size 3x5]

-1 -1 -1 -1 -1
-2 -2 -2 -2 -2
[torch.DoubleTensor of size 2x5]

-2.0000 -2.0000 -2.0000 -2.0000 -2.0000
1.4018 -0.9938  0.3035  1.6247 -0.1368
-1.0000 -1.0000 -1.0000 -1.0000 -1.0000
[torch.DoubleTensor of size 3x5]


这里需要强调一下,二维Tensor的第二个维度为1时这个Tensor还是二维的

-- 这个是一维的Tensor
th> a = torch.Tensor({1,2,3,4,5});print("dim=" .. a:dim());print(a);print(a:size())
dim=1
1
2
3
4
5
[torch.DoubleTensor of size 5]
5
[torch.LongStorage of size 1]
[torch.LongStorage of size 1]
-- 这个就是二维的Tensor了,此时是5x1的二维Tensor了
th> x = torch.Tensor({1,2,3,4,5});y = x:reshape(a:size(1),1);print(y);print(y:dim())
1
2
3
4
5
[torch.DoubleTensor of size 5x1]
2
-- 也可以这么写
th> y = torch.Tensor({{1},{2},{3},{4},{5}});print(y);print(y:dim())
1
2
3
4
5
[torch.DoubleTensor of size 5x1]
-- 这样就变成了1x5的二维Tensor了
th> y = torch.Tensor({{1,2,3,4,5}});print(y);print(y:dim())
1  2  3  4  5
[torch.DoubleTensor of size 1x5]
2
-- 或者这样也可以,都是二维的Tensor了,size()的值即各维度的大小,因为是二维,所以有两个值
th> x = torch.Tensor({1,2,3,4,5});y = x:reshape(1,a:size(1));print(y:dim());print(y);;print(y:size())
2
1  2  3  4  5
[torch.DoubleTensor of size 1x5]
1
5
[torch.LongStorage of size 2]


gather(dim, index)取出指定区域的数据,其中index的维数必须和最后输出的Tensor形状(维数,输出Tensor形状是mxn,index必须也是mxn)相同,且数据类型是LongTensor

-- dim = 1
result[i][j][k]... = src[index[i][j][k]...][j][k]...
-- dim = 2
result[i][j][k]... = src[i][index[i][j][k]...][k]...
-- etc.
-- src 是原来的Tensor
-- 如果想取对角线上的所有元素,输出是一个nx1的具有两个维度(Tensor:dim() = 2),则index必须是nx1具有两个维度的LongTensor

th> x = torch.rand(5, 5);print(x);y = x:gather(1, torch.LongTensor{{1, 2, 3, 4, 5}, {2, 3, 4, 5, 1}});print(y)
0.2188  0.3625  0.7812  0.2781  0.9327
0.5342  0.3879  0.7225  0.6031  0.7325
0.1464  0.4534  0.5134  0.9993  0.6617
0.0594  0.6398  0.1741  0.7357  0.6613
0.2926  0.7286  0.7255  0.7108  0.1820
[torch.DoubleTensor of size 5x5]

0.2188  0.3879  0.5134  0.7357  0.1820
0.5342  0.4534  0.1741  0.7108  0.9327
[torch.DoubleTensor of size 2x5]
-- 如果要取出对角线上的元素(1,1),(2,2),(3,3),(4,4),(5,5)这个1x5(二维Tensor)的Tensor出来 index = torch.LongTensor({{1, 2, 3, 4, 5}});  y = x:gather(1, index) 其中的index是1x5的二维Tensor
-- 如果要取出对角线(1,2),(2,3),(3,4),(4,5),(5,1)的1x5Tensor,index = torch.LongTensor({{5,1, 2, 3, 4}});  y = x:gather(1, index)或者index = torch.LongTensor({{2, 3, 4, 5,1}});  y = x:gather(2, index)
-- 看个例子
th> x = torch.rand(5, 5);print(x);y = x:gather(2, torch.LongTensor({ {1,2},{2,3},{3,4},{4,5},{5,1} }) );print(y)
0.8563  0.2664  0.6895  0.8124  0.0788
0.0503  0.6646  0.7659  0.4013  0.0670
0.4760  0.0517  0.9621  0.7437  0.1162
0.4069  0.9932  0.6118  0.6200  0.3585
0.9795  0.9601  0.9098  0.4714  0.5577
[torch.DoubleTensor of size 5x5]

0.8563  0.2664
0.6646  0.7659
0.9621  0.7437
0.6200  0.3585
0.5577  0.9795
[torch.DoubleTensor of size 5x2]


scatter(dim, index, src|val)就是把其它Tensor src或者标量值val写入到自己参数含义和gather一样

x = torch.rand(2, 5)
> x
0.3227  0.4294  0.8476  0.9414  0.1159
0.7338  0.5185  0.2947  0.0578  0.1273
[torch.DoubleTensor of size 2x5]

y = torch.zeros(3, 5):scatter(1, torch.LongTensor{{1, 2, 3, 1, 1}, {3, 1, 1, 2, 3}}, x)
> y
0.3227  0.5185  0.2947  0.9414  0.1159
0.0000  0.4294  0.0000  0.0578  0.0000
0.7338  0.0000  0.8476  0.0000  0.1273
[torch.DoubleTensor of size 3x5]

z = torch.zeros(2, 4):scatter(2, torch.LongTensor{{3}, {4}}, 1.23)
> z
0.0000  0.0000  1.2300  0.0000
0.0000  0.0000  0.0000  1.2300
[torch.DoubleTensor of size 2x4]


nonzero(tensor)返回一个nx2 LongTensor,包含原Tensor中所有的非零值的下标和值

th> x = torch.rand(4, 4):mul(3):floor():int();y = torch.nonzero(x);print(x);print(y)
2  1  0  2
2  1  2  1
1  0  1  0
2  0  2  2
[torch.IntTensor of size 4x4]

1  1
1  2
1  4
2  1
2  2
2  3
2  4
3  1
3  3
4  1
4  3
4  4
[torch.LongTensor of size 12x2]


转置操作transpose(dim1, dim2)或者直接t(),仅对二维Tensor有用

对多维tensor转置permute(dim1, dim2, …, dimn)

x = torch.Tensor(3,4,2,5)
> x:size()
3
4
2
5
[torch.LongStorage of size 4]

y = x:permute(2,3,1,4) -- 等价于 y = x:transpose(1,3):transpose(1,2)
> y:size()
4
2
3
5
[torch.LongStorage of size 4]


还有对tensor对象每个元素应用函数处理apply(function)

= 0
z = torch.Tensor(3,3)
z:apply(function(x)
i = i + 1
return i
end)
> z
1  2  3
4  5  6
7  8  9
[torch.DoubleTensor of dimension 3x3]


针对两个tensor的每个元素处理map(tensor, function(xs, xt))

x = torch.Tensor(3,3)
y = torch.Tensor(9)
i = 0
x:apply(function() i = i + 1; return i end) -- fill-up x
i = 0
y:apply(function() i = i + 1; return i end) -- fill-up y
> x
1  2  3
4  5  6
7  8  9
[torch.DoubleTensor of dimension 3x3]
> y
1
2
3
4
5
6
7
8
9
[torch.DoubleTensor of dimension 9]
th> z =x:map(y, function(xx, yy) return xx*xx + yy end);print(z)
2   6  12
20  30  42
56  72  90
[torch.DoubleTensor of size 3x3]


将tensor进行切分split([result,] tensor, size, [dim]),size是切分的单位(大小),dim是指定的维度,即在那个维度上面切分

th> x = torch.randn(3,10,15)
th> x:split(2,1)
{
1 : DoubleTensor - size: 2x10x15
2 : DoubleTensor - size: 1x10x15
}
[0.0003s]
th> x:split(4,2)
{
1 : DoubleTensor - size: 3x4x15
2 : DoubleTensor - size: 3x4x15
3 : DoubleTensor - size: 3x2x15
}
[0.0004s]
th> x:split(5,3)
{
1 : DoubleTensor - size: 3x10x5
2 : DoubleTensor - size: 3x10x5
3 : DoubleTensor - size: 3x10x5
}


然后是一些tensor的Math操作函数,例如sum、add、sub、mul、div、mode、max、min、std、mean、pow、rand、randn、log、range、exp、abs、floor、sqrt等还有很多很多Math官网

这些函数因为非常常用已经放到了torch包中,既可以mean = torch.mean(x),也可以mean = x:mean()

th> x = torch.range(1,15,3);print(x);print(x:mean());print(torch.mean(x))
1
4
7
10
13
[torch.DoubleTensor of size 5]

7
7


还有一个很常用的是torch.sort([resval, resind,] x [,dim] [,flag]),默认升序,返回两个tensor,第一个是排好序的tensor,第二个是排好序的tensor每个元素对应的下标的tensor

th> x = torch.Tensor({8.3,3.4,5.7,1.7,9.3});y,index = x:sort(true);print(y,index)
-- 也可以这样写 y,index = torch.sort(x,true);print(y,index)
9.3000
8.3000
5.7000
3.4000
1.7000
[torch.DoubleTensor of size 5]

5
1
3
2
4
[torch.LongTensor of size 5]


下一篇Torch实现线性回归
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息