博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Theano2.1.10-基础知识之循环
阅读量:7122 次
发布时间:2019-06-28

本文共 11078 字,大约阅读时间需要 36 分钟。

来自:http://deeplearning.net/software/theano/tutorial/loop.html

loop

一、Scan

  • 一个递归的通常的形式,可以用来作为循环语句。
  • 约间和映射(在第一个(leading,个人翻译成第一个)维度上进行循环)是scan的特殊情况
  • 沿着一些输入序列scan一个函数,然后在每个时间步上生成一个输出。
  • 该函数可以查看函数的前K个时间步的结果。
  • sum() 可以通过在一个列表上使用 z + x(i) 函数(初始化为Z=0)来得到结果。
  • 通常来说,一个for循环可以表示成一个scan()操作,而且scan是与theano的循环联系最紧密的。 
  • 使用scan而不是for循环的优势:
    • 迭代的次数是符号graph的一部分。
    • 最小化GPU的迁移(如果用到GPU的话)。
    • 通过连续的步骤计算梯度。
    • 比python中使用theano编译后的for循环稍微快一点。
    • 通过检测实际用到的内存的数量,来降低总的内存使用情况。

所有的文档可以查看库对应的: .

1.1 Scan 例子: 逐元素计算 tanh(x(t).dot(W) + b)

import theanoimport theano.tensor as Timport numpy as np# defining the tensor variablesX = T.matrix("X")W = T.matrix("W")b_sym = T.vector("b_sym")results, updates = theano.scan(lambda v: T.tanh(T.dot(v, W) + b_sym), sequences=X)compute_elementwise = theano.function(inputs=[X, W, b_sym], outputs=[results])# test valuesx = np.eye(2, dtype=theano.config.floatX)w = np.ones((2, 2), dtype=theano.config.floatX)b = np.ones((2), dtype=theano.config.floatX)b[1] = 2print compute_elementwise(x, w, b)[0]# comparison with numpyprint np.tanh(x.dot(w) + b)
1.2 Scan 例子:计算序列 x(t) = tanh(x(t - 1).dot(W) + y(t).dot(U) + p(T - t).dot(V))

import theanoimport theano.tensor as Timport numpy as np# define tensor variablesX = T.vector("X")W = T.matrix("W")b_sym = T.vector("b_sym")U = T.matrix("U")Y = T.matrix("Y")V = T.matrix("V")P = T.matrix("P")results, updates = theano.scan(lambda y, p, x_tm1: T.tanh(T.dot(x_tm1, W) + T.dot(y, U) + T.dot(p, V)),          sequences=[Y, P[::-1]], outputs_info=[X])compute_seq = theano.function(inputs=[X, W, Y, U, P, V], outputs=[results])# test valuesx = np.zeros((2), dtype=theano.config.floatX)x[1] = 1w = np.ones((2, 2), dtype=theano.config.floatX)y = np.ones((5, 2), dtype=theano.config.floatX)y[0, :] = -3u = np.ones((2, 2), dtype=theano.config.floatX)p = np.ones((5, 2), dtype=theano.config.floatX)p[0, :] = 3v = np.ones((2, 2), dtype=theano.config.floatX)print compute_seq(x, w, y, u, p, v)[0]# comparison with numpyx_res = np.zeros((5, 2), dtype=theano.config.floatX)x_res[0] = np.tanh(x.dot(w) + y[0].dot(u) + p[4].dot(v))for i in range(1, 5):  x_res[i] = np.tanh(x_res[i - 1].dot(w) + y[i].dot(u) + p[4-i].dot(v))print x_res
1.3 Scan 例子: 计算 X的线(指的是按照某一维度方向) 范数

import theanoimport theano.tensor as Timport numpy as np# define tensor variableX = T.matrix("X")results, updates = theano.scan(lambda x_i: T.sqrt((x_i ** 2).sum()), sequences=[X])compute_norm_lines = theano.function(inputs=[X], outputs=[results])# test valuex = np.diag(np.arange(1, 6, dtype=theano.config.floatX), 1)print compute_norm_lines(x)[0]# comparison with numpyprint np.sqrt((x ** 2).sum(1))
1.4 Scan 例子:计算x的列的范数 

import theanoimport theano.tensor as Timport numpy as np# define tensor variableX = T.matrix("X")results, updates = theano.scan(lambda x_i: T.sqrt((x_i ** 2).sum()), sequences=[X.T])compute_norm_cols = theano.function(inputs=[X], outputs=[results])# test valuex = np.diag(np.arange(1, 6, dtype=theano.config.floatX), 1)print compute_norm_cols(x)[0]# comparison with numpyprint np.sqrt((x ** 2).sum(0))
1.5 Scan 例子: 计算x的迹

import theanoimport theano.tensor as Timport numpy as npfloatX = "float32"# define tensor variableX = T.matrix("X")results, updates = theano.scan(lambda i, j, t_f: T.cast(X[i, j] + t_f, floatX),                  sequences=[T.arange(X.shape[0]), T.arange(X.shape[1])],                  outputs_info=np.asarray(0., dtype=floatX))result = results[-1]compute_trace = theano.function(inputs=[X], outputs=[result])# test valuex = np.eye(5, dtype=theano.config.floatX)x[0] = np.arange(5, dtype=theano.config.floatX)print compute_trace(x)[0]# comparison with numpyprint np.diagonal(x).sum()

1.6 Scan 例子:计算序列 x(t) = x(t - 2).dot(U) + x(t - 1).dot(V) + tanh(x(t - 1).dot(W) + b)

import theanoimport theano.tensor as Timport numpy as np# define tensor variablesX = T.matrix("X")W = T.matrix("W")b_sym = T.vector("b_sym")U = T.matrix("U")V = T.matrix("V")n_sym = T.iscalar("n_sym")results, updates = theano.scan(lambda x_tm2, x_tm1: T.dot(x_tm2, U) + T.dot(x_tm1, V) + T.tanh(T.dot(x_tm1, W) + b_sym),                    n_steps=n_sym, outputs_info=[dict(initial=X, taps=[-2, -1])])compute_seq2 = theano.function(inputs=[X, U, V, W, b_sym, n_sym], outputs=[results])# test valuesx = np.zeros((2, 2), dtype=theano.config.floatX) # the initial value must be able to return x[-2]x[1, 1] = 1w = 0.5 * np.ones((2, 2), dtype=theano.config.floatX)u = 0.5 * (np.ones((2, 2), dtype=theano.config.floatX) - np.eye(2, dtype=theano.config.floatX))v = 0.5 * np.ones((2, 2), dtype=theano.config.floatX)n = 10b = np.ones((2), dtype=theano.config.floatX)print compute_seq2(x, u, v, w, b, n)# comparison with numpyx_res = np.zeros((10, 2))x_res[0] = x[0].dot(u) + x[1].dot(v) + np.tanh(x[1].dot(w) + b)x_res[1] = x[1].dot(u) + x_res[0].dot(v) + np.tanh(x_res[0].dot(w) + b)x_res[2] = x_res[0].dot(u) + x_res[1].dot(v) + np.tanh(x_res[1].dot(w) + b)for i in range(2, 10):  x_res[i] = (x_res[i - 2].dot(u) + x_res[i - 1].dot(v) +              np.tanh(x_res[i - 1].dot(w) + b))print x_res
1.7 Scan 例子:计算 y = tanh(v.dot(A))  关于 x 的jacobian

import theanoimport theano.tensor as Timport numpy as np# define tensor variablesv = T.vector()A = T.matrix()y = T.tanh(T.dot(v, A))results, updates = theano.scan(lambda i: T.grad(y[i], v), sequences=[T.arange(y.shape[0])])compute_jac_t = theano.function([A, v], [results], allow_input_downcast=True) # shape (d_out, d_in)# test valuesx = np.eye(5, dtype=theano.config.floatX)[0]w = np.eye(5, 3, dtype=theano.config.floatX)w[2] = np.ones((3), dtype=theano.config.floatX)print compute_jac_t(w, x)[0]# compare with numpyprint ((1 - np.tanh(x.dot(w)) ** 2) * w).T

    注意到我们需要对y的索引值而不是y的元素进行迭代。原因在于scan会对它的内部函数创建一个占位符变量,该占位符变量没有和需要替换的那个变量同样的依赖条件。

1.8 Scan 例子: 在scan中累计循环次数

import theanoimport theano.tensor as Timport numpy as np# define shared variablesk = theano.shared(0)n_sym = T.iscalar("n_sym")results, updates = theano.scan(lambda:{k:(k + 1)}, n_steps=n_sym)accumulator = theano.function([n_sym], [], updates=updates, allow_input_downcast=True)k.get_value()accumulator(5)k.get_value()
 1.9 Scan 例子:计算 tanh(v.dot(W) + b) * d ,这里d 是 二项式

import theanoimport theano.tensor as Timport numpy as np# define tensor variablesX = T.matrix("X")W = T.matrix("W")b_sym = T.vector("b_sym")# define shared random streamtrng = T.shared_randomstreams.RandomStreams(1234)d=trng.binomial(size=W[1].shape)results, updates = theano.scan(lambda v: T.tanh(T.dot(v, W) + b_sym) * d, sequences=X)compute_with_bnoise = theano.function(inputs=[X, W, b_sym], outputs=[results],                          updates=updates, allow_input_downcast=True)x = np.eye(10, 2, dtype=theano.config.floatX)w = np.ones((2, 2), dtype=theano.config.floatX)b = np.ones((2), dtype=theano.config.floatX)print compute_with_bnoise(x, w, b)

    注意到如果你想使用一个不会通过scan循环更新的随机变量 d ,你就应该将这个变量作为参数传递给non_sequences 。

1.10 Scan 例子: 计算 pow(A, k)

import theanoimport theano.tensor as Ttheano.config.warn.subtensor_merge_bug = Falsek = T.iscalar("k")A = T.vector("A")def inner_fct(prior_result, B):    return prior_result * B# Symbolic description of the resultresult, updates = theano.scan(fn=inner_fct,                            outputs_info=T.ones_like(A),                            non_sequences=A, n_steps=k)# Scan has provided us with A ** 1 through A ** k.  Keep only the last# value. Scan notices this and does not waste memory saving them.final_result = result[-1]power = theano.function(inputs=[A, k], outputs=final_result,                      updates=updates)print power(range(10), 2)#[  0.   1.   4.   9.  16.  25.  36.  49.  64.  81.]
1.11 Scan 例子: 计算一个多项式

import numpyimport theanoimport theano.tensor as Ttheano.config.warn.subtensor_merge_bug = Falsecoefficients = theano.tensor.vector("coefficients")x = T.scalar("x")max_coefficients_supported = 10000# Generate the components of the polynomialfull_range=theano.tensor.arange(max_coefficients_supported)components, updates = theano.scan(fn=lambda coeff, power, free_var:                                   coeff * (free_var ** power),                                outputs_info=None,                                sequences=[coefficients, full_range],                                non_sequences=x)polynomial = components.sum()calculate_polynomial = theano.function(inputs=[coefficients, x],                                     outputs=polynomial)test_coeff = numpy.asarray([1, 0, 2], dtype=numpy.float32)print calculate_polynomial(test_coeff, 3)# 19.0

二、Exercise

    运行两个例子。

    修改并执行多项式的例子,通过scan来进行约间。

答案( )

#!/usr/bin/env python# Theano tutorial# Solution to Exercise in section 'Loop'from __future__ import print_functionimport numpyimport theanoimport theano.tensor as tt# 1. First exampletheano.config.warn.subtensor_merge_bug = Falsek = tt.iscalar("k")A = tt.vector("A")def inner_fct(prior_result, A):    return prior_result * A# Symbolic description of the resultresult, updates = theano.scan(fn=inner_fct,                              outputs_info=tt.ones_like(A),                              non_sequences=A, n_steps=k)# Scan has provided us with A ** 1 through A ** k.  Keep only the last# value. Scan notices this and does not waste memory saving them.final_result = result[-1]power = theano.function(inputs=[A, k], outputs=final_result,                        updates=updates)print(power(range(10), 2))# [  0.   1.   4.   9.  16.  25.  36.  49.  64.  81.]# 2. Second examplecoefficients = tt.vector("coefficients")x = tt.scalar("x")max_coefficients_supported = 10000# Generate the components of the polynomialfull_range = tt.arange(max_coefficients_supported)components, updates = theano.scan(fn=lambda coeff, power, free_var:                                  coeff * (free_var ** power),                                  sequences=[coefficients, full_range],                                  outputs_info=None,                                  non_sequences=x)polynomial = components.sum()calculate_polynomial1 = theano.function(inputs=[coefficients, x],                                        outputs=polynomial)test_coeff = numpy.asarray([1, 0, 2], dtype=numpy.float32)print(calculate_polynomial1(test_coeff, 3))# 19.0# 3. Reduction performed inside scantheano.config.warn.subtensor_merge_bug = Falsecoefficients = tt.vector("coefficients")x = tt.scalar("x")max_coefficients_supported = 10000# Generate the components of the polynomialfull_range = tt.arange(max_coefficients_supported)outputs_info = tt.as_tensor_variable(numpy.asarray(0, 'float64'))components, updates = theano.scan(fn=lambda coeff, power, prior_value, free_var:                                  prior_value + (coeff * (free_var ** power)),                                  sequences=[coefficients, full_range],                                  outputs_info=outputs_info,                                  non_sequences=x)polynomial = components[-1]calculate_polynomial = theano.function(inputs=[coefficients, x],                                       outputs=polynomial, updates=updates)test_coeff = numpy.asarray([1, 0, 2], dtype=numpy.float32)print(calculate_polynomial(test_coeff, 3))# 19.0
参考资料:

[1]官网:http://deeplearning.net/software/theano/tutorial/loop.html

转载地址:http://toxel.baihongyu.com/

你可能感兴趣的文章
京颐集团上云之路:如何助力中小型医疗行业信息化与全面上云?
查看>>
Python yield与实现
查看>>
mongodb一些使用技巧或注意事项记录
查看>>
C# 浅拷贝与深拷贝区别 解惑篇
查看>>
nested loop,merge join,hash join与子查询优化
查看>>
注册过程太痛苦,昵称起了一箩筐还是没有可用的,前端校验和后台查询不一致用户体验太差...
查看>>
Munin进阶使用
查看>>
[Nhibernate]体系结构
查看>>
【转载】对 Zookeeper 的一些分析
查看>>
IPC 和 RPC (呵呵,我感觉我应该要钻研到这个深度啦)
查看>>
设计可以多选的按钮ChooseManyButton
查看>>
NSURLErrorDomain Code=-999
查看>>
SQL模板资源管理器,你用了吗?
查看>>
ORA-00600: internal error code, arguments: [17281], [1001], [0x1FF863EE8], [], [], [], [], []
查看>>
Integer取值范围和NumberFormatException的解决
查看>>
网站技术笔记-演化
查看>>
【转】IE10 CSS hack
查看>>
堆排序(1)
查看>>
Node.js之HTTP请求与响应
查看>>
DOM何时Ready
查看>>