如何对自己的图像做autoencoder matlab

羽毛球技术 | 体育赛事 | 英文歌曲 | 住宅风水 | 用户界面设计师 | 六爻 | 书籍改编电影 | 德国足球甲级联赛 | 欧美明星 | PLC | 中国足球 | aj1 | 国家队 | 拜仁慕尼黑足球俱乐部 | 小说创作 | 配音 | iOS应用 | NBA 2K | 古典音乐 | 面相 | 火影忍者 | 武汉大学 | 土拨鼠 | 营销策划 | 秦时明月之天行九歌 | 设计师 | 巴塞罗那足球俱乐部 | 尤文图斯 | 实况足球（游戏） | 少帅 | 罗玉凤 | 比利时 | 跑鞋 | 冷知识 | 肖战 | 李元胜 | 古琴 | 按键精灵 | 罗兰 | 徐波 | 激光手术 | 角色扮演 | 关晓彤 | 微电影 | safari | 北京国安 | 古汉语 | 曼彻斯特联 | 玄幻小说 | 科幻小说 | 双眼皮手术 | 主题曲 | 年会 | 检测仪 | 徒步 | 互联网公司 | 百度输入法 | 镜头 | 宜昌市 | 自拍 | 金蝶 | 电子烟 | 网站建设 | 广播体操 | 文身 | nba篮球 | 索尼(sony) | 天体物理学 | 痛风 | 象棋 | 牛皮癣 | 皮肤护理 | 周星驰（人物） | 试管婴儿 | 亚足联亚洲杯（AFC Asian Cup） | 健美 | 美术生 | 迅雷（软件） | 战斗机 | 穿越小说 | 张璐 | 姓氏 | 诸葛亮 | 后宫·甄嬛传（书籍） | 虎牙直播 | snh48 | 阿迪达斯 | 投影仪 | 组装机 | 微信群 | 阿迪达斯(adidas) | 网球王子 | 分子生物学 | 耽美 | 武磊 | 婚礼 | 表演 | 中国武术 | 动画电影 | Air Jordan | 张子枫 | 免费软件 | 相声演员 | 摩羯座 | 宿舍 | ansys | 法国足球甲级联赛 | 户外 | 剧场版 | 杨凡 | 科幻电影 | galgame | 融资 | 关节炎 | NBA季后赛 | 神话 | 王力宏（人物） | 建模 | 计算机病毒 | 广州恒大淘宝足球俱乐部 | 北京奥运会 | 电脑电源 | 百度翻译 | 字幕 | 讯飞输入法 | 海关 | 易烊千玺 | 深度学习 | 编辑器 | 澳门特别行政区 | 直播 | 流氓软件 | 事故 | 大片 | 李景亮 | 郭富城 | 日语歌曲 | 卡牌游戏 | 小品 | 东京 | 花卉 | 音乐剧 | 互联网创业 | 占卜 | 羽毛球拍 | 婆媳关系 | 日本动画 | 巴黎 | 拳击比赛 | 东南亚 | 足球经理（FM）（游戏） | youtube | 胡歌（演员） | 地铁跑酷 | 植发 | 张继科 | 三国 | 用户界面 | 演技 | 百度竞价 | 青梅竹马 | 移动硬盘 | 韩晓鹏 | 马龙 | 瘦腿 | 宠物医疗 | 巨蟹座 | 徐峥 | 天蝎座 | 胸肌 | 赵丽颖（演员） | adidas阿迪达斯 | 低音炮 | 星际争霸（游戏） | 豆瓣电影 | 微信开放平台 | 手绘 | 吉他学习 | 江苏卫视 | 模特 | 创意 | 团队管理 | 奢侈品 | 王源 | TANK | 笛子 | 偶像 | 莱斯特城 | 维生素 | 新百伦 | 国际物流 | 前女友 | 李小龙 | 华语流行音乐 | 猎头公司 | crm | 搏击项目 | 网站运营 | 鼻炎 | 篮球游戏 |

你的位置：网站首页 >> 频道首页 >>软件 >>如何对自己的图像做autoencoder matlab

如何对自己的图像做autoencoder matlab

来源：蜘蛛抓取(WebSpider) 时间：2016-12-05 13:01 标签： keras autoencoder

一种基于AutoEncoder的RBF神经网络训练算法--《中国科技信息》2015年09期
一种基于AutoEncoder的RBF神经网络训练算法
【摘要】：RBF神经网络中心宽度等参数确定的是否合理将直接影响到RBF网络的学习性能。通过有监督学习的方法来确定RBF神经网络的参数是最一般化的方法。研究表明,参数的初始化问题是该类方法的关键所在。为此,提出了一种利用AutoEncoder初始化RBF神经网络各个参数的新型训练算法。实验仿真表明,与传统RBF神经网络训练算法相比,该新型算法具有更高的训练精度与更强的泛化能力。
【作者单位】：
【关键词】：
【基金】：
【分类号】：TP183【正文快照】：
哪麵钟備纖幽韶纖麵賴關络的学习性能。通过有监督学习的方法来确定RBF神经网络的参数是最 -矣[生函d构成,其作用主要疋为fell入层丨55作出最终响应。-般化的方法。研究表明,参数的初始化问题是该类方法的关键所在。为 RBF神经网络的函数表达式为:此,提出了一种利用AutoEncode
欢迎：、、)
支持CAJ、PDF文件格式，仅支持PDF格式
【相似文献】
中国期刊全文数据库
廖晓昕,傅予力,高健,赵新泉;[J];电子学报;2000年01期
张菊亮,章祥荪;[J];运筹学学报;2001年02期
罗公亮;[J];冶金自动化;2001年05期
蒋德云,张弓;[J];农业工程学报;2002年05期
王芳荣,周德义,郑咏梅,王鼎,张铁强;[J];吉林大学学报(信息科学版);2002年03期
宋光雄,何胜锋,曹辉,张峥,钟群鹏;[J];金属热处理学报;2003年01期
王学武,谭得健;[J];计算机工程与应用;2003年03期
陈有伟,李为民;[J];计算机工程与应用;2003年08期
刘斌,刘新芝,廖晓昕;[J];控制理论与应用;2003年02期
刘国良,强文义,麻亮,陈兴林;[J];控制与决策;2003年03期
中国重要会议论文全文数据库
王雷;陈宗海;;[A];'2002系统仿真技术及其应用学术论文集（第四卷）[C];2002年
周宗潭;胡德文;;[A];1995年中国智能自动化学术会议暨智能自动化专业委员会成立大会论文集（上册）[C];1995年
侯媛彬;易继锴;杨玉珍;陈双叶;韩崇昭;;[A];1996年中国智能自动化学术会议论文集（上册）[C];1996年
江铭炎;江铭虎;;[A];1998年中国智能自动化学术会议论文集（上册）[C];1998年
陈文新;王长富;戴蓓倩;;[A];第一届全国语言识别学术报告与展示会论文集[C];1990年
刘丰;姜建新;程俊;易克初;;[A];第二届全国人机语音通讯学术会议论文集[C];1992年
梁循;;[A];1995中国控制与决策学术年会论文集[C];1995年
黄小原;肖四汉;樊治平;;[A];1995中国控制与决策学术年会论文集[C];1995年
李艳;邵日祥;方建安;邵世煌;;[A];1996中国控制与决策学术年会论文集[C];1996年
高文忠;顾树生;平力;;[A];1994年中国控制会议论文集[C];1994年
中国重要报纸全文数据库
美国明尼苏达大学社会学博士
密西西比州立大学国家战略规划与分析研究中心资深助理研究员
陈心想;[N];中国教师报;2014年
卢业忠;[N];计算机世界;2001年
葛一鸣路边文;[N];中国纺织报;2003年
中国科技大学计算机系　邢方亮;[N];计算机世界;2003年
孙刚;[N];解放日报;2007年
刘霞;[N];科技日报;2011年
健康时报特约记者　张献怀;[N];健康时报;2006年
刘力;[N];中国电子报;2001年
;[N];世界金属导报;2002年
陈耀群;[N];中国船舶报;2006年
中国博士学位论文全文数据库
曾喆昭;[D];湖南大学;2008年
陈先来;[D];中南大学;2010年
楼旭阳;[D];江南大学;2009年
汪木兰;[D];合肥工业大学;2010年
冯伟;[D];重庆大学;2009年
裴浩东;[D];浙江大学;2001年
陆婷;[D];华南理工大学;2003年
吕建成;[D];电子科技大学;2006年
张超;[D];大连理工大学;2008年
陈薇娜;[D];复旦大学;2009年
中国硕士学位论文全文数据库
沈花玉;[D];天津理工大学;2007年
曹影鹏;[D];江南大学;2008年
邹宇;[D];天津大学;2007年
苏卫卫;[D];燕山大学;2008年
钟义长;[D];湖南科技大学;2007年
唐荣江;[D];吉林大学;2009年
李小燕;[D];武汉理工大学;2009年
郭军平;[D];大连理工大学;2009年
陈郁;[D];长春理工大学;2009年
文辉;[D];南昌航空大学;2009年
&快捷付款方式
&订购知网充值卡
400-819-9993
《中国学术期刊（光盘版）》电子杂志社有限公司
同方知网数字出版技术股份有限公司
地址：北京清华大学 84-48信箱大众知识服务
出版物经营许可证新出发京批字第直0595号
订购热线：400-819-82499
服务热线：010--
在线咨询：
传真：010-
京公网安备75号君，已阅读到文档的结尾了呢~~
扫扫二维码，随身浏览文档
手机或平板扫扫即可继续访问
基于Autoencoder网络的数据降维和重构
举报该文档为侵权文档。
举报该文档含有违规或不良信息。
反馈该文档无法正常浏览。
举报该文档为重复文档。
推荐理由：
将文档分享至：
分享完整地址
文档地址：
粘贴到BBS或博客
flash地址：
支持嵌入FLASH地址的网站使用
html代码：
&embed src='/DocinViewer-4.swf' width='100%' height='600' type=application/x-shockwave-flash ALLOWFULLSCREEN='true' ALLOWSCRIPTACCESS='always'&&/embed&
450px*300px480px*400px650px*490px
支持嵌入HTML代码的网站使用
您的内容已经提交成功
您所提交的内容需要审核后才能发布，请您等待！
3秒自动关闭窗口今天分享的是 NIPS 2015 的 workshop 之一，Reasoning Attention Memory（RAM）中的 accepted paper。这个 workshop 中有几个看点，一个是请来了一些非 DL 的研究者，比如 cognitive science 方向的，带来了生物学角度的 memory 研究；二来是有很多开创性工作，比如公布了一个数据集，或者发表了对未来某个领域的一些展望。所以这个 workshop 整体来看，模型理论不复杂，论文也相对简单，我的分享也就会加入许多自己的看法。今天的分享包括：《How to learn an algorithm》 Juergen Schmidhuber, IDSIA.《From Attention to Memory and towards Longer-Term Dependencies》 Yoshua Bengio, University of Montreal.《Generating Images from Captions with Attention》 Elman Mansimov, Emilio Parisotto, Jimmy Lei Ba, Ruslan Salakhutdinov (University of Toronto).《Smooth Operators: the Rise of Differentiable Attention in Deep Learning》 Alex Graves, Google Deepmind.《Sleep, learning and memory: optimal inference in the prefrontal cortex》 Adrien Peyrache, New York University.《Dynamic Memory Networks for Natural Language Processing》 Ankit Kumar, Ozan Irsoy, Peter Ondruska, et al. (MetaMind).《Chess Q&A : Question Answering on Chess Games》 Volkan Cirik, Louis-Philippe Morency, Eduard Hovy (CMU).《Towards Neural Network-based Reasoning》 Baolin Peng, Zhengdong Lu, Hang Li, et al. (Noah's Ark Lab).《Neural Machine Translation: Progress Report and Beyond》 Kyunghyun Cho, New York University.《A Roadmap towards Machine Intelligence》 Tomas Mikolov, Facebook AI Research. How to learn an algorithm 第一个 talk 没有 slides 没有 paper。但是 Juergen Schmidhuber 教授在今年10月也个过一个同题目的 talk，视频曾被爱可可老师分享在百度云里过。如果懒得看视频可以看14年的survey《Deep Learning in Neural Networks: An Overview》。视频里主要都是介绍 DL 的过去，现在和将来，没有对某一个模型和算法的具体介绍。挺像 TED 那种技术发展介绍的，有兴趣可以看看。再附个别人的笔记：Juergen gave a brief history of RNNs. RNNs are general purpose computers, and learning a program boils down to learning a weight matrix. An LSTM is an instance of an RNN, good for learning long-term relationships. Juergen was giving a short history of LSTMs, starting with supervised learning, but also RL (e.g., Bakker et al., IROS 2003). LSTMs are now used for all kinds of things (speech recognition, translation, ….). Then he made this incredibly funny comment thatGoogle already existed 13 years ago, but it was not yet aware that it was becoming a big LSTM.Juergen’s first deep network back in 1991 was a stack of RNNs (Hierarchical Temporal Memory). Deep learning was started by Ivakhnenko in 1965, backpropagation by Linnainmaa in 1970, and Werbos was the first one to apply this idea to NNs back in 1982.Unsupervised learning is basically nothing but compression.In the context of life-long meta learning, Juergen mentioned the Goedel machine. The Goedel machine (2003) is a theoretically optimal self-improver, which kind of ticks all kinds of boxes for life-long learning. It provides a solution to the towers-of-Hanoi problem: Learn context-free and context-sensitive languages, and finally learn to solve this problem.In 2013, he used RNNs for learning from raw videos “RL in partially observable worlds” using compressed network search.The talk was more an interesting history lesson than a discussion of current research activities, but nevertheless very interesting.Future directions: Learn subgoals in a
learns faster. From Attention to Memory and towards Longer-Term Dependencies这个也是没有找到对应的 slides 和 paper（相关 paper 太多了），附上别人的笔记和我的理解。One of the reasons the attention mechanisms help is connected to the problem of long-term dependencies (that is, when learning a composition of many nonlinear functions by measuring a loss which depends on all of the nonlinearities, the derivatives can become very small or large depending on the eigenvalues of the Jacobians). If you want a recurrent network to store information reliably, you need some kind of attractors in the dynamics (Jacobians with eigenvalues less than 1). The problem with that is that if they are contractive, it also means that you will have gradient vanishing. So, the condition that requires that RNNs are learnable seems to imply that you must be forgetting things. One of the paths to improving this were LSTMs, which introduces loops in the state-to-state transitions where the derivatives are slightly less than one. An alternate early approach are skip connections or a hierarchy of timescales. 小S 批注：skip connections 也叫 shortcut connections，推荐过很多次的 Highway Networks 就是利用 shortcut connections + gate 实现的；最近爆火的 MSRA ILSVRC 1st place 的工作，Deep Residual Network 可以说 Highway Network 的一个特例，所以也是充分利用了 shortcut connections；而 hierarchy of timescales 的例子也在前几天分享过，Facebook 的 Laplacian pyramid GAN 的工作。Considering a memory content which be read and written to, for many of the locations in the memory at each time step very little will happen because a softmax is used which usually only selects a few elements. This is similar to how in an LSTM the memory is copied across time. For this to work, however, the memory needs to be large enough so that it doesn't have to read and write from the same locations very often. The idea of a “copy” (preserve information) can be generalized a little bit to an operation where the eigenvalues of the Jacobian are 1. This all suggests that networks with large memories are valuable, but these make the networks more expensive. Generating Images from Captions with Attention 这篇论文结合和扩展了两种 generative NN model，一个是 DRAW 的 differential attention mechanism，还有一个是 post-processing deterministic Laplacian pyramid adversarial network (GAN)。在此基础上，这篇工作的另一个创新之处在于，把 image -& generate caption (text) 的过程反过来了，用 caption 去 generate image。如果把上述三点分别展开来说，那么：1）这个工作中对于 DRAW 的改进是，使得 generation 时，变成了 conditional 的，也就是变成了一种 conditional generative model，参加见 Figure 2 中，最右上方的 p(x|y, Z_{1:T})。这个 conditional 的改变，对于 performance 提升很有帮助。具体来说，2）对于 GAN 几乎可以说没有改进，只是将其与 DRAW 结合在一起，替代了 inference 的 part，用来 sharpen generated images (scenes)，使得图像看起来更“清晰”。这里要注意，这个 inference part 其实就是 reconstruction loss，个人感觉这是因为 generative model 本身很多时候不够 powerful，或者训练集不足以支撑（关于 generative model power 的问题见下面关于 Smoother Operator 的讨论）。但从现在的 generate sample 来看，这样做并不具备改变图片的”语义“的功能，感觉只是单纯的“锐化”……3）从 caption (text) -& generate image 的过程，应该来说是更加困难的。在这个过程中，attention mechanism 作为了一种 hint，当 attention 分配在 caption 中的某些强语义 words 时，图片中就会有非常 clear 的 image object 对应；但如果 attention “失败”了，则 image 中完全不会出现这个 object。作者也在实验分析中，进行了各种 image 元素的替换对比，用于揭示这个 conditional align attention model 的优势。比如其中之一就是，在此基础上，可以通过轻松替换正确被 attention 的 words，generate 出训练集中完全没有出现过的反生活的图像组合。当然，就像之前说的，caption -& image generation 的过程还是比较难的。比如尽管 Deep Learning ”号称“已经可以区分 cat 和 dog，但是在这样 multi-modal 的情况下，按照 caption 生成准确的 dog or cat，看起来还是非常困难的。小S 认为，除了 image 本身的复杂性，还有现有 multi-modal framework 的复杂性的因素，因为尽管一边是 differential 的，一边是 deterministic 的，依然很 complex……但是，总的来说，这篇论文写得很清晰，即使没有看过 attention model，没有看过 image caption generation 文章的人，也可以以此入门了解 framework。 Smooth Operator: the Rise of Differentiable Attention in Deep Learning 这个 talk 来自 Google DeepMind 团队的 Alex Graves，网上并没有这个题目的 slides。但是 Graves 是 DRAW 的作者之一，并且 DRAW 的一大贡献就是，是第一个成功学出 fully differentiable attention policy for sequence generation 的工作。?所以我先提一下 DRAW 的工作，再来说下相关的 differentiable attention 工作，最后来整理一下我看到的关于这个 talk 的笔记。DRAW 的工作首先见于 ICML 2015，arXiv 文章《DRAW：A Recurrent Neural Network For Image》。开源！有代码！文章开篇就用短短三行解释了 motivation：A person asked to draw, paint or otherwise recreate a visual scene will naturally do so in a sequential, iterative fashion, reassessing their handiwork after each modification. Rough outlines are gradually replaced by precise forms, lines are sharpened, darkened or erased, shapes are altered, and the final picture emerges.啥意思，我们人类画画不是一次到位的！要不断的修正，一遍遍的完善细节啊等等。所以呢，为啥我们要求 NN 一步到位的给我们 generate 出一幅完整（entire）的图？这！不！科！学！（汗，画风不太对了，我严谨一点）于是乎，作者就想模拟这个一遍遍的过程。每次我们只画一丢丢，只着重于某一个 part——然后大家就说了，你这个不新鲜啊，attention mechanism 不是也这个意思么，但是 attention model 是啥呢，是在 decode 的过程中，最后的过程中，再去决定，而不是在反复画的过程中：The main challenge faced by sequential attention models is learning where to look, which can be addressed with reinforcement learning techniques such as policy gradients (Mnih et al. , 2014 ). The attention model in DRAW, however, is fully differentiable, making it possible to train with standard backpropagation. In this sense it resembles the selective read and write operations developedfor the Neural Turing Machine.所以他们就提出了这个 DRAW——Deep Recurrent Attentive Writer 的 model。recurrent 表示一遍遍去“修正”完善，attentive 表示每次只着重于某个 part。而且文章不仅配图丰富（在许多个图像数据集如 MINST 上做了实验），还有视频套餐（地址见论文）。视频中可以动态的看出，这个 DRAW 是如何把一个个 digit “写”出来的。下面是我对于 differentiable attention 的一些理解。这篇 DRAW 的工作，主要是基于 Variational AutoEncoder（VAE）框架，去做对于 deep generative model 的 inference。在 deep generative model 的 inference 上，各种理论和模型一直在发展。现在 VAE 框架发展得最快（具体下次再总结），而 deep generative model 不仅被用在 image-text 模型上，也被用在 speech-text 模型上，甚至说在1998年就出现过 deep generative model for speeach-text 的工作。但是当时，inference, computation, data noise 等等问题，都限制了这方面工作的发展。如今，computation 不再是问题，inference 也在蓬勃发展，才有了 deep generative model 的复兴。个人感觉，这方面的工作主要分为两类，一类是继续改进优化 inference algorithm，一类是结合 generative 和 discriminative，互补。第一类不用多说，第二类涉及到的其实是两类 attention mechanism。第一类，就是我们最熟悉的 soft attention，比如《Neural Machine Translation by Jointly Learning to Align and Translate》里的。这种 soft attention，是可以直接在 back propagation 求导（differentiable）的，但是缺点就是它要一步步检测所有 input location（对于 text 就是所有 words，对于 image 就是所有 pixels）。这个计算度就非常高了。相反，第二类 hard attention 是 stochastic 的，是 sampling-based，就可以只 attend input 的某一个范围，计算量就会小很多。但是缺点是，如何做这个 sampling，sampling form 是什么，就需要好好斟酌和考虑。显然，两类是互补的。而且，1998年的工作经验表明，有时候单做 generative task 表现会不好，却可以通过 discriminative 任务来辅助提高。于是乎，现有的一些任务也致力于结合 soft + hard。比如 Wake-Sleep 工作。关于更多 variational autoencoder for deep generative model 的工作介绍，可以看这次 NIPS 2015 Yoshua Bengio 教授的 Tutorial 最后部分。最后，摘录一则笔记，并没有太多新内容：Explicit attention can be helpful because it limits the data presented to the network in some way - it can be more computationally efficient, more scalable (same amount of input even when the input is larger), it doesn't have to learn to ignore things, and it can also be used for sequential processing of static data (you can turn anything into a sequence - which is full circle from when people used to turn sequential data into fixed-length vectors and statistics). Using “hard” attention usually requires using some kind of reinforcement learning technique, which is required in some settings. However, in other settings it is possible to use a “soft” attention which is differentiable so that we can use end-to-end training. One possibility is to have the attention mechanism to output the parameters of a probability distribution to decide which locations in the sequence to attend to (from Graves' 2013 paper). This is allows for visualization of the alignment of the produced output to the input data. Alternatively, you can select by content, e.g. output a “key vector” which is compared to all data using some similarity function which is then used to compute a probability distribution over the input which decides what to attend to. For example, the input can be embedded in some space, and then the attention mechanism decides which embeddings are attended to. This can even be effective when a location-based attention would seem more natural, such as “very sequential” data.It's also useful to selectively attend to the network's internal state or memory, e.g. selectively writing rather than writing the data all at once and then deciding what to read（小S 批注：比如 NTM, DRAW 和 Impatient Writer）. This may make it easier to build complex structures. In the Neural Turing Machine, this is done by first emitting an erase vector and then emitting an add vector. This can be effective for copying tasks, where the number of copies is variable, where it is important to keep track of how many copies have been produced. In a different example, the neural programmer interpreter avoids the fact that the network must learn from scratch how and what to attend to by being told exactly what to attend to at each point in time. This allows training to happen faster by explicitly giving the model the procedure it should use, rather than just lots of training data.In DRAW, a grid of Gaussian filters is used both to read and write from the “canvas” image, which produces a 2-dimensional memory. The center of the Gaussians, their variance, and the “strength” of the focus is controlled over time. This allows generating images to happen iteratively, either iteratively sharpening an image or starting from a blurry image and making it sharper. The resulting system is compositional - once it learns to generate one thing, it can generate an arbitrary number of them. Related are spatial transformer networks, which（小S 批注：在 DL Symposium 的分享里介绍过） parameterize the sampling grid as a spline mapping with affine transformations which allows for a nonlinear warping which can un-distort input images. This works in more than two dimensions, too. Sleep, Learning and Memory: Optimal Inference in the Prefrontal Cortex 这篇工作主要是介绍认知方面的进展，以老鼠为被试，去观测老鼠前额叶皮质中的激活，用这样的激活来验证一个假设——这个假设和现在的 memory network 中的 claim 很相似。那就是，我们本身的自发性行为（spontaneous）是基于 prior distribution 的，而应激性行为（evoked）是基于 posterior distribution 的，并且，随着老鼠的活动（Wake）过程，这两种分布会越来越趋近。它们通过实验和各种统计指标，验证了这个假设。并且因此推断，老鼠前额皮质中的激活确实是 Inference-by-sampling 的。这个推论个人认为才是最重要的。对于未来的 memory network learning policy 设计会有一定帮助。 Dynamic Memory Networks for Natural Language Processing 这个工作就是《Ask me anything》那篇论文，也是相当早就放在 arXiv 上了，ACL 2015 的论文中就已经有人引用。文章来自 Richard Socher 的 MetaMind 团队。主要就是利用一个 dynamic memory network（DMN）框架去进行 QA（甚至是 Understanding Natural Language）。这个框架是由几个模块组成，可以进行 end-to-end 的 training。其中核心的 module 就是 Episodic Memory module，可以进行 iterative 的 semantic + reasoning processing。DMN 先从 input 接受 raw input（question），然后生成 question representation，送给 semantic memory module，semantic module 再将 question representation + explicit knowledge basis（只是设想）一起传给核心的 Episodic Memory module。这个 Episodic Memory module 会首先 retrieve question 中涉及到的 facts 和 concepts，再逐步推理得到一个 answer representation。由于可能有多个涉及到的 facts 和 questions，所以这里还用到了 attention mechanism。最后，Answer Module 就可以用接收到的 answer representation 去 generate 一个真正的 answer。 Chess Q&A: Question Answering on Chess Games 这篇和上面的《Ask me anything》一起看效果加成。文章来自 CMU，也是贡献了一个 dataset。这个 dataset 是关于 reasoning on Chess 的，但是其中包括的 sub question 问题类型和 FB20 非常像：关于 number 的， location 的，关于 sequence of moves 等等。但是 chess dataset 比 FB20 也许好的一点是，它里面涵盖的 common knowledge 会比较少（limited knowledge requirement）。之前有提过 ICLR 2016 submission 的工作《Reasoning in Vector Space: An Exploratory Study of Question Answering》里就需要 predefine 一些 common knowledge，比如北边是南边的对立。但是 chess 里面，这类知识就会相对少。所以这个 dataset 没准也会火起来呢。 Towards Neural Network-based Reasoning 这篇来自诺亚方舟团队，利用的也是 FB20 的 dataset 做 reasoning。和上面的提到的 ICLR 2016 submission 的工作《Reasoning in Vector Space: An Exploratory Study of Question Answering》一样，他们也认为 FB20 中最难的是 path finding 和 positional reasoning。而他们在这篇工作中提出的 Neural Reasoner，可以在这两个任务上达到将近 98% 的正确率（比以前其他模型的&60% 高出一大截）。虽然不如 ICLR 2016 submission 的 100% 高，但也很厉害。模型上他们的改进主要是一个 reasoning layer，这个 layer 在 general 层面上实现 fact, question 之间的 interaction 后 update representation（但是在实验中没有用到）。如果就到这里，实验结果并不会很好。在这篇工作中，一个更重要的改进是他们在 reasoning task objective 的同时，加入了 reconstruction loss objective，就是说，我们学出来的 semantic representation，不仅要 reasoning 很好，而是（至少）要能 reconstruct 出 original question。我个人是非常认同这件事的。虽然这是一个 intuition 的分析，但是我们也可以找到一些对应的理论。在现在高速发展的 Variational AutoEncoder（VAE）的 objective function 中，也有 reconstruction error 一项，即 log p(x|z)。VAE 中的另两项可以合并看坐 regularized term dictated by the variational bound。这样看来，VAE 也可以看成 bound + reconstruction 的共同优化。而 bound 的优化就是 task objective 的转换。所以，个人认为，Neural Reasoner 加入了 reconstruction error 后 performance 显著提高，并不是一个个例。最后，Neural Networks 中各种 error/loss 的功能在 ICLR 2016 的《Stacked What-Where Auto-Encoders》（SWWAE）论文中有很不错的讨论。Yoshua Bengio 教授这次在 NIPS 2015 的 Tutorial 里也推荐了 SWWAE 的工作哟。 Neural Machine Translation: Progress Report and Beyond 来自 NYU 新晋 Assistant Professor Kyunghyun Cho 的 presentation。这个 presentation 的 slides 之前在网上有放出来过（ACL workshop），大家可以去搜一下题目。主要 focus on 13-15年来，各种 Neural-based MT 的工作（encoder-decoder）。比如，最开始大家用 RNN encoder-decoder 来做 MT，发现效果大部分时候是不错的。然而，遇到 long sentences 的时候，就会很崩溃。究其原因，主要是 RNN 对于 long information 的 representation（或者说只基于 vanilla RNN）capacity 不够好。于是，就有了著名的《Neural Machine Translation by Jointly Learning to Align and Translate》这篇 attention-based MT 工作。有了 attention 机制后，MT 的 performance 有了很大的提高。但是另一个问题依然没有得到解决，computation cost 依然很大，速度太慢。出于此，便有了 15年的工作《On Using Very Large Target Vocabulary for Neural Machine Translation》，提出用部分词表近似大词表 MT 效果。在此之上，要进一步提高 MT 的效果，就只能从训练语料入手。训练 MT 模型的 sentence pair 语料总归是比较少的，但是 monolingual 的语料相对就非常庞大。一个自然的想法就是，如何把 monolingual corpus 中的 knowledge 运用在 MT 中，让 monolingual corpus 中的 language model 学习来帮助 MT 的 language model 学习？这就是《On Using Monolingual Corpora in Neural Machine Translation》的工作。到此为止，Neural-based MT 的效果已经可以和过去 phrase-based MT 匹敌了。What's next？大家自己去看 slides 吧：） A Roadmap towards Machine Intelligence Word embedding “创始人” Mikolov 前阵子的一个 arXiv preprint。当时在微博上的评价是，全程无公式，可以当科幻小说看……作为一个 SF fan，我就看了……看了……看了……然后好失望。小说不小说可以不提，但是内容真的感觉没什么新鲜的。如果要评价，大概是各种 Artificial Intelligence course lecture 的集成版？还得是 Introduction to Artificial Intelligence 那种课。因为实在无感，不知道如何推荐……大家有不同意见欢迎在本文最右下角【写评论】，这样别人也可以看到，一起交流想法。这样，NIPS 2015 RAM workshop 也分享完毕啦。与此同时，NIPS 2015 大会也落幕了，新的征程再度开启了……关于主会的 paper 会下次分享。然而！然而！CV 盛会 ICCV 2015 开始啦。所以，【重磅预告】【重磅预告】【重磅预告】，周三会由大S 来分享 ICCV 2015 的干货。NIPS 就顺延~感谢大家的各种讨论和打赏！~\(≧▽≦)/~其他相关文章，可回复代码（如【GH034】）或点击文章名跳转阅读（已加链接）：GH022
亲爱的小伙伴们，感谢大家对我们第一次的线下活动的支持！这次活动最终接到来自各地的超过130份报名信息，Google已经对报名成功通过筛选的童鞋们发放了邀请函，请大家注意查收邮箱。
12:28:57 你真的了解 CNN 么，了解每个 component 的用处么，你会改进它们么？快来一起看看最新的相关工作们。
9:06:20 亲爱的程序媛们，在大家的“千呼万唤”下，我们的第一次 offline 线下活动就要来啦！我们 GirlsWhoCode 将联合 Google 在 Google 北京\/上海\/新加坡办公室举办工作坊，给程序媛们与 Google 亲密接触的机会！
18:16:04 今天分享一下 MS COCO 2015 Detection 比赛中获得前三名的工作。本期内容要特别感谢 WDQ 同学。他参加了 ICCV 2015，并向我推荐和分享了许多许多资料。
8:26:24 Maybe elegant clarity is a gift. But even a gift has to be educated and exercised.
8:57:45 2015，感恩有你们相伴。2016，期待着你们的参与！

如何对自己的图像做autoencoder matlab

我要回帖

更多关于 keras autoencoder 的文章

随机推荐