some things from agi-16

author: Bryan Bishop <kanzure@gmail.com> 2016-07-27 20:29:06 -0500
committer: Bryan Bishop <kanzure@gmail.com> 2016-07-27 20:29:06 -0500
commit: 5c6951872205031b40e53565368158e45c6d5a0f (patch)
tree: 5a5154db20aae4ea4e6f74e8898bbc70ea8a109e
parent: ca9cc72aefffbd6af78f34554e567068d4a4003b (diff)
download: diyhpluswiki-5c6951872205031b40e53565368158e45c6d5a0f.tar.gz
diyhpluswiki-5c6951872205031b40e53565368158e45c6d5a0f.zip
2 files changed, 97 insertions, 0 deletions
diff --git a/transcripts/agi-16/deep-learning-for-agi-survey-of-recent-developments.mdwn b/transcripts/agi-16/deep-learning-for-agi-survey-of-recent-developments.mdwn
new file mode 100644
index 0000000..fb29fd6
--- /dev/null
+++ b/transcripts/agi-16/deep-learning-for-agi-survey-of-recent-developments.mdwn
@@ -0,0 +1,64 @@
+Deep learning for AGI: Survey of recent developments
+
+Cosmo Harrigan
+
+<http://agi-conf.org/2016/>
+
+<https://www.youtube.com/watch?v=dLdFLVDlJes&list=PLZlLHCryX93L40ZLylvbO8zqfbz3ZNwmm&index=6>
+
+There we go. The floor is yours.
+
+My name is Cosmo Harrigan. I'm going to present a brief overview of work that I have found interesting in the past 2 years that extends on Leslie's comments, on deep learning and how it might be applied ot the field of AGI.
+
+Here's a brief outline. First I'll describe how we might view AGI in the context of deep learning. I'll present another definition for deep learning. We'll review several fields. We'll go into intrinsic models, programmed learning, memory, learning to act, issues with one-shot learning, transfer learning, attention, and then some architectures and potential avenues for the future.
+
+Because this is a brief talk, I will only be able to go into these models in brief detail. These are all described in differet papers. Why do I think deep learning is relevant for AGI? I think it's relevant for two reasons. First, the methods for deep learning are expanding in scope, beginning to include ways to address memory, program learning, unsupervissed learning, action learning, etc. Because deep learning is being used in conjunction with other systems, to produce other systems, too.
+
+So what about universal intelligence like from the tutorials yesterday? In 2011, Venice paper, planning is expecting max searching into the future. Learning is the use of bayesian mix of turing machines to predict future observations and rewards, in the AIXI framework. We can view practical methods for AGI as an approximation of these ideals. Deep learning is one example.
+
+I like this alternative definition of deep learning, where he defines the program implemented by the network, and several pieces of terminology like a partial causal sequence, concept of weight sharing, credit assignment task, and uses this to define what we mean by deep neural network. He defines the definition of the network as its program, or its weights, and a partially causal sequence defined by the network, including the concept of events. The network encodes a topology. There's a crucial concept of weight sharing in many types of deep neural networks. We also have the concept of credit assignment paths through the network. And potentially causal connections. I apologize for going through this rapidly. These concepts are put together to define the concept of depth; we have a definition of deep learning which is extremely general and can encompass many architectures.
+
+Now I will review some work from this field. We'll begin with intrinsic motivation. This is a method to allow embodied agents to satisfy internal drives for discovery, which could be defined in many ways, rather than aiming to solve a higher-level goal function. This is useful in environments and settings where you have a long delay for rewards, sparse rewards. Hierarchical value function at different temporal scales, with intrinsic motivation function, to approach deep learning. They are able to learn intrinsic goals, also called options as defined by Richard Semiens. This is an example of their architecture in this diagram.
+
+Here's an interesting application. In the original deep q network paper, this was presenting one of the exaples that is difficult for the algorithm to achieve. This is an instance of a delayed reward where you .. complicated scene of actions. This hierarchical cuonent algorithm, you can see has achieved very good results in comparison to the original results.
+
+Moving on to other recent work, they have outlined possible functions to use as intrinsic motivation functions. Of these five possibilities, they focus on the concept of empowerment. We could formulate intrinsic drives, but they all involve an unsupervised learning method that allows an agent to arrive at some measurement of value, to achieve its goals. In this paper, they used a mutual information method, called empowerment, which allowed the agent to achieve this type of reasoning.
+
+Another theme interesting I found is generative models, based on variational auto-encoders, which are latent variable probabilistic models which are used in unsupervised learning settings. On the left, we have examples provided to the system, and on the right we see they learned to hallucinate additional classes, based on their model, in the style of the example on the left.
+
+An additional work done on this is the general adversarial network. In this framework, they trained two models, one is a generative model, which caches and distributes the data. The second is a discriminative model, which attempts to determine if the sample it is given came from the sample data or a generative model. This corresponds to a mini-max scheme. These types of generative models were applied recently to produce a learned models of chairs, where they could see latent features and generate new images. They were able to vary features like color to generate different chairs that were not presented in the chain. There has been additional work in this regard as well, where they combine pyramids with the generative adversarial model for class exchange.
+
+So what about memory and programmed learning? There have been some recent works in this regard. There was a feedback recurrent memory queue network, where there is context-dependent retrieval of memories. They were able to use an attention model to determine which memories to focus on, for computing a value function. The model is illustrated on this slide.
+
+A lot of memory, we have the concept of programmed learning, which is a key aspect for AGI. There was the neural turing machine paper, which a couple of neural networks .... the diagram of the architecture ... now we will describe some of its properties and applications. They have applied to sorting, associative recall, and simple tasks. They have constructed an architecture that extends standard recurrent neural networks, to allow it to utilize a memory. This is analogous to a working memory system.
+
+Additional recent work has extended this is the neural GPU. This in context with the neural turing machine is a highly parallel architecture, designed to make it easier to train and more efficient. They have applied this to several tasks.
+
+Neural programmer interpreter. This was a recurrent and compositional network that learned to represent and execute programs in python. It's interesting to look at their results. They compared it to a traditional sequence-to-sequence network. First they looked at sample complexity. After 8 training examples it achieves perfect accuracy, whereas the sequence-to-sequence one took much longer. We could also look at the generalization ability, where we see that it is able to generalize much better.
+
+Moving on to the next topic, neuroevolution and learning to act. Neural networks can be used for feature learning and reinforcement learning. We can represent the controller as a eural network, and optimize it using genetic algorithms.
+
+Using a sequence of three papers, learning to drive from screen pixes, in a simulator. They used a genetic algorithm to train a convolutional neural network and other one, and they did not use backpropagation in this case.
+
+Another recent case was using backpropagation rather than neuroevolution to train this network for taking captions and targets. These are both examples of deep reinforcement learning.
+
+The next field I would like to briefly summarize is that of data efficiency and one-shot learning, which was mentioned in the previous presentation. If we look at part 1, we see that we're given one example highlighted in red, and the problem is to look at the initial remaining examples and see which ones belong to the same class. Humans can do this easily, given one example, the problem is to describe systems that also have that ability. If we look at the second example, the problem is being given one example and being able to generate new examples. This is known as one-shot learning.
+
+Original deep reinforcement learning methods were highly data inefficient methods. There have been several recent attempts to overcome these issues. There is one way called episodic control. It reenacts successful policies, in memory, by following highly rewarding sequences that have achieved good results in the past. This is an instance of a fast learning system, based on non-parametric memorization of certain experiences. The context in which this was presented in humans and animals they are able to learn rapidly from single examples using multiple methods like memory and decision systems. Sometimes it is appropriate to use model-based planning, which takes more resources, but sometimes you need ot make fast decisions, but then you need less resource-intensive models. Model-free episodic control, shows an instance based method that is used as a rough approximation when less resources are available.
+
+Other work in this vein of data efficiency and one-shot learning was presented in "One-shot learning with memory augmented neural networks". They were able to rapidly learn from new data, and utilize it, for further instances. This is more in the deep reinforcement learning setting, where the concept of experienced replay stores prior experiences and replays them internally to continue to train the model. Traditional experience replay does not have a way to prioritize certain memories. In 2015, Shawl, they introduced prioritized experience replay where they have a method of sampling based on predicted importance of the memories.
+
+Another interesting work was called "Multi-tasked learning". In this case they were able to train a single policy networks, using the guidance of teachers, how to act in a set of distinct tasks. There was another example called "deep skill networks", using a hierarchical reinforcement network for life-long learning. This is an illustration of their architecture.
+
+Universal value approximator function. It expands the concept of a value function in reinforcement learning, rather than just for the state, it's generalized to a state-goal pair, so that it could be generalized to new goals when the agent acquires new experiences and requirements.
+
+I want to present recent architectures. Highway networks, deep recurrent networks, latter networks for unsupervised learning, the dueling network for learning an advantage function in conjunction with the value function for stray pixels; also automated theorem proving applications. Recurrent neural network controller, with a recurrent neural network model, which utilize each other to learn and predict and control in an environment.
+
+I would like to point out that many of these models take inspiration from neuroscience. There was a recent paper where it said that structured architectures, and dedicated systems for attention, short memory, long term memory, and these architectures are heterogenous. And there are series of interactive cost functions, aiming to make data more efficient and targeted to the needs of the organism.
+
+In conclusion I think we will see more hybrid models. We may see work aiming to address planning and reasoning, building cognitive architectures using components from deep learning. Future generations of neural networks willl look very different from modern entworks. We might see more structure and inductive biases will be built inot the networks, or learned from previous experiences from previous tasks, leading to human-like behavior.
+
+By expanding the scope of deep learning and combining it into hybrid systems, I think this is relevant to the topic of artificial general intelligence.
+
+Model-based reinforcement learnings and planners.
+
diff --git a/transcripts/agi-16/deep-neural-networks-cant-make-agi.mdwn b/transcripts/agi-16/deep-neural-networks-cant-make-agi.mdwn
new file mode 100644
index 0000000..577b2e3
--- /dev/null
+++ b/transcripts/agi-16/deep-neural-networks-cant-make-agi.mdwn
@@ -0,0 +1,33 @@
+Deep neural networks can't make AGI
+
+Brandon Rohrer
+
+<https://www.youtube.com/watch?v=KK6uHsIm8rA&list=PLZlLHCryX93L40ZLylvbO8zqfbz3ZNwmm&index=7>
+
+The intention was to have an inflammatory conversation about whether deep neural networks are as awesome as the media implies, and whether they can solve everything. If you happen to agree that it's not enough then you should think about what to fill the gap with. I'm not going to take that opportunity right now, but I will elucidate what I see the gap is.
+
+We have what deep learning can do. I would sum it up in a broad brush as saying, it finds patterns, specifically in terms of weighted combinations of inputs. If you look at most of the talks from this morning and yesterday, a lot of the conversation started at the symbolic levle. Going from pixels, touch sensors, microphones, using a huge gap that has to be crossed. Can deep learning do that?
+
+Deep learning can do some generalization, especially convolutional neural networks, and with some add-ons and some other layers you can do classification, assigning categories, doing regression and assigning a numerical value, and you can do clustering, to determine what's similar, and you can also choose your actions.
+
+Deep learning is, in fact, cool. You can take raw images, expose it, and then learn feature that when reconstituted are quite interesting. You can expose it to cars and learn features that look like cars. This idea of going from raw pixels to symbols is quite plausible here.
+
+You can do this with imagery. Raw exposure to data. You can take raw audio soundtracks, break them up to frequencies, run deep learning, and learn features, and then figure out the author from this. Modest Mouse and the president of the U.S. are similar. Beyonce and Taylor Swift aren't too far away from each other. And so on.
+
+You can also, as mentioned, you can use deep queue networks to learn to play Atari very well. You can also do crazy things like have a robot watch youtube videos, learn types of objects, and with some other mechanisms, apply those to how to cook.
+
+So we have one edge of our gap. The other edge of our gap is what's necessary for AGI? For the purposes of this conversation, there's a lot of debate there yes, I would like to focus on things that are externally observable. I don't want to touch internal representation or internal processes or affects. What can the thing do? I would like to use human performance as a threshold. This is a human-level AI concert. Can deep learning do everything that a human can do?
+
+What can humans do, that deep neural networks cannot do? I want to throw out some concrete observations. In the category of action, there's switching context and making plans. Deep neural networks don't do that well. Generalization and adapation, in perception, are severely lacking.
+
+What I mean by switching context is that existing approaches map one set of inputs to one action. Humans don't appear to rely on that. In the example of a self-driving car trained in the US, you take it to the UK, it would probably take more than a 5 minute lesson to get it to work.
+
+And planning. If you look at how your algorithm did on the Atari games. There was a pattern, where the action was determined by the current state of the screen. That blew humans out of the water. But when you had to do a series of actions, they were awful at that. There are ways around this. I'm not aware of general ways around this, using only deep neural networks. This has still not happened well.
+
+They are having some successes in Go and Chess. Those parts, planning, were not done with deep neural networks. They were done using tree search.
+
+Generalization is one of my favorites. If you look at deep neural network evidence, they were designed to identify cats. The generalization is that it's good at taking an array of pixels, finding a pattern, and finding it no matter where it is in the image. As long as your data has 2d or 3d structure, you can have translational invariance and translational generalization. What it does not do is finding things that look different in their raw pixel format, but are similar in some other way. And it doesn't work well if you have data that doesn't have 2d structure. There's a very narrow carefully defined set of problem, like looking at the light where it is in our implementation if you were to have an implementation like an advanced robot with some cameras and robots and range sensors and microphones, you can't take all of that and put it into a nice 2d array that is meaningful, meaning two inputs that are next two each other are similar and two distant signals are not related. So the generalization that deep neural networks do right now, is very limited.
+
+Finally, adaptation. As was mentioned earlier this morning, and I was nodding vigoriously as Gary Martins was talking. It would be good to not only have benchmarks where you have to learn to do a task, but where you have to learn to do a set of tasks even better, but better yet where you have to do a set of tasks and then you are tested on a completely novel task, or a way that the world is changed in some fundamental way during the course of the task. How well does it learn to play chess after the same number of games? If a computer and a human first play 10 games of chess before playing each other, being able to quantify that and will be interesting.
+
+What do we need to fill this gap with? Thanks for your attention.
author	Bryan Bishop <kanzure@gmail.com>	2016-07-27 20:29:06 -0500
committer	Bryan Bishop <kanzure@gmail.com>	2016-07-27 20:29:06 -0500
commit	5c6951872205031b40e53565368158e45c6d5a0f (patch)
tree	5a5154db20aae4ea4e6f74e8898bbc70ea8a109e
parent	ca9cc72aefffbd6af78f34554e567068d4a4003b (diff)
download	diyhpluswiki-5c6951872205031b40e53565368158e45c6d5a0f.tar.gz diyhpluswiki-5c6951872205031b40e53565368158e45c6d5a0f.zip