I have a Doctorate in Biotechnology and I'm a machine learning expert based in Hong Kong
Recently DeepMind beat, the current world number one in the game of GO.
The self-learning (AI) algorithm may have won in the ancient two-player board game, but don't ask it to win at PAC Man. It can't. This is known as narrow or weak AI - built specifically to solve a narrow task.
Strong AI was popularised by the Chinese Room experiment - where a person using a software program, replies to questions posed by a human Chinese speaker outside the room and convinces the human that the program is a live Chinese speaker. Instead of simulating human reasoning, Weak AI, on the other hand, wants to mimic human reasoning.
It helps to understand the problem by visualising the elements of Artificial General Intelligence (AGI), with the eyes (computer vision), ears and mouth (text and natural language processing), and the touch (haptic robots). The missing puzzle is the command centre or the brain is needed - AGI.
The goal of AGI is to create this brain, resembling human thinking that infers across a wide spectrum of situations, where AGI system display human-like common sense as opposed to possessing a narrow skill.
The holy grail of AGI is the unification of computer vision, text and natural language processing, robotics commandeered by a single brain that possesses human-like skill over a comprehensive range of tasks. We call AGI AI 2.0.
Imagine a software program successfully learns to win the game PONG and moves on to learn how to win at Minecraft. Using what it learnt to win PONG in order to spend less time learning to win at Minecraft - two different games. That's called Transfer Learning - the ability of an AI to learn from diverse tasks and apply its accumulated education to a novel task. The advantage is that this method drastically reduces the time to learn the solution to a new task from scratch. DeepMind Lab - has developed PathNet - a network of neural networks trained using both stochastic gradient descent and a genetic selection method.
Using the PONG and Minecraft games to demonstrate successful transfer learning; fixing the parameters along a path learned on winning at Pong and re-evolving a new population of paths for Minecraft, results in winning at Minecraft to be learned faster than it could be learned from scratch.
Abstract: For artificial general intelligence (AGI) it would be efficient if multiple users trained the same giant neural network, permitting parameter reuse, without catastrophic forgetting. PathNet is a first step in this direction. It is a neural network algorithm that uses agents embedded in the neural network whose task is to discover which parts of the network to re-use for new tasks. Agents are pathways (views) through the network which determine the subset of parameters that are used and updated by the forwards and backwards passes of the backpropogation algorithm. During learning, a tournament selection genetic algorithm is used to select pathways through the neural network for replication and mutation. Pathway fitness is the performance of that pathway measured according to a cost function. We demonstrate successful transfer learning; fixing the parameters along a path learned on task A and re-evolving a new population of paths for task B, allows task B to be learned faster than it could be learned from scratch or after fine-tuning. Paths evolved on task B re-use parts of the optimal path evolved on task A. Positive transfer was demonstrated for binary MNIST, CIFAR, and SVHN supervised learning classification tasks, and a set of Atari and Labyrinth reinforcement learning tasks, suggesting PathNets have general applicability for neural network training. Finally, PathNet also significantly improves the robustness to hyperparameter choices of a parallel asynchronous reinforcement learning algorithm (A3C).
Pub.: 30 Jan '17, Pinned: 17 Jun '17
Abstract: With the popularization of the Internet, permeation of sensor networks, emergence of big data, increase in size of the information community, and interlinking and fusion of data and information throughout human society, physical space, and cyberspace, the information environment related to the current development of artificial intelligence (AI) has profoundly changed. AI faces important adjustments, and scientific foundations are confronted with new breakthroughs, as AI enters a new stage: AI 2.0. This paper briefly reviews the 60-year developmental history of AI, analyzes the external environment promoting the formation of AI 2.0 along with changes in goals, and describes both the beginning of the technology and the core idea behind AI 2.0 development. Furthermore, based on combined social demands and the information environment that exists in relation to Chinese development, suggestions on the development of AI 2.0 are given.
Pub.: 28 Feb '17, Pinned: 16 Jun '17
Abstract: In the past, several models of consciousness have become popular and have led to the development of models for machine consciousness with varying degrees of success and challenges for simulation and implementations. Moreover, affective computing attributes that involve emotions, behavior and personality have not been the focus of models of consciousness as they lacked motivation for deployment in software applications and robots. The affective attributes are important factors for the future of machine consciousness with the rise of technologies that can assist humans. Personality and affection hence can give an additional flavor for the computational model of consciousness in humanoid robotics. Recent advances in areas of machine learning with a focus on deep learning can further help in developing aspects of machine consciousness in areas that can better replicate human sensory perceptions such as speech recognition and vision. With such advancements, one encounters further challenges in developing models that can synchronize different aspects of affective computing. In this paper, we review some existing models of consciousnesses and present an affective computational model that would enable the human touch and feel for robotic systems.
Pub.: 02 Jan '17, Pinned: 17 Jun '17
Abstract: In this paper, we argue that the future of Artificial Intelligence research resides in two keywords: integration and embodiment. We support this claim by analyzing the recent advances of the field. Regarding integration, we note that the most impactful recent contributions have been made possible through the integration of recent Machine Learning methods (based in particular on Deep Learning and Recurrent Neural Networks) with more traditional ones (e.g. Monte-Carlo tree search, goal babbling exploration or addressable memory systems). Regarding embodiment, we note that the traditional benchmark tasks (e.g. visual classification or board games) are becoming obsolete as state-of-the-art learning algorithms approach or even surpass human performance in most of them, having recently encouraged the development of first-person 3D game platforms embedding realistic physics. Building upon this analysis, we first propose an embodied cognitive architecture integrating heterogenous sub-fields of Artificial Intelligence into a unified framework. We demonstrate the utility of our approach by showing how major contributions of the field can be expressed within the proposed framework. We then claim that benchmarking environments need to reproduce ecologically-valid conditions for bootstrapping the acquisition of increasingly complex cognitive skills through the concept of a cognitive arms race between embodied agents.
Pub.: 05 Apr '17, Pinned: 16 Jun '17
Abstract: The rapid advancement of machine learning techniques has re-energized research into general artificial intelligence. While the idea of domain-agnostic meta-learning is appealing, this emerging field must come to terms with its relationship to human cognition and the statistics and structure of the tasks humans perform. The position of this article is that only by aligning our agents' abilities and environments with those of humans do we stand a chance at developing general artificial intelligence (GAI). A broad reading of the famous 'No Free Lunch' theorem is that there is no universally optimal inductive bias or, equivalently, bias-free learning is impossible. This follows from the fact that there are an infinite number of ways to extrapolate data, any of which might be the one used by the data generating environment; an inductive bias prefers some of these extrapolations to others, which lowers performance in environments using these adversarial extrapolations. We may posit that the optimal GAI is the one that maximally exploits the statistics of its environment to create its inductive bias; accepting the fact that this agent is guaranteed to be extremely sub-optimal for some alternative environments. This trade-off appears benign when thinking about the environment as being the physical universe, as performance on any fictive universe is obviously irrelevant. But, we should expect a sharper inductive bias if we further constrain our environment. Indeed, we implicitly do so by defining GAI in terms of accomplishing that humans consider useful. One common version of this is need the for 'common-sense reasoning', which implicitly appeals to the statistics of physical universe as perceived by humans.
Pub.: 13 Jan '17, Pinned: 17 Jun '17
Abstract: There is a growing focus on how to design safe artificial intelligent (AI) agents. As systems become more complex, poorly specified goals or control mechanisms may cause AI agents to engage in unwanted and harmful outcomes. Thus it is necessary to design AI agents that follow initial programming intentions as the program grows in complexity. How to specify these initial intentions has also been an obstacle to designing safe AI agents. Finally, there is a need for the AI agent to have redundant safety mechanisms to ensure that any programming errors do not cascade into major problems. Humans are autonomous intelligent agents that have avoided these problems and the present manuscript argues that by understanding human self-regulation and goal setting, we may be better able to design safe AI agents. Some general principles of human self-regulation are outlined and specific guidance for AI design is given.
Pub.: 05 Jan '17, Pinned: 17 Jun '17
Abstract: DeepMind Lab is a first-person 3D game platform designed for research and development of general artificial intelligence and machine learning systems. DeepMind Lab can be used to study how autonomous artificial agents may learn complex tasks in large, partially observed, and visually diverse worlds. DeepMind Lab has a simple and flexible API enabling creative task-designs and novel AI-designs to be explored and quickly iterated upon. It is powered by a fast and widely recognised game engine, and tailored for effective use by the research community.
Pub.: 12 Dec '16, Pinned: 17 Jun '17
Abstract: Metrics on the space of sets of trajectories are important for scientists in the field of computer vision, machine learning, robotics and general artificial intelligence. Yet existing notions of closeness are either mathematically inconsistent or of limited practical use. In this paper we outline the limitations in the existing mathematically-consistent metrics, which are based on Schuhmacher et al. 2008, and the inconsistencies in the heuristic notions of closeness used in practice, whose main ideas are common to the CLEAR MOT measures widely used in computer vision. In two steps we then propose a new intuitive metric between sets of trajectories and address these problems. First we explain a natural solution that leads to a metric that is hard to compute. Then we modify this formulation to obtain a metric that is easy to compute and keeps all the good properties of the previous metric. In particular, our notion of closeness is the first that has the following three properties: it can be quickly computed, it incorporates confusion of trajectories' identity in an optimal way and it is a metric in the mathematical sense.
Pub.: 12 Jan '16, Pinned: 16 Jun '17
Abstract: A reinforcement learning agent that autonomously explores its environment can utilize a curiosity drive to enable continual learning of skills, in the absence of any external rewards. We formulate curiosity-driven exploration, and eventual skill acquisition, as a selective sampling problem. Each environment setting provides the agent with a stream of instances. An instance is a sensory observation that, when queried, causes an outcome that the agent is trying to predict. After an instance is observed, a query condition, derived herein, tells whether its outcome is statistically known or unknown to the agent, based on the confidence interval of an online linear classifier. Upon encountering the first unknown instance, the agent "queries" the environment to observe the outcome, which is expected to improve its confidence in the corresponding predictor. If the environment is in a setting where all instances are known, the agent generates a plan of actions to reach a new setting, where an unknown instance is likely to be encountered. The desired setting is a self-generated goal, and the plan of action, essentially a program to solve a problem, is a skill. The success of the plan depends on the quality of the agent's predictors, which are improved as mentioned above. For validation, this method is applied to both a simulated and real Katana robot arm in its "blocks-world" environment. Results show that the proposed method generates sample-efficient curious exploration behavior, which exhibits developmental stages, continual learning, and skill acquisition, in an intrinsically-motivated playful agent.
Pub.: 11 Dec '13, Pinned: 17 Jun '17
Join Sparrho today to stay on top of science
Discover, organise and share research that matters to you