Adaptive Template Model of Intelligence

Click to see this project on GitHub

1. Introduction

1.1) Problem Overview

Traditionally, template-matching algorithms have been used for things like digital image processing and visual pattern recognition. There are typically sets of small, two-dimensional filters that are moved across a gray-scale image in order to detect instances of low-level visual patterns.

Pattern recognition via mainstream template-matching techniques is restricted in that it is only useful when dealing with problems in vector space. Even problems of moderate complexity tend to deal with patterns that are more conceptually abstract and less dependent on space-time. This means the template-matching techniques that are useful at low levels cannot scale with the complexity of higher-level pattern recognition.

1.2) System Objectives

In the following framework, I propose a feasible solution for extending template-matching methods to topological space, like graphs and networks. The two main objectives are: 1. to uncover a mechanism for complex pattern recognition that uses template-matching methods, as this may substantially reduce the computational resources required for high-level pattern recognition, and 2. to provide a universal method of analogical processing that can be applied across all types of information.


2. Graphs

2.1) Graph Components

A graph is a collection of nodes connected by links, each with a certain set of properties than can take one of a finite amount of values at any given time. Every node in a graph has the same set of properties as every other node, and the same goes for links as well. Each graph is equipped with two property sets: one for nodes and one for links. While every component of one type shares an identical set of properties with the rest, its values can vary from node-to-node or link-to-link.

2.2) Component Properties

Any two graphs that have identical property sets are said to be compatible. That is, they are similar enough to compare with one another. A comparison between graphs is a measurement of similarity between them. Similarity refers to number of attributes shared between two graphs, with respect to the number of attributes held by only one graph or the other.

Attributes not only references the properties of a graph’s nodes and links, but also the topology of the graph. This means that the connectivity of nodes and links can be regarded as a property itself, not specific to any one component but rather an attribute of the graph as a unified object.


3. Templates

3.1) Template Graphs

A template graph is a graph that compares itself to another graph, called the input graph, by calculating a measurement call the similarity value. The input and template graphs are each equipped with an origin node, which is simply a node that acts as a universal reference point to the rest of the graph. The origin nodes are aligned and a comparison is made between them, resulting in said similarity value. This value measures the correspondence between a given input graph with the template graph, and thus presents information about a graph through a comparative process, rather than simply conveying information about the graph in isolation.

3.2) Sliding Templates

A sliding template is an agent that traverses a larger graph and compares its assigned graph template to the local subgraph around each node along the walk. The output of a sliding template is a graph where each node has an assigned similarity value in reference to the local subgraph around a specific input node.

Sliding templates at any point along their walk can be located using their origin nodes. They traverse a graph by moving their origin from the current or center node on the input graph to align with one of its connected neighbors. This means that each step along its walk, a sliding template selects a single node from the local subgraph, which then becomes the new center node (resulting in the previous center node becoming part of the new local subgraph).


4. Comparisons

4.1) Comparison Operators

To better explain the abstract notion of taking two graphs and making some comparison between them, we’ll go through some simple examples (It might also be comforting to know that we’re barely stepping outside the realm of basic set theory). The process of “comparison” as we’ve discussed requires little more than elementary binary operations that you are probably familiar with. The following shows three examples of how binary operations are used to compare graphs whose nodes have exhibit a property of color.

4.2) Applied Comparisons

Now, we’ll take this idea a step further and use a sliding template on a large graph to show how topological properties are detected during a traversal. We will use a simple graph template that consists of an origin node along with three neighbors.

The left image shows the input graph, and the right a color-coded version of the output graph. Each node is shaded according to the calculation of similarity at that point on the input graph. In the right upper-hand corner is a key that relates each color with the associated level of similarity. As you can see, nodes with exactly three neighbors received strong similarity values with those with greater or less than three received weaker values.


5. Agents

5.1) Agent Fitness

Agents are used to control the movement of sliding templates. During a walk, an agent observes the local environment, or subgraph, as well as a performance rating, or similarity measurement, and selects an action, or neighbor, which becomes the position at the next step. When the total value of similarity measurements along a walk is high, then the agent is said to have performed well. When the total is low, its said to have performed poorly.

Each agent has a finite memory space that it uses to store/retrieve information about the environment throughout a walk. Agents “earn” memory space by performing well, which means it can keep track of more information that it uses to make decisions about its movement during a walk. This means the agents that perform consistently well are rewarded with computational resources that help sustain those successful behaviors, while the lower-ranking agents are deprived of said resources which decreases their chance of survival.

5.2) Agent Evolution

Once an agent’s resources fall below a certain amount, it “goes extinct” and is removed from the population. On the other hand if the amount of resources exceeds a certain amount, the agent “reproduces” and creates a duplicate agent, where a small chance of mutation can lead to differences in the properties of its associated graph template.

Since all computational systems have constraints on resources, an evolving population operates like a Darwinian zero-sums game, where the winners are the agents who can locate the greatest number of matching subgraphs during a walk.


6. Adaptations

6.1) Reproductive Adaptations

An elementary agent, the simplest type of agent, selects actions by measuring the similarity for each neighbor and simply moves in the toward the highest value. No additional memory space is required, however an elementary agent will still die if its memory space decreases below a certain point.

The run-time of an elementary agent is brief unless it accumulates enough space to duplicate itself. This triggers the evolutionary process which potentially leads to brand new agents being formed, via mutations that occur in each new generation.

6.2) Generative Adaptations

When an elementary agent dies, its graph is replaced with the local subgraph around its last position. Agents rapidly pop in-and-out of existence, trying out new graphs until one takes hold. These graphs are called stable because they have the ability to sustain themselves for a relatively longer period of time.

Advertisements

3 thoughts on “Adaptive Template Model of Intelligence

  1. I love the intersection between this and “Linguistic Development and Constructed meaning.” I’ve been thinking along the same lines for some time now. Using a big enough body of literature, this could be semi-supervised to unsupervised. Almost a ML take on Spinoza’s method in a way. Best of luck, if I think of anything, I’ll let you know.

    I keep returning to Being and Nothingness with their synthesis of becoming, and the idea of growing as a possible starting place. How to build from these starting blocks to more complex ideas using the techniques of ML is where I get lost in the weeds, so to speak. Best of luck, friend.

    Liked by 1 person

  2. By the way, disclaimer: I am a complete hack.

    I get that you are proposing using template graphs and evolutionary algorithms. If the graphs are for idea representation, how to you propose to “get” the graphs?

    Hierarchical attention (https://www.cs.cmu.edu/~diyiy/docs/naacl16.pdf) seems appropriate to natural text interpretation. Sentences are a form of avoiding splitting thoughts, or at least of splitting them coherently, so it’s a form of grammatical supervision, and it’s free.

    You might be able to create a neural network that creates graphs based on language parsing, but one thing I keep returning to when thinking of any of this is that labelling is a truly substantial pinch-point to even being able to assess the viability of any method. With the variety and richness of language and of human thought, creating enough high quality labels to move to semi-supervised learning might be the biggest hurdle.

    Once again, I am a complete hack, if you are a post-Doc or whatever, please go easy on me!

    Liked by 1 person

    1. Disclaimer: Its extremely late and I didn’t plan on writing this much, so bare with me

      Hey, thanks a lot for the comment. You bring up some good points. I want to clarify that my use of the word ‘analogy’ refers to a more general notion of mapping from a source domain to a target domain, which reveals some partial correspondence between the two. Analogies differ from normal patterns in that you’re selecting only the elements of the source domain that you wish to discover in the target. Its essentially a form of abstraction or generalization.

      So I’m not necessarily discussing the use natural language processing, although that is definitely one application. Instead I’m really focused on the use of analogical processing of any information. For example, you could think of circles and spheres as a type of analogy that maps between spatial 2D and 3D domains. The map would include geometrical data like an having infinite number of corners and being the same distance from its origin at any given point on the surface.

      Another thing that isn’t explicitly stated in my post is that the use of agents as described is mostly metaphorical, in that its a way for me to discuss the system at a conceptual level without acknowledging any constraints that a real implementation might have.

      Also, the phrase ‘template-matching’ refers to a simplification of the analogy-mapping process. Realistically there would be a lot more going on than just measuring the similarities between different parts of a graph.

      Finally the graph itself is a simplification of a neural network, as you accurately pointed out. I’ve already begun coding this part of the system as an unsupervised neural network that uses competitive inhibition between neurons in a layer to organize itself, such that each node responds to a specific input pattern with minimal overlap. The output of each layer can be either the set of outputs from each neuron, or it can use a winner-takes-all approach and just output the most-responsive neuron for a given input pattern. Learning is just about there but not quite, sometimes a node will respond to multiple patterns and other times it won’t respond at all, but as far as I can tell the individual adaptations of each neuron are sound enough, but the competitive inhibition algorithm causes them to get stuck and not respond to anything or blow up and respond to everything. So that’s where I’m at right now.

      Beyond that, I plan to build a component on top of each layer that deals with lateral connections based on temporal processing, similar to what’s used by Hierarchical-Temporal-Memory to make predictions of incoming spatial patterns. The input to the network may also require some sort of spatial relevance, where the nodes have a local ‘view’ of inputs instead of taking the entire input set, and the view of each node overlaps in some capacity with its neighbors. Spatial data would then preserved in the output of a layer, enabling the next layer to also process spatial information.

      Anyway, the spatiotemporal differences of a pattern could be abstracted into a model or small network of nodes that represent lower-level ‘events’ or patterns of activity, and links that indicate differences in space and/or time that separate them from one another. The result is a ‘graph template’, to answer your question about where the graphs come from, and are used in analogy-mapping as a ‘pattern blueprint’ with which to make comparisons and detect when a given pattern has a set of qualities that ‘match’ its own.

      So that’s about all I have to say about it at this point. You can find the project here:
      https://github.com/CarsonScott/Analogy-Detection-Agents

      You can also send an email if you have any other ideas or questions about the system that you’d like to discuss. Thanks Ryan!

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s