Form a prediction based on the input in the context of previous inputs
The final step for our region is to make a prediction of what is likely to happen next. The prediction is based on the representation formed in step 2), which includes context from all previous inputs. When a region makes a prediction it activates (into the predictive state) all the cells that will likely become active due to future feed-forward input. Because representations in a region are sparse, multiple predictions can be made at the same time. For example if 2% of the columns are active due to an input, you could expect that ten different predictions could be made resulting in 20% of the columns having a predicted cell. Or, twenty different predictions could be made resulting in 40% of the columns having a predicted cell. If each column had four cells, with one active at a time, then 10% of the cells would be in the predictive state.
A future chapter on sparse distributed representations will show that even though different predictions are merged together, a region can know with high certainty whether a particular input was predicted or not.
How does a region make a prediction? When input patterns change over time, different sets of columns and cells become active in sequence. When a cell becomes active, it forms connections to a subset of the cells nearby that were active immediately prior. These connections can be formed quickly or slowly depending on the learning rate required by the application. Later, all a cell needs to do is to look at these connections for coincident activity. If the connections become active, the cell can expect that it might become active shortly and enters a predictive state. Thus the feed-forward activation of a set of cells will lead to the predictive activation of other sets of cells that typically follow. Think of this as the moment when you recognize a song and start predicting the next notes.
Figure 2.3: At any point in time, some cells in an HTM region will be active due to feed-forward input (shown in light gray). Other cells that receive lateral input from active cells will be in a predictive state (shown in dark gray).
In summary, when a new input arrives, it leads to a sparse set of active columns. One or more of the cells in each column become active, these in turn cause other cells to enter a predictive state through learned connections between cells in the region. The cells activated by connections within the region constitute a prediction of what is likely to happen next. When the next feed-forward input arrives, it selects another sparse set of active columns. If a newly active column is unexpected, meaning it was not predicted by any cells, it will activate all the cells in the columns. If a newly active column has one or more predicted cells, only those cells will become active. The output of a region is the activity of all cells in the region, including the cells active because of feed-forward input and the cells active in the predictive state.
As mentioned earlier, predictions are not just for the next time step. Predictions in an HTM region can be for several time steps into the future. Using melodies as example, an HTM region would not just predict the next note in a melody, but might predict the next four notes. This leads to a desirable property. The output of a region (the union of all the active and predicted cells in a region) changes more slowly than the input. Imagine the region is predicting the next four notes in a melody. We will represent the melody by the letter sequence A,B,C,D,E,F,G. After hearing the first two notes, the region recognizes the sequence and starts predicting. It predicts C,D,E,F. The “B” cells are already active so cells for B,C,D,E,F are all in one of the two active states. Now the region hears the next note “C”. The set of active and predictive cells now represents “C,D,E,F,G”. Note that the input pattern changed completely going from “B” to “C”, but only 20% of the cells changed.
Because the output of an HTM region is a vector representing the activity of all the region’s cells, the output in this example is five times more stable than the input. In a hierarchical arrangement of regions, we will see an increase in temporal stability as you ascend the hierarchy.
We use the term “temporal pooler” to describe the two steps of adding context to the representation and predicting. By creating slowly changing outputs for sequences of patterns, we are in essence “pooling” together different patterns that follow each other in time.
Now we will go into another level of detail. We start with concepts that are shared by the spatial pooler and temporal pooler. Then we discuss concepts and details unique to the spatial pooler followed by concepts and details unique to the temporal pooler.
|