Static
kStatic
kThe cursor position in the grid.
The length of the readings in the grid.
The readings in the grid.
The separator for readings.
Returns all candidate values at the location. If spans are not empty and loc is at the end of the spans, (loc - 1) is used, so that the caller does not have to care about this boundary condition.
The location to get candidates from.
A list of candidates.
Clears the grid.
Delete the reading after the cursor, like Del. Cursor is unmoved.
Always returns false.
Delete the reading before the cursor, like Backspace. Cursor will decrement by one.
Always returns false.
Inserts a reading at the current cursor position.
The reading to insert.
True if the reading was inserted, false otherwise.
Find all nodes that overlap with the location. The return value is a list of nodes along with their starting location in the grid.
The location to find overlapping nodes.
A list of nodes that overlap with the location.
Adds weight to the node with the unigram that has the designated candidate value and applies the desired override type, essentially resulting in user override. An overridden node would influence the grid walk to favor walking through it.
The location of the candidate.
The candidate to override.
The type of override.
True if the override was successful, false otherwise.
Same as the method above, but since the string candidate value is used, if there are multiple nodes (of different spanning length) that have the same unigram value, it's not guaranteed which node will be selected.
The location of the candidate.
The candidate to override.
The type of override.
True if the override was successful, false otherwise.
Find the weightiest path in the grid graph. The path represents the most likely hidden chain of events from the observations. We use the DAG-SHORTEST-PATHS algorithm in Cormen et al. 2001 to compute such path. Instead of computing the path with the shortest distance, though, we compute the path with the longest distance (so the weightiest), since with log probability a larger value means a larger probability. The algorithm runs in O(|V| + |E|) time for G = (V, E) where G is a DAG. This means the walk is fairly economical even when the grid is large.
The result of the walk.
A grid for deriving the most likely hidden values from a series of observations. For our purpose, the observations are Bopomofo readings, and the hidden values are the actual Mandarin words. This can also be used for segmentation: in that case, the observations are Mandarin words, and the hidden values are the most likely groupings.
While we use the terminology from hidden Markov model (HMM), the actual implementation is a much simpler Bayesian inference, since the underlying language model consists of only unigrams. Once we have put all plausible unigrams as nodes on the grid, a simple DAG shortest-path walk will give us the maximum likelihood estimation (MLE) for the hidden values.