Во первых, мы начинаем конструкцию характеристики романа которая одалживает от общ используемых методов для извлечения характеристики в опознавании речи и обрабатывать нот. Эти методы зацеплены к людскому уху, которое ограничено к приблизительно. 20 килогерцев и чувствительность логарифмическая в частоте; для принтеров, наши эксперименты показывают что большинств интересные характеристики происходят над 20 килогерцами, и логарифмический маштаб нельзя предположить.
Наша конструкция характеристики отражает эти замечания путем использовать разложение подполосы устанавливает внимание на высоких частотах, и распространять частоты фильтра линейно над частотным рядом. We further add suitable smoothing to make the recognition robust against measurement variations and environmental noise.Second, we deal with the decay time and the induced blurring by resorting to a word-based approach instead of decoding individual letters. A word-based approach requires additional upfront effort such as an extended training phase as the dictionary grows larger, and it does not permit us to increase recognition rates by using, e.g., spell-checking. Recognition of words based on training the sound of individual letters (or pairs/triples of letters), however, is infeasible because the sound emitted by printers blurs so strongly over adjacent letters.
Third, we employ speech recognition techniques to increase the recognition rate: we use Hidden Markov Models (HMMs) that rely on the statistical frequency of sequences of words in text in order to rule out incorrect word combinations. The presence of strong blurring, however, requires to use at least 3-grams on the words of the dictionary to be effective, causing existing implementations for this task to fail because of memory exhaustion. To tame memory consumption, we implemented a delayed computation of the transition matrix that underlies HMMs, and in each step of the search procedure, we adaptively removed the words with only weakly matching features from the search space.
We built a prototypical implementation that can bootstrap the recognition routine from a database of featured words that have been trained using supervised learning. Afterwards, the prototype automatically recognizes text with recognition rates of up to 72 %.
.
.