Abstract
Automatic handwriting recognition has had the interest of researchers for decades. Although there are
various applications for which the technique is already used in daily life, a number of problems still has to
be solved. At this moment, one of the biggest problems is the low user acceptance: the systems are not
accurately enough, and moreover, the mistakes that recognizers make are usually not very understandable
to humans, which can frustrate the users of the systems.
This thesis is about the use of the Dynamic Time Warping (DTW) algorithm. We claim that the results of a recognizer based on the DTW-algorithm handwriting recognition are more "intuitive" to humans than the results of other recognizers. Because humans can understand the errors the system makes, this will probably improve the user acceptance of handwriting recognition systems. Furthermore, given that users are aware of what they have to do for the system to understand them, the recognizer is expected to yield better recognition performances. The way the system works also adds a number of new possibilities for applications of automatic handwriting recognizers. A more human-like system, for example, can be of use in the field of forensic document analysis or the teaching of handwriting to young children.
Because the algorithm compares characters in a way that differs from the comparison by other classifiers, we think that it can be useful to add the system to a Multiple Classifier System (MCS): because the system has a different "look" on the data, and therefore is orthogonal to other classifiers, it can give valuable advice about a classification in a MCS.
To test the performance of DTW, our claim about the similarity to human handwriting recognition and the suitability of the system in a Multiple Classifier System, we have implemented the algorithm into a prototype-based classifier. Two different sets of prototypes were also created especially for the classifier.
We have conducted an experiment to test the recognition performance of our classifier, using the different prototype sets. The results show that our classifier correctly classifies between 88.20 and 95.26 percent of the offered samples using one of the prototype sets and between 90.32 and 97.16 percent using the other prototype set. Another experiment was conducted to check the "intuitivity claim". 25 subjects judged the results of our classifier and the results of another classifier, and the results show that our classifier performed significantly better than the other system.
Finally, the classifier was tested in a real MCS, that was used for the cleaning up of a large database (the UNIPEN devset) of handwritten characters, which is known to contain a relatively large number of errors. Using the MCS, we were able to automatically clean up a very large part of the database: only a minimum of human interference was needed. The concept of rejection, that allows the classifier to reject a classification if it is not certain enough about it, was implemented for this application.
The performance of our DTW-classifier in the experiments, and in the tested application is very promising, and in the future, the classifier could be applied to other than Latin alphabets or recognizing words. Also, it can be used for other applications than classification or data cleaning.
This thesis is about the use of the Dynamic Time Warping (DTW) algorithm. We claim that the results of a recognizer based on the DTW-algorithm handwriting recognition are more "intuitive" to humans than the results of other recognizers. Because humans can understand the errors the system makes, this will probably improve the user acceptance of handwriting recognition systems. Furthermore, given that users are aware of what they have to do for the system to understand them, the recognizer is expected to yield better recognition performances. The way the system works also adds a number of new possibilities for applications of automatic handwriting recognizers. A more human-like system, for example, can be of use in the field of forensic document analysis or the teaching of handwriting to young children.
Because the algorithm compares characters in a way that differs from the comparison by other classifiers, we think that it can be useful to add the system to a Multiple Classifier System (MCS): because the system has a different "look" on the data, and therefore is orthogonal to other classifiers, it can give valuable advice about a classification in a MCS.
To test the performance of DTW, our claim about the similarity to human handwriting recognition and the suitability of the system in a Multiple Classifier System, we have implemented the algorithm into a prototype-based classifier. Two different sets of prototypes were also created especially for the classifier.
We have conducted an experiment to test the recognition performance of our classifier, using the different prototype sets. The results show that our classifier correctly classifies between 88.20 and 95.26 percent of the offered samples using one of the prototype sets and between 90.32 and 97.16 percent using the other prototype set. Another experiment was conducted to check the "intuitivity claim". 25 subjects judged the results of our classifier and the results of another classifier, and the results show that our classifier performed significantly better than the other system.
Finally, the classifier was tested in a real MCS, that was used for the cleaning up of a large database (the UNIPEN devset) of handwritten characters, which is known to contain a relatively large number of errors. Using the MCS, we were able to automatically clean up a very large part of the database: only a minimum of human interference was needed. The concept of rejection, that allows the classifier to reject a classification if it is not certain enough about it, was implemented for this application.
The performance of our DTW-classifier in the experiments, and in the tested application is very promising, and in the future, the classifier could be applied to other than Latin alphabets or recognizing words. Also, it can be used for other applications than classification or data cleaning.