Skip to content

fmeyenhofer/pattern-recognition-2016

 
 

Repository files navigation

pattern-recognition-2016

Configuration

External libraries

See here for Windows binaries.

Info: Since the multilayer_perceptron classes from scikit-learn are not yet included in the latest release (0.17.1), they have been copied. As soon as they are released, the code in the folder mlp can be removed.

Classifier Results

MLP

MLP Training Neurons MLP Training Algorithms

SVM

Scores for different kernels and confusion matrices

Kernel Training score Training cross validation Test score Test cross validation
Linear 1 0.910 0.908 0.913
Poly1 1 0.910 0.908 0.958
Poly4 1 0.955 0.966 0.946

linear kernel confusion matrix ploy 3 kernel confusion matrix poly 4 kernel confusion matrix

Key Word Search

The main project of the course is about implementing a solution for key word search in historical documents. Most of the methology is inspired by the work of Rath et al. 1

Pre-processing

Before extracting features, each word is pre-processed:

  • Remove clutter (small objects)
  • Find a word mask
  • Normalize the pixel intensities
  • Position all the words in a frame with uniform height (centering and scaling)

During this procedure the main assumption is, that the central part of the handwriting (i.e. small letters like a, e, i, ...) will be the predominant peak on the vertical projection of the pixels.

Feature computation

Sliding window approach. Local descriptor includes:

  • Black-white transitions
  • Foreground fractions
  • Relative positions (top, bottom, centroid, center of mass)
  • Gray-scale moments

Distance computation

Once the features are computed for each word, dynamic time warping (DTW) is used to compute the string edit distance between a given pair of words.

Word classification

For word classification a KNN algorithm is used.

Performance

Dataset Overall accuracy Accuracy with training samples CPU time
Training 0.48 0.48 7.25 min
Validation 0.36 0.57 4.15 min
Everything 0.44 0.50 11.40 min

accuracy vs. training samples

1: Tony M. Rath and R. Manmatha. 2006. Word spotting for historical documents. IJDAR 9, 2–4 (August 2006), 139–152. DOI: http://dx.doi.org/10.1007/s10032-006-0027-8

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.9%
  • Other 2.1%