pattern-recognition-2016

Configuration

Python Version: 3.5.1
Use the config.ini (see ConfigParser)
For a good Python style guide see Google Python Style Guide

External libraries

cython: C-Extensions for Python
numpy: Optimized and powerful N-dimensional array object
scipy: Fundamental library for scientific computing
scikit-image: Collection of algorithms for image processing
sklearn: Machine Learning in Python
sklearn.svm: Support vector machines (SVMs)
sklearn.neural_network.multilayer_perceptron: Generic multi layer perceptron (Github pull request recently merged), see also here
svg.path: SVG path objects and parser
bob.learn.mlp: Bob's Multi-layer Perceptron (MLP).
dtwextension: Dynamic time warping (C implementation)

See here for Windows binaries.

Info: Since the multilayer_perceptron classes from scikit-learn are not yet included in the latest release (0.17.1), they have been copied. As soon as they are released, the code in the folder mlp can be removed.

Classifier Results

MLP

SVM

Scores for different kernels and confusion matrices

Kernel	Training score	Training cross validation	Test score	Test cross validation
Linear	1	0.910	0.908	0.913
Poly1	1	0.910	0.908	0.958
Poly4	1	0.955	0.966	0.946

Key Word Search

The main project of the course is about implementing a solution for key word search in historical documents. Most of the methology is inspired by the work of Rath et al. ¹

Pre-processing

Before extracting features, each word is pre-processed:

Remove clutter (small objects)
Find a word mask
Normalize the pixel intensities
Position all the words in a frame with uniform height (centering and scaling)

During this procedure the main assumption is, that the central part of the handwriting (i.e. small letters like a, e, i, ...) will be the predominant peak on the vertical projection of the pixels.

Feature computation

Sliding window approach. Local descriptor includes:

Black-white transitions
Foreground fractions
Relative positions (top, bottom, centroid, center of mass)
Gray-scale moments

Distance computation

Once the features are computed for each word, dynamic time warping (DTW) is used to compute the string edit distance between a given pair of words.

Word classification

For word classification a KNN algorithm is used.

Performance

Dataset	Overall accuracy	Accuracy with training samples	CPU time
Training	0.48	0.48	7.25 min
Validation	0.36	0.57	4.15 min
Everything	0.44	0.50	11.40 min

1: Tony M. Rath and R. Manmatha. 2006. Word spotting for historical documents. IJDAR 9, 2–4 (August 2006), 139–152. DOI: http://dx.doi.org/10.1007/s10032-006-0027-8

Name		Name	Last commit message	Last commit date
Latest commit History 138 Commits
clfs		clfs
evaluation		evaluation
ext		ext
figs		figs
ip		ip
mlp		mlp
molecules		molecules
search		search
svm		svm
utils		utils
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
config.ini		config.ini
examples.py		examples.py
kws_pipeline.py		kws_pipeline.py
mlp_main.py		mlp_main.py
molecules_pipeline.py		molecules_pipeline.py

License

fmeyenhofer/pattern-recognition-2016

Folders and files

Latest commit

History

Repository files navigation

pattern-recognition-2016

Configuration

External libraries

Classifier Results

MLP

SVM

Key Word Search

Pre-processing

Feature computation

Distance computation

Word classification

Performance

About

Resources

License

Stars

Watchers

Forks

Languages