Abb. INB-Logo

Tools + Demos


Zurück / back

Simple Method for High-Performance Digit Recognition Based on Sparse Coding

 

In [1], we propose a method of feature extraction for digit recognition that is inspired by vision research: a sparse-coding strategy and a local maximum operation. We show that our method, despite its simplicity, yields state-of-the-art classification results on a highly competitive digit-recognition benchmark. We first employ the unsupervised Sparsenet [2],[3] algorithm to learn a basis for representing patches of handwritten digit images. We then use this basis to extract local coefficients. In a second step, we apply a local maximum operation in order to implement local shift invariance. Finally, we train a Support-Vector-Machine on the resulting feature vectors and obtain state-of-the-art classification performance in the digit recognition task defined by the MNIST benchmark. We compare the different classification performances obtained with sparse coding, Gabor wavelets, and principle component analysis. We conclude that the learning of a sparse representation of local image patches combined with a local maximum operation for feature extraction can significantly improve recognition performance.

 

Here, we provide an OCTAVE/MATLAB package containing an implementation of the method:

 

(1) Download the package.

(2) Extract the tar archive: tar -xzf SPNREC.tar.gz

The archive contains precompiled MATLAB binaries for linux x86-32 and x86-64 architecture. There are also OCTAVE binaries for linux x86-64 architecture included. For other platforms/operating systems, you have to compile the sources:

(3) Add the -msse2 compiler option to the CFLAGS variable of the MATLAB MEX compiler (mexopt.sh)

(4) Enter the SPNREC directory

(5) Compile the C sources by typing make. Note that the MATLAB MEX compiler has to be in the path.

Note 1: In order to use the .m scripts in the package you additionally need our SoftDoubleMinOver package .

Note 2: If you don't want to use SSE2 you can disable it by removing the __USE_SSE__ compiler option in the make file.

Note 3: It is possible to compile the package for octave usage. In order to obtain octave compatible mex files modify the CC variable in the make file (replace mex by mkoctfile, see make file for more information).

Note 4: On x86-64 architecture use the __USE_SSE64__ compiler option (see make file).

Please contact labusch@inb.uni-luebeck.de if you have further questions.

 

The digit_recognition.m skript in the package can be used in order to reproduce the experiment on digit recognition based on a sparse code. Additionally, we also provide a MATLAB data file containing all the variables and results that were used and obtained from running this skript (fe_mode=0):

example.mat (Warning! This file is about 1.6 GB, Your browser might try to display this file without telling you. In order to download the file to your hard disk right click on the link!)

Among others, this file contains
  • C : The basis that was learned by the Sparsenet algorithm from the MNIST data
  • F_train : sparse coding features of the MNIST training set
  • c_train : class information of the training set
  • F_test : sparse coding features of the MNIST test set
  • c_test : class information of the test set
  • bmerr : mean validation error
  • error_rate : final error rate on the MNIST test set

 

  1. Kai Labusch, Erhardt Barth, and Thomas Martinetz. Simple Method for High-Performance Digit Recognition Based on Sparse Coding. IEEE Transactions on Neural Networks, 19(11):1985-1989, 2008.
  2. Olshausen BA, Field DJ (1997). Sparse coding with an overcomplete basis set: A strategy employed by V1? Vision Research, 37, 3311-3325.
  3. Olshausen BA, Field DJ (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607-609.