您的位置:首页 > 其它

Halcon-OCR create_ocr_class_mlp

2016-01-01 11:25 639 查看
create_ocr_class_mlp
(Operator)
Name
create_ocr_class_mlp — Create an OCR classifier using a multilayer perceptron.
Signature
create_ocr_class_mlp( : :WidthCharacter,HeightCharacter,Interpolation,Features,Characters,NumHidden,Preprocessing,NumComponents,RandSeed
:OCRHandle)
Description
create_ocr_class_mlp creates an OCR classifier that uses a multilayer perceptron (MLP). The handle of the OCR classifier
is returned in OCRHandle.
For a description on how an MLP works, seecreate_class_mlp.create_ocr_class_mlp
creates an MLP withOutputFunction ='softmax'.
The length of the feature vector of the MLP (NumInput in
create_class_mlp) is determined from the features that
are used for the OCR, which are passed in Features. The
features are described below. The number of units in the hidden layer is determined by
NumHidden. The number of output variables of the MLP (NumOutput
increate_class_mlp) is determined from the names of
the characters to be used in the OCR, which are passed in Characters.
As described withcreate_class_mlp, the parametersPreprocessing
andNumComponents can be used to specify a preprocessing
of the data (i.e., the feature vectors). The OCR already approximately normalizes the features. Hence,Preprocessing
can typically be set to'none'. The parameterRandSeed
has the same meaning as increate_class_mlp.
The features to be used for the classification are determined byFeatures.Features
can contain a tuple of several feature names. Each of these feature names results in one or more features to be calculated for the classifier. Some of the feature names compute gray value features (e.g.,'pixel_invar').
Because a classifier requires a constant number of features (input variables), a character to be classified is transformed to a standard size, which is determined by
WidthCharacter and
HeightCharacter. The interpolation to be used for
the transformation is determined by Interpolation.
It has the same meaning as inaffine_trans_image.
The interpolation should be chosen such that no aliasing effects occur in the transformation. For most applications,
Interpolation =
'constant' should be used. It should be noted that the size of the transformed character is not chosen too large, because the
generalization properties of the classifier may become bad for large sizes. In particular, large sizes will lead to the fact that small segmentation errors will have a large influence on the computed features if gray value features are used. This happens because
segmentation errors will change the smallest enclosing rectangle of the regions, which leads to the fact that the character is zoomed differently than the characters in the training set. In most applications, sizes between 6x8 and 10x14 should be used.
The parameterFeatures
can contain the following feature names for the classification of the characters. By specifying
'default', the features'ratio'
and'pixel_invar' are selected.
'pixel'
Gray values of the character (WidthCharacter
x HeightCharacter features).
'pixel_invar'
Gray values of the character with maximum scaling of the gray values (WidthCharacter
x HeightCharacter features).
'pixel_binary'
Region of the character as a binary image zoomed to a size ofWidthCharacter
xHeightCharacter (WidthCharacter
x HeightCharacter features).
'gradient_8dir'
Gradients are computed on the character image. The gradient directions are discretized into 8 directions. The amplitude image is decomposed into 8 channels according
to these discretized directions. 25 samples on a 5x5 grid are extracted from each channel. These samples are used as features (200 features).
'projection_horizontal'
Horizontal projection of the gray values (seegray_projections,HeightCharacter
features).
'projection_horizontal_invar'
Maximally scaled horizontal projection of the gray values (HeightCharacter
features).
'projection_vertical'
Vertical projection of the gray values (seegray_projections,WidthCharacter
features).
'projection_vertical_invar'
Maximally scaled vertical projection of the gray values (WidthCharacter
features).
'ratio'
Aspect ratio of the character (1 feature).
'anisometry'
Anisometry of the character (seeeccentricity,
1 feature).
'width'
Width of the character before scaling the character to the standard size (not scale-invariant, seesmallest_rectangle1,
1 feature).
'height'
Height of the character before scaling the character to the standard size (not scale-invariant, seesmallest_rectangle1,
1 feature).
'zoom_factor'
Difference in size between the character and the values ofWidthCharacter
andHeightCharacter (not scale-invariant, 1 feature).
'foreground'
Fraction of pixels in the foreground (1 feature).
'foreground_grid_9'
Fraction of pixels in the foreground in a 3x3 grid within the smallest enclosing rectangle of the character (9 features).
'foreground_grid_16'
Fraction of pixels in the foreground in a 4x4 grid within the smallest enclosing rectangle of the character (16 features).
'compactness'
Compactness of the character (seecompactness,
1 feature).
'convexity'
Convexity of the character (seeconvexity,
1 feature).
'moments_region_2nd_invar'
Normalized 2nd moments of the character (seemoments_region_2nd_invar,
3 features).
'moments_region_2nd_rel_invar'
Normalized 2nd relative moments of the character (seemoments_region_2nd_rel_invar,
2 features).
'moments_region_3rd_invar'
Normalized 3rd moments of the character (seemoments_region_3rd_invar,
4 features).
'moments_central'
Normalized central moments of the character (seemoments_region_central,
4 features).
'moments_gray_plane'
Normalized gray value moments and the angle of the gray value plane (seemoments_gray_plane,
4 features).
'phi'
Sinus and cosinus of the orientation (angle) of the character (seeelliptic_axis,
2 feature).
'num_connect'
Number of connected components (seeconnect_and_holes,
1 feature).
'num_holes'
Number of holes (seeconnect_and_holes,
1 feature).
'cooc'
Values of the binary cooccurrence matrix (seegen_cooc_matrix,
8 features).
'num_runs'
Number of runs in the region normalized by the height (1 feature).
'chord_histo'
Frequency of the runs per row (HeightCharacter
features).
After the classifier has been created, it is trained usingtrainf_ocr_class_mlp.
After this, the classifier can be saved usingwrite_ocr_class_mlp.
Alternatively, the classifier can be used immediately after training to classify characters using
do_ocr_single_class_mlp or
do_ocr_multi_class_mlp.
HALCON provides a number of pretrained OCR classifiers (see Solution Guide I, chapter 'OCR', section 'Pretrained OCR Fonts'). These pretrained OCR classifiers can be read directly
with read_ocr_class_mlp and make it possible to read
a wide variety of different fonts without the need to train an OCR classifier. Therefore, it is recommended to try if one of the pretrained OCR classifiers can be used successfully. If this is the case, it is not necessary to create and train an OCR classifier.
A comparison of the MLP and the support vector machine (SVM) (seecreate_ocr_class_svm)
typically shows that SVMs are generally faster at training, especially for huge training sets, and achieve slightly better recognition rates than MLPs. The MLP is faster at classification and should therefore be prefered in time critical applications. Please
note that this guideline assumes optimal tuning of the parameters.
Parallelization

Multithreading type: exclusive (runs in parallel only with independent operators).
Multithreading scope: global (may be called from any thread).
Processed without parallelization.

Parameters
WidthCharacter
(input_control) integer → (integer)
字符的宽度
Width of the rectangle to which the gray values of the segmented character are zoomed.
Default value:8
Suggested values:1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 20
Typical range of values:4 ≤ WidthCharacter ≤ 20
HeightCharacter
(input_control) integer → (integer)
字符的高度
Height of the rectangle to which the gray values of the segmented character are zoomed.
Default value:10
Suggested values:1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 16, 20
Typical range of values:4 ≤ HeightCharacter ≤ 20
Interpolation
(input_control) string → (string)
字符缩放样式
Interpolation mode for the zooming of the characters.
Default value:'constant'

List of values:'nearest_neighbor', 'bilinear', 'constant', 'weighted'
Features
(input_control) string(-array) → (string)
Features to be used for classification.
Default value:'default'

List of values:'default', 'pixel', 'pixel_invar', 'pixel_binary', 'gradient_8dir',
'projection_horizontal', 'projection_horizontal_invar', 'projection_vertical', 'projection_vertical_invar', 'ratio', 'anisometry', 'width', 'height', 'zoom_factor', 'foreground', 'foreground_grid_9', 'foreground_grid_16', 'compactness', 'convexity', 'moments_region_2nd_invar',
'moments_region_2nd_rel_invar', 'moments_region_3rd_invar', 'moments_central', 'moments_gray_plane', 'phi', 'num_connect', 'num_holes', 'cooc', 'num_runs', 'chord_histo'

Characters
(input_control) string-array → (string)
All characters of the character set to be read.
Default value:['0','1','2','3','4','5','6','7','8','9']
NumHidden
(input_control) integer → (integer)
Number of hidden units of the MLP.
Default value:80
Suggested values:1, 2, 3, 4, 5, 8, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90,
100, 120, 150
Restriction:NumHidden >= 1
Preprocessing
(input_control) string → (string)
Type of preprocessing used to transform the feature vectors.
Default value:'none'

List of values:'none', 'normalization', 'principal_components', 'canonical_variates'
NumComponents
(input_control) integer → (integer)
Preprocessing parameter: Number of transformed features (ignored forPreprocessing
='none' andPreprocessing
='normalization').
Default value:10
Suggested values:1, 2, 3, 4, 5, 8, 10, 15, 20, 30, 40, 50, 60, 70, 80, 90,
100
Restriction:NumComponents >= 1
RandSeed
(input_control) integer → (integer)
Seed value of the random number generator that is used to initialize the MLP with random values.
Default value:42

OCRHandle
(output_control) ocr_mlp → (integer)
Handle of the OCR classifier.
Example
(HDevelop)
read_image (Image, 'letters')

* Segment the image.

bin_threshold (Image, Region)

dilation_circle (Region, RegionDilation, 3.5)

connection (RegionDilation, ConnectedRegions)

intersection (ConnectedRegions, Region, RegionIntersection)

sort_region (RegionIntersection, Characters, 'character', 'true', 'row')

* Generate the training file.

count_obj (Characters, Number)

Classes := []

for J := 0 to 25 by 1

Classes := [Classes,gen_tuple_const(20,chr(ord('a')+J))]

endfor

Classes := [Classes,gen_tuple_const(20,'.')]

write_ocr_trainf (Characters, Image, Classes, 'letters.trf')

* Generate and train the classifier.

read_ocr_trainf_names ('letters.trf', CharacterNames, CharacterCount)

create_ocr_class_mlp (8, 10, 'constant', 'default', CharacterNames, 20, \

'none', 81, 42, OCRHandle)

trainf_ocr_class_mlp (OCRHandle, 'letters.trf', 100, 0.01, 0.01, Error, \

ErrorLog)

* Re-classify the characters in the image.

do_ocr_multi_class_mlp (Characters, Image, OCRHandle, Class, Confidence)

clear_ocr_class_mlp (OCRHandle)

Result
If the parameters are valid, the operatorcreate_ocr_class_mlp
returns the value 2 (H_MSG_TRUE). If necessary an exception is raised.
Possible
Successors
trainf_ocr_class_mlp
Alternatives
create_ocr_class_svm,create_ocr_class_box
See
also
do_ocr_single_class_mlp,do_ocr_multi_class_mlp,clear_ocr_class_mlp,create_class_mlp,train_class_mlp,classify_class_mlp
Module
OCR/OCV

Table of Contents
/ OCR /Neural
Nets
Operators
HALCON Reference Manual 10.0
Copyright © 1996-2010 MVTec Software GmbH
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: