Neural Network Data -- Texture Images


    *********** Texture.nns is a NeuNet Pro Sample File ***********

This file contains special properties that allow NeuNet Pro to recognize it
as authorized NeuNet sample data.  Anyone using the unlicensed version
of NeuNet Pro is welcome to experiment with this sample data.
Please do not modify this file, or it will lose its status as authorized sample data.

For further information about Neunet Pro and additional sample data,
please visit the NeuNet Pro website at http://www.cormactech.com/neunet

All of this data has been collected from publicly available sources.
CorMac Technologies Inc. does not guarantee the accuracy of the data.
This data is intended solely for experimental purposes.


    ***************  More about the Texture Data *******************


                    #########################
                    # The TEXTURE database  #
                    #########################

1. Sources:

     (*) This database was generated by the Laboratory of Image Processing 
         and Pattern Recognition (INPG-LTIRF) in the development of the
	 Esprit project ELENA No. 6891 and the Esprit working group ATHOS
	 No. 6620. 

     (a) Original source:

       P. Brodatz "Textures: A Photographic Album for Artists and Designers",
       Dover Publications,Inc.,New York, 1966.

     (b) Creation: Laboratory of Image Processing and Pattern Recognition

       Institut National Polytechnique de Grenoble INPG
       Laboratoire de Traitement d'Image et de Reconnaissance de Formes LTIRF
       Av. Felix Viallet, 46
       F-38031 Grenoble Cedex
       France

     (c) Contact: Dr. A. Guerin-Dugue, INPG-LTIRF, guerin@tirf.inpg.fr




2. Past Usage:

This database has a private usage at the TIRF laboratory. It has been created 
in order to study the textures discrimination with high order statistics. 

    A.Guerin-Dugue, C. Aviles-Cruz,
   "High Order Statistics from Natural Textured Images",
    In ATHOS workshop on System Identification and High Order Statistics, 
    Sophia-Antipolis, France, September 1993.

    Guerin-Dugue, A. and others,  
    Deliverable R3-B4-P - Task B4: Benchmarks, Technical report,
    Elena-NervesII "Enhanced Learning for Evolutive Neural Architecture", 
    ESPRIT-Basic Research Project  Number 6891,
    June 1995




3. Relevant Information:

    The aim is to distinguish between 11 different textures (Grass
lawn, Pressed calf leather, Handmade paper, Raffia looped to a high
pile, Cotton canvas, ...), each pattern (pixel) being characterised
by 40 attributes built by the estimation of fourth order modified
moments in four orientations: 0, 45, 90 and 135 degrees.

    A statistical method based on the extraction of fourth order
moments for the characterization of natural micro-textures was
developed called "fourth order modified moments" (mm4) [Guerin93],
this method measures the deviation from first-order Gauss-Markov
process, for each texture. The features were estimated in four
directions to take into account the possible orientations of the
textures (0, 45, 90 and 135 degrees). Only correlation between the
current pixel, the first neighbourhood and the second neighbourhood 
are taken into account.
This small neighbourhood is adapted to the fine grain property of
the textures.

  The data set contains 11 classes of 500 instances and each class
refers to a type of texture in the Brodatz album.

  The database dimension is 40 plus one for the class label.  The 40
attributes were build respectively by the estimation of the following
fourth order modified moments in four orientations: 0, 45, 90 and 135
degrees:  mm4(000), mm4(001), mm4(002), mm4(011), mm4(012), mm4(022),
mm4(111), mm4(112), mm4(122) and mm4(222).

!! Patterns are always sorted by class and are presented in the increasing 
   order of their class label in each dataset relative to the texture 
   database (texture.dat, texture_CR.dat, texture_PCA.dat, texture_DFA.dat)




4. Class: 

The class label is a code for the following classes:

  Class         Class   
  label
    2 	Grass lawn			(D09)  
    3 	Pressed calf leather		(D24) 
    4 	Handmade paper			(D57) 
    6 	Raffia looped to a high pile:   (D84) 
    7 	Cotton canvas			(D77) 
    8 	Pigskin				(D92) 
    9 	Beach sand:   			(D28) 
    10 	Beach sand 			(D29) 
    12 	Oriental straw cloth		(D53) 
    13 	Oriental straw cloth		(D78) 
    14 	Oriental grass fiber cloth 	(D79) 




6. Summary Statistics:

Table here below provides for each attribute of the database 
the dynamic (Min and Max values), the mean value and the standard deviation.

Attribute  Min      Max       Mean     Standard    
				       deviation   

    1   -1.4495    0.7741   -1.0983    0.2034
    2   -1.2004    0.3297   -0.5867    0.2055
    3   -1.3099    0.3441   -0.5838    0.3135
    4   -1.1104    0.5878   -0.4046    0.2302
    5   -1.0534    0.4387   -0.3307    0.2360
    6   -1.0029    0.4515   -0.2422    0.2225
    7   -1.2076    0.5246   -0.6026    0.2003
    8   -1.0799    0.3980   -0.4322    0.2210
    9   -1.0570    0.4369   -0.3317    0.2361
   10   -1.2580    0.3546   -0.5978    0.3268
   11   -1.4495    0.7741   -1.0983    0.2034
   12   -1.0831    0.3715   -0.5929    0.2056
   13   -1.1194    0.6347   -0.4019    0.3368
   14   -1.0182    0.1573   -0.6270    0.1390
   15   -0.9435    0.1642   -0.4482    0.1952
   16   -0.9944    0.0357   -0.5763    0.1587
   17   -1.1722    0.0201   -0.7331    0.1955
   18   -1.0174    0.1155   -0.4919    0.2335
   19   -1.0044    0.0833   -0.4727    0.2257
   20   -1.1800    0.4392   -0.4831    0.3484
   21   -1.4495    0.7741   -1.0983    0.2034
   22   -1.2275    0.5963   -0.7363    0.2220
   23   -1.3412    0.4464   -0.7771    0.3290
   24   -1.1774    0.6882   -0.5770    0.2646
   25   -1.1369    0.4098   -0.5085    0.2538
   26   -1.1099    0.3725   -0.4038    0.2515
   27   -1.2393    0.6120   -0.7279    0.2278
   28   -1.1540    0.4221   -0.5863    0.2446
   29   -1.1323    0.3916   -0.5090    0.2526
   30   -1.4224    0.4718   -0.7708    0.3264
   31   -1.4495    0.7741   -1.0983    0.2034
   32   -1.1789    0.5647   -0.6463    0.1890
   33   -1.1473    0.6755   -0.4919    0.3304
   34   -1.1228    0.3132   -0.6435    0.1441
   35   -1.0145    0.3396   -0.4918    0.1922
   36   -1.0298    0.1560   -0.5934    0.1704
   37   -1.2534    0.0899   -0.7795    0.1641
   38   -1.0966    0.1944   -0.5541    0.2111
   39   -1.0765    0.2019   -0.5230    0.2015
   40   -1.2155    0.4647   -0.5677    0.3091

   The dynamic of the attributes is in [-1.45 - 0.775].  
The database resulting from the centering and reduction by attribute 
of the Texture database is on the ftp server in the 
`REAL/texture/texture_CR.dat.Z' file.


5. Confusion matrix.
 
The following confusion matrix of the k_NN classifier was obtained with a 
Leave_One_Out error counting method on the texture_CR.dat database.
k was set to 1 in order to reach the minimum 
mean error rate : 1.0 +/- 0.8%. 

 Class    2      3      4      6      7      8      9      10     12     13     14 
 2      97.0    1.0    0.4    0.0    0.0    0.0    1.6    0.0    0.0    0.0    0.0   
 3       0.2   99.0    0.0    0.0    0.0    0.0    0.4    0.0    0.0    0.0    0.4   
 4       1.0    0.0   98.8    0.0    0.0    0.0    0.2    0.0    0.0    0.0    0.0   
 6       0.0    0.0    0.0   99.4    0.0    0.0    0.0    0.6    0.0    0.0    0.0   
 7       0.0    0.0    0.0    0.0  100.0    0.0    0.0    0.0    0.0    0.0    0.0   
 8       0.0    0.0    0.0    0.0    0.0   98.6    0.0    1.4    0.0    0.0    0.0   
 9       0.4    0.0    0.2    0.0    0.0    0.2   98.8    0.4    0.0    0.0    0.0   
 10      0.0    0.0    0.0    0.0    0.0    1.4    0.0   98.6    0.0    0.0    0.0   
 12      0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0  100.0    0.0    0.0   
 13      0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0    0.0   99.8    0.2   
 14      0.0    0.4    0.0    0.0    0.0    0.4    0.0    0.0    0.2    0.0   99.0   



6. Result of the Principal Component Analysis:

The Principal Components Analysis is a very classical method in pattern
recognition [Duda73].

PCA reduces the sample dimension in a linear way for the best
representation in lower dimensions keeping the maximum of inertia. The
best axe for the representation is however not necessary the best axe
for the discrimination. After PCA, features are selected according to
the percentage of initial inertia which is covered by the different
axes and the number of features is determined according to the
percentage of initial inertia to keep for the classification process.
This selection method has been applied on the texture_CR database.
When quasi-linear correlations exists between some initial features,
these redundant dimensions are removed by PCA and this preprocessing is
then recommended.  In this case, before a PCA, the determinant of the
data covariance matrix is near zero; this database is thus badly
conditioned for all process which use this information (the quadratic
classifier for example).

The following file is available for the texture database:
``texture_PCA.dat.Z'', it is the projection of the ``texture_CR'' database on its 
principal components (sorted in a decreasing order of the related 
inertia percentage; so, if you desire to work on the database projected on 
its x first principal components you only have to keep the x first attributes 
of the texture_PCA.dat database and the class labels (last attribute)). 

Table here below provides the inertia percentages associated to the 
eigenvalues corresponding to the principal component axis sorted
in the decreasing order of the associated inertia percentage.
99.85 percent of the total database inertia will remain if the 20 first principal
components are kept.

       Eigen	   Value	 Inertia      Cumulated
       value          	       percentage      inertia

	1	30.267500000 75.6687000000  75.6687000000 
	2	3.6512500000  9.1281300000  84.7969000000 
 	3	2.2937000000  5.7342400000  90.5311000000 
 	4	1.7039700000  4.2599300000  94.7910000000 
 	5	0.6716540000  1.6791300000  96.4702000000 
 	6	0.5015290000  1.2538200000  97.7240000000 
 	7	0.1922830000  0.4807070000  98.2047000000 
 	8	0.1561070000  0.3902670000  98.5950000000 
 	9	0.1099570000  0.2748920000  98.8699000000 
 	10	0.0890891000  0.2227230000  99.0926000000 
 	11	0.0656016000  0.1640040000  99.2566000000 
 	12	0.0489988000  0.1224970000  99.3791000000 
 	13	0.0433819000  0.1084550000  99.4875000000 
 	14	0.0345022000  0.0862554000  99.5738000000 
 	15	0.0299203000  0.0748007000  99.6486000000 
 	16	0.0248857000  0.0622141000  99.7108000000 
 	17	0.0167901000  0.0419752000  99.7528000000 
 	18	0.0161633000  0.0404083000  99.7932000000 
 	19	0.0128898000  0.0322246000  99.8254000000 
 	20	0.0113884000  0.0284710000  99.8539000000 
 	21	0.0078481400  0.0196204000  99.8735000000 
 	22	0.0071527800  0.0178820000  99.8914000000 
 	23	0.0067661400  0.0169153000  99.9083000000 
 	24	0.0053149500  0.0132874000  99.9216000000 
 	25	0.0051102600  0.0127757000  99.9344000000 
 	26	0.0047116600  0.0117792000  99.9461000000 
 	27	0.0036193700  0.0090484300  99.9552000000 
 	28	0.0033222000  0.0083054900  99.9635000000 
 	29	0.0030722400  0.0076806100  99.9712000000 
 	30	0.0026373300  0.0065933300  99.9778000000 
 	31	0.0020996800  0.0052492000  99.9830000000 
 	32	0.0019376500  0.0048441200  99.9879000000 
 	33	0.0015642300  0.0039105700  99.9918000000 
 	34	0.0009679080  0.0024197700  99.9942000000 
 	35	0.0009578000  0.0023945000  99.9966000000 
 	36	0.0007379780  0.0018449400  99.9984000000 
 	37	0.0006280250  0.0015700600  100.000000000
 	38	0.0000000040  0.0000000099  100.000000000 
 	39	0.0000000001  0.0000000003  100.000000000 
	40	0.0000000008  0.0000000019  100.000000000 

This matrix can be found in the texture_EV.dat file.  

The Discriminant Factorial Analysis (DFA) can be applied to a learning
database where each learning sample belongs to a particular class
[Duda73].  The number of discriminant features selected by DFA is fixed
in function of the number of classes (c) and of the number of input
dimensions (d); this number is equal to the minimum between d and c-1.
In the usual case where d is greater than c, the output dimension is
fixed equal to the number of classes minus one and the discriminant
axes are selected in order to maximize the between-variance and to
minimize the within-variance of the classes.
The discrimination power (ratio of the projected between-variance over
the projected within-variance) is not the same for each discriminant
axis: this ratio decreases for each axis. So for a problem with many
classes, this preprocessing will not be always efficient as the last
output features will not be so discriminant. This analysis uses the
information of the inverse of the global covariance matrix, so the
covariance matrix must be well conditioned (for example, a preliminary
PCA must be applied to remove the linearly correlated dimensions).

The Discriminant Factorial Analysis (DFA) has been applied on the 18
first principal components of the texture_PCA database (thus by
keeping only the 18 first attributes of these databases before to apply
the DFA preprocessing) in order to build the texture_DFA.dat.Z
database file, having 10 dimensions (the texture database having 11
classes). In the case of the texture database, experiments shown that
a DFA preprocessing is very useful and most of the time improved
the classifiers performances.


[Duda73]
Duda, R.O. and Hart, P.E.,
Pattern Classification and Scene Analysis,
John Wiley & Sons, 1973.