VisiRex Home Page
FAQ's
CorMac Home Page
VisiRex Home
Specifications
Screen Shots
Download
Purchase
Developer's Kit
FAQ's
Forum


Site Search
CorMac Home
NeuNet Home
Horse Racing
   
VisiRex 2.0 for Windows
"Frequently Asked Questions"


What is Inductive Rule Extraction?

Inductive Rule Extraction is related to the fields of Machine Learning, Knowlege Discovery, Expert Systems and Artifical Intelligence. Rule Extraction is sometimes called "Decision Tree Classification". The method depends on the concept of "Entropy" which is a term used by scientists to measure the amount of randomness, disorder, or uncertainty in a population. The use of entropy in the field of Information Theory was introduced by Claud Shannon 40 years ago.

-Entopy = P * log2(P) + Q * log2(Q)

The following example is meant to give a brief demonstration how entropy can be used to extract rules from a database.
Imagine a large database of medical patients contains only one field as "Sick" or "Healthy". It is your job to make a diagnosis using any information you can extract from this single column database.

Now suppose you notice that all patients are shown as being healthy. The portion of healthy is 100% (P=1); while the portion sick is 0% (Q=0).
For this example, -Entropy = 1 * log2(1) + 0 * log2(0) , Entropy = 0.0
implying no randomness, so a diagnosis of "healthy" could be made with high confidence.

Likewise, if all patients were shown as sick, the portion of healthy is 0% (P=0); while the portion sick is 100% (Q=1).
For this example, -Entropy = 0 * log2(0) + 1 * log2(1) , Entropy = 0.0
implying no randomness, so a diagnosis of "sick" could be made with high confidence.

The worst case occurs when half the patients are healthy and half are sick. the portion of healthy is 50% (P=.5); while the portion sick is 50% (Q=.5).
For this example, -Entropy = .5 * log2(.5) + .5 * log2(.5) , Entropy = 1.0
implying total randomness. This database contains zero information regarding the diagnosis.

Notice how the value of entropy is always somewhere between zero and one.

Now suppose the above database is widened to include fields for [Age] ("Young" or "Old") and [Wealth] ("Rich" or "Poor").
You determine that 80% of the patients are young and 20% are old. Try splitting the data into two portions based on age, and calculate the entropy of diagnosis for each portion.
Say entropy of "Young" portion = .65, Entropy of "Old" portion =.55.
Combined overall entropy for [Age] is weighted average, (80% * .65) + (20% * .55) = .63

Next you determine that 60% are poor and 40% are rich. Try splitting the data into two portions based on wealth, and calculate the entropy of diagnosis for each portion.
Say entropy of "Poor" portion = .45, Entropy of "Rich" portion =.05.
Combined overall entropy for [Wealth] is weighted average, (60% * .45) + (40% * .05) = .29

The lower (.29) entropy on [Wealth] versus the higher (.63) entropy on [Age] is telling you that [Wealth] contains more information about the diagnosis than [Age]. Therfore, you should use [Wealth] as the first branch of your decision tree.

You continue this procedure for further tree building, always choosing the fields with least entropy as the uppermost branches of the evolving tree.

Choosing between BackProp, Sfam and VisiRex

Many projects can be successfully trained using any of these three methods.
BackProp uses numeric inputs to predict a numeric value (ie. a number).
SFAM uses numeric inputs to predict a class or catagory (ie. a word).
VisiRex uses both numeric and text inputs to discover rules contained within data. The discovered rules are used to predict a class or catagory (ie. a word)

It is possible to convert numeric predictions to class predictions by dividing the target numbers into ranges. For example: "HIGH, MEDIUM, LOW".
BackProp may be used for classification, especially if the data is dirty. However, you must show your classes as numeric values. Try to avoid using SFAM if your data contains any blatant contradictions.
BackProp tends to filter out noisy anomalies, while SFAM tends to mimic the noise.
In some cases, the data can be cleaned using BackProp, then the project can be completed using SFAM.
In general, BackProp is a more powerful method. For example, if the prediction depends of some mathematical relationship between a number of input fields, BackProp will model this relationship.
SFAM tends to provide a simple fuzzy lookup table (mapping) based on similarity of coordinates within the multidimensional input space.
VisiRex should be used if you wish to discover the rules behind the data. For example, which inputs are most important in combination with other inputs.

CorMac Technologies Inc.
28 North Cumberland Street ~ Thunder Bay ON P7A 4K9 ~ Canada
E m a i l