Neural Network Software - Frequently Asked Questions

Our programs require your data to be in Access 97 format (not Access 2000). The sample data installed with our programs and the sample data downloaded from our web site shows .nns (neunet sample). All these .nns files are really Access97.mdb files.

If you have data in Access 2000 format, you can convert your .mdb file to Access 97 by using Tools> database utilities> convert database> To Access 97. Remember to set a primary key field before performing the conversion. Remember to keep a backup of the original 2000 format before performing the conversion.

It is not necessary to own any version of Access to use our products. The MS-Access 97 engine installs with all our products, so any Access97.mdb file may be used as your data source. Also, our programs support ODBC connection to many databases.

Our File> Import menu will create an Access97.mdb file from any comma seperated text file.

Setting Prediction Decimal Places in SDK

The NeuNet Pro SDK allows users to set the desired decimal points in the Java Applet by passing a parameter. Refer to the page source on our Java Applets demonstrations to see how this done.
The automatic generation of a Javascript web page does not allow this feature. If you wish to control the decimal places, modify one of the generating functions and add the new function as shown below.
function DisplayPrediction() {
   // this example chops to 2 decimal places
   document.InputForm.Output.value = chopDecimals(GetPrediction(),2)
}

// Chop a floating point's digits to requested decimal places
function chopDecimals(aNumber, decimals) {
    var str = "" + aNumber
    var b = str.lastIndexOf(".")

    // add decimal point if needed
    if (b < 0) str += "."

    // pad with extra zeros if needed
    for (var i = 0; i < decimals; i++) str += "0"

    // chop to size if needed
    if (b >= 0) return(str).substring(0,b+1+decimals)
    return str
}

Medical Diagnosis

Adrian Fox has asked about the reliability of medical diagnosis and creation of medical databases.
Any neural net does not know whether it is dealing with medical, financial or horse racing data. It is just crunching numbers to make a prediction. I will talk about the reliablity of neural nets in general.
BACKPROPAGATION:
I often regard a backprop neural net as just an enhanced version of classical multi-variable regression. For example, multi-variable regression may produce an equation that looks like this:

Probability of heart attack =
B0 + B1*Age + B2*Cholesterol + B3*BloodSugar + B4*PulseRate

Where B0...B4 are fixed numbers.
The big advantage of a backprop neural net is the values B0...B4 are allowed to "float" depending on the current values of the input factors. Please refer below to "Importance of Any One Input" for further discussion of this effect. I believe backprop will always outperform multi-variable regression.
SFAM:
I often regard SFAM neural nets as providing a "fuzzy lookup" into the training database. For example, if the medical tests for a new patient are given to SFAM, it will return the diagnosis of the most similar patient in the training data.
VISIREX:
Visual Rule Extraction will produce the most efficient flow chart to make a prediction for the given input data. The human act of viewing this flow chart may provide researchers with important clues about relationships within the data. For example, perhaps cholesterol is not important unless the patient also suffers low blood sugar. The VisiRex flow chart is able to suggest the optimal sequence for medical tests. Depending on the results of the first test, further testing may be unnecessary. The total amount of medical tests performed (data collection) is minimized.
COMMENT ON RELIABILITY:
The accuracy of prediction depends on how clearly your data behaves in relation to the given inputs.
The EEG project described by Eberhart and Dobbins (ISBN 0-12-228640-5) defines an epileptic spike when at least 4 of 6 neurologists voted it was. Clearly this data contains some borderline cases which will cause difficulty for both artificial intelligence and human intelligence. This database installs as a sample when you install NeuNet Pro. When using backprop on this EEG data, I set 0 as negative and 1 as postive. Most of the data predicted very well as either 0 or 1. However the borderline cases predicted as numbers like 0.8 or 0.2 thus revealing their uncertainty.
One danger when using neural nets is the accidental discovery of spurious correlations. This danger is reduced if you have plenty of data rows with few columns and the data is well-behaved.
The NeuNet Pro download area offers an assortment of medical databases. Breast Cancer, EEG and Heart databases seem to produce accurate diagnosis. Dermatology, Liver and Diabetes produce poor diagnosis. The NeuNet Pro scatter graph and confusion matrix provide a visual indication as to how well the data is predicting.
HINTS FOR STUCTURING DATA:

Try to minimize the number of input fields. Include only inputs that are relevant. Avoid using two inputs that are measures of the same factor.
Try to maximize the number of input rows (cases) so you have plenty of data for both training and testing.
Try to avoid mixing data into the same database when the cases are vastly different. For example if you have medical tests performed on both monkeys and humans, it is best to divide this data into two seperate databases, or at least create a field that indicates monkey or human.
Shuffle your data rows into random order, so the training set is a true reflection of the data mix found in the testing set.

Choosing between BackProp, Sfam and VisiRex

Many projects can be successfully trained using any of these three methods.
BackProp uses numeric inputs to predict a numeric value (ie. a number).
SFAM uses numeric inputs to predict a class or catagory (ie. a word).
VisiRex uses both numeric and text inputs to discover rules contained within data. The discovered rules are used to predict a class or catagory (ie. a word)

It is possible to convert numeric predictions to class predictions by dividing the target numbers into ranges. For example: "HIGH, MEDIUM, LOW".
BackProp may be used for classification, especially if the data is dirty. However, you must show your classes as numeric values. Try to avoid using SFAM if your data contains any blatant contradictions.
BackProp tends to filter out noisy anomalies, while SFAM tends to mimic the noise.
In some cases, the data can be cleaned using BackProp, then the project can be completed using SFAM.
In general, BackProp is a more powerful method. For example, if the prediction depends of some mathematical relationship between a number of input fields, BackProp will model this relationship.
SFAM tends to provide a simple fuzzy lookup table (mapping) based on similarity of coordinates within the multidimensional input space.
VisiRex should be used if you wish to discover the rules behind the data. For example, which inputs are most important in combination with other inputs.

Predicting a Time Series

There is a subtle difference between predicting and modeling.
The Stock Market example that installs with NeuNet Pro is an example of modeling. Today's stock market value is estimated using today's values for several indicators. The Neural Net learns from the historic relationship between these values, but only today's values are used to calculate today's estimate.
When predicting a time series, you might decide to include some previous "historic" values to be considered as inputs for today's prediction. Study the AirMiles example to see how the neural net inputs are arranged to acomplish this task. Notice how the same historic input numbers appear multiple times in a diagonal "plaid effect" as you scroll down through the data rows. This method works well for predicting any periodic math function.
HINT: When predicting a time series, arrange your data so the newer data is toward the bottom, so the graph will read from left to right.

Predicting on New Data

NeuNet Pro has several ways to predict on new data, once the training is complete:

New data can be appended into your original data file, then the TEST set can be moved to include this new data. Remember the value shown in the prediction field of the test set is not used in making the predition. It is used only for comparison purposes. If you have no values to enter for the prediction field of the test set, just use any value at all.
New values can be temporarily typed into the yellow "interactive row" to experiment with the effect of various inputs.
The SDK can be used to add a predictions column to an Excel spreadsheet, Access database, VisualBASIC or C program.
The SDK can be used to automatically create an interactive JavaScript or Java web page for live predictions. The web page can be used locally on your hard drive, or posted onto the internet.

Importing Data

NeuNet Pro has a import function that will convert any comma seperated text to a MS-Access .mdb file. The simplest way to get data from Excel into NeuNet Pro is to export from Excel as .CSV and rename the file to .TXT. There are a few points to keep in mind:

The first row should contain field names separated by commas and no quotes
All data should be comma separated but not quote delimited
If there is any non-numeric data below the first row, that whole column will be assumed to be text
Any missing data will stop the import (with a message showing the row number)
NeuNet Pro insists that all source fields are non-null. If ,, is encountered the import is stopped with an error mesage. Rolf Blum has suggested that null values should be allowed. We will consider this in future versions.
The import takes comma-delimited text and creates a MS Access .mdb, which can then be used by NeuNet Pro. If you have Access, you can import directly. If you create your own .mdb, remember it must contain exactly one primary key field. (usually a row counter) Also, there are ways to create a .mdb from Excel.

Confirmed Bug: Please ensure file names do not contain spaces when using Import.

Shuffling Data Rows

Here is a neat trick to randomly shuffle data rows using MS-Access. Shuffling is important to ensure your training data and testing data both truely reflect your data mix. Assume your table has a primary key field called "Row_ID".

Open your table in Access design view and create a new number field called "Order". Set format to Single.
Create and run a new update query, setting [Order] to Rnd([Row_ID]).
Create and run a new Make-Table query using Show for the * (all) fields. Set order-by and not-show for the [Order] field.
Open the new table in design mode and delete the old [Row_ID] field. Save the table.
Open the table again in design mode and create a new [Row_ID] field as auto-number and set as primary key.
If everything looks ok, you may return to design mode and delete the [Order] field.
The new table will now appear as a random shuffle of the original source table. If you wish to make additional shuffles of this same table, change step 2 to use Rnd([Row_ID+x]), where x is a number you pick for this particular shuffle.

Importance of Any One Input

When using a neural net, the importance of any one input can fluctuate depending on the current value of other inputs. For example, imagine a database of 1000 football games that includes inputs wind speed and wind direction. When the wind direction is cross-field, the speed has no importance because the effect is equal on both teams. When the speed is low, the direction has no importance. Any discussion about the importance of wind speed is futile, because it has to be considered along with other inputs. The importance of wind speed can be very high in some cases and very low in other cases.
It is possible to study the NN weights attached to a particular input in order to assess the average importance of this input for this particular mix of data. For example if the 1000 football games included mostly games with cross wind or low speed, the weights would show that direction is not important. However, direction might be highly important in a minority of cases.
Another method is to nudge a particualr input value up and down and observe the effect on the predicted value. If a 1% nudge causes a 10% change in prediction, this input is very important, but only for the current value of the other inputs.
The ability to use changing importance is a major advantage of neural nets over classical statisical regression.

Horse Racing, Dog Racing, Sports Handicapping

Many NeuNet users ask about using neural networks to predict horse races, dog races and sporting events. My thoughts and experiences are summarized in the following essay.

Stock Market Data

Gregory Peterson has asked for advice where to find historical stock markert data. Our download page has historical data in econdata.exe. Scroll down to "Economic Data". This file is all ready to use with the free level one NeuNet Pro.
I have used http://chart.yahoo.com/d as a source of historical quotes. Also http://www.stls.frb.org is a good history of usa economic indicators. If anyone has additional links, please post them in the discussion forum.

Prediction of Lotteries

This interesting subject has come up several times over the past years with NeuNet. I must confess that I have a bias against trying to predict random numbers.
I once tried an experiment where I created a very long sequence of random digits (1 to 9). I set up a NeuNet project where I attempted to predict the 20th digit based on the previous 19 digits. I expected this data would be totally unlearnable. When I began the backprop training, I was very amazed to see the prediction error was improving as training progressed. It was developing a system to predict random numbers!
Later when I looked at the results, I had a good laugh. NeuNet had learned to predict "5" for every prediction. The prediction of "5" minimizes the average error for predicting random digits between 0 and 9.
I have encountered this "lottery" effect several times when the target prediction has no relationship to the given inputs. Backprop tends to gravitate toward the median prediction value for the entire training set. SFAM tends to grow a new node for every training pattern, in effect developing a look-up system on the training set.
Any series of random numbers has a pattern to it. Intensive analysis will discover this pattern, but the discovery will be of no use for future predictions.
Now let's consider the possibility that lottery numbers are not random. They must be very close to random, or smart people would discover the pattern. Suppose the painting of "8" on a ping pong ball increases its weight to that it has an improved chance of being chosen. In this case, it seems your pick should consist of many 8's.
Suppose you came up with a system that doubled or trippled your chance of winning. Your chance might increase from 1 in a million to 3 in a million. You would need to purchase many million tickets in order to determine whether your results were due to your system or due to plain luck (statistical significance).

Preditor / Prey Relationships

Preditor/Prey relationships involve feedback from the past states to the current state. The mathematics of the underlying model resembles chaos theory. Depending on the parameter values, chaos equations will result in one of three behaviors: steady-state, cycles, or chaos. Predictor/Prey relationships often result in population cycles. These cycles can be modeled by a neural network.
Suppose we wish to model the population of rabbits and we have a database showing the past annual population of foxes and rabbits. Now a high population of foxes in one year will tend to cause a low population of rabbits in the following year or two. Conversely, a low popultaion of rabbits will tend to reduce the population of foxes in following years. A reduction in fox population will tend to increase the rabbit population. An extremely high rabbit population might result in over-browsing of the food supply, or spread of disease; which could cause a collapse in population numbers. Other factors such as precipitation and temperatures will also exert some influenece.
The beauty of neural networks is that you are able to model the underlying process without having to derive the mathematics. You simply train the neural network to model a large sample of past history. The training database would look something like this:

YEAR,SNOW0,FOX5,FOX4,FOX3,FOX2,FOX1,FOX0,RAB5,RAB4,RAB3,RAB2,RAB1,RAB0,RAB+1
----------------------------------------------------------------------------
   1,   90,  24,  32,  48,  44,  31,  36, 117, 119, 133, 121, 142, 140, 124
   2,   78,  32,  48,  44,  31,  36,  52, 119, 133, 121, 142, 140, 124, 119
   3,   85,  48,  44,  31,  36,  52,  49, 133, 121, 142, 140, 124, 119, 107
   4,   97,  44,  31,  36,  52,  49,  41, 121, 142, 140, 124, 119, 107, 112
and so on for dozens of years...

where SNOW0 is the snowfall amount last winter
      FOX0 is the fox population this year
      FOX1 is the fox population one year ago
      FOX2 is the fox population two years ago ...
      RAB0 is the rabbit population this year
      RAB1 is the rabbit population one year ago
      RAB2 is the rabbit population two years ago ...
      RAB+1 is the rabbit population observed in the following year

 Notice how the numbers form a diagonal plaiding effect.
 This is typical for time series forcasting.

Now the neural network is trained to predict RAB+1, based on all the other columns. The result is a model able to produce a one-year forecast of rabbit population. Multiple Predictions

The same database can be used to make multiple predictions. Simply configure a different project for each prediction you wish to make, and point all these projects to the same database.

Validation Sets and Over-Training

Regarding validation sets, the NeuNet Pro SPLIT control allows you to select any range of data to be used as testing, and this range can be moved without any re-training. Thus it is quite easy to select two TEST ranges and move between them, using one of the ranges for testing and the other for final validation.
There are subtle differences in the way a neural network is used in various projects. For example, if you are DATA MINING using anomaloy detection for a given data set, you should set both the training set and testing set to cover all of the data. If you are MODEL BUILDING, you will try to determine today's value for the target field based on today's value for a group of other fields. This is the same as FUNCTION FITTING. Here you might select 1/2 or 2/3 of your data for training the model, then use the remaining data for testing. FORECASTING is similar to MODEL BULDING, except you try to determine a future value based on present and past values.
Often it is useful to SHUFFLE you data to remove any bias or "drift" thus ensuring the training set is a fair represention of the testing set.
It is obvious that your model performance will be overly optimistic if any of the TEST data is included in the training set. It is less obvious that if you are tempted to look at the results on the testing set, then return to the training to improve the performance, you are actually cooking the model. This problem can be solved by setting aside a group of data to be used as a VALIDATION set. This validation set is used as a final test of the model performance. No additional "fine-tuning" of the model should be allowed once the VALIDATION set is tested. Use the SPLIT control to move from the TEST set to the VALIDATION set.
Regarding OVER-TRAINING, this can be a problem especially for noisy data. With crisp data that conforms exactly to some (unknown) mathematical formula, excess training seems to only improve the models accuaracy to more decimal points. However, with noisy (real-world) data, the neural net will learn the underlying process up to some critical point, then further accuracy becomes masked by the noise. The model begins to mimic the noise in the training data, which actually hurts the performance of the testing set. One suggestion to solve this problem is to frequently peek ahead into the performance of the testing set during the training process. You save the model that results in the best performance on the testing set instead of on the training set. In effect your testing set becomes a semi-training set. The final test is run on the validation set which remains unseen during all training. In practice, we have found that NeuNet Pro will quickly train down to its best performance. Once the training error has levelled out, you find yourself spending hours trying to squeeze out that last little bit of error. It is obvious that it is time to stop because you are wasting time trying to memorize the noise in the training set.
Regarding RETRAINING, the evolved weights for the neural net are linked to each of your chosen input fields which are normalized to the minimums and maximums found in the entire dataset. If you wish to try different input fields or the new minimums/maximums appear, the previous weights are useless for this new project. NeuNet Pro will warn you, that all previous training will be lost. If you wish to maintain several different configurations, you can create a new project for each configuration. That way you can always get back to some previous configuration.
The TESTING range can be moved or adjusted at any time without any retraining of the neural network. However, any change to the TRAINING range (including addition of new training data) will necessitate retaining of the whole neural network. We have experimented with keeping the old weights when the size of training set is changed. The network continues to produce useful predictions, but the whole process becomes somewhat murky. The network will adjust to the reflect the character of the new siblings, but the older siblings tend to exert more influence because they have received more training cycles and they probably outnumber the newly added siblings. We decided it was best to simply retrain the entire network, so all data retains an even influence. Most projects train very quickly, so this need for retraining is not a problem.

CorMac Technologies Inc.
34 North Cumberland Street ~ Thunder Bay ON P7A 4L3 ~ Canada
E m a i l