Tips & Tricks
Choosing a Project
Choosing a Project
Is the Data Predictable?
For a project to be successful, there must be some relationship between the input numbers
and the field you are trying to predict. The neural network is smart enough to attach low
weighting to inputs which are not important. If there is no relationship between your inputs
and the target prediction, BackProp will learn to make all of its predictions near the average value
for all of the prediction target values in the training set. SFAM will simply memorize every record by growing
a node for almost every record in the training set.
How Much Data Do I Need?
- There is no firm answer to this question.
It is best to experiment with project configuration and various splits of training
set and testing set. In some cases, predictions may be learned using a training
set of only a dozen records. In other cases, thousands of training records may be needed.
- It depends on how closely the target prediction is related to your input values,
and the complexity of this relationship.
- Is your data very crisp (clean), or does it contain contradictions and anamalies (dirty)?
It might take several clean records to overcome the effect of one anomaly.
If NeuNet Pro is used for data mining and anomaly detection, it is possible to separate
the clean data from the dirty data.
- It depends on how many input fields you are using. Usually it is best to
include any input field you think might be useful, and the let the neural network
decide exactly how useful. However, if you include a large number of unnecessary inputs,
there is an increased chance the neural network will discover some spurious correlation.
This spurious correlation can be averaged-out by increasing the number of training records.
Consider Splitting Your Data into Smaller Projects
- Suppose you are trying to predict the selling price of used vehicles,
and you database includes a mixture of trucks, cars and bicycles.
- You could establish three new fields - Trucks, Cars, Bicycles where
Trucks = "1, 0, 0" ; Cars = "0,1,0"; Bicycles = "0,0,1".
The entire database could then be trained as one combined project.
- It would be better if the above database was split into three independent data tables.
A Complete Neural Network Development System
CorMac Technologies Inc.
34 North Cumberland Street ~ Thunder Bay ON P7A 4L3 ~ Canada
E m a i l