Over the past couple of days, I have been training cascade classifiers for detecting various types of electronic components such as resistors, capacitors and transistors. I am using OpenCV and learning on a set of positive and negative images. However, I overlearned the training set, and as a result have pretty bad results on my test data. I am working on tweaking the model to get around this, so if all goes well, I should have a pretty good classifier built in the next week. My theory is that the positive images were of only a single discrete component with a white background, whereas my negative images were pulled from random image hosting sites with a script I wrote. These images were a lot more chaotic and therefore I notice a lot of false positives, such as a simple logo on a white t-shirt.
I will try to make some optimizations and see if I can get the code to run any faster, but between these classifiers and some work for a competition on Kaggle, my computer is pretty overloaded.