=What You Could Use This For Just in case you don't have a clue what machine learning or classification is, here's a quick example scenario and an explanation of the process. The most popular task is spam identification. To do this you'll first need a set of training documents. This would consist of a number of documents which you have labeled as either spam or not. With training sets, bigger is better. You should probably have at least 100 of each type (spam and not spam). Really 1,000 of each type would be better and 10,000 of each would be super sweet. Once you have the training set the process with this library flows like this: * Create each as a Document (a class in this library) * Pass those documents into the FeatureSelector * Get the best features and pass those into the FeatureExtractor * Now extract features from each document using the extractor and * Pass those extracted features to NaiveBayes as part of the training set * Now you can save the FeatureExtractor and NaiveBayes to a file That represents the process of selecting features and training the classifier. Once you've done that you can predict if a new previously unseen document is spam or not by just doing the following: * Load the feature extractor and naive bayes from their files * Create a new document object from your new unseen document * Extract the features from that document using the feature extractor and * Pass those to the classify method of the naive bayes classifier

installgem install basset -v 1.0.1

Paul Dix, Bryan Helmkamp

5,408 total downloads 1,754 for this version


gem 'basset', '~> 1.0.1'
  1. 2.0.1 September 27, 2009 (5 KB)
  2. 1.0.1 January 8, 2008 (14 KB)
  3. 1.0.0 January 8, 2008 (14 KB)