Precision, recall, fmeasure, accuracy calculator

***  AGGIORNAMENTO 21 ottobre 2010 ***

corretto un bug nella classificazione

Ho sviluppato, in base alle mie necessità in machine learning un semplice programmetto in C++ per il calcolo di fmeasure, accuracy, precision e recall di un file di testo contente due colonne:

La prima colonna rappresenta la classe reale (suppongo che il negativo sia <=0, il positivo sia >0).

La seconda colonna rappresenta la classe predetta dal nostro qualsivoglia predittore (positivo >0, negativo <=0)

Il programma si può utilizzare direttamente in pipe unix, ad esempio:

cat test.txt | fmeasure

oppure passandogli l’input:

./fmeasure < test.txt

 e stampa le statistiche delle colonne d’interesse, ad esempio:

Total examples = 40320
TP = 10521 FP = 4067
TN = 1163 FN = 24569
Precision = 0.721209
Recall = 0.299829
Accuracy = 0.870288
FMeasure = 0.423568
 

con chiara indicazione sul significato dei simboli, per info leggere qui: http://en.wikipedia.org/wiki/Receiver_operating_characteristic

 ===========================

I recently developed a simple program to compute accuracy, precision, recall, fmeasure of a dataset.

First column is the real class, second column is the predicted class.

For binary classification task, I suppose that a value <=0 is negative, else >0 is positive.

You can use the program in a unix shell as follow:

cat test.txt | fmeasure

or also in this way

./fmeasure < test.txt

It prints out to stdout the following log

Total examples = 40320
TP = 10521 FP = 4067
TN = 1163 FN = 24569
Precision = 0.721209
Recall = 0.299829
Accuracy = 0.870288
FMeasure = 0.423568

 

For info read here : http://en.wikipedia.org/wiki/Receiver_operating_characteristic

Compile it g++ -O3 fmeasure.cpp -o fmeasure

 Download it here:


/* PARAMETERS CALCULATOR
*  Carlo Nicolini – September 2010
*  To use it, cat a file and redirect the output to it, for example:
*  cat test.txt | fmeasure
*  or also:
*  ./fmeasure < test.txt
*  First column are reality values, second column are predicted values
*
*/

#include <iostream>
#include <sstream>
#include <string>
#include <stdexcept>

using namespace std;

int main() {
      //  Don’t sync C++ and C I/O
      ios_base::sync_with_stdio(false);
    char line[1];
    double label=0, margin=0;

    double precision=0,recall=0, accuracy=0;
    double truePositives=0, trueNegatives=0, falsePositives=0, falseNegatives=0;
    double totalLines=0;
     while(  cin.getline(line,100)   )
    {
        cin >> label >> margin;
        if (label >0.0 )
        {
            if (margin > 0.0 )
                truePositives+=1;
            else
                falseNegatives+=1;
        }
        else
        {
            if (margin > 0.0 )
                falsePositives+=1;
            else
                trueNegatives+=1;
        }
        totalLines+=1;
    }
    
   precision = truePositives/ ( truePositives +  falsePositives );
    recall = truePositives/ ( truePositives +  falseNegatives );
    accuracy = (truePositives+falseNegatives)/ (  totalLines );
    cout << “Total examples = ” << totalLines << endl;
 
    cout << “Precision = ” << precision << endl;
    cout << “Recall = ” << recall << endl;
    cout << “Accuracy = ” << accuracy << endl;
    cout << “FMeasure = ” << 2*(precision*recall)/(precision+recall) << endl;

  return 0;
}

 

Link all'articolo originale: http://carlonicolini.altervista.org/index.php/Informatica-e-Web/Notizie-dal-web/Precision-recall-fmeasure-accuracy-calculator.html

Lascia un commento