Differences between revisions 4 and 5

Deletions are marked like this. Additions are marked like this.
Line 10: Line 10:
' < /home/jm/ftp/spamassassin/rules/STATISTICS-set3.txt > tsv ' < /home/jm/ftp/spamassassin/rules/STATISTICS-set3.txt | sort -n > tsv
Line 65: Line 65:

to use the labels feature with Gnuplot 4.2:

{{{
set terminal png small interlace size 500,400 \
      xffffff x444444 x33cc00 xff9900 x0000cc x99cc00 xff9900 \
      xcccc00 x333333 x999999 x9500d3 font "/home/jm/.fonts/verdana.ttf"
set out 'roc_labels.png'
set xlabel "False Positives"
set ylabel "True Positives"
set xtics 0, 0.1, 1
set ytics 0, 0.1, 1
set grid back xtics ytics
set xrange [0:1]
set yrange [0:1]
plot "tsv" using ($2/100):(1-($3/100)) t "ROC" with linespoints lw 2 pt 4, \
  '' using ($2/100):(1-($3/100)):1 with labels tc lt 2 font "medium" offset 0,0 notitle

set terminal png small interlace size 500,400 \
      xffffff x444444 x33cc00 xff9900 x0000cc x99cc00 xff9900 \
      xcccc00 x333333 x999999 x9500d3 font "/home/jm/.fonts/verdana.ttf"
set out 'roc_zoomed_labels.png'
set xlabel "False Positives"
set ylabel "True Positives"
set xtics 0, 0.01, 1
set ytics 0, 0.01, 1
set grid back xtics ytics
set xrange [0:0.1]
set yrange [0.9:1]
plot "tsv" using ($2/100):(1-($3/100)) t "ROC" with linespoints lw 2 pt 4, \
  '' using ($2/100):(1-($3/100)):1 with labels tc lt 2 font "medium" offset 0,0 notitle

}}}

These gnuplot files are used to derive nice FP%/FN% rate graphs.

Prepare the data:

perl -ne '
/SUMMARY.* (\S+):/ and $t=$1;
/False positives:.* (\S+)\%/and $fp = $1; 
if (/False negatives:.* (\S+)\%/) { print "$t $fp $1\n";} 
' < /home/jm/ftp/spamassassin/rules/STATISTICS-set3.txt | sort -n > tsv

Graphing commands:

set terminal png small interlace size 500,400 \
      xffffff x444444 x33cc00 xff9900 x0000cc x99cc00 xff9900 \
      xcccc00 x333333 x999999 x9500d3
set out 'fp_fn_rates_main.png'
set xlabel "SpamAssassin required_score threshold (5 = default)"
set xtics -10, 5, 20
set ytics 0, 10, 100
set mxtics 5
set mytics 1
set grid back xtics ytics
set yrange [0:100]
plot "tsv" using 1:2 t "FPs" with linespoints lw 2 pt 4, \
     "tsv" using 1:3 t "FNs" with linespoints lw 2 pt 4 lc rgb '#0000ff'

set terminal png small interlace size 500,200 \
      xffffff x444444 x33cc00 xff9900 x0000cc x99cc00 xff9900 \
      xcccc00 x333333 x999999 x9500d3
set out 'fp_fn_rates_zoomed.png'
set xlabel "SpamAssassin required_score threshold (5 = default)"
set mxtics 0.1
set ytics 0, 0.2, 1
set grid back xtics ytics
set yrange [0:1]
plot "tsv" using 1:2 t "FPs" with linespoints lw 2 pt 4, \
     "tsv" using 1:3 t "FNs" with linespoints lw 2 pt 4 lc rgb '#0000ff'

set terminal png small interlace size 500,400 \
      xffffff x444444 x33cc00 xff9900 x0000cc x99cc00 xff9900 \
      xcccc00 x333333 x999999 x9500d3
set out 'roc.png'
set xlabel "False Positives"
set ylabel "True Positives"
set xtics 0, 0.1, 1
set ytics 0, 0.1, 1
set grid back xtics ytics
set xrange [0:1]
set yrange [0:1]
plot "tsv" using ($2/100):(1-($3/100)) t "ROC" with linespoints lw 2 pt 4

set out 'roc_zoomed.png'
set xlabel "False Positives"
set ylabel "True Positives"
set xtics 0, 0.01, 1
set ytics 0, 0.01, 1
set grid back xtics ytics
set xrange [0:0.1]
set yrange [0.9:1]
plot "tsv" using ($2/100):(1-($3/100)) t "ROC" with linespoints lw 2 pt 4

to use the labels feature with Gnuplot 4.2:

set terminal png small interlace size 500,400 \
      xffffff x444444 x33cc00 xff9900 x0000cc x99cc00 xff9900 \
      xcccc00 x333333 x999999 x9500d3 font "/home/jm/.fonts/verdana.ttf"
set out 'roc_labels.png'
set xlabel "False Positives"
set ylabel "True Positives"
set xtics 0, 0.1, 1
set ytics 0, 0.1, 1
set grid back xtics ytics
set xrange [0:1]
set yrange [0:1]
plot "tsv" using ($2/100):(1-($3/100)) t "ROC" with linespoints lw 2 pt 4, \
  '' using ($2/100):(1-($3/100)):1 with labels tc lt 2 font "medium" offset 0,0 notitle

set terminal png small interlace size 500,400 \
      xffffff x444444 x33cc00 xff9900 x0000cc x99cc00 xff9900 \
      xcccc00 x333333 x999999 x9500d3 font "/home/jm/.fonts/verdana.ttf"
set out 'roc_zoomed_labels.png'
set xlabel "False Positives"
set ylabel "True Positives"
set xtics 0, 0.01, 1
set ytics 0, 0.01, 1
set grid back xtics ytics
set xrange [0:0.1]
set yrange [0.9:1]
plot "tsv" using ($2/100):(1-($3/100)) t "ROC" with linespoints lw 2 pt 4, \
  '' using ($2/100):(1-($3/100)):1 with labels tc lt 2 font "medium" offset 0,0 notitle

FpFnGraphs (last edited 2008-03-20 11:09:25 by 84)