Differences between revisions 4 and 5
| Deletions are marked like this. | Additions are marked like this. |
| Line 10: | Line 10: |
| ' < /home/jm/ftp/spamassassin/rules/STATISTICS-set3.txt > tsv | ' < /home/jm/ftp/spamassassin/rules/STATISTICS-set3.txt | sort -n > tsv |
| Line 65: | Line 65: |
|
to use the labels feature with Gnuplot 4.2: {{{ set terminal png small interlace size 500,400 \ xffffff x444444 x33cc00 xff9900 x0000cc x99cc00 xff9900 \ xcccc00 x333333 x999999 x9500d3 font "/home/jm/.fonts/verdana.ttf" set out 'roc_labels.png' set xlabel "False Positives" set ylabel "True Positives" set xtics 0, 0.1, 1 set ytics 0, 0.1, 1 set grid back xtics ytics set xrange [0:1] set yrange [0:1] plot "tsv" using ($2/100):(1-($3/100)) t "ROC" with linespoints lw 2 pt 4, \ '' using ($2/100):(1-($3/100)):1 with labels tc lt 2 font "medium" offset 0,0 notitle set terminal png small interlace size 500,400 \ xffffff x444444 x33cc00 xff9900 x0000cc x99cc00 xff9900 \ xcccc00 x333333 x999999 x9500d3 font "/home/jm/.fonts/verdana.ttf" set out 'roc_zoomed_labels.png' set xlabel "False Positives" set ylabel "True Positives" set xtics 0, 0.01, 1 set ytics 0, 0.01, 1 set grid back xtics ytics set xrange [0:0.1] set yrange [0.9:1] plot "tsv" using ($2/100):(1-($3/100)) t "ROC" with linespoints lw 2 pt 4, \ '' using ($2/100):(1-($3/100)):1 with labels tc lt 2 font "medium" offset 0,0 notitle }}} |
These gnuplot files are used to derive nice FP%/FN% rate graphs.
Prepare the data:
perl -ne '
/SUMMARY.* (\S+):/ and $t=$1;
/False positives:.* (\S+)\%/and $fp = $1;
if (/False negatives:.* (\S+)\%/) { print "$t $fp $1\n";}
' < /home/jm/ftp/spamassassin/rules/STATISTICS-set3.txt | sort -n > tsv
Graphing commands:
set terminal png small interlace size 500,400 \
xffffff x444444 x33cc00 xff9900 x0000cc x99cc00 xff9900 \
xcccc00 x333333 x999999 x9500d3
set out 'fp_fn_rates_main.png'
set xlabel "SpamAssassin required_score threshold (5 = default)"
set xtics -10, 5, 20
set ytics 0, 10, 100
set mxtics 5
set mytics 1
set grid back xtics ytics
set yrange [0:100]
plot "tsv" using 1:2 t "FPs" with linespoints lw 2 pt 4, \
"tsv" using 1:3 t "FNs" with linespoints lw 2 pt 4 lc rgb '#0000ff'
set terminal png small interlace size 500,200 \
xffffff x444444 x33cc00 xff9900 x0000cc x99cc00 xff9900 \
xcccc00 x333333 x999999 x9500d3
set out 'fp_fn_rates_zoomed.png'
set xlabel "SpamAssassin required_score threshold (5 = default)"
set mxtics 0.1
set ytics 0, 0.2, 1
set grid back xtics ytics
set yrange [0:1]
plot "tsv" using 1:2 t "FPs" with linespoints lw 2 pt 4, \
"tsv" using 1:3 t "FNs" with linespoints lw 2 pt 4 lc rgb '#0000ff'
set terminal png small interlace size 500,400 \
xffffff x444444 x33cc00 xff9900 x0000cc x99cc00 xff9900 \
xcccc00 x333333 x999999 x9500d3
set out 'roc.png'
set xlabel "False Positives"
set ylabel "True Positives"
set xtics 0, 0.1, 1
set ytics 0, 0.1, 1
set grid back xtics ytics
set xrange [0:1]
set yrange [0:1]
plot "tsv" using ($2/100):(1-($3/100)) t "ROC" with linespoints lw 2 pt 4
set out 'roc_zoomed.png'
set xlabel "False Positives"
set ylabel "True Positives"
set xtics 0, 0.01, 1
set ytics 0, 0.01, 1
set grid back xtics ytics
set xrange [0:0.1]
set yrange [0.9:1]
plot "tsv" using ($2/100):(1-($3/100)) t "ROC" with linespoints lw 2 pt 4
to use the labels feature with Gnuplot 4.2:
set terminal png small interlace size 500,400 \
xffffff x444444 x33cc00 xff9900 x0000cc x99cc00 xff9900 \
xcccc00 x333333 x999999 x9500d3 font "/home/jm/.fonts/verdana.ttf"
set out 'roc_labels.png'
set xlabel "False Positives"
set ylabel "True Positives"
set xtics 0, 0.1, 1
set ytics 0, 0.1, 1
set grid back xtics ytics
set xrange [0:1]
set yrange [0:1]
plot "tsv" using ($2/100):(1-($3/100)) t "ROC" with linespoints lw 2 pt 4, \
'' using ($2/100):(1-($3/100)):1 with labels tc lt 2 font "medium" offset 0,0 notitle
set terminal png small interlace size 500,400 \
xffffff x444444 x33cc00 xff9900 x0000cc x99cc00 xff9900 \
xcccc00 x333333 x999999 x9500d3 font "/home/jm/.fonts/verdana.ttf"
set out 'roc_zoomed_labels.png'
set xlabel "False Positives"
set ylabel "True Positives"
set xtics 0, 0.01, 1
set ytics 0, 0.01, 1
set grid back xtics ytics
set xrange [0:0.1]
set yrange [0.9:1]
plot "tsv" using ($2/100):(1-($3/100)) t "ROC" with linespoints lw 2 pt 4, \
'' using ($2/100):(1-($3/100)):1 with labels tc lt 2 font "medium" offset 0,0 notitle
