Using Slogger
These are brief notes on using the Slogger extension to record my web browsing for later searching and recall. Still in progress on this -- JustinMason
The Basics
Install Slogger: http://www.kenschutte.com/firefoxext/
Add button: Add the "Slogger" button to a Firefox toolbar. (View->Toolbars->Customize..., find the button, and drag to where you want it.)
Turn on: Clicking the button should turn it red. (This turns it on, to log every page.)
Choose folder: Open any page in your browser. A message will say you have to set the folder -- do so via the Slogger Options dialog, and click OK. (I used /home/jm/ftp/slog.)
Logging Full Text
Next step: log the full text of each page. Go to the Slogger Options dialog, select the Google Desktop profile, then:
1. Rename it to something like Full HTML log.
2. in the 'Settings for 'Full HTML log' ' -> 'Log File', select On log current page = set, Prompt = unset, During auto-logging = set.
'Configure Log File Format', and use:
<tr class="c1"> <td nowrap="nowrap">$year-$month-$day $hour:$minute:$second.$millisecond</td> <td class="title">$title</td> <td><a href="$url">url</a></td> <td><a href="$savefile">local</a></td> </tr> <tr class="c1sup"> <td class="label">url</td> <td colspan=3><a href="$url">$url</a></td> </tr> <tr class="c1sup"> <td class="label">keywords</td> <td colspan=3>$keywords</td> <td colspan=3><a href="$url">$url</a></td> </tr> <tr class="c1sup"> <td class="label">description</td> <td colspan=3>$desc</td> </tr> <tr class="c1sup"> <td class="label">clip</td> <td colspan=3>$clip</td> </tr> <tr class="c1sup"> <td class="label">image</td> <td colspan=3><img src="$imgsrc"></td> </tr> <tr class="c1sup"> <td class="label">link</td> <td colspan=3><a href="$linkhref">link</a></td> </tr>
(this adds the URL as a text string suitable for incremental text searching.)
3. in the 'Settings for 'Full HTML log' ' -> 'Save Pages', select On log current page = set, Prompt = unset, During auto-logging = set, 'Web Page, HTML only' = set.
4. in the 'Settings for 'Full HTML log' ' -> 'Filters', Allow URIs beginning with: 'http:' = set, all others unset (so you don't index your online banking for example.)
5. in the 'Settings for 'Full HTML log' ' -> 'Filters', enter a few domains you may not want to log, if applicable.
Hit Apply, Close.
Searching
Google Desktop looks nice; not avialable on Linux, though. So instead I'll use mnoGoSearch.
sudo apt-get install mnogosearch-common mnogosearch-dev \ mnogosearch-doc mnogosearch-sqlite
select a few config options:
1. 'Overwrite mnogosearch configuration files?' = Yes 2. 'What mode would you like the database to run?' = Single 3. 'What sort of database are you running?' = sqlite 4. 'Configure the database?' = Yes
ensure the SQLite db is writable:
sudo chmod -R og+rw /var/lib/mnogosearch
Edit /etc/mnogosearch/indexer.conf and append to end:
Server http://localhost/slog/
Now run indexer -am and it'll noisily index all the pages you've slogged so far.
Edit /etc/apache/httpd.conf and append:
Alias /slog/ /home/jm/ftp/slog/ <Directory /home/jm/ftp/slog> Order deny,allow Deny from all Allow from 127.0.0.1 Options Indexes FollowSymLinks MultiViews </Directory>
Also, find the <Directory /usr/lib/cgi-bin/> block, comment these lines:
# Order allow,deny # Allow from all
and add these lines:
Order deny,allow Deny from all Allow from 127.0.0.1
sudo /etc/init.d/apache restart, and now http://localhost/cgi-bin/search.cgi is available!
Indexing on Demand
Create a file, ~/bin/search-slog:
#!/bin/sh indexer gnome-open "http://localhost/cgi-bin/search.cgi?q=$*"
chmod 755 ~/bin/search-slog. You can now perform a search with an implicit reindex beforehand from the command line:
search-slog Mason
it'll perform a quick reindex, then open Firefox with the search results.