Sisyphus - a log data-mining toolkit --------- Sisyphus is an easy way to find important stuff in logs. Yes it provides fast index-based searches when you know what to look for, but its distinguishing feature is that it helps you find anomalies when you don't know what to look for. It ranks log files and colorizes words based on information theory (answering the questions, "what is the strangest logfile?" and "what is the strangest stuff in this logfile?"). It provides useful file and word statistics in tabular and plot formats. It uses Vaarandi's excellent "loghound" program to generate message templates for you (which you can then use as regular expressions in monitoring programs if you wish). Web and command line interfaces are included. . Sisyphus has been developed specifically for use with supercomputer syslogs, based on the premise that similar computers correctly executing similar workload should produce similar logs (thus, anomalies warrant investigation). But it is general enough to be used on any event logs (if your logs are not syslog, you will have to tweak arguments and/or provide a parser). I hope you find Sisyphus useful. Please share feedback (positive or negative). - Jon Stearley ----- INSTALLATION PROCESS: ----- Debian: 1. Add the below line to /etc/apt/sources.conf: deb http://www.cs.sandia.gov/sisyphus/debian etch main 2. Run `apt-get update; apt-get install sisyphus` 3. [optional] Edit /usr/share/sisyphus/sweb/header.cgi: - replace CONTACT with your email address - replace WEBSERVER with your sweb server name 4. You're done. Skip to SETTING UP and USING below. Other *nix's: Install dependencies: 1. Perl, and the below libraries, for which an install processs as root is: perl -MCPAN -e shell # then, at the cpan> prompt: install CGI::Session install Date::Parse install Date::Format install Statistics::Descriptive quit 2. gnuplot 3. tac - Linux users with "coreutils" should already have it, Mac users may get and install a copy, or just use the universal binary provided with sisyphus (eg, `cp misc/tac.mac /usr/bin/tac`). 4. Apache webserver, with mod_perl enabled. Install Sisyphus: 1. Run `./configure`. [optional] Use --with-contact=your.email.address and --with-webserver=http://your.sweb.server (or edit header.cgi yourself later in order to make the "Email URL" links useful). 2. Run `make install`. Configure apache webserver: 1. See sweb/README.sweb, eg examples/apache2-swebtest.conf. ----- SETTING UP SISYPHUS ---- Test Data: 1. Run `sisyphus-swebtest setup`. This has been tested on Debian only, so on other dists you may want to do the steps therein manually. 2. Point Firefox at http://localhost/swebtest and read USING, TOUR, and INTERACTIVE below. NOTE: This data is purely contrived with known properties designed for verification purposes. Its use as a sweb dataset is secondary, and while useful, it is unlikely to be satisfying to new users. Live syslogs: 1. Run as root `sisyphus-sweb setup`. This has been tested on Debian only, so on other dists you may want to do the steps therein manually. This will change syslog-ng.conf and other things, so you should look through the script before running it in any event. 2. Point Firefox at http://localhost/sweb and read USING, TOUR, and INTERACTIVE below. See doc/examples/redstorm for an example of a complex live sweb setup. Historical syslogs: 1. Import historical syslogs: mkdir /var/log/sweb # eg SISPHUS_DATA=/var/log/sweb cd /var/log/sweb sprep -i /path/to/raw/logs NOTE: /path/to/raw/logs must contain ONLY syslog files 2. Setup apache to serve this SISYPHUS_DATA (see sweb/README.sweb). 3. See USING, TOUR, and INTERACTIVE below. ----- USING SISYPHUS ----- 1. See the demo videos at http://www.cs.sandia.gov/sisyphus 2. Read SWEB QUICK TOUR below. 3. Read sweb/help.html (linked as HELP in docs.cgi) 4. Read the INTERACTIVE PLOTS below. 5. Read the man pages - start with smerge and sprep ----- QUICK TOUR ----- 1. Point Firefox at: http://servername/sweb/ 2. Follow the "Select Documents for Analysis" link. 3. Use the default selection criteria. 4. Click the "List: Documents" button to display the list of nodehour documents matching your search criteria. 5. Click on the column headings to sort on the values in that column. Sorting on the same column again will toggle ascending/descending order. 6. Click on a document name to view the contents of that document. A new window will open to display the doc and allow further analysis. OR... 6. Select one or more documents with the checkboxes on the left. The top checkbox will select/clear all others. The icon next to the top checkbox will invert the current selection. Holding down "shift" while selecting a checkbox will select all docs between the current and previously changed document. 7. Click on one of the "Analyze" buttons to merge the selected documents, sort them, and view them in a new display window. To view just the document contents, use the "Msgs." button. To view both the messages and terms in a split window, use the "Terms" button. To view both the messages and message templates in a split window, use the "Templates" button. (Don't worry too much about which view you want, you can easily access the others later. For now, "Templates" is a great choice.) 8. At this point, the messages analysis page should be visible. Use the close icons ([X]) in the top right of each frame to hide that frame. Use the "Analyze" buttons at the top of the page to display hidden frames. 9. Select one or more templates (or terms) to filter the messages frame and display only those messages matching the templates (or containing the selected terms) 10. Select the "OUTLIERS" template link to drill down and perform further analysis on the messages that did not match any of the listed templates.