DOWNLOAD: Due to the large size of these logs, we recommend using curl to download them, as it will resume partial transfers (eg, if your network connection gets dropped). curl -C - -o bgl.gz -v http://www.cs.sandia.gov/~jrstear/logs/bgl.gz curl -C - -o liberty.gz -v http://www.cs.sandia.gov/~jrstear/logs/liberty.gz curl -C - -o redstorm.gz -v http://www.cs.sandia.gov/~jrstear/logs/redstorm.gz curl -C - -o spirit.gz -v http://www.cs.sandia.gov/~jrstear/logs/spirit.gz curl -C - -o tbird.gz -v http://www.cs.sandia.gov/~jrstear/logs/tbird.gz Simply run each command repeatedly until it completes successfully. MD5SUMs: To verify the integrity of your downloaded files, run md5sum on each and compare the output to the below: 57754c0d3ff0bc9215904ced273fe5de bgl.gz 458c96e9ec0d2eda6b0ce587514671c2 liberty.gz 982806366b3cfbc5dd38131b340dd698 redstorm.gz f90db23dccd4a8b3700494af04139a74 spirit.gz 13be0486681ab29c62d77a4782d1b094 tbird.gz FORMAT: Each line contains one message, plus four fields which we have added to facilitate parsing. Each line is formatted as Cat Utime YYYY.MM.DD Source Message where Cat indicates the alert category ("-"indicates no alert), Utime and date are self-explanatory, Source is the name of the device which generated the message, and Message is the full message in its original form. Messages which were corrupted such that these first four fields could not be determined are not present in these files. SCRUBBING: We removed identifying user, group, and network information from the last four logs per Sandia guidance (SAND 2007-4091W). Such strings have been replaced by #TOKEN#, where TOKEN is a unique integer. Livermore required no such process for the BG/L logs (which are of a more limited nature). FEEDBACK: Our willingness and ability to release future logs depends in part on feedback we receive on these logs. Please send any comments or questions to jrstear@sandia.gov or oliner@stanford.edu. UPDATE: The tools in http://www.cs.sandia.gov/~jrstear/logs/tagger.tar.gz revise the tagging based on additional understanding of the logs. Detailed description of the revision and analysis of the resulting logs is given in "Alert Detection in System Logs", IEEE International Conference on Data Mining (ICDM), 2008.