Your friendly neighborhood AppSec advisor and honeypot enthusiast. Formerly @ Goldman Sachs and Ernst & Young. Find his thoughts in code form committed to Github.
Logs are important. We need them to investigate, monitor, and analyze. In cybersecurity we have many tools for generating logs, and extracting those logs to a central repository is a common routine. However, the methods and tools for capturing logs are not always common. In this post, I’ll provide an overview of several options available to users of Signal Sciences for capturing request logs.
With Signal Sciences there are a few types of log data. At a high level, these are request data and diagnostic data. Request data contains information about requests hitting the site(s) you are protecting with Signal Sciences. This is the data you want for threat monitoring and analysis. Diagnostic data contains information about Signal Sciences modules and agents This is the data you want if you need to perform advanced monitoring over the health of the various Signal Sciences components you’ve deployed.
This post will focus on request data—I may cover capturing diagnostic data in a future post. In addition, this post is meant to be a high-level introduction to several available options, not a comprehensive guide.
The primary option for extracting request data from Signal Sciences is the request feed REST API endpoint, which is documented here, and there is a companion document with details on extracting your data here. I highly recommend you read the document on extracting your data as it explains some of the restrictions with the API endpoint.
Now that you are familiar with the API endpoint, you can always script your own solution in any language to download the data. However, if scripting is not your thing, here are some options on how you can download the data.
SigSciApiPy Python Script
This script is meant to be a sample reference script for calling the API, but it is very functional and handy. The following command can be used to pull the last 24 hours of request feed data:
./SigSci.py --feed --from=-24h
Note the default output format is JSON. You can change the format to CSV format with the “format” command-line option, but you will notice some of the request metadata will be missing. The following is an example:
If you have Logstash, you know it can pump logs out to any destination via a variety of output plugins. Typically that destination is Elasticsearch. However, before you can pump out logs, you need to first get them via an input plugin. There is now a Logstash input plugin for Signal Sciences available that can get you up and running quickly.
REST APIs not your thing? Maybe your logging tools are centered around ingesting data in Syslog format. In addition, you may want to capture the data directly from Signal Sciences agents. There are a few options for you.
Tail the Log
The first option involves tailing the agent’s log file. Configure the agent to log to a file with the following (default location for the agent configuration file on Linux: /etc/sigsci/agent.conf):
debug-log-web-inputs = 1
debug-log-web-outputs = 1
log-out = “/var/log/sigsci.log”
Since Syslog wants a single flat line per record, we are setting debug-log-web-inputs and outputs to 1. This will produce a flat single line per record, rather than the default multi-line JSON format.
Next, start the agent and then run the following command as a backgrounded process to forward the logs to the systems Syslog:
tail -F /var/log/sigsci.log | logger &
When using this method you should also consider rotating the log file and removing older files to preserve disk space. Using the logrotate tool can easily help with managing the log files.
An alternative to using the logger and logrotate commands mentioned above is the sigsci-syslog-client utility. This tool is written in Go and helps to easily forward the logs to a local or remote Syslog server. It also has a companion script to manage the max file size for the local log file to preserve disk space.
Having the ability to extract request data and easily import it into your security information and event management (SIEM) system or other log analysis tools is a must. Whether you prefer JSON or Syslog format, the options outlined above will help you get started with capturing Signal Sciences request logs. If you are a Signal Sciences customer and have questions on capturing log data, please don’t hesitate to contact the support team or your Technical Account Manager.