Distributed logging with log_roller
Categories: erlang, logging
Where do I dump my big pile of logs?

The OTP solution: let SASL take care of it
The SASL application adds a handler that formats and writes log reports to disk as specified by the sasl config parameters
dev.config:
[{sasl, [
{error_logger_mf_dir, "."},
{error_logger_mf_maxbytes, 1024},
{error_logger_mf_maxfiles, 10}
]}].
boot sasl upon startup and browse logs with the report browser:
jvorreuter$ erl -config dev -boot start_sasl
Erlang R13B03 (erts-5.7.4) [source] [smp:2:2] [rq:2] [async-threads:0] [kernel-poll:false]
...
=PROGRESS REPORT==== 10-Feb-2010::19:09:10 ===
application: sasl
started_at: nonode@nohost
Eshell V5.7.4 (abort with ^G)
1> rb:start().
rb: reading report...done.
{ok,<0.43.0>}
2> rb:list().
No Type Process Date Time
== ==== ======= ==== ====
5 progress <0.31.0> 2010-02-10 19:09:10
4 progress <0.31.0> 2010-02-10 19:09:10
3 progress <0.31.0> 2010-02-10 19:09:10
2 progress <0.31.0> 2010-02-10 19:09:10
1 progress <0.24.0> 2010-02-10 19:09:10
ok
3> rb:show(1).
PROGRESS REPORT <0.6.0> 2010-02-10 19:09:10
===============================================================================
application sasl
started_at nonode@nohost
ok
4>q().
ok
jvorreuter$ ls -1
1
2
dev.config
index
http://erlang.org/doc/man/rb.html
PROS
» easy setup: add sasl config values and rb:start()
» supports simple filtering and grepping: rb:list(error_report) and rb:grep("asdf")
CONS
» no remote access: rb runs locally as a gen_server and can't load logs from other nodes
» logs from separate VMs are siloed: it can be useful to view logs from distributed nodes in a unified timeline
» no web interface: this limits access by multiple users
» not very realtime: rb requires rescanning new log entries because it stores scanned logs in gen_server state
What does log_roller bring to the party?
» distributed nodes broadcast messages to a central disk log
» SASL-like configuration of max size and number of files in your rotating disk log
» filter log messages into partitioned "buckets" based on node names, log types, and regexp matching
» full featured web interface for log monitoring, grepping and polling
http://github.com/JacobVorreuter/log_roller
Overview
log_roller is made up of two different applications. They can be thought of as a client and a server. The client should run on the node you plan on monitoring. When the client starts it registers the log_roller_h event handler with the error_logger. This allows it to broadcast a copy of each log to any available log_roller_server processes. The log_roller_server application can be started on the local node or on a remote node. Active log_roller_server processes are members of the log_roller_server pg2 group. When the log_roller_h handler forwards a log message it does so to every member of the log_roller_server pg2 group. Once received, a log message is written to a rotating disk log using the disk_log module.
Distribution
log_roller and log_roller_server processes form a many-to-many relationship. In a grid of connected nodes, any node running the log_roller application will broadcast to all nodes running the log_roller_server application.
Partitioning
log_roller supports a system of filtering log messages into "buckets" as they are received by the log_roller_server process. A bucket is a separate disk log with a set of filters associated with it. For instance, you may want to filter error messages into a separate bucket from info messages. It is possible to filter based on message type, node name, and regular expression matching.
Configuration
If started without a config file, log_roller will create a single "bucket", named "default", with values of cache_size = 10485760, maxbytes = 10485760 and maxfiles = 10. Otherwise, these values come from the log_roller_server application config values. For example, here we define two buckets to store info and error messages separately:
[{log_roller_server, [
{logs, [
{info, [
{cache_size, 10485760},
{maxbytes, 10485760},
{maxfiles, 10},
{filters, [
{types, [progress, info]}
]}
]},
{errors, [
{cache_size, 10485760},
{maxbytes, 10485760},
{maxfiles, 10},
{filters, [
{types, [warning, error]}
]}
]}
]}
]}].
Browsing
Included in the log_roller_server application is the log_roller web interface. When the log_roller_server application starts it spawns a mochiweb server listening on port 8888 by default. The web server can be configured with the following values:
[{log_roller_server, [
{address, "127.0.0.1"},
{port, 8888},
{doc_root, "."}
]}].
Getting started
download, compile and install
git clone git://github.com/JacobVorreuter/log_roller.git
cd log_roller
make
sudo make install
start log_roller with your application
jvorreuter$ erl -name myapp@`hostname` -pa lib/log_roller/ebin
Erlang R13B03 (erts-5.7.4) [source] [smp:2:2] [rq:2] [async-threads:0] [kernel-poll:false]
Eshell V5.7.4 (abort with ^G)
(myapp@jacob-vorreuters-macbook-pro-3.local)1> application:start(log_roller).
ok
(myapp@jacob-vorreuters-macbook-pro-3.local)2> erlang:get_cookie().
xxx
start a log_roller_server node with the same erlang cookie
jvorreuter$ erl -name logger@`hostname` -pa lib/log_roller/ebin lib/log_roller_server/ebin
Erlang R13B03 (erts-5.7.4) [source] [smp:2:2] [rq:2] [async-threads:0] [kernel-poll:false]
Eshell V5.7.4 (abort with ^G)
(logger@jacob-vorreuters-macbook-pro-3.local)1> application:start(mochiweb).
ok
(logger@jacob-vorreuters-macbook-pro-3.local)2> application:start(log_roller_server).
opened log: {ok,default}
info: [{name,default},
{file,"default"},
{type,wrap},
{format,external},
{size,{10485760,10}},
{items,110202},
{owners,[{<0.44.0>,true}]},
{users,0},
{head,none},
{mode,read_write},
{status,ok},
{node,'logger@jacob-vorreuters-macbook-pro-3.local'},
{distributed,local},
{no_current_bytes,7474946},
{no_current_items,45859},
{no_items,110202},
{no_written_items,0},
{current_file,2},
{no_overflows,{0,0}}]
ok
(logger@jacob-vorreuters-macbook-pro-3.local)3> erlang:get_cookie().
xxx
The log_roller web server should now be running on http://127.0.0.1:8888

Go back to the first node and log some info reports from your application and refresh the browser
(myapp@jacob-vorreuters-macbook-pro-3.local)3> [error_logger:info_report( {erlang:make_ref(), N}) || N <- lists:seq(1,100)].
=INFO REPORT==== 10-Feb-2010::20:29:53 ===
{#Ref<0.0.0.96>,1}
=INFO REPORT==== 10-Feb-2010::20:29:53 ===
{#Ref<0.0.0.97>,2}
...
=INFO REPORT==== 10-Feb-2010::20:29:53 ===
{#Ref<0.0.0.208>,100}
[ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,
ok,ok,ok,ok,ok,ok,ok,ok,ok,ok|...]

Now log some error reports so we can test out the filtering
(myapp@jacob-vorreuters-macbook-pro-3.local)3> [error_logger:error_report( {erlang:make_ref(), N}) || N <- lists:seq(1,100)].
=ERROR REPORT==== 11-Feb-2010::15:26:16 ===
{#Ref<0.0.0.286>,1}
=ERROR REPORT==== 11-Feb-2010::15:26:16 ===
{#Ref<0.0.0.287>,2}
...
=ERROR REPORT==== 11-Feb-2010::15:26:16 ===
{#Ref<0.0.0.396>,100}
[ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,
ok,ok,ok,ok,ok,ok,ok,ok,ok,ok|...]
Limit the log type to error messages only and grep for "100"

Now do the same, but for info messages
