Boost.Logging

BOOST WIKI | RecentChanges | Preferences | Page List | Links List

The purpose of this wiki is to aggregate the different requirement definitions for an eventual Boost.Logging library. Feel free to add those requirements on this page. The better would be to put your name and pick up an number for each requirements.

Guidelines

Please add a definition for any word that you think may need one. A definition does not only *define* the meaning of something, it also *precise* the meaning, so we all know what we are talking about.

Definitions

Log record. A single pack of information, collected from the user's application, that is candidate to be put in log. In a simple case the log record will be represented as a line of text in the log file after being processed by the logging library.
Log attribute. An "attribute" is basically a piece of information of which the logging record consists. Attributes may have different types (integrals, strings and more complex types). Some examples of attributes: timestamp, filename, line number, the trace itself.
Log sink. A target, to which all log records are fed after being collected from user's application.
Log source. An entry point for user's application to put log records to. In a simple case it is an object (logger) which maintains a set of attributes that will form a log record upon user's request.
Log filter. A predicate that takes a log record and tells weither this record should be passed through or discarded.
Logging core. The global entity that maintains connection between sources and sinks and applies filters to records.
i18n. Internationalization. The ability to manipulate wide characters.

Major Requirements

These are the corner-stones, which define the following view and direction of further development of the more detailed functional requirements.

Simplicity. A small example code snippet should be enough to get the feel of the library and be ready to use its basic features.
Extensibility. A user should be able to extend library's functionality in ways of collecting and storing information into logs.
Performance. The library should make as least performance impact on the user's application as possible.

Functional Requirements

In this part the requested functionalities are described. Please, be as specific and precise as possible here, in order to elide misunderstandings. And if possible, a short rationale would be nice.

1. Thread safety (Andrey Semashev)

The library should be able to operate in a multithreaded environment. The major thread safety policy seems to me as follows:

Logging core and sinks are thead-safe.
Log sources are not necessarily thread-safe. In particular, a logger object does not need to be thread-safe.

The main idea here, of course, is that the program should be able to generate log messages from any thread.

Rationale: For performance reasons threads should not share loggers.

John Torjo : There should be no need for a "logging core".

Andrey Semashev: The points of the core are (a) to interconnect loggers and sinks and (b) to provide the way to global filtering and storing global and thread-specific attributes.

2. Scope logging (Andrey Semashev)

The library should offer the ability to maintain a stack of named scopes. The stack should be a supported as a log attribute and available for sinks to be written to output. The stack should be modified with a scope guard objects, provided by the library. The guard objects should have an optional ability to issue a log record upon construction and destruction. Since the stack is thread-local, the functionality is to be provided by the logging core. For performance reasons only C-string constants should be supported as scope names.

Rationale:

It is very convenient to see the sequence of calls that led to the given line of a log.

Example:

class A
{
  logger m_log;

  void foo(int n, string const& s)
  {
    log_scope scope("A::foo"); // Indicate that we have entered the function "A::foo", no log records are made
    // Another useful feature incorporating macros:
    BOOST_LOG_FUNCTION("A::foo", (n)(s)); // Will log something like "A::foo(n = 10, s = \"Hello\")"

    switch (n)
    {
      case 1:
      {
        // Equivalent to:
        //log_scope case_scope("case 1");
        //m_log << "Entered scope \"case 1\"";

        log_scope case_scope(m_log, "case 1", true); // Indicate that "case 1" have been selected
                                                     // The additional flag and logger indicate that
                                                     // an actual log record should be made on case_scope construction
                                                     // and destruction

        m_log.strm() << "Working..."; // This record would contain a stack [...->"A::foo"->"case 1"]

        break; // Here will be executed: m_log << "Left scope \"case 1\"";
      }

      case 2:
      {
        log_scope case_scope(m_log, "case 2", true);

        m_log.strm() << "Working too..."; // This record would contain a stack [...->"A::foo"->"case 2"]

        break;
      }
    }

    m_log.strm() << "Done working..."; // This record would contain only a stack [...->"A::foo"]
  }
};

3. Eliminate log statemets from generated code (Michael Lacher)

Requirement: It should be possible to prevent all or a specified subset of log messages from generating any code, symbols or strings in the object file.

Rationale: Often log output used for debugging contains valuable information that would be useful for reverse engineering the code. Even if log is not actually generated, fragments might remain as symbols or in the string table of an object file if they are not removed from compilation alltogether (e.g. through macros). This might concern company secrets but also patents, or other cases where it is not up to the log user to decide if leaking information is acceptable or not.

4. Full lazy evaluation (Michael Lacher)

Requirement: Log arguments should not only be lazy formated, but also lazy evaluated. Lazy in this context means: only if log will actually be visible in any of the output modules.

Rationale: In a recent project benchmarking showed that formatting only consumed about 50%-70% of the time. The rest was spend in actually evaluating the variables that were about to be logged. This is especially important since it might be very important to do heavy calculations and preparations on values to make them suitable for a readable output.

This issue was addressed on the mailinglist using constructs of lamda and lazy evaluation. This would work fine, but imho is cumbersome to write. It would be helpfull if a logging library would provide some helper macros (which one might use or not, depending on their needs) for such tasks.

Andrey Semashev: I think, the most user-fiendly and natural solution is to perform filtering before constructing the text message. I.e.:

if (logger.will_write_message()) // Filtering takes place in will_write_message
  logger.strm() << "The message text";

5. Sinks (JD)

Sink nature (JD)

Multiple possible output media (files, sockets, syslog, Windows event log, etc) (JD on behalf of Caleb Epstein)

Independent output formatting and filtering (JD)

The library shall be able to output to different sinks.
Each sink shall be able to have a particular formatting.
Each sink shall be filtered independently. (cf definition of filter)

Andrey Semashev: See also req. #9

6. Exception safety (abingham)

This comprises at least two parts. First, the logging system should be stable in the face of exceptions that it generates. If, for instance the logging system uses exception to indicate some sort of failure, the generation of that exception should leave the logging system in a consistent state.

The second aspect is probably trickier. Consider the following notional logging statements (vaguely remembered from some posting on the list):

BOOST_LOG( data1() << data2() << data3() ); // data3() throws

In this case, what if anything gets logged when data3() throws?

Andrey Semashev: Regardless of the library implementation, the behavior is uspecified here. The compiler is permitted to reorder these function calls and operator << calls in any way. That means that is is not known which of dataN functions were called and which results of their execution were put into logger stream before the exception occured. I guess, the library will just output what it has at the point of exception.

7. Configurable log message attributes (abingham)

Requirement: There are lots of pieces of data that someone might want to associate with a log message. These include timestamps, severity, stack info, file/line info, and so forth. However, it seems clear that there's no definitive set of attributes that everyone will want all of the time. So, the library should support the optional association of arbitrary data with log messages. This could be as simple as formatter objects that modify the logged string, or as complex as some sort of full-type-checked templated channel/sink system (notice my waving hands).

Rationale: Sometimes you want lots of extra information in a logged message (e.g. when debugging complex interactions), and sometimes you just want to print out a simple message (e.g. a simple heartbeat message).

8. The library shall manage i18n (JD)

Be able to manipulate wide characters either by template instanciation or by macro definition at compile time.

9. Filtering support (Andrey Semashev)

The library should allow to apply filtering to the records being written to the output. There should be both global and per-sink filters.

Rationale: Per-sink filtering allows to separate logging records to be put into different sinks and global filtering allows to reduce log verbosity, if needed.

Design note: Each filter is essentially and unary functor that receives an attribute set and returns bool. Such approach allows to compose filter checks easily with some lambda-like syntax:

sink.set_filter(attr("severity") > 2 && (attr("channel") == "IO" || attr("channel") == "Connections"));
logging_system::get()->set_filter(attr("severity") >= 1);

Limitation: This way of filtering will not be available for the message text because the text is not formatted at the point of filtering. Still, a sink implementation is allowed to discard the record after the regular filtering have been applied and the message text have been constructed. Such late filtering may involve message text and, IMO, should not be covered by the initial library implementation.

John Torjo: The filter (which should not be linked to the logger). You can choose to make it thread-safe.

Andrey Semashev: Filters are completely unrelated to loggers and, in fact, to neither of the major parts of the library. Filters never share any mutable data (unless they are designed that way by users, that is) and therefore they are thread-safe and lock-free by default.

10. Attribute sets (Andrey Semashev)

The library should support three attribute set categories:

Logger-associated. Each logger object has its own set of attributes it adds to each logging record.
Thread-associated. The logging library core maintains thread-specific attribute set.
Global. This attribute set is a singleton. Each logging record ever made in the application has attributes from this set.

John Torjo: Not sure why we'd need attributes to be kept in the logger/thread/global.

Andrey Semashev: In many times it is very convenient. For example to log each thread identifiers, scope stack or some processing context (like user name, IP whose request is currently in processing). Sometimes user's application spreads to more than one module (dll/so) and thread-specific attributes are a very convenient way to tag logging records in other modules while processing a single request from user.

11. Exception logging support (Andrey Semashev)

The library should provide means to log an exception being thrown. It would be nice if the logging record included information about the place in sources where the exception was thrown.

Rationale: It is often convenient to log an exception that indicates an error. This often helps debugging the application and diagnose its undesirable behavior.

Design note: Most likely this functionality will be implemented as a macro or a set of macros that constructs the exception, logs it and throws. Probably, Boost.Exception support should be implemented, though the functionality should not be restricted to it (at least <stdexcept> classes should work too). Preliminary syntax could be something like:

#define BOOST_LOG_THROW(logger, svty, exception_type, strm)
  if (true) {
    exception_type e;
    wrap(e) << strm;
    BOOST_LOG_SEV(logger, svty)
      << "Exception occurred at "
      << __FILE__ << ':' << __LINE__
      << ". " << e;
    throw e;
  } else ((void)0)


BOOST_LOG_THROW(lg, minor,
  std::runtime_error, "CPU not found");

BOOST_LOG_THROW(lg, major,
  std::invalid_argument, "The argument X has invalid value " << X);

BOOST_LOG_THROW(lg, fatal,
  boost::exception, "I'm dying, here's my dump: " << dump);

12. Log4j parser support for configuration files (Ilya Murav'jov)

I think Boost.Logging needs a parser for configuring logging easy; log4j file format is a reasonable stadard for that: http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/PropertyConfigurator.html#doConfigure(java.lang.String,%20org.apache.log4j.spi.LoggerRepository%29

And Boost.Spirit helps excellently in that!

Andrey Semashev: This was already under consideration. Although it looks tempting, there is no common and convenient way to store settings. Log4j does not work wery good since the proposed library architecture is way too different from log4j. Other solutions might be reading from a simple text file, xml file, Windows registry or command line, but all these solutions are still not universal (e.g. user's application might need its own settings and their format may differ from what the library offers). Therefore there will be no such feature out of box, at least at first.

13. Choice for thread-safety (Ilya Murav'jov)

Now Boost.Logging, compiling with gcc, depends on pthread library, whether I choose thread-locking or not for logger_format_write<> (see bl::array::shared_ptr_holder<>). So, if I want not to use thread-safety I have to construct log_type manually.

Andrey Semashev: I hope to provide a compile-time choice for thread safety. That is, at the time of the library compilation you will have to chose whether you want it or not.

14. Minimal thread synchronization impact (Yuriy Sosov)

In a multithreaded application, transferring the log records from the log sources to the log sinks will require the thread with the log source to obtain the ownership of the sink (or the logging core if logging through the core) to transfer the log records. If several threads with the sources have to move their log records to the same sink at the same time, they would be able to do that only one after the other. This will cause synchronization of the execution in the logging threads. The purpose of using logs in a multithreaded application is often to reveal the sequence of events leading to some undesirable threading behavior such as deadlocks or races. The synchronization impact of logging may change the event sequence greatly so that the behavior in question will never happen when using logs. It is highly desirable to have an option where logging from the separate threads causes as little synchronization impact as possible. In practice this would probably mean that the number of operations used for the record transfer in between acquiring and releasing the ownership has to be brought to an absolute minimum.

Design Requirements

1. Configurable log message attributes (JD)

Req: Defining format for the sink shall be as natural as possible. Something like: Ex:

 sink.set_format("[" << boost::logging::level << "]" << " - " << boost::logging::date("%d/$m/$Y") << "," << boost::logging::timestamp << "," << boost::logging::message);

2. Macro access (JD)

Req.: The library should define some macro to log anywhere in the code without having the overhead of defining extern global variable. Ex:

 BOOST_LOG("employee::set_salary: salary = " << salary);

General Thoughts (An area to brainstorm unsorted things)

A feature that came in handy recently was a certain loglevel that caused a full flush of all logfile caches (which were needed to improve performance). The reason is that high level log messages are often followed by application crashes. If the log files are not flushed then the most interesting log parts are lost, if they are flushed every time, than a verbose program experiences heavy delays. (Michael Lacher)

Andrey Semashev: I guess, there's no actual relation between the crashability of the application and logging records severity. I mean, in perfect world they should be related, but in reality we usually don't know where the bug is an when the app would crash. Therefore I would propose an "auto flush" mode that may be enabled per-sink at run time. In this mode the sink would flush its buffers, if any, after each logging record being written. Such mode would be useful for debugging, when performance is not of a primary concern.

Log filters should apply per sink. Consider the following example: There exist two sinks, one is a typical logfile, the other is a network sink which will notify the system administrator (say per email) of critical conditions. The second sink should only report really critical conditions, while the first will need to be more verbose to be useful. (Michael Lacher)

Andrey Semashev: Added functional requirement #9.

In my experience, there are two completely unrelated tasks for logging with quite different requirements. Usually logging libraries map those two to different log levels, but this is not really correct. There are debug messages which need a very high level (like assertion failures) but which are nonetheless completely unneeded and maybe even harmfull in a release build. On the other hand many normal logging messages "user clicked button x" are not really important in most cases and it is cumbersome of having to enable those just to be able to see debug log messages. (Michael Lacher)
- Debugging: print log messages useful for the developer. They are document things like: "entering function foo()", or "the value of bar = 5".
- Logging: print log messages useful for the program user. They are needed to understand unexpected conditions during usage like: "file xyz not found", "directory baz is not readable".

In general developers do need to compile certain log statements OUT depending on the build configuration (debug, release, profile, etc.) boost.Logging should make it as easy as possible to achieve this. (Alessio M.)

Andrey Semashev: Such feature looks like channels. It may be implemented as a special attribute which tells the channel name or id. With filtering sinks will be able to extract specific channels' messages from the whole flow.

Another useful feature to support: sinks and sources should be independent enough to be able to run in the different processes/machines. It is not uncommon, to have all the logging sent via socket to a different machine where it is extracted from the socket and written on disk. It would be nice if architecture will allow to implement such separation. (Zigmar)

Yes that's interesting. The library should be able to log to and stream from, but shall not manage interconnection.

     Log generated >- XXXX -> log written by sink.

What's happenening on XXXX is not managed by the library. XXXX typically Process pipes or network streaming. (JD)

I would actually make one or more "IPC" (inter process communication) sinks (based on say sockets, ...) and corresponding "IPC" sources. If the library is extensible then the user can add their own transport mechanisms, but at least everything would stay within the logging library. (Michael Lacher)

Andrey Semashev: Nice. In the light of the library development results I would suggest a slightly different scheme of such feature which is quite possible now. Application: Log generated by the app via usual loggers -> Log sent to the receiver by a special sink (say, syslog over the network). Log receiver: Log collected from the network with a special source object -> filtering, formatting, etc. -> Log written by a sink to a file.

Log filters should be changeable at runtime. It is very useful, often essential, to just ask a user to crank up the logging level to investigate an issue on-the-fly. This would of course apply only to the log statements that have made it through compilation. (Alessio M.)

Implementation (Andrey Semashev)

There is a preliminary library implementation in the Boost.Vault, available with the following link:

http://www.boost-consulting.com/vault/index.php?action=downloadfile&filename=BoostLog.zip&directory=&;

The actual library code is hosted on the SourceForge CVS and is available here:

http://sourceforge.net/projects/boost-log

Online documentation is available here:

http://boost-log.sourceforge.net/libs/log/doc/html/index.html

There is also another implementation here:

http://torjo.com/log2/

and the code is in the boost sanbox SVN:

http://svn.boost.org/svn/boost/sandbox/logging/

Disclaimer: This site not officially maintained by Boost Developers