Parsing

class apachelogs.LogParser(format, encoding='iso-8859-1', errors=None)[source]

A class for parsing Apache access log entries in a given log format. Instantiate with a log format string, and then use the parse() and/or parse_lines() methods to parse log entries in that format.

Parameters
  • format (str) – an Apache log format

  • encoding (str) – The encoding to use for decoding certain strings in log entries (see Supported Directives); defaults to 'iso-8859-1'. Set to 'bytes' to cause the strings to be returned as bytes values instead of str.

  • errors (str) – the error handling scheme to use when decoding; defaults to 'strict'

Raises
parse(entry)[source]

Parse an access log entry according to the log format and return a LogEntry object.

Parameters

entry (str) – an access log entry to parse

Return type

LogEntry

Raises

InvalidEntryError – if entry does not match the log format

parse_lines(entries, ignore_invalid=False)[source]

Parse the elements in an iterable of access log entries (e.g., an open text file handle) and return a generator of LogEntrys. If ignore_invalid is True, any entries that do not match the log format will be silently discarded; otherwise, such an entry will cause an InvalidEntryError to be raised.

Parameters
  • entries – an iterable of str

  • ignore_invalid (bool) – whether to silently discard entries that do not match the log format

Return type

LogEntry generator

Raises

InvalidEntryError – if an element of entries does not match the log format and ignore_invalid is False

class apachelogs.LogEntry[source]

A parsed Apache access log entry. The value associated with each directive in the log format is stored as an attribute on the LogEntry object; for example, if the log format contains a %s directive, the LogEntry for a parsed entry will have a status attribute containing the status value from the entry as an int. See Supported Directives for the attribute names & types of each directive supported by this library.

If the log format contains two or more directives that are stored in the same attribute (e.g., %D and %{us}T), the given attribute will contain the first non-None directive value.

The values of date & time directives are stored in a request_time_fields: dict attribute. If this dict contains enough information to assemble a complete (possibly naïve) datetime.datetime, then the LogEntry will have a request_time attribute equal to that datetime.datetime.

entry = None

The original logfile entry with trailing newlines removed

format = None

The entry’s log format string

apachelogs.parse(format, entry, encoding='iso-8859-1', errors=None)[source]

A convenience function for parsing a single logfile entry without having to directly create a LogParser object.

encoding and errors have the same meaning as for LogParser.

apachelogs.parse_lines(format, entries, encoding='iso-8859-1', errors=None, ignore_invalid=False)[source]

A convenience function for parsing an iterable of logfile entries without having to directly create a LogParser object.

encoding and errors have the same meaning as for LogParser. ignore_invalid has the same meaning as for LogParser.parse_lines().