Parsing
- class apachelogs.LogParser(format, encoding='iso-8859-1', errors=None)[source]
A class for parsing Apache access log entries in a given log format. Instantiate with a log format string, and then use the
parse()
and/orparse_lines()
methods to parse log entries in that format.- Parameters:
format (str) – an Apache log format
encoding (str) – The encoding to use for decoding certain strings in log entries (see Supported Directives); defaults to
'iso-8859-1'
. Set to'bytes'
to cause the strings to be returned asbytes
values instead ofstr
.errors (str) – the error handling scheme to use when decoding; defaults to
'strict'
- Raises:
InvalidDirectiveError – if an invalid directive occurs in
format
UnknownDirectiveError – if an unknown directive occurs in
format
- parse(entry)[source]
Parse an access log entry according to the log format and return a
LogEntry
object.- Parameters:
entry (str) – an access log entry to parse
- Return type:
- Raises:
InvalidEntryError – if
entry
does not match the log format
- parse_lines(entries, ignore_invalid=False)[source]
Parse the elements in an iterable of access log entries (e.g., an open text file handle) and return a generator of
LogEntry
s. Ifignore_invalid
isTrue
, any entries that do not match the log format will be silently discarded; otherwise, such an entry will cause anInvalidEntryError
to be raised.- Parameters:
- Return type:
LogEntry
generator- Raises:
InvalidEntryError – if an element of
entries
does not match the log format andignore_invalid
isFalse
- class apachelogs.LogEntry[source]
A parsed Apache access log entry. The value associated with each directive in the log format is stored as an attribute on the
LogEntry
object; for example, if the log format contains a%s
directive, theLogEntry
for a parsed entry will have astatus
attribute containing the status value from the entry as anint
. See Supported Directives for the attribute names & types of each directive supported by this library.If the log format contains two or more directives that are stored in the same attribute (e.g.,
%D
and%{us}T
), the given attribute will contain the first non-None
directive value.The values of date & time directives are stored in a
request_time_fields: dict
attribute. If thisdict
contains enough information to assemble a complete (possibly naïve)datetime.datetime
, then theLogEntry
will have arequest_time
attribute equal to thatdatetime.datetime
.- directives
New in version 0.3.0.
A
dict
mapping individual log format directives (e.g.,"%h"
or"%<s"
) to their corresponding values from the log entry.%{*}t
directives with multiple subdirectives (e.g.,%{%Y-%m-%d}t
) are broken up into one entry per subdirective (For%{%Y-%m-%d}t
, this would become the three keys"%{%Y}t"
,"%{%m}t"
, and"%{%d}t"
). This attribute provides an alternative means of looking up directive values besides using the named attributes.
- entry
The original logfile entry with trailing newlines removed
- format
The entry’s log format string
- apachelogs.parse(format, entry, encoding='iso-8859-1', errors=None)[source]
A convenience function for parsing a single logfile entry without having to directly create a
LogParser
object.encoding
anderrors
have the same meaning as forLogParser
.
- apachelogs.parse_lines(format, entries, encoding='iso-8859-1', errors=None, ignore_invalid=False)[source]
A convenience function for parsing an iterable of logfile entries without having to directly create a
LogParser
object.encoding
anderrors
have the same meaning as forLogParser
.ignore_invalid
has the same meaning as forLogParser.parse_lines()
.