summaryrefslogtreecommitdiffstats
path: root/doc/mmutf8fix.html
diff options
context:
space:
mode:
Diffstat (limited to 'doc/mmutf8fix.html')
-rw-r--r--doc/mmutf8fix.html87
1 files changed, 87 insertions, 0 deletions
diff --git a/doc/mmutf8fix.html b/doc/mmutf8fix.html
new file mode 100644
index 00000000..c75e71bc
--- /dev/null
+++ b/doc/mmutf8fix.html
@@ -0,0 +1,87 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
+<html><head>
+<meta http-equiv="Content-Language" content="en">
+<title>Fix invalid UTF-8 Sequences (mmutf8fix)</title></head>
+
+<body>
+<a href="rsyslog_conf_modules.html">back</a>
+
+<h1>Fix invalid UTF-8 Sequences (mmutf8fix)</h1>
+<p><b>Module Name:&nbsp;&nbsp;&nbsp; mmutf8fix</b></p>
+<p><b>Author: </b>Rainer Gerhards &lt;rgerhards@adiscon.com&gt;</p>
+<p><b>Available since</b>: 7.5.4</p>
+<p><b>Description</b>:</p>
+<p>The mmutf8fix module permits to fix invalid UTF-8 sequences.
+Most often, such invalid sequences result from syslog sources sending
+in non-UTF character sets, e.g. ISO 8859. As syslog does not have a way
+to convey the character set information, these sequences are not properly
+handled. While they are typically uncritical with plain text files, they can
+cause big headache with database sources as well as systems like ElasticSearch.
+<p>The module is an experiement at "fixing" such encoding problems. It
+begun as a very simple replacer of non-control characters, and actually breaks
+some UTF-8 encoding right now. If the module turns out to be useful, it
+should be enhanced to support modes that really detect invalid UTF8. In the longer term
+it could also be evolved into an any-charset-to-UTF8 converter. But
+first let's see if it really gets into widespread enough use.
+<p>What it currently does is simply replace all US-ASCII control characters
+(characters ouside the range of 32 to 126) by a configured replacement
+character. For forward compatibility, this will remain the default mode
+in the future. However, as said above, more useful modes will be added
+based on user feedback and demand.
+
+<p><b>Proper Usage</b>:</p>
+<p>Some notes are due for proper use of this module. This is a message modification
+module utilizing the action interface, which means you call it like an action.
+This gives great flexibility on the question on when and how to call this module.
+Note that once it has been called, it actually modifies the message. The original
+messsage is then no longer available. However, this does <b>not</b> change any
+properties set, used or extracted before the modification is done.
+<p>One potential use case is to normalize all messages. This is done by simply calling
+mmutf8fix right in front of all other actions.
+<p>If only a specific source (or set of sources) is known to cause problems,
+mmutf8fix can be conditionally called only on messages from them. This also offers
+performance benefits. If such multiple sources exists, it probably is a good idea
+to define different listeners for their incoming traffic, bind them to specific
+<a href="multi_ruleset.html">ruleset</a> and call mmutf8fix as first action in this
+ruleset.
+
+<p><b>Module Configuration Parameters</b>:</p>
+<p>Currently none.
+<p>&nbsp;</p>
+<p><b>Action Confguration Parameters</b>:</p>
+<ul>
+<li><b>replacementChar</b> - default " " (space), a single character<br>
+This is the character that invalid sequences are replaced by.
+</ul>
+
+<p><b>Caveats/Known Bugs:</b>
+<ul>
+<li><b>only IPv4</b> is supported
+</ul>
+
+<p><b>Samples:</b></p>
+<p>In this snippet, we write one file without fixing UTF-8 and another one
+with the message fixed. Note that once mmutf8fix has run, access to the
+original message is no longer possible.
+<p><textarea rows="5" cols="60">module(load="mmutf8fix")
+action(type="omfile" file="/path/to/non-fixed.log")
+action(type="mmutf8fix")
+action(type="omfile" file="/path/to/fixed.log")
+</textarea>
+
+<p>In this sample, we fix only message originating from host 10.0.0.1.
+<p><textarea rows="5" cols="60">module(load="mmutf8fix")
+if $fromhost-ip == "10.0.0.1" then
+ action(type="mmutf8fix")
+# all other actions here...
+</textarea>
+
+<p>[<a href="rsyslog_conf.html">rsyslog.conf overview</a>] [<a href="manual.html">manual
+index</a>] [<a href="http://www.rsyslog.com/">rsyslog site</a>]</p>
+<p><font size="2">This documentation is part of the
+<a href="http://www.rsyslog.com/">rsyslog</a> project.<br>
+Copyright &copy; 2013 by <a href="http://www.gerhards.net/rainer">Rainer Gerhards</a> and
+<a href="http://www.adiscon.com/">Adiscon</a>. Released under the GNU GPL
+version 3 or higher.</font></p>
+
+</body></html>