Log Analysis

Discuss and get help to implement a CacheGuard Gateway into your networks
User avatar
david
Posts: 163
Joined: 08 Aug 2015 20:38

Re: Log Analysis

Post by david »

Hi Miguel

Some forums are not reactive enough and you have wait a bit more before having an answer.
Anyway... Did you try the following Stone Steps Webalizer configuration (clf replaced by apache):

Code: Select all

LogType apache
ApacheLogFormat %h %u %t \"%r\" %>s %b - -
Best Regards,
David Janeway
CacheGuard Technical Team
https://www.cacheguard.com
miguelp
Posts: 46
Joined: 17 Aug 2015 13:06

Re: Log Analysis

Post by miguelp »

Hello,
Thanks!
Yes I tried that option too. Same results, skipping bad record for all records.
Br,
Miguel
User avatar
david
Posts: 163
Joined: 08 Aug 2015 20:38

Re: Log Analysis

Post by david »

Dear Miguel

Maybe we should abandon Stone Steps Webalizer and use another product. Do you know AWStats (http://www.awstats.org/)? Personally I don't like applications written in Perl but it's worth it to test it. What do you think?

Best Regards
David Janeway
CacheGuard Technical Team
https://www.cacheguard.com
miguelp
Posts: 46
Joined: 17 Aug 2015 13:06

Re: Log Analysis

Post by miguelp »

Hello David,

Using this configuration for http://www.awstats.org/
LogFormat=1
LogFormat = "%host %logname %time1 %methodurl %code %bytesd %other %other
It works. Thanks!

Since awstats is more oriented for web servers logs, I cannot get any statistics by authenticated users (%logname).
I will study this with more detail.

Anyways, I was thinking that it would be really useful that you could configure CG to output a log, separated by | for example. In this way this logs can be easily loaded into Hadoop/Pig for example, that I suppose is pretty common now a days.

192.168.110.4|prueba|[07/Sep/2015:22:57:02 +0300]|"POST http://ocsp.digicert.com/ HTTP/1.1"|200|861|TCP_MISS|HIER_DIRECT

Let you know if I can solve the user statistics issue.
Thanks,
Miguel
User avatar
david
Posts: 163
Joined: 08 Aug 2015 20:38

Re: Log Analysis

Post by david »

Hi Miguel

Don't you think that Hadoop/Pig can be configured to accept a white space as a delimiter?

Best Regards,
David Janeway
CacheGuard Technical Team
https://www.cacheguard.com
miguelp
Posts: 46
Joined: 17 Aug 2015 13:06

Re: Log Analysis

Post by miguelp »

Hi,
For doing a quick, one line load like:

web_usage_logs = load 'cache_guardlog' using PigStorage(' ');

Won't work because there area other areas with spaces, like in the time stamp.
Also the time stamp is not identified as text with quotes ", but with [].
Off course, it can be done but you need to write more code.

If the separator would be '|', then you can load with one line:
web_usage_logs = load 'cache_guardlog' using PigStorage('|');

Br,
Mgiguel
User avatar
david
Posts: 163
Joined: 08 Aug 2015 20:38

Re: Log Analysis

Post by david »

Hi Miguel

We can consider your proposition to add a feature to CG that allows to customise the delimiter character in logs (that shouldn't take us a long time). But what about situations where a URL contains that customised delimiter character (for instance the pipe character)?

Best Regards,
David Janeway
CacheGuard Technical Team
https://www.cacheguard.com
miguelp
Posts: 46
Joined: 17 Aug 2015 13:06

Re: Log Analysis

Post by miguelp »

Hi,
Then it can be TAB character, I do not think URLs will contain TABs or ?
Good catch.
Thanks,
Miguel
User avatar
david
Posts: 163
Joined: 08 Aug 2015 20:38

Re: Log Analysis

Post by david »

Hi Miguel

Actually a URL may contain a <TAB> but it's encoded with %09 so the <TAB> character could be a good candidate to replace the white space as the default delimiter in CacheGuard log files. Thanks for the idea :-)

In a future version we can consider the possibility to have customizable delimiters in log files. But this is not really compatible with we the spirit of CacheGuard which is to remain as simple/functional as possible. The decision requires discussion...

In the meantime, we can replace white spaces by <TAB> characters in log files and add this feature to the future maintenance release version. Do you think that could resolve your issues? What about the integration with Webalizer, Stone Steps Webalizer and Hadoop/Pig?

Best Regards,
David Janeway
CacheGuard Technical Team
https://www.cacheguard.com
miguelp
Posts: 46
Joined: 17 Aug 2015 13:06

Re: Log Analysis

Post by miguelp »

Hi,
The status so far is:

Webalizer, it works but cannot get the username, still ckecking.
Stone Steps, they released today 14.09.2015 a patch, I will check.
Hadoop/Pig : Custom code to load needed.

If you replace by Tab:
Webalizer, I will try.
Stone Steps, won't work
Hadoop/Pig: Can be easily loaded.

But in my opinion, I think it should be configurable.

Option Classic (like it is now)
For classic (or old tools) webalizer, Stone Steps, etc.

Option Custom(tab or pipe)
For newer tools (hadoop/pig, power bi, etc.).

If we choose pipe, and there's a pipe in the url, it won't break it since the ulr is between quotes ".

Br,
Miguel
Post Reply