add logparser module by hamedbrd · Pull Request #4900 · netdata/netdata · GitHub
Skip to content

add logparser module#4900

Closed
hamedbrd wants to merge 16 commits into
netdata:masterfrom
hamedbrd:logparser
Closed

add logparser module#4900
hamedbrd wants to merge 16 commits into
netdata:masterfrom
hamedbrd:logparser

Conversation

@hamedbrd

@hamedbrd hamedbrd commented Dec 3, 2018

Copy link
Copy Markdown
Contributor

Fix: #3729
This PR is the same of this PR
#4334

But I just forked the latest version and add my new plugin in it

I added this plugin in collectors/python.d.plugin/logparser/ which it was completely different structure in previous version .

This module is able to monitor an application specific log file and then create a chart based on occurrences of a log line (like amount of occurrences in a day, or a line graph of when the occurrences happen).It has no limitation to have multi dimension and multi charts.

Config patterns

chart_name:
    log_path: path/log/file
    dimensions:
      dimension: regex_pattern

Above config show how to define your charts and how to fetching metrics from custom log files.
For each dimension in each chart must one regex be written in order to fetch those matches from that log file.It allows us to define whatever charts we want to show in dashboard.

A final config for more than one chart and more than one dimension could be something like this

chart1_name:
    log_path: /path/logs/log.file
    dimensions:
      dimension_name1: .*GET.*
      dimension_name2: .*POST.*
      dimension_name3: .*PATCH.*
chart2_name:
    log_path: /path/logs/log2.file
    dimensions:
      dimension_name1: [0-9]+
      dimension_name2: [A-Z]+
      dimension_name3: [a-z]+

@ilyam8

ilyam8 commented Dec 3, 2018

Copy link
Copy Markdown
Member

@hamedbrd

hamedbrd commented Dec 3, 2018

Copy link
Copy Markdown
Contributor Author

@ilyam8 could explain it a bit more
do you mean instead of

regex.search(line)

It's better to use

regex.match(line)

Am i right?

@ilyam8

ilyam8 commented Dec 3, 2018

Copy link
Copy Markdown
Member

It depends on the method/value.

Example

foo:
  bar1: 'string=GET'
  bar2: 'string=^GOT'
  bar3: 'regex=G[QWERTY]T'

You should parse value and use appropriate matcher for that value.

We need a matcher. It has only one method match(self, row) which returns True/False.

We can use any logic for matchig - value in row, row.startswith(value), value.match(row) - all of them return bool.

Matchers:

Matchers example

class BaseStringMatcher:
    def __init__(self, value):
        self.value = value

    def match(self, row):
        raise NotImplementedError


class BaseRegexMatcher:
    def __init__(self, value):
        self.value = re.compile(value)

    def match(self, row):
        raise NotImplementedError


class StringMatcher(BaseStringMatcher):

    def match(self, row):
        return self.value in row


class StringPrefixMatcher(BaseStringMatcher):

    def match(self, row):
        return row.startswith(self.value)


class StringSuffixMatcher(BaseStringMatcher):

    def match(self, row):
        return row.endswith(self.value)


class RegexMatchMatcher(BaseRegexMatcher):

    def match(self, row):
        return self.value.match(row)


class RegexSearchMatcher(BaseRegexMatcher):

    def match(self, row):
        return self.value.search(row)

Matcher factory

METHOD_REGEX = "regex"
METHOD_STRING = "string"


def regex_matcher_factory(value):
    if value.startswith('^'):
        return RegexMatchMatcher(value)
    return RegexSearchMatcher(value)


def string_matcher_factory(value):
    if value.startswith("^"):
        return StringPrefixMatcher(value)
    elif value.endswith('$'):
        return StringSuffixMatcher(value)
    return StringMatcher(value)


def matcher_factory(raw_value):
    method, value = raw_value.split("=")

    if method == METHOD_REGEX:
        return regex_matcher_factory(value)

    if method == METHOD_STRING:
        return string_matcher_factory(value)

    raise ValueError('unknown search method')

@netdatabot netdatabot added area/collectors Everything related to data collection area/docs area/external/python labels Dec 3, 2018
@ilyam8

ilyam8 commented Dec 3, 2018

Copy link
Copy Markdown
Member

@hamedbrd

if you need any help with the module feel free to ask

@paulfantom paulfantom added the no changelog Issues which are not going to be added to changelog label Dec 4, 2018
@ilyam8

ilyam8 commented Dec 5, 2018

Copy link
Copy Markdown
Member

@hamedbrd if you don't have time to finish it i can take over, np

@ilyam8 ilyam8 modified the milestone: v1.12-rc1 Dec 6, 2018
cakrit
cakrit previously requested changes Dec 6, 2018

@cakrit cakrit left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In readme and config instructions,

  1. Correct first paragraph to:
    This module is able to monitor an application specific log file and then create a chart based on occurrences of a log line (like amount of occurrences in a day, or a line graph of when the occurrences happen). It can read multiple files and produce multiple charts with multiple dimensions on each chart.

  2. Rename "Config patterns" to "Configuration" and write the following before the final example.

By default, the plugin does not read any files or produce any charts. To configure it, edit python.d/logparser.conf.

The sample config below shows the general definition of a chart named chart_name with a single dimension with name dimension_name. The metric for that dimension is a counter of the occurrences of lines matching the python regular expression regex_pattern in file path/logfile:

chart_name:
   log_path: path/logfile
   dimensions:
     dimension_name: regex_pattern

A final config for more than one chart and more than one dimension could be something like this

@hamedbrd

Copy link
Copy Markdown
Contributor Author

@ilyam8 as you guessed ,unfortunately i do not have time for these changes and also i have some other changes in my mind for this plugin to being able to do something more but no free time.
I will update it in next couple of weeks

So i would appreciate if you help on this

@ilyam8

ilyam8 commented Dec 10, 2018

Copy link
Copy Markdown
Member

@hamedbrd so what help do you need? do you want me to add this directly to your PR?

@cakrit cakrit dismissed their stale review December 10, 2018 08:53

Comments updated

@hamedbrd

Copy link
Copy Markdown
Contributor Author

@ilyam8 could you please take look at new changes and if you think there is not problem then i will update the readme?

@ilyam8 ilyam8 changed the title add logparser module [wip] add logparser module Dec 10, 2018
@netdatabot

Copy link
Copy Markdown
Member

This pull request introduces 1 alert when merging 13f5c48 into f1bb78a - view on LGTM.com

new alerts:

  • 1 for Unused import

Comment posted by LGTM.com

@ilyam8

ilyam8 commented Dec 10, 2018

Copy link
Copy Markdown
Member

@hamedbrd logic should be

  1. create [(dim_name, matcher), ...] list in check (use namedtuple, so it will be list of namedtuples), populate self.data with dimension names zero values, create charts, etc.

  2. iterate over dim_name, mather list in get_data

for row in raw:
    for m in self.matchers:
        if m.match(row):
            self.data[m.name] += 1
            break

return self.data

example^^

@hamedbrd

hamedbrd commented Dec 12, 2018

Copy link
Copy Markdown
Contributor Author

@ilyam8
I can not get your last feedback.
What do you mean?Could you please tell me more?

@hamedbrd

Copy link
Copy Markdown
Contributor Author

@hamedbrd have you tested lastest version? please confirm that it works.

Yes, I've test it many times because it's used on different services

@ilyam8

ilyam8 commented Dec 31, 2018

Copy link
Copy Markdown
Member

i mean the latest, becase we made few changes in the last hour.

@netdata netdata deleted a comment Dec 31, 2018
@ilyam8

ilyam8 commented Dec 31, 2018

Copy link
Copy Markdown
Member

@cakrit please have a look, i see you had some questions

code looks ok to me

Comment thread collectors/python.d.plugin/logparser/logparser.conf Outdated
@hamedbrd

hamedbrd commented Dec 31, 2018

Copy link
Copy Markdown
Contributor Author

i mean the latest, becase we made few changes in the last hour.
Yes, each time i ran the latest changes to assure it worked well

@ilyam8

ilyam8 commented Dec 31, 2018

Copy link
Copy Markdown
Member

@hamedbrd

One more - please add default info to the readme

after that - lgtm

@netdata netdata deleted a comment Dec 31, 2018
@cakrit

cakrit commented Jan 1, 2019

Copy link
Copy Markdown
Contributor

Ok, so what I still don't see here is a time element. What I got from the code is that we just have an incremental counter for each pattern that starts at zero every time netdata is restarted and just keeps increasing with every matching line. However, the metric unit states jobs/s, so perhaps I missed something here?

@ilyam8

ilyam8 commented Jan 1, 2019

Copy link
Copy Markdown
Member

jobs/s

imo should be matches/s or something

What I got from the code is that we just have an incremental counter for each pattern that starts at zero every time netdata is restarted and just keeps increasing with every matching line.

Yes
#3729 (comment)

@cakrit

cakrit commented Jan 1, 2019

Copy link
Copy Markdown
Contributor

So what I'm saying is that we don't actually have a way here to show e.g. the occurences in a day, as the issue and the README say we want to.
I guess it's the best we can do though.

@ilyam8

ilyam8 commented Jan 1, 2019

Copy link
Copy Markdown
Member

The issue has or

and then create a chart based on occurences of a log line (like amount of occurences in a day, or a line graph of when the occurences happen).

I think the PR implements the part after or 😄 So the issue isfixed after this PR will be merged.

@hamedbrd

Copy link
Copy Markdown
Contributor Author

@ilyam8 @cakrit
What more does it need to be merged?

@cakrit

cakrit commented Jan 26, 2019

Copy link
Copy Markdown
Contributor

@hamedbrd please remove the [wip] from the title, to signify it's ready. @ilyam8 please provide the final review after the automated checks are completed again

@hamedbrd hamedbrd changed the title [wip] add logparser module add logparser module Jan 27, 2019
@hamedbrd

hamedbrd commented Feb 5, 2019

Copy link
Copy Markdown
Contributor Author

@ilyam8 any update?

@ilyam8

ilyam8 commented Feb 5, 2019

Copy link
Copy Markdown
Member

@hamedbrd sorry for delay. The module is finished, good job 👍

But we will rewrite it in go (netdata has go.d.plugin) (you can do it yourself btw if you willing)

We need to wait untis this PR is merged.

And we use new syntax for all that string/regex stuff:
https://github.com/netdata/go.d.plugin/tree/master/pkg/matcher#supported-format

@hamedbrd

hamedbrd commented Feb 5, 2019

Copy link
Copy Markdown
Contributor Author

@ilyam8 For rewriting to go, I will take over this part too but you won't merge this MR, will you?
At the moment I don't have time for this rewrite but definitely I will do it asap.

@ilyam8

ilyam8 commented Feb 5, 2019

Copy link
Copy Markdown
Member

@hamedbrd

you won't merge this MR, will you?

No.

At the moment I don't have time for this rewrite but definitely I will do it asap.

Nice! As i said lets wait for netdata/go.d.plugin#141.

I will ping you when it will be merged.

@cakrit

cakrit commented May 29, 2019

Copy link
Copy Markdown
Contributor

@hamedbrd you can follow netdata/go.d.plugin#141. I'm closing this one.

@anujw

anujw commented Jun 21, 2021

Copy link
Copy Markdown

@Dim-P Dim-P mentioned this pull request Aug 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/collectors Everything related to data collection area/docs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Monitor application log files

6 participants