In big data scenarios that require high concurrency, effective analysis of Java error logs can reduce the O&M costs of Java applications. You can use Log Service to collect Java error logs from Alibaba Cloud services and use the data transformation feature to parse the collected logs.

Prerequisites

The Java error logs of Log Service, Object Storage Service (OSS), Server Load Balancer (SLB), and ApsaraDB RDS are collected and stored in a Logstore named cloud_product_error_log. For more information, see Use Logtail to collect logs.

Scenarios

For example, you have developed Java Application A by using multiple Alibaba Cloud services, such as OSS and Log Service. You have created a Logstore named cloud_product_erro_log in the China (Hangzhou) region to store Java error logs that are generated when you call the API operations of the Alibaba Cloud services. To fix Java errors in an efficient manner, you need to use Log Service to analyze the Java error logs at regular intervals.

To meet the preceding requirements, you must parse the log time, error code, status code, service name, error message, request method, and error line number from the collected logs, and then send the parsed logs to the Logstore of each cloud service for error analysis.

The following example shows a raw log:

__source__:192.0.2.10
__tag__:__client_ip__:203.0.113.10
__tag__:__receive_time__:1591957901
__topic__:
message: 2021-05-15 16:43:35 ParameterInvalid 400
com.aliyun.openservices.log.exception.LogException:The body is not valid json string.
   at com.aliyun.openservice.log.Client.ErrorCheck(Client.java:2161)
   at com.aliyun.openservice.log.Client.SendData(Client.java:2312)
   at com.aliyun.openservice.log.Client.PullLogsk(Client.java:1397)
   at com.aliyun.openservice.log.Client.SendData(Client.java:2265)
   at com.aliyun.openservice.log.Client.GetCursor(Client.java:1123)
   at com.aliyun.openservice.log.Client.PullLogs(Client.java:2161)
   at com.aliyun.openservice.log.Client.ErrorCheck(Client.java:2426)
   at transformEvent.main(transformEvent.java:2559)

Procedure

The error logs of Application A are collected by using Logtail and are stored in the cloud_product_error_log Logstore. Then, the error logs are transformed and the transformed logs are sent to the Logstore of each cloud service for error analysis. The procedure consists of the following steps:summary-flow-en
  1. Design a data transformation statement: In this step, analyze the transformation logic and write a transformation statement.
  2. Create a data transformation task: In this step, send logs to different Logstores of cloud services for error analysis.
  3. Query and analyze data: In this step, analyze error logs in the Logstore of each cloud service.

Step 1: Design a data transformation statement

Transformation procedure

To analyze error logs in a convenient manner, you must complete the following operations:
  1. Extract the log time, error code, status code, service name, error message, request method, and error line number from the message field.
  2. Send error logs to the Logstore of each cloud service.
etl-needs-en

Transformation logic

In this case, you must analyze the log time, error code, status code, service name, error message, request method, and error line number in the raw log field, and then design regular expressions for each field that you want to extract. etl-logic

Syntax description

  1. Use the regex_match function to match logs that contain LogException. For more information, see regex_match.
  2. If a log contains LogException, the log is transformed based on the transformation rule of Log Service error logs. If a log contains OSSException, the log is transformed based on the transformation rule of OSS error logs. For more information, see e_switch.
  3. Use the e_regex function to parse error logs for each cloud service. For more information, see e_regex.
  4. Delete the message field and send error logs to the Logstore of the corresponding cloud service. For more information, see e_drop_fields and e_output and e_coutput.
  5. For more information, see the Group section in Regular expressions.

Transformation statement syntax

The following example shows the specific syntax of a data transformation statement:
e_switch(
    regex_match(v("message"), r"LogException"),
    e_compose(
        e_regex(
            "message",
            "(?P<data_time>\S+\s\S+)\s(?P<error_code>[a-zA-Z]+)\s(?P<status>[0-9]+)\scom\.aliyun\.openservices\.log\.exception\.(?P<product_exception>[a-zA-Z]+)\:(?P<error_message>[a-zA-Z0-9:,\-\s]+)\.(\s+\S+\s\S+){5}\s+\S+\scom\.aliyun\.openservices\.log\.Client\.(?P<method>[a-zA-Z]+)\S+\s+\S+\stransformEvent\.main\(transformEvent\.java\:(?P<error_line>[0-9]+)\)",
        ),
        e_drop_fields("message"),
        e_output("sls-error"),
    ),
    regex_match(v("message"), r"OSSException"),
    e_compose(
        e_regex(
            "message",
            "(?P<data_time>\S+\s\S+)\scom\.aliyun\.oss\.(?P<product_exception>[a-zA-Z]+)\:(?P<error_message>[a-zA-Z0-9,\s]+)\.\n\[ErrorCode\]\:\s(?P<error_code>[a-zA-Z]+)\n\[RequestId\]\:\s(?P<request_id>[a-zA-Z0-9]+)\n\[HostId\]\:\s(?P<host_id>[a-zA-Z-.]+)\n\S+\n\S+(\s\S+){3}\n\s+\S+\s+(.+)(\s+\S+){24}\scom\.aliyun\.oss\.OSSClient\.(?P<method>[a-zA-Z]+)\S+\s+\S+\stransformEvent\.main\(transformEvent\.java:(?P<error_line>[0-9]+)\)",
        ),
        e_drop_fields("message"),
        e_output("oss-error"),
    ),
)

Step 2: Create a data transformation task

  1. Go to the data transformation page.
    1. In the Projects section, click the name of the project that you want to view.
    2. Choose Log Storage > Logstores. On the Logstores tab, click the Logstore that you want to view.
    3. On the Search & Analysis page, click Data Transformation.
  2. In the upper-right corner of the page, specify a time range for the required log data.
    Make sure that log data exists on the Raw Logs tab.
  3. In the edit box, enter the following data transformation statement:
    e_switch(
        regex_match(v("message"), r"LogException"),
        e_compose(
            e_regex(
                "message",
                "(?P<data_time>\S+\s\S+)\s(?P<error_code>[a-zA-Z]+)\s(?P<status>[0-9]+)\scom\.aliyun\.openservices\.log\.exception\.(?P<product_exception>[a-zA-Z]+)\:(?P<error_message>[a-zA-Z0-9:,\-\s]+)\.(\s+\S+\s\S+){5}\s+\S+\scom\.aliyun\.openservices\.log\.Client\.(?P<method>[a-zA-Z]+)\S+\s+\S+\stransformEvent\.main\(transformEvent\.java\:(?P<error_line>[0-9]+)\)",
            ),
            e_drop_fields("message"),
            e_output("sls-error"),
        ),
        regex_match(v("message"), r"OSSException"),
        e_compose(
            e_regex(
                "message",
                "(?P<data_time>\S+\s\S+)\scom\.aliyun\.oss\.(?P<product_exception>[a-zA-Z]+)\:(?P<error_message>[a-zA-Z0-9,\s]+)\.\n\[ErrorCode\]\:\s(?P<error_code>[a-zA-Z]+)\n\[RequestId\]\:\s(?P<request_id>[a-zA-Z0-9]+)\n\[HostId\]\:\s(?P<host_id>[a-zA-Z-.]+)\n\S+\n\S+(\s\S+){3}\n\s+\S+\s+(.+)(\s+\S+){24}\scom\.aliyun\.oss\.OSSClient\.(?P<method>[a-zA-Z]+)\S+\s+\S+\stransformEvent\.main\(transformEvent\.java:(?P<error_line>[0-9]+)\)",
            ),
            e_drop_fields("message"),
            e_output("oss-error"),
        ),
    )
  4. Click Preview Data.
    preview data
  5. Create a data transformation task.
    1. Click Save as Transformation Rule.
    2. In the Create Data Transformation Rule panel, set the parameters and click OK. The following table describes the parameters.
      Parameter Description
      Rule Name The name of the transformation rule, for example, test.
      Authorization Method Select Default Role to read data from the source Logstore.
      Storage Target
      Target Name The name of the storage target, for example, sls-error or oss-error.
      Target Region The region where the destination project resides, for example, China (Hangzhou).
      Target Project The name of the destination project to which transformed data is saved.
      Target Logstore The name of the destination Logstore to which transformed data is saved, for example, sls-error or oss-error.
      Authorization Method Select Default Role to write transformation results to the destination Logstore.
      Processing Range
      Time Range Select All.

    After you create a data transformation task, a dashboard is automatically created for the task. You can view the metrics of the task on the dashboard.

    On the Exception detail chart, you can view the logs that failed to be parsed, and then modify the regular expression.
    • If a log fails to be parsed, you can specify the severity of the log as WARNING to report the log. The data transformation task continues running.
    • If you specify the severity of the log as ERROR to report the log, the data transformation task stops running. In this case, you must identify the cause of the error and modify the regular expression until the data transformation task can parse all required types of error logs.

Step 3: Analyze error logs

After raw error logs are transformed, you can analyze the error logs. In this example, only the Java error logs of Log Service are analyzed.

  1. In the Projects section, click the name of the project that you want to view.
  2. Choose Log Storage > Logstores. On the Logstores tab, click the Logstore that you want to view.
  3. Enter a query statement in the search box.
    • To calculate the number of errors for each request method, execute the following query statement:
      * | SELECT COUNT(method) as m_ct, method GROUP BY method
    • To calculate the number of occurrences of each error message for the PutLogs API operation, execute the following query statement:
      * | SELECT error_message,COUNT(error_message) as ct_msg, method WHERE method LIKE 'PutLogs' GROUP BY error_message,method
    • To calculate the number of occurrences for each error code, execute the following query statement:
      * | SELECT error_code,COUNT(error_code) as count_code GROUP BY error_code
    • To query the error information of each request method by log time, execute the following query statement:
      * | SELECT date_format(data_time, '%Y-%m-%d %H:%m:%s') as date_time,status,product_exception,error_line, error_message,method ORDER BY date_time desc
  4. On the Search & Analysis page, click 15Minutes(Relative) to specify a time range.
    You can select a relative time or a time frame. You can also specify a custom time range.
    Note The query results may contain logs that are generated 1 minute earlier or later than the specified time range.
  5. Click Search & Analyze to view the query and analysis result.