Until now, we have used the model to get a detailed picture of the status of our service. Status reporting incides on that specific aspect by looking at each and every entity discovered using the rules defined in the model as we configured it, calculating its entire status, and correlating all that information to form a summarized view of its overall health. The concept of incidents on Service Operations focuses more on trying to determine the causes of the detected issues, as well as their respective impact—if any—on the service.
In the model we created and that we have been using throughout these sections of the documentation there are two main points to look at when dealing with incidents. Firstly, any detected incident will be listed in the upper right corner of the service overview module, just as the status of the map is presented on the left-hand side of the screen. Furthermore, there is the specific incidents viewer module accessible from the main menu of Service Operations.
Hover on the IMPACT: UNKNOWN element and expand it by clicking on the arrow button. Service Operations will display the full list of incidents identified within the specified time range.
Incidents in Service Operations are associated with an impact level based upon the how those entities potentially affect a number of elements (e.g., end-users). For that impact level to be properly calculated, impact queries have to be defined in the model so that Service Operations can determine how many and which ones are affected by any reported issue. To illustrate this, we will define impact queries for the current model following the next steps:
Head to the administration section of Service Operations and load the configuration of the e-commerce model.
On the model map, click on application module and, in the entity details form, click on the impact header to show that subsection.
Set the following configuration:
Issue symptom: The application module reports a high number of 50x errors.
Next best action 1: Check servers status.
Next best action 2: Verify load balancing policies.
Impact evaluation query:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
from demo.ecommerce.data where isnotnull(clientIpAddress) select str(clientIpAddress) as clientIp select decode(true, uri->"addtocart","addtocart", uri->"purchase","purchase", uri->"product.screen","product_details", uri->"category.screen","category_details", uri->"view","checkout", "browse") as applicationModule select ifthenelse(statusCode>=500,1.0,0.0) as applicationError group every 1h by applicationModule,clientIp,userAgent select sum(applicationError) as isError where isError>0 group every 1h by applicationModule,userAgent,clientIp select last(isError) as detectedErrors
Apply and then publish to save the configuration. Then, run the model to see how this configuration takes effect.