Defect Triage Meetings

Defect Management

Triage Meetings - Purpose and Process

If you've been following along with these blogs from the beginning, you have a good high level understanding of what Defect Management is (from Blog #1), how to set up your defect repository (from Blog #2) and know what is needed in order to train your eager team (from Blog #3). By now, Everyone is testing and creating defects, but they are just being dumped into the defect repository tool and no one is working on them in a timely manner. What's the solution here? Triage Meetings!

But just what is a triage meeting, how do you run it, what gets accomplished during it, and what should you expect from it?

The Triage Meeting

    What is a Triage Meeting?

    A triage meeting, like triage of patients in a hospital, looks at defects, analyzes them, and routes them to the appropriate department to be taken care of. Some questions you might have about it:

    • What's Addressed, What's the Goal?
    • Who Organizes, Who Attends?
    • How Often, How Long?
    • What are the Take-Aways?

    1. What's Addressed, What's the Goal?

    Just what are we supposed to talk about during the triage meeting? What are we driving for?

    First of all, there are no SET rules for a triage other than what the Defect Manager, or the individual in that position to run the triage, sets, and this can even change on a daily basis. Below are some of the more common discussion agenda:

    • Review any NEW defects that have been logged since the last triage meeting, from highest to lowest severity, depending on duration of the meeting. If you find that defects are coming in too quickly to handle in a 60 minute triage, extend them to 90, or have 2 triage meetings per day (same observation goes for ANY of the agendas you choose to follow)
    • Review all open Critical and High severity defects
    • Review defects from oldest to newest, working to address those that have been hanging around for a while (they may or may not still be valid defects, that is what the discussion is about), to get them fixed or deferred or closed if possible
    • Identify specific applications or application functionalities that are having more issues that others, and focus on getting through them
    • Focus on defects identified specifically from the testing of the last code build delivered to testers
    • Focus on Production issues found after a given deployment
    • Have an open forum and ask everyone (a day or two before a focused triage) if there are any specific defects they are having issues with, having trouble moving forward, lost in a dark hole, needing help from others in the triage team, etc.
    • Defects that have been identified as incorrectly logged, fields incorrectly filled, etc. These lists may be identified by regularly querying the defects by field, sorting, filtering, etc., looking for anomalies - testing environments that don't agree with their cycles or releases, owners (department heads in charge of functional areas) that don't agree with the application area the defect is assigned to, etc.
    • … you can see that the defect manager can have a pretty free range of areas to cover, and even switch it up regularly if one agenda is no longer a reasonable focus. For instance, if defects are no longer pouring in, and the defect manager realizes that running the triage like the first bullet point only takes 5 minutes, and if following the agenda for the second bullet point they find they are running over the same ones over and over again, maybe throw something new in the agenda and follow the third bullet point, oldest to newest.

    The goal of ALL of these agendas is to find errors on the defects and make corrections, to make sure they are assigned out to the appropriate resource (sometimes a developer may forget to assign to the tester when they are done, or vice versa), and to make sure that comments from the resource working on the defect at the time are up to date. The comment section is the guide to the team that a defect is being worked or not, is moving forward to resolution or not. If comments are not being filled out (last one is several days old), this is a red flag that it may have dropped through the cracks and is not currently being addressed.

    This is accomplished throughout all triage meetings through discussions with the team, coming, usually quickly, to a consensus. Longer discussions may ensue, and this is a GOOD thing if a solution is in sight and the discussion does not begin to take over the triage. If the defect manager sees that happening, it is up to them to suggest that this particular discussion be taken offline, and to get a resolution estimate entered into the comments ("Team A and B will discuss this afternoon and update defect comments for tomorrow's triage.")

    2. Who Organizes, Who Attends?

    Usually the person in the role of the Defect Manager is the one who at least organizes the meetings, if not actually runs them. They will determine the scope and frequency of the meetings, lay down the meeting structure (the defects that will be discussed (most critical, newest, oldest, by department, by function, etc.)), decide who should attend, decide if discussion needs to continue on resolving a defect, or if discussion needs to be taken offline, etc. The defect manager will be charged with setting up, cancelling or moving the triage meetings, and sending out information (metrics on what was discussed, decisions made, next steps, etc. (whatever is deemed to be needed for that group to act on)) to the attendees and/or Sr. Management.

    Okay, now just who SHOULD attend? Not a simple answer, as it will depend on availability, size of the teams, and especially the focus of the meetings. Groups you might consider are:

    • Sr. Management, if they are interested - they may have an interest in the beginning, then once they know Defect Triages are being run well, they may bow out
    • Department Heads (Directors, perhaps, or heads of the functional areas of the organization) - more likely to be involved longer term as the defects are addressing and effecting their specific departments
    • Development Leads - probably the most important, as the individuals in this role SHOULD know everything that is going in about the development process, and understand where the defects are occurring in the application, who knows the code the best, who it should be assigned to, what progress the developers have made on each defect, etc.
    • Test Leads - also very important, as these individuals should know when, where and how these defects occurred, be able to answer general questions, assign back to testers to retest, etc.
    • Developers and Testers - except in specific cases (exploring a specific set of defects belonging to one or more developers or testers), the developers and testers should continue developing and testing rather than spending time in meetings. HOWEVER, when addressing a specific set of defects, it can be invaluable to have the tester who discovered the defect, and the developer working on it to be in the same meeting together, able to talk through issues if needed, to more quickly come to a resolution, with the help of others in the meeting.

    If the focus is to quickly review and assign new defects out to the appropriate team/developer, the attendees can be team leads that know the systems and know their developers, have read and understood each of the defects in their area before the triage, and can assign them to the appropriate person. If the focus is to review old defects that are not being worked on as quickly as expected, the attendees should include the developers or testers that the defects belong to, or the team leads that know exactly what is happening (or NOT happening) with the defects, and can either defend why progress is not being made on them, or state definitively that they will be addressed right away.

    3. How Often, How Long?

    As mentioned in the last point, this too depends on the size of the project, the number and velocity of the defects. How long is probably easier to answer than how often. How duration of a triage meeting should probably be no longer than an hour, 90 minutes if there are lots of defects that need to be addressed. Even if the number of open defects is fairly high, the Critical severity, and usually even the High severity defects can be addressed in a 60-minute triage. If you find that the agenda you have chosen to address is taking longer, each day, than a 90-minute max, then you should probably think about adjusting the agenda to not be so all-encompassing, or if it is indeed just a matter of a LOT of defects to review, then set up two separate meetings during the day until you can get them under control. For instance, you might have a morning triage to review everything from the prior evening and morning, then another one mid-afternoon to review everything that has been logged since the morning meeting. OR, if you are, say reviewing all defect, oldest to newest, you might just take a reasonable chunk each meeting until you feel you have addressed the oldest, and until the team understands their duties to, on their own, take care of these.

    Keep in mind that these triage meetings NEED to be seen by both the rank and file and the senior management as beneficial and working toward a goal rather than being a burden to everyone involved. They should be walking away from those triage meetings not feeling that it was a waste of time, but rather that the whole team is getting a better and better handle on the wayward defects.

    4. What are the Take-Aways?

    Wonderful! Okay, we've just finished up with the last triage meeting. Now what? What should the resources in each of the main groups be doing for the rest of the day (to address the defects, not considering all the rest of their daily tasks)?

    • " Sr. Management - The Senior Management, if they are still involved, should be mostly be interested in progress metrics so that they can track, at the highest level, the progress that is being against the existing defects, the rate of defects coming in from development, and the rate of testing being performed. Some random examples below with example charts:
      • a. How many defects are coming in on a daily basis? How many defects are being closed on a daily basis? How many defects are open currently on a daily basis?

        Defect Management

      • b. How OLD, on average, and by Severity, are the open defects?

        Defect Management

      • c. Status of defects on a daily basis

        Defect Management

      • d. Simple trend of open Critical and High defect by day

        Defect Management

      • e. Defects by Status by Severity

        Defect Management

      • f. SO many more than this are possible, but this might give you an idea of what can be presented to the Senior Management to start the ideas coming
    • Department Heads - The department heads should be at each of the triage meetings. Between the meetings, the department heads need to be making sure they are clearing the path for the rest of the team to be able to resolve issues, get defects fixed, getting them ready for testing. Are environments down? They need to drive to spin them back up. Is there dirty data or other data issues? They need to make sure data is fixed and flows correctly. Are there a huge amount of defects coming from the developers, or from a specific development team? They need to address WHY (poor development processes, no unit testing, etc.). This level of resource clears the path for others to get the job done.
    • Development Leads - The development leads track the progress of each defect as it goes through the defect workflow from being assigned to a developer to be fixed, to the reassignment of the defect, after the fix, to the tester to retest. They need to know the status of each of the defects being worked on by their developers, so as to avoid the need for a bevy of developers to join the triage meetings.
    • Test Leads - Test Leads take the same position as the development leads; they need to know the status of all of the defects that have been fixed, but still need to be, or are in the middle of being, tested. They should be able to answer what is happening to any defect in a "Ready to Test" or "Retest Failed" status (or whatever the equivalent is for your system) - WHY is it not tested and closed yet? Has the tester simply not gotten to it yet, and if so, when will they be able to address it. If the retest of the fixed defect failed, why and how did it fail? Has the tester reached out to the developer to explain the issues, to go over the retest and explain what happened?
    • Developers and Testers - Developers and Testers should usually only join triage meetings if the team, in a previous triage meeting, had determined that one or more defects need direct attention of a developer and/or a tester in order to help resolve it quickly.
    • Defect Manager - The Defect Manager is a busy one between triage meetings. They need to be doing, among other things:
      • a. Reviewing newly detected and logged defects for inconsistencies, making corrections where obvious, taking notes to bring up during the next triage
      • b. Scouring the data for anomalies to bring up during future triages
      • c. Scouring the data for trends to report up to Sr. Management
      • d. Creating daily metrics on defects to report up to Sr. Management and the rest of the teams
      • e. Admin work on updating the defect management tool as needed/requested
      • f. Admin work on adding/removing/unlocking users to the system

    5. Tracking Timely Refactoring

    SLAs, Proposed Resolution Date, Metrics

    So now that the defects are flowing in, how do you ensure that they are being addressed in a timely fashion?

    Three major ways: 1) Creating and socializing an expected due date for defect refactoring, 2) Capturing, on each defect, the proposed due date for completion (may or may not be different that the expected due date from #1), and 3) creating metrics to identify the aging of defects throughout the refactoring process, alerting Sr. Management of issues.

    1. Create SLAs (Service Level Agreements) for defect refactoring, depending on defect severity, so that everyone knows what is expected. Reasonable refactoring time might be 1 day for Critical severity, 2 days for High severity, 5 days for Medium severity and 10 days for Low severity. Note that these times may work well for some internal development teams, or for smaller, less complex applications, but may have to be increased a bit if the applications are more complex, or if your development teams are offsite, 3rd-party companies with their own sets of processes, or if the 3rd-party company is modifying code for which you are only one of many clients using that shared code (meaning that the company can't just change code willy-nilly that would affect the base code for many other clients). Their process would include their own internal research, coding, unit testing and rollout schedule that just may not fit within your SLAs requested.
    2. One of the fields on each defect might be something like "Proposed Resolution Date", where, during the first triage of the defect, the development team estimates when they can deliver the fix back to the defect creator for retest.
    3. Metrics can be pulled for Sr. Management by comparing Open/Close time of defects against either/both of the SLA by severity or the Proposed Resolution Date, and can reveal any major differences between the hoped for resolution time and the actual resolution time. Management can determine, given these results, if action needs to be taken to address the issue and work to improve turn-around time for the refactoring of the defects.

    Now that all of the documentation and training has been completed, and your testers are starting to test and enter defects. But what happens if they didn't quite get all the classroom instructions correct, or if they are just aren't quite filling out the defects the way they were instructed? These could be incorrect entries, such as stating UAT testing when the project is only in SIT, or stating the defect was found in Production when the application has not yet been released. It can ALSO be a matter of clarity of the summary/title field. This field is the one seen by everyone, and how the defect is referred to, so it must be clear and concise. If the tester has not entered it in a clear and concise manner, it should be corrected, and the tester needs to be educated, reminded on how best to fill it out.

    So, there it is - Triage meetings and tracking timely refactoring in a nutshell. First, organize the triage, know what you want to discuss, make sure everyone knows their roles, and knows their take-aways, ensure defects are continually being addressed and moved through the system, that they don't age longer than they should, and second, to verify that they are being addressed in a timely fashion, identifying trends and issues so they can be addressed if needed.

    That's it for this blog. The next one will also combine two steps, and look at metrics creation to track and report progress and identify trends and issues.