Monday 2 November 2009

Brainstorming techniques for Enhancing Findbugs - Triage Mode

In the last entry for brainstorming techniques I discussed the possibility of enhancing Findbugs by implementing more sophisticated filtering abilities through the GUI. However, it seems the manual I used as evidence that such a feature did not exist was possibly out of date. The latest available version of Findbugs (1.3.9-rc1) has the kind of filtering I discussed already available, in the Swing GUI, not the Eclipse plugin. So that isn't really an option.

However, a paper written by Pugh and others[1] discussed the use of Findbugs on large scale projects, including at Google. One interesting point mentioned here is the way Findbugs was used - two developers were assigned to run Findbugs after certain builds, and categorise the bugs reported. The result of this "bug triage" was that only priority bugs reported would be raised with the relevant developer.

The triage model of workflow is an interesting one. One of the important evaluation criteria for a system which enhances Findbugs is that it has a very low cost of use. Machine learning techniques that use supervised learning may be off-putting for developers if it doesn't fit in with their workflow. However, if a developer is assigned to perform triage, their workflow is already tied to Findbugs. If this is a common approach to deploying Findbugs within a development team, a possible way to enhance the Findbugs system would be to provide an interface which is designed and streamlined for bug triage.

Currently the Finbugs GUI reports errors with the ability to configure the way they are reported. They can be listed by Category, Rank, Pattern and Kind (shown in the image below). The developer then has to manually look through the reported bugs to investigate each one. This results in a couple of mouse clicks per bug report. There are keyboard shortcuts available, but like all keyboard shortcuts, they have a barrier to use. Although the UI issue may seem trivial, if Findbugs reports hundreds of errors the inefficiency will begin to accumulate.


A triage mode could be created to address this. The basic premise is that a UI is made which is specialised for processing a large number of bug reports in one session. The system would roll through each bug report, show all necessary information in one screen, and ask the user for an action on the bug. The action could involve reclassifying the bug (is it 'mostly harmless' or 'must fix'?), or crucially for this project, creating a filter rule based on the current bug. The use of filters is probably the best strategy for reducing false positives available.

Use of filters is probably the best strategy because what makes a specific bug report a false positive is not black and white. Reducing the number of false positives relies on having a consistent definition of what a false positive is. I'm moving towards the idea that a false positive is any bug reported that the user doesn't care about. Clearly this depends more on the user than Findbugs itself. For instance, reporting a call to System.exit() will make sense for a class library, but for a standalone GUI application this may be perfectly acceptable. In one context, the bug report is a false positive, in the other it is not. Therefore what constitutes a false positive is highly dependent on the codebase it is run over. Filters are created on a per-codebase basis, depending on the perspective of the developers. Using a triage mode which can define filters is a natural progression for the system to build up an idea of the context of the system, based on the user's categorisation of bugs.

I would be keen for a system to be able to build this context solely from the filter definition file, which would have several advantages. However, it could also be possible to build the context from supervised learning, conducted transparently in each triage session. A triage mode will most likely be one of the prototypes developed and evaluated.



[1] Evaluating Static Analysis Defect Warnings On Production Software - Nathaniel Ayewah, William Pugh, J. David Morgenthaler, John Penix, YuQian Zhou

No comments:

Post a Comment