Designing a Study

Once you have decided which data to leverage, you have to choose a methodology—that is, a plan on how to conduct your investigation. This very book collects a number of successful approaches and gives plenty of ideas on the investigations to pursue.

There is a caveat, though. Even though many mining and research projects follow the same basic principles, projects experience many small differences that might lead your mining efforts into a dead end. Two of these factors include the nature of the project and the underlying development process. A good example is the differences between open source software (e.g., Eclipse) and industrial software (e.g., Microsoft or SAP projects). Differences in the environments (physical and organizational) that surround the software projects engender fundamental difference in development processes. For example, in open source projects that tend to draw developers from around the world, none of them sharing the same office, pair programming or group code reviews become difficult if not impossible. Even a quick face-to-face chat about a problem requires advanced technology and must respect different time zones.

Differences in development environments and processes will have a fundamental impact on the project history and thus must be considered when mining that history. Recent research projects and replication studies mining data sets from both distributed open source projects and industry projects showed that some mining activities succeed on one type, and some on the other type [Bird et al. 2009b], [Zimmerman et al. 2009]. Many process-related metrics, such as socio-technical network metrics [Bird et al. 2009b] or metrics related to organizational structure [Nagappan et al. 2008], depend heavily on the software development process in place, and show very different results on open source software.

Thus we recommend that before designing your own study, your first task should be to replicate a study that has already shown valid results. If you come up with good or even better results, all is fine. But if your results are different, you need to investigate what makes them different—and this difference will also impact the design of your own studies.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.37.136