Download PDFOpen PDF in browser

Cross-project Reopened Pull Request Prediction in GitHub

EasyChair Preprint 2992

6 pagesDate: March 19, 2020

Abstract

In GitHub, pull requests may get reopened again for further modification and code review. Prediction of within-project
reopened pull requests work well if there is enough amount of training data to build the training model. However, for new projects that have a limited amount of pull requests, using training data from other projects can help to predict the reopened pull requests. Therefore, it is important to study cross-project reopened pull request
prediction and help integrators in new projects. In this paper, we propose a cross-project approach that consists of
building a decision tree training model based on an external project as a source project to predict the reopened pull requests in another project. We evaluate the effectiveness of cross-project prediction on 7 open source projects containing 100,622 pull requests. Experiment results show that the cross-project prediction achieves accuracy from
78.76% to 96.52%, and F1-measure from 53.34% to 90.58% across 7 projects. We examine the feature importance using the decision tree predictor and find that the number of commits is the most important feature in the majority of projects.

Keyphrases: GitHub, Reopened pull request prediction, cross-project

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:2992,
  author    = {Abdillah Mohamed and Li Zhang and Jing Jiang},
  title     = {Cross-project Reopened Pull Request Prediction in GitHub},
  howpublished = {EasyChair Preprint 2992},
  year      = {EasyChair, 2020}}
Download PDFOpen PDF in browser