1 Discussion, Limitations and Conclusion

Having reliable and easy ways to compare algorithm performances on a variety of graphs with different features is an important aspect of graph drawing research.

In this work, we presented a comprehensive benchmark dataset collection for graph layout algorithms. Compiling, organizing, and making a wide array of datasets with diverse characteristics accessible not only facilitates rigorous and fair comparisons of algorithmic performance but also addresses critical issues of replicability and reproducibility in research. The Graph Benchmark Datasets website, along with the efforts for long-term archival, is an effort towards maintaining these valuable resources available to the community.

The work we did for the collection process doesn’t come without limitations. We focused on the Graph Drawing conference as the main venue to begin collecting papers, which limits the completeness of our search. There could be many relevant datasets that we did not find to include here. Indeed, in no way we consider this collection comprehensive, but rather a starting point.

There is a large number of interesting follow-up questions that could be tackled starting from the data we collected, and information that could be gathered from cross-referencing the datasets with the literature. For instance, it would be interesting to study the spread of a dataset based on its features and how it has been distributed. The following is a chart comparing the year of publication of a dataset with the year of publication of papers that use the dataset:

Figure 1: Date of publication of a dataset, compared to the dates of publications of papers using the dataset. A gray dot indicates the publication of the paper of the dataset, while a red dot indicates the publication of a paper that uses the dataset. Darker red dots mean more papers with the same datasets being published in the same year. Note that, in several instances, the paper presenting the dataset can be published after the dataset has already been used several times.

Additional questions that would be interesting to explore could be:

  • Has the type of benchmark datasets used in the literature changed over time?
  • The datasets we collected definitely went through changes in the years — some merged, some underwent changes. How did these datasets evolve over time?
  • How has the inclusion of supplemental material in literature changed?

We leave these questions for new and exciting future work. In the meantime, we hope that the Graph Benchmark Datasets website will be an appreciated resource for the community.