ResourcesResearchTRACER- Finding Patches for Open Source Software Vulnerabilities
Research
Published on
August 29, 2024

TRACER- Finding Patches for Open Source Software Vulnerabilities

Written By
Ding Sun
Share
How can we effectively detect and address known vulnerabilities in existing OSS vulnerabilities to enhance software security and reliability?

We recently conducted an insightful interview with researchers Congying Xu from Fudan University and Prof. Liu Yang from NTU. Our discussion focused on the critical issues surrounding OSS vulnerabilities and how their innovative solution, TRACER, can significantly enhance the tracking and mitigation of these vulnerabilities. This groundbreaking research has been published in the Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE '22). Stay tuned for key takeaways and expert perspectives on enhancing software security and reliability.

Scantist: What motivated you to focus on the quality and characteristics of patches for OSS vulnerabilities in vulnerability databases?

Xu Congying: Our motivation stemmed from the increasing reliance on Open Source Software (OSS) in various applications, which, while beneficial, introduces significant security risks. Existing vulnerability databases often lack comprehensive or accurate patch information, making it challenging to mitigate these vulnerabilities effectively. We aimed to address this gap by conducting an empirical study to understand the quality and characteristics of patches in these databases and proposing an automated approach, Tracer, to improve patch tracking. By enhancing patch quality and coverage, we hope to strengthen the security of software systems relying on OSS.

Scantist: Can you explain the design and objectives of your empirical study, particularly regarding the five research questions (RQ1-RQ5) you aimed to answer?

Xu Congying: Our empirical study aimed to analyze the quality and characteristics of patches for OSS vulnerabilities in two major industrial vulnerability databases: Veracode and Snyk. We focused on five key research questions:

  • RQ1: Coverage Analysis – determining the prevalence of patches in these databases.
  • RQ2: Consistency Analysis – assessing how consistent patches are across the databases.
  • RQ3: Type Analysis – identifying common patch types.
  • RQ4: Cardinality Analysis – exploring the mapping relationships between vulnerabilities and their patches.
  • RQ5: Accuracy Analysis – evaluating the accuracy of patches in these databases. The study provided insights into the limitations of current databases and informed the development of our Tracer approach.

Scantist: What were the main findings from your coverage and consistency analyses of OSS vulnerabilities in vulnerability databases?

Xu Congying: Our coverage analysis (RQ1) revealed that more than half of the vulnerabilities in Veracode and Snyk lacked patches. Specifically, Veracode had patches for 41.8% of the vulnerabilities, while Snyk had patches for 41.2%. In terms of consistency (RQ2), around 20% of the vulnerabilities had inconsistent patches across the two databases. These inconsistencies included cases where one database reported a patch, but the other did not, and cases where the patches differed. These findings highlighted the need for more comprehensive and accurate patch information in vulnerability databases to ensure effective vulnerability mitigation.

Scantist: Could you elaborate on the types of patches you identified and the mapping cardinalities between OSS vulnerabilities and their patches?

Xu Congying: Through our type analysis (RQ3), we found that the majority of patches (around 93%) were in the form of GitHub commits, with SVN commits and other Git platform commits being less common. Regarding cardinality (RQ4), we observed that 43.8% of the vulnerabilities had a one-to-one mapping with their patches. Additionally, 15.1% had multiple equivalent patches, and 41.2% had multiple distinct patches. This diversity in patch types and mappings underscores the complexity of accurately tracking patches. For instance, some vulnerabilities required multiple patches across different branches or repositories, complicating the tracking process.

Scantist:What did your accuracy analysis reveal about the patch accuracy of OSS vulnerabilities in the studied databases?

Xu Congying: Our accuracy analysis (RQ5) indicated that while both Veracode and Snyk achieved high precision for single-patch vulnerabilities, their recall was significantly lower for vulnerabilities with multiple patches. For vulnerabilities with one-to-many mappings, the databases often missed some patches, leading to incomplete vulnerability mitigation. Specifically, Veracode and Snyk had F1-scores of 0.793 and 0.771, respectively, which highlighted the need for more sophisticated methods to ensure comprehensive patch tracking. These findings emphasized the importance of having a reliable system like Tracer to enhance patch accuracy and coverage.

Scantist: How does Tracer improve upon existing heuristic-based approaches and industrial vulnerability databases in tracking patches?

Xu Congying: Tracer significantly outperforms existing heuristic-based approaches and industrial vulnerability databases in tracking patches for OSS vulnerabilities. Compared to heuristic methods, Tracer finds patches for up to 273.8% more vulnerabilities and achieves up to 116.8% higher F1-scores for patch accuracy. Additionally, Tracer enhances patch recall by up to 18.4% compared to Veracode and Snyk, while maintaining a comparable precision. These improvements are attributed to Tracer's automated reference network construction and patch selection heuristics, which enable more comprehensive and accurate patch tracking. The combination of multiple knowledge sources and a systematic approach to selecting high-confidence patches are key factors in its superior performance.

Scantist: What insights did the ablation analysis provide regarding the contribution of each component of Tracer to its overall effectiveness?

Xu Congying: The ablation analysis provided valuable insights into the significance of each component within Tracer. Removing any single advisory source (NVD, Debian, Red Hat, or GitHub) resulted in a decrease in the number of CVEs for which patches could be found, with NVD and GitHub contributing the most. The reference network construction was crucial for finding hidden patches, as it significantly increased patch coverage compared to direct references alone. The patch selection heuristics, particularly the connectivity-based and confidence-based approaches, were essential in balancing precision and recall. Finally, the patch expansion step ensured that multiple patches across different branches were accurately tracked, further enhancing the overall effectiveness of Tracer.

Scantist: How did you evaluate the generality of Tracer, and what were the findings regarding its applicability to OSS vulnerabilities beyond the depth dataset?

Xu Congying: We evaluated the generality of Tracer by running it against two new datasets: one consisting of CVEs for which only one of the two industrial databases reported patches, and another consisting of CVEs for which neither database reported patches. Tracer successfully tracked patches for 67.7% of the CVEs in the first dataset and 51.5% in the second dataset, demonstrating its applicability beyond the initial depth dataset. Manual analysis of a sample of these CVEs showed that Tracer maintained high accuracy, with F1-scores of 0.784 and 0.867, respectively. These results indicate that Tracer is not overfitted to the initial dataset and can effectively generalize to a broader range of OSS vulnerabilities. This broad applicability is essential for ensuring the robustness and reliability of the patch tracking process across diverse OSS projects.

Scantist: Can you discuss the practical usefulness of Tracer as determined by your user study, and how it can be implemented in real-world scenarios?

Xu Congying: The user study conducted with 10 participants highlighted Tracer's practical usefulness. Participants were able to track patches more accurately and quickly with Tracer than without it. Specifically, Tracer reduced the time required to find patches by an average of 5.66 minutes per task and significantly improved patch accuracy. These findings suggest that Tracer can be a valuable tool for security professionals and organizations, helping them to efficiently manage OSS vulnerabilities. Implementing Tracer in real-world scenarios can enhance the reliability of vulnerability databases, reduce manual effort in patch tracking, and ultimately improve the overall security posture of software systems. By automating the patch tracking process, organizations can ensure timely and accurate vulnerability mitigation, which is critical in the fast-paced and evolving landscape of cybersecurity.