ResourcesResearchTowards Understanding Third-party Library Dependency in C/C++ Ecosystem
Research
Published on
September 11, 2024

Towards Understanding Third-party Library Dependency in C/C++ Ecosystem

Written By
Ding Sun
Share

How can we effectively detect and address vulnerabilities introduced by third-party library dependencies in C/C++ code bases to enhance software security and reliability?

We recently conducted an insightful interview with researchers Wei Tang and Prof. Yang Liu, where we discussed their research on third-party library dependencies in C/C++. The conversation focused on how their tool, CCScanner, effectively identifies and mitigates vulnerabilities in large-scale C/C++ projects by providing precise detection of TPLs. Their research, presented at ASE 2022, sheds light on innovative methods to enhance the security and reliability of C/C++ code bases. Stay tuned for key insights and expert perspectives.

Scantist: Could you start by providing an overview of your research? What motivated you to explore third-party library (TPL) dependencies in the C/C++ ecosystem?

Wei Tang: Our research was motivated by the lack of a comprehensive understanding of TPL dependencies in the C/C++ ecosystem, unlike other languages like Java and Python, which have established package managers. In C/C++, the absence of a unified package management system leads to fragmented TPL management practices, increasing security risks and complicating dependency tracking. Our goal was to study how TPLs are reused, the scale of their use in large repositories, and the implications for security and software development practices. We also aimed to identify areas for future improvements.

Scantist: Could you explain the common methods developers use to handle dependencies in the C/C++ ecosystem? How are third-party libraries typically reused in these projects?

Wei Tang: Developers in C/C++ tend to handle dependencies through build scripts (like CMake or Makefiles), which introduce TPLs 71% of the time. These scripts are preferred over formal package managers such as Conan or Vcpkg. While this method allows flexibility, it complicates the tracking of dependencies. Without explicit version control and package management tools, dependency management becomes fragmented, making it difficult to detect vulnerabilities and manage security risks in a structured way.

Scantist: You noted that only a small percentage of developers use application-level package managers in C/C++. What do you think contributes to this low adoption rate, and how does it affect dependency management?

Wei Tang: The fragmented nature of the C/C++ ecosystem contributes to the low adoption of application-level package managers. Many projects rely on system-level package managers (such as APT or Homebrew) or manual processes, which developers find convenient. However, this leads to incomplete dependency tracking and increases the risk of using vulnerable or outdated libraries. Package managers like Conan and Vcpkg offer more structured management but are not yet widely embraced, likely due to the learning curve and lack of a unified toolchain in the ecosystem.

Scantist: What were your findings on the scope of third-party library data for C/C++ projects? How comprehensive are the current TPL databases used in this ecosystem?

Wei Tang: We found significant fragmentation in the TPL data for C/C++. There is no single, comprehensive database that covers all C/C++ libraries. System libraries, provided by operating systems, account for 65.5% of dependencies, but many are overlooked in vulnerability detection. Package manager repositories and GitHub-hosted libraries provide additional coverage but are inconsistent. This fragmentation creates challenges for developers trying to maintain secure, up-to-date libraries in their projects, as vulnerabilities may not be uniformly tracked across different sources.

Scantist: How does this fragmentation in third-party library data affect the security and reliability of software development in C/C++?

Wei Tang: Fragmented TPL data leads to inconsistencies in version control and vulnerability reporting. Different versions of libraries may exist across system packages, GitHub, and package manager repositories, making it difficult to detect and mitigate vulnerabilities across projects. This fragmentation, coupled with the reliance on system-level package managers that don’t automatically update libraries, increases the likelihood of using outdated, vulnerable libraries, impacting both security and project reliability.

Scantist: What are the most frequently reused libraries in the C/C++ ecosystem, and why are these libraries so essential to software development?

Wei Tang: System libraries such as libm, libdl, libc, and OpenSSL are among the most frequently reused libraries. They provide foundational functionality across a wide range of systems and applications, making them indispensable for most C/C++ projects. These libraries are typically bundled with operating systems, which simplifies their use but also makes them critical points of failure in terms of security. Their essential role means that vulnerabilities in these libraries can have widespread consequences across the entire ecosystem.

Scantist: System libraries play a significant role in the C/C++ ecosystem. Why are they often overlooked in dependency management, and what impact does this have on security?

Wei Tang: System libraries are often overlooked because they come pre-installed with operating systems and are managed by system-level package managers. As a result, developers may not explicitly track or manage these libraries in their dependency management workflows. This can lead to vulnerabilities going unnoticed, especially if developers assume the libraries are secure due to their integration with the OS. The failure to track system library versions and updates increases the security risks, as vulnerabilities in widely used libraries like libc can propagate across multiple projects.

Scantist: What trends did you observe regarding version constraints and third-party libraries in the C/C++ ecosystem? How are versions typically managed in C/C++ projects?

Wei Tang: We found that only 27% of C/C++ dependencies have explicit version constraints, which is far lower than in other ecosystems. Many developers rely on system-level package managers, which do not allow for specific version control. This leads to a significant risk of projects using outdated or vulnerable library versions. In cases where version constraints are specified, only 9.5% of dependencies use the latest version. This highlights a lack of proper version management, which contributes to the overall security vulnerabilities in C/C++ projects.

Scantist: How do vulnerable library versions impact the C/C++ ecosystem, and what steps can developers take to mitigate the risks associated with these vulnerabilities?

Wei Tang: Vulnerable library versions pose a serious threat in the C/C++ ecosystem, with 49.3% of dependencies using outdated and vulnerable versions. This is especially concerning for critical libraries like OpenSSL, which have had high-profile vulnerabilities such as Heartbleed. Developers can mitigate these risks by adopting tools that enforce version control and regularly update dependencies. Using application-level package managers like Conan or Vcpkg, combined with continuous monitoring for vulnerabilities, can significantly improve the security and reliability of C/C++ projects.

Scantist: Based on your research, what do you see as the most important steps to improve dependency management and security in the C/C++ ecosystem?

Wei Tang: The most important steps involve developing a unified package management system that can manage both system-level and application-level dependencies effectively. Integrating version control and security alerts into this system will help prevent vulnerabilities from going unnoticed. Developers should be encouraged to adopt modern tools like Conan or Vcpkg for better dependency tracking. Additionally, improving tools for detecting TPL dependencies, such as CCScanner, and encouraging the use of structured version management practices will help reduce security risks in the ecosystem.