Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories

Abstract

Functionality-specific vulnerabilities, which mainly occur in Application Programming Interfaces (APIs) with specific functionalities, are crucial for software developers to detect and avoid. When detecting individual functionality-specific vulnerabilities, the existing two categories of approaches are ineffective because they consider only the API bodies and are unable to handle diverse implementations of functionality-equivalent APIs. To effectively detect functionality-specific vulnerabilities, we propose APISS, the first approach to utilize API doc strings and signatures instead of API bodies. APISS first retrieves functionality-equivalent APIs for APIs with existing vulnerabilities and then migrates Proof-of-Concepts (PoCs) of the existing vulnerabilities for newly detected vulnerable APIs. To retrieve functionality-equivalent APIs, we leverage a Large Language Model for API embedding to improve the accuracy and address the effectiveness and scalability issues suffered by the existing approaches. To migrate PoCs of the existing vulnerabilities for newly detected vulnerable APIs, we design a semi-automatic schema to substantially reduce manual costs. We conduct a comprehensive evaluation to empirically compare APISS with four state-of-the-art approaches of detecting vulnerabilities and two state-of-the-art approaches of retrieving functionality-equivalent APIs. The evaluation subjects include 180 widely used Java repositories using 10 existing vulnerabilities, along with their PoCs. The results show that APISS effectively retrieves functionality-equivalent APIs, achieving a Top-1 Accuracy of 0.81 while the best of the baselines under comparison achieves only 0.55. APISS is highly efficient: the manual costs are within 10 minutes per vulnerability and the end-to-end runtime overhead of testing one candidate API is less than 2 hours. APISS detects 179 new vulnerabilities and receives 60 new CVE IDs, bringing high value to security practice.

Keywords

Computer scienceGraphConvolutional neural networkScalabilityArtificial intelligenceENCODEMargin (machine learning)Theoretical computer sciencePattern recognition (psychology)Machine learning

Related Publications

LRBM: A Restricted Boltzmann Machine Based Approach for Representation Learning on Linked Data

Kang Li , Jing Gao , Suxin Guo +3 more

Linked data consist of both node attributes, e.g., Preferences, posts and degrees, and links which describe the connections between nodes. They have been widely used to represen...

2014 27 citations

Supervised random walks

Lars Bäckström , Jure Leskovec

Predicting the occurrence of links is a fundamental problem in networks. In the link prediction problem we are given a snapshot of a network and would like to infer which intera...

2011 994 citations

A Comprehensive Survey on Transfer Learning

Fuzhen Zhuang , Zhiyuan Qi , Keyu Duan +5 more

Transfer learning aims at improving the performance of target learners on target domains by transferring the knowledge contained in different but related source domains. In this...

2020 Proceedings of the IEEE 5546 citations

Deformable Part Descriptors for Fine-Grained Recognition and Attribute Prediction

Ning Zhang , Ryan Farrell , Forrest Iandola +1 more

Recognizing objects in fine-grained domains can be extremely challenging due to the subtle differences between subcategories. Discriminative markings are often highly localized,...

2013 201 citations

RolX

Keith Henderson , Brian Gallagher , Tina Eliassi‐Rad +6 more

Given a network, intuitively two nodes belong to the same role if they have similar structural behavior. Roles should be automatically determined from the data, and could be, fo...

2012 386 citations

Publication Info

Year: 2025
Type: preprint
Citations: 15795
Access: Closed

External Links

Download PDF (Free) View on DOI.org Semantic Scholar

Social Impact

Altmetric

Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

15795

OpenAlex

Influential

Cite This

APA Style

                            
                                    Thomas Kipf, 
                                
                                    Max Welling
                                
                            (2025). 
                            Detecting Functionality-Specific Vulnerabilities via Retrieving Individual Functionality-Equivalent APIs in Open-Source Repositories. 
                            Dagstuhl Research Online Publication Server
                            
                            .
                            https://doi.org/10.4230/lipics.ecoop.2025.6

Identifiers

DOI: 10.4230/lipics.ecoop.2025.6

Data Quality

Data completeness: 77%