How pairing SAST with AI dramatically reduces false positives in code safety

Learn extra at:

The core drawback: Context vs. guidelines

Conventional SAST instruments, as we all know, are rule-bound; they examine code, bytecode, or binaries for patterns that match recognized safety flaws. Whereas efficient, they typically fail in relation to contextual understanding, lacking vulnerabilities in advanced logical flaws, multi-file dependencies, or hard-to-track code paths. This hole is why their precision charges and the share of true vulnerabilities amongst all reported findings stay low. In our empirical research, the broadly used SAST device, Semgrep, reported a precision of simply 35.7%.

Our LLM-SAST mashup is designed to bridge this hole. LLMs, pre-trained on large code datasets, possess sample recognition capabilities for code conduct and a information of dependencies that deterministic guidelines lack. This permits them to cause in regards to the code’s conduct within the context of the encircling code, related recordsdata, and all the code base.

A two-stage pipeline for clever triage

Our framework operates as a two-stage pipeline, leveraging a SAST core (in our case, Semgrep) to determine potential dangers after which feeding that data into an LLM-powered layer for clever evaluation and validation.

Leave a reply

Please enter your comment!
Please enter your name here