Skip to content

Leveraging Flawed Tutorials for Seeding Large-Scale Web Vulnerability Discovery

In »Leveraging Flawed Tutorials for Seeding Large-Scale Web Vulnerability Discovery« (PDF) a bunch of researchers from TU Berlin, TU Braunschweig and Trend Micro are testing the hypothesis that people copy code from Stack Overflow even if it is bad code.
That is, one rotten tutorial can spoil the lot:

Based on our assertion, we hypothesize that vulnerability discovery can be seeded by code snippets such as those found in top-ranked tutorials. Viewed from an adversarial standpoint, we present a novel approach for bootstrapping vulnerability discovery at scale. Our main intuition is that recurring vulnerabilities can be found by recognizing, and subsequently looking for patterns in code that correspond to the original vulnerability. We refer to instances of these patterns as code analogues throughout the rest of the paper. Our expectation is that if such a pattern recurs, so will the corresponding vulnerability.

and further down:

Thanks to our framework, we have uncovered over 100 vulnerabilities in web application code that bear a strong resemblance to vulnerable code patterns found in popular tutorials. More alarmingly, we have confirmed that 8 instances of a SQLi vulnerability present in different web applications are an outcome of code copied from a single vulnerable tutorial. Our results indicate that there is a substantial, if not causal, link between insecure tutorials and web application vulnerabilities.

So, in total, the outcome is:

[other results]

  • our results give credence to the widely known anecdote that programmers copy and paste code from vulnerable tutorials. Our case study, involving 64,415 PHP projects hosted on GitHub, indicates that such ad hoc code re-use may endanger the security of software throughout the opensource landscape.
Published inHackerterrorcybercyber

One Comment

Leave a Reply

Your email address will not be published. Required fields are marked *