Vous pouvez vous abonner à nos annonces de séminaires http://gallium.inria.fr/seminaires/ S E M I N A I R E __ / _` _ / / o / ) __) / / / / / /\/| (___/ (_/ (_ (_ / (__/ / | I N R I A - Paris 2 rue Simone Iff (ou: 41 rue du Charolais) Salle Lions 2, bâtiment C Lundi 3 octobre, 14h30 -------------- Andreas Zeller -------------- Saarland University ================================== Mining Input Grammars for Security ================================== [This is not a Gallium seminar, but an IRILL seminar, taking place at Inria Paris at 14h30.] Knowing which part of a program processes which parts of an input can reveal the structure of the input as well as the structure of the program. In a URL "http://www.example.com/path/", for instance, the protocol “http", the host “www.example.com", and the path “path" would be handled by different functions and stored in different variables. Given a set of sample inputs, we use _dynamic tainting_ to trace the data flow of each input character, and aggregate those input fragments that would be handled by the same function into lexical and syntactical entities. The result is a _context-free grammar_ that accurately reflects valid input structure; as it draws on function and variable names, it can be as readable as textbook examples. In my talk, I show how our AUTOGRAM prototype derives such grammars automatically, and point out their uses in software engineering and security: * They facilitate reverse engineering of input formats as well as manually writing valid test inputs; * They produce high numbers of varied and valid inputs, thus facilitating automated robustness testing and fuzzing; * Integrated into a checking parser, they protect existing programs against invalid, unexpected, and malicious inputs and behaviors. This work was conducted with Matthias Höschele and Konrad Jamrozik, presented at ASE 2016 (https://www.st.cs.uni-saarland.de/models/autogram/) and ICSE 2016 (http://www.boxmate.org). It is part of the ERC SPECMATE project, funded by an ERC Advanced Grant.