Ralf Treinen: Some Recent Talks
Mining Debian Maintainer Scripts
With Nicolas Jeannerod.
Debian sid contains more than 31.000 maintainer scripts, the
vast majority of which are written in the POSIX shell language. We
have written, in the context of the CoLiS project, the tool shstats
which allows us to do a statistical analysis of a large corpus of
shell scripts. The tool operates on concrete syntax trees of shell
scripts constructed by the morbig shell parser. The morbig parser has
already been presented at a devroom at FOSDEM 2018, and at the
minidebconf 2018 in Hamburg. The shstats tool comes with a number of
analysis modules, and it is easy to extend it by adding new modules.
In this talk we will present both the design of the analyzer tool and
how it can be extended by new analysis modules, and some of the results
we have obtained so far on the corpus of sid maintainer scripts. Among
these are answers to questions like:
- are recursive functions used in maintainer scripts?
- how many variable assignments are in reality definition of constants?
- how many shell scripts don't do any variable assignments, besides
definitions of constants?
- how often are shell constructs like while, for, if, etc. used?
- which UNIX commands are most commonly used in the corpus, and
with which options?
- are there any syntax errors in the arguments of complex commands like test?
Talk given at
[Slides (PDF)]
Parsing Posix [S]hell
With Yann Régis-Gianas.
Parsing the POSIX shell language is challenging: lexical analysis
depends on the parsing context ; parsing is specified by an ambiguous
BNF grammar annotated with adhoc rules ; the language specification is
informal... and, the icing on the cake: statically parsing a shell
script is actually undecidable!
Forget about your textbooks! Existing implementations of shell
interpreters contain hand-crafted character-level non-modular
syntactic analyzers and yes, their source code are very hard to read.
What if you had to program the parser of a shell script verifier?
Would you accept to throw away your favorite lexer and parser
generators which had let you write high-level declarative
implementations for decades? Would you accept to mix lexing, parsing
and evaluation in a single large bowl of spaghetti code? Of course
not. You would hardly trust such a parser.
In this talk, we will introduce "Morbig", a modular, high-level and
interoperable parser for POSIX shell. The implementation of Morbig
relies on key features of Menhir, an LR(1) parser generator for OCaml.
Talk given at
[Slides (PDF)]
Towards the formal verification of maintainer scripts
This talk describes a recently started research project named
[http://colis.irif.univ-paris-diderot.fr/ Colis] with the goal of
developing techniques and tools for the formal verification of
maintainer scripts (preinst, postinst, prerm, postrm).
Program verification aims at obtaining a formal assurance that a
program is correct with respect to a given specification. This is
achieved by constructing a formal proof of correctness of the
program. In contrast to program testing, the existence of a proof
assures that the program behaves correctly in any situation described
by the specification. Failure of an attempt to verify a program, on
the other hand, can often be used to generate useful test cases.
A possible example of a program specification is absence of execution
error under certain initial conditions. Automatic program verification
even for this kind of specification is an extremely challenging
task. In case of Debian maintainer scripts we are faced with even more
challenging properties like idempotency of scripts (required by
policy), or commutation of scripts.
The project is still in the beginning, so there are no results yet to
present. However, I will explain why I think that the case of Debian
maintainer scripts is very interesting for program verification : some
aspects of scripts (POSIX shell, manipulation of a complex data
structure) make the problem very difficult, while other aspects of the
Debian case are likely to make the problem easier. I am also very
interested in the inputs from the audience about possible use cases of
script verification in the context of Debian.
Talk given at
[Slides (PDF)]
Mancoosi tools for the analysis and quality assurance of FOSS
distributions
We will present a set of tools for the analysis and quality assurance
of FOSS distributions. These tools analyse meta-data of packages (like
dependencies, build-dependencies, conflicts, etc), and answer
questions like:
- which packages are installable and which are not?
- for which packages is it possible to install a build environment?
- which packages are absolutely mandatory for the installation of a
package (which might have dependencies with alternatives, or on
virtual packages)?
- which packages will certainly become uninstallable when I upgrade
a certain package?
This talk is not pkgsrc-specific, but will give an overview over the
work done by Mancoosi and discuss possible collaborations between
Mancoosi and the pkgsrc community.
This is joint work with
Pietro Abate,
Yacine Boufkhad,
Jaap Boender,
Roberto Di Cosmo,
Jérôme Vouillon,
and Stefano Zacchiroli.
Talk given at the
[Slides (PDF)]
News from EDOS: finding outdated packages
The best-known tool from the EDOS project, today called dose-debcheck,
analyses the installability of packages according to its metadata
(conflicts, dependencies,etc.). This allows us to find packages that
certainly are not installable, but it is not always very helpful to
pinpoint the origin of a problem. In this talk we present a new
notion: A package is outdated if it is not installable, and
furthermore if the only way to make it installable is to update the
package itself, as opposed to mereley upgrading some of its
dependencies. It turns out that this property can be defined in terms
of installability in possible future versions of a repository. We
explain how our algorithm can actually check this property, and we
report on our experience with running the tool in debian.
This is joint work with
Pietro Abate,
Roberto Di Cosmo,
and Stefano Zacchiroli.
Talk given at the
[Slides (PDF)]
Symbolic Protocol Analysis for Monoidal Equational Theories
(or: How Algebra Helps to Solve Intruder Constraints)
We consider the design of automated procedures for analyzing the (in)security
of cryptographic protocols in the Dolev-Yao model for a bounded number of
sessions when we take into account some algebraic properties satisfied by the
operators involved in the protocol. This leads to a more realistic model than
what we get under the perfect cryptography assumption, but it implies that
protocol analysis deals with terms modulo some equational theory instead of
terms in a free algebra.
The main goal of this paper is to set up a general approach that works for a
whole class of so-called monoidal theories which contains many of the specific
cases that have been considered so far in an ad-hoc way, e.g. exclusive or,
Abelian groups, exclusive or in combination with the homomorphism axiom.
We follow a classical schema for cryptographic protocol analysis which proves
first a locality result and then reduces the insecurity problem to a symbolic
constraint solving problem. This approach strongly relies on the
correspondence between a monoidal theory E and a semiring
SeE which we use to deal with the symbolic constraints. We
show that the well-defined symbolic constraints that are generated by
reasonable protocols can be solved provided that unification in the monoidal
theory satisfies some additional properties. The resolution process boils down
to solving particular quadratic Diophantine equations that are reduced to
linear Diophantine equations, thanks to linear algebra results and the
well-definedness of the problem. Examples of theories that do not satisfy our
additional properties appear to be undecidable, which suggests that our
characterization is reasonably tight.
This is joint work with
Stéphanie Delaune,
Pascal Lafourcade,
and Denis Lugiez.
Talk given at
[Slides (postscript)]
Symbolic Protocol Analysis in Presence of a Homomorphism Operator
and Exclusive-Or
Security of a cryptographic protocol for a bounded number of sessions is
usually expressed as a symbolic trace reachability problem. We show that
symbolic trace reachability for well-defined protocols is decidable in
presence of the exclusive or theory in combination with the
homomorphism axiom. These theories allow us to model basic properties of
important cryptographic operators.
This trace reachability problem can be expressed as a system of symbolic
deducibility constraints for a certain inference system describing the
capabilities of the attacker. One main step of our proof consists in reducing
deducibility constraints to constraints for deducibility in one step of the
inference system. This constraint system, in turn, can be expressed as a
system of quadratic equations of a particular form over Z/2Z[h], the
ring of polynomials in one indeterminate over the finite field
Z/2Z. We show that satisfiability of such systems is
decidable.
This is joint work with
Stéphanie Delaune,
Pascal Lafourcade,
and Denis Lugiez.
Talk given at
[Slides (postscript)]
Le Projet PROUVÉ : PRotocoles cryptographiques: OUtils de VÉrification
automatique
PROUVÉ est un projet RNTL sur trois ans qui réuni CRIL Technology, France
Télécom R&D, INRIA Lorraine, LSV et VERIMAG. Le projet porte sur la
vérification de protocoles cryptographiques. Le point de départ du projet est
une critique des méthodes existantes de vérification de protocoles
cryptographiques :
- l'absence d'un langage d'entrée commun et d'une sémantique commune des
outils
de vérification existants
- des langages de spécification de protocoles trop simples ou trop abstraits
- une abstraction des méthodes de chiffrement qui est trop éloignée des
vraies
implémentations de protocoles
Les objectifs du projet sont :
- développement d'un langage de spécification de protocoles cryptographiques
expressif avec une sémantique précise, et d'un langage générique de
propriétés de protocoles dans lequel peuvent s'exprimer de nombreuses
variantes des propriétés classiques
- intégration des outils existants de vérification issus des laboratoires
partenaires du projet dans une boîte d'outils, en particulier développement
des traducteurs du langage de spécification et du langage de propriétés
vers les langages d'entrée des outils
- vérification de protocoles cryptographiques sous affaiblissement de
l'hypothèse du chiffrement parfait, en particulier le remplacement de
nonces par des timestamps ou des compteurs, et prise en compte des
propriétés algébriques des primitives cryptographiques
- développement d'une interface commune avec une présentation uniforme
de preuves et d'attaques
Le projet s'attaquera à deux études de cas significatives: un porte-monnaie
électronique et un protocole d'enchères, qui lui permettront à la fois de
guider les recherches, d'expérimenter les outils et de valider les résultats.
Talk given at
- IRISA, Rennes, 27/01/2005.
[Slides (postscript)]
Easy Intruder Deductions
We investigate extensions of the Dolev-Yao model of a passive intruder into
a cryptographic protocol by some algebraic properties of cryptographic
primitives. We provide sufficient conditions under which the intruder
deduction problem is decidable in polynomial time. We apply this result to
the equational theory of homomorphism, and show that in this case the
intruder deduction problem is linear, provided that the messages are in
normal form.
This is joint work with
Hubert Comon-Lundh.
Talk given at
[Slides (postscript)]
Tree Automata and Tree Constraints
We are interested in the question whether automata techniques can help to
solve constraints over trees.
Tree constraints are first-order formulas built over a given set of
predicate symbols with a fixed interpretation in a domain of trees. The
semantics of such a tree constraint is a relation over trees. Tree
automata, on the other hand, provide a mechanism to define sets of trees.
Tree automata are well-investigated; they are known to enjoy basically the
same nice properties than word automata.
Unfortunately, classical automata are not well equipped to recognize the
kind of relation that we can get as the semantics of a constraint. What is
missing is
- a means to encode a (non-unary) relation, that is to recognize a
language of n-tuples of trees. The standard technique is to consider
an alphabet of n-tuples of symbols.
- the ability of the automaton to perform tests, in particular
equality and diseqality tests. This is necessary since tree
constraints may contain explicit or implicit equations.
In the case of word automata both extensions pose no problem since they are
subsumed by the closure properties of word automata. The same holds for the
extension of tree automata to tuples, and it also holds at least in some
interesting special cases for the extension of tree automata by
tests. Unfortunately, we encounter severe problems if we want to combine
both extensions to tree automata.
We will discuss the (mostly negative) results that we have obtained for our
new class of tree automata, and we will illustrate the automata approach at
hand of the (still open) problem of entailment of non-structural subtype
constraints.
This is in parts joint work with Alex Aiken, Joachim Niehren, Tim Priesnitz, and Zhendong Su.
Talk given at
- Tsukuba
Software Science Seminar, Tsukuba, Japan, 26/11/2001.
- 11th
Seminar on Algebra, Logic and Geometry in Informatics (ALGI),
Kinosaki, Japan, 4/12/2001.
- Séminaire
du Laboratoire Spécification et Vérification, ENS Cachan, France,
15/1/2002.
[Slides (postscript)]
Ralf Treinen
July 13, 2012.