The PreCA precondition inference framework is now released. Check out its features and how you can use it for research and code understanding.
Do you often need to reverse some code or simply to use not-so-well-documented functions? Then you surely asked yourself many times on which input a function must be called. Answering this question is called precondition inference and PreCA is a new framework to tackle this problem. PreCA is completely black-box and, thus, does not rely on static analysis. As such, it will not be impacted by code translations like optimizations or obfuscation. Moreover, if the source code is not available, it makes no difference for PreCA, you only need to be able to run the code.
So why not trying it?
But first, what is precondition inference exactly?
Given a function F
it may be useful to know on which input we can run F
to get the desired output. This is exactly what precondition inference does.
It takes the function F
under analysis and a predicate Q
called a postcondition, stating what is the desired output.
From F
and Q
, it infers a predicate P
, s.t., the execution of F
on any input agreeing with P
terminates, does not crash, and the output agrees with Q
.
Such predicate P
is called a precondition. The best precondition, i.e., the less restrictive one, is then called the weakest precondition.
Enough with theory, now to practice!
Example: understanding strcat
Here is a basic implementation of the standard strcat
function which appends the second string to the first one.
char * strcat(char *s, char* append)
{
char *save = s;
for (; *s; ++s);
while (*s++ = *append++);
return(save);
}
Take time to understand it and try to find on which input the execution terminates and does not crash i.e., the precondition when Q = true
.
At first glance, we just have two pointers that are dereferenced. As such, it seams natural to say that P = valid(s) and valid(append)
.
Unfortunately, this is not enough. Indeed, if the two pointers are valid, no crash will occur. However, if the strings
overlap, append
will rewrite itself and execution will never terminate.
As such, it seems reasonable to state that P = valid(s) and valid(append) and not strOverlap(s, append)
. This is indeed what we can
understand from the manual. However, this precondition is too restrictive. Indeed, if the append
string is empty, the two strings can overlap.
In the end, the weakest precondition is: P = valid(s) and valid(append) and [ strlen(append) > 0 => not strOverlap(s, append) ]
.
If we run PreCA over this function, this is indeed what we get.
$ java -ea -jar preca.jar -file examples/strcat_conacq.txt
...
*************Learned Network CL example ******
network var=4 cst=3
-------------------------
CONSTRAINTS:
Valid(var0)
Valid(var1)
NotOverlap(var0, var1)_or_StrlenEq0(var1)[0, 1, 2, 3]
-------------------------
*************Learned Network CL example (SMTLIB) ******
(and (valid v0) (valid v1) (or (not (overlap v0 v1)) (strleneq v1 #x00000000)))
PreCA: Precondition Constraint Acquisition
What is it?
PreCA it a tool that aims to infer the precondition of a function at binary level. It is completely black-box and, as such, is not impacted by syntactic complexity introduced by optimizations or obfuscation. But how does it works? PreCA relies on Constraint Acquisition, a symbolic machine learning framework, to infer the precondition from observed executions. PreCA generates itself the needed testcases (active learning), which enables it to enjoy good guarantees.
In our IJCAI’22 paper, we show that PreCA is more precise and faster that the state-of-the-art.
What it ships
PreCA 0.1.0 enables the following features:
- Proposes to use different Constraint Acquisition algorithms (Conacq, DCA)
- Exhibit a precondition language over pointers (overlap, aliasing …) and integers
- Gives all scripts to replicate experiments from the papers.
Documentation explaining how to run PreCA is available here.
Cannot help but to try it?
Do not hesitate to get the docker image and the source to try PreCA out. For more information about PreCA check the README, the PreCA paper, and the DCA paper.