Quick Dive into Trail of Bits' Slither
Intro
Slither is a static analyzer that has been developed by Trail of Bits to help smart contract developers find bugs in their code. In this post, I’ll try to get my hands dirty with Slither so you don’t have to. Moreover, having a background writing static analysis tools myself, I’m curious how Slither is architected and I’m excited at the prospect of writing detector for it…one day.
This post attempts to understand the work-flow of Slither. Target audience for this are folks who
- would like to understand the architecture/work-flow of Slither
- would like to start to write a detector (like me) but don’t know where to start
Treat this as a (shoddy) introduction to Slither, that at the time of writing addresses only the author’s curiosity. haha.
First things first, Slither itself is written in python3
, yaay!
One of the first things slither does is to use the solidity compiler (solc
binary) to obtain the AST of the program to be analyzed.
Therefore, before I proceed, let me install the Solidity compiler.
Since most of the test contracts in the slither code base are targeted at
compiler version 0.4.24, I chose to pick it up from the official GitHub page
here.
One could also fetch the officially distributed compiler for your Ubuntu
distribution like so:
sudo add-apt-repository ppa:ethereum/ethereum
sudo apt-get update
sudo apt-get install solc
Try Slither Out
After installing the solc
binary, I set up a python IDE to debug slither.
Essentially, the idea is to use a good debugger (I’m using Jet Brain’s PyCharm) to step through slither code and understand the steps involved in analyzing smart contracts.
The invocation that I am using for debugging is the elementary:
$ slither <name_of_contract>.sol
What this is supposed to do is analyze the source code of the contract and spit out bug reports, like so:
INFO:Detectors: Uninitialized state variable in ../solidity/001_name_references.sol, Contract: test, Variable: variable, Used in ['f']
INFO:Detectors: Contract 'test' is not in CapWords
INFO:Detectors: Parameter '' is not in mixedCase, Contract: '', Function: 'test''
What you’d notice when you run slither against buggy code are the following things
- The smart contract to be analyzed needs to be compilable but not necessarily runnable
- Bug reports are spit out on
stderr
- Each bug report is prefixed with the string
INFO:Detectors:
But this is too high level, let’s step through slither at a more easy pace
Entry point
The entry point for slither
is the main function of course.
This function is defined in a python file called __main__.py
in the slither distribution.
The very first thing this main function does is to fetch all detectors
and printers
.
Each detector
object in slither detects a class of bugs, and each printer
object logs useful information about the program under analysis e.g., its call graph, what a function is trying to do (so called function summary) etc.
Detectors
To get a sense of the kind of bugs Slither detects, let’s look at the default set of detectors that Slither provides. Here’s an exhaustive list at the time of writing
UninitializedStateVarsDetection,
ConstantPragma,
OldSolc,
Reentrancy,
UninitializedStorageVars,
LockedEther,
ArbitrarySend,
Suicidal,
UnusedStateVars,
TxOrigin,
Assembly,
LowLevelCalls,
NamingConvention,
ConstCandidateStateVars,
ExternalFunction
That makes it a total of 15 detectors for as many bug classes. A brief digression: Until we have a formalization of bug classes as in the C/C++ space (see the common weakness enumeration project), I’d expect bug classification for Solidity to be largely ad-hoc.
Let’s dive deep into an elementary bug class to see how bug detection is
implemented.
The Backdoor
detector (unlisted, but available in source) looks like an
example detector that makes for a good
starting example.
Here’s the backdoor.sol
contract that may be found in the slither code base
that the backdoor detector is meant to detect.
pragma solidity 0.4.24;
contract C{
function i_am_a_backdoor() public{
selfdestruct(msg.sender);
}
}
Evidently, this contract
- defines a function that calls the
selfdestruct
method on the msg sender
What’s the selfdestruct
method?
The only possibility that code is removed from the blockchain is when a contract at that address performs the selfdestruct operation. The remaining Ether stored at that address is sent to a designated target and then the storage and code is removed from the state.
In this intentionally buggy piece of code:
- When some other contract calls
C.i_am_a_backdoor()
the piece of code that points tomsg.sender
i.e., the caller ofC.i_am_a_backdoor()
is going to be removed from the blockchain. C.i_am_a_backdoor()
is a means to hide oneself
So, let’s see what happens when Slither analyzes this piece of code:
INFO:Detectors: Backdoor function found in C.i_am_a_backdoor
INFO:Detectors: Suicidal function in /home/bhargava/work/github/slither/tests/backdoor.sol Contract: C, Function: i_am_a_backdoor
INFO:Detectors: Function 'i_am_a_backdoor' is not in mixedCase, Contract: 'C'
INFO:Detectors: Public function in /home/bhargava/work/github/slither/tests/backdoor.sol Contract: C, Function: i_am_a_backdoor should be declared external
INFO:Slither:/home/bhargava/work/github/slither/tests/backdoor.sol analyzed (1 contracts), 4 result(s) found
Voila, the backdoor function is flagged and reported to the user (see first line of report). We will ignore the other bugs flagged by other detectors since our purpose is to get a general sense of how detection works, not understand the specifics of a particular detector. So, how does the detection work under the hood?
Well, to begin with, any static analyzer needs to “understand” the code being analyzed. What needs to be understood is essentially: “What is this program trying to do? Is there a bug in it?”. These two questions hinge on semantic program analysis which is a complex problem.
We can begin to get a semantic understanding of a program by first looking at its syntax tree. A syntax tree is a tree: A directed acyclic graph that remains acyclic even if directionality is removed. The nodes of the tree are syntactic elements of the programming language in which the analyzed program is written. Here’s a snippet of an actual AST (as a JSON string) of the backdoor program shown above.
{
"attributes" :
{
"absolutePath" : "tests/backdoor.sol",
"exportedSymbols" :
{
"C" :
[
11
]
}
},
"children" :
[
{
"attributes" :
{
"literals" :
[
"solidity",
"0.4",
".24"
]
},
"id" : 1,
"name" : "PragmaDirective",
"src" : "0:23:0"
},
{
"attributes" :
{
"baseContracts" :
[
null
],
"contractDependencies" :
[
null
],
"contractKind" : "contract",
"documentation" : null,
"fullyImplemented" : true,
"linearizedBaseContracts" :
[
11
],
"name" : "C",
"scope" : 12
},
...
}
...
}
Hope this gives you a sense of the AST.
The AST is essentially a dictionary object with certain top-level attributes
and a list of children.
For example, one of the children is the pragma
directive on line 1 of
backdoor.sol
.
This child contains an ID, mapping to the source file, and a list of string
literals it holds together.
In the following, I briefly describe what happens inside Slither even before
bug detection is attempted.
Step 1: Obtain AST
The first thing that slither does is obtain the AST of the analyzed
program in the form of a JSON string using the Solidity compiler, solc
.
solc
supports this off-the-shelf with such an
invocation as:
$ ./solc tests/backdoor.sol --ast-json --allow-paths .
Step 2: Parse AST into CFG
Once the AST (JSON string) has been obtained, the next thing Slither does is to parse it. Parsing the AST entails parsing the JSON of the AST. The AST parsing in Slither is quite sophisticated, not something I can describe succinctly here.
The main idea behind parsing the AST is to created a (cyclic) directed graph that shows control flow in the analyzed smart contract. This is necessary because the AST itself is not adequate to grasp control-flow.
Control-flow graph is created at the granularity of a function call i.e., each function in the analyzed smart contract maps to a corresponding CFG. You can find the function that does the AST parsing/CFG creation here.
Step 3: Drop to Slithir
Once the CFG has been created for all functions in the smart contract under analysis, Slither drops the AST/CFG representation of the analyzed smart contract into an SSA-based intermediate representation called Slithir. By “dropping”, I mean conversion from a higher-level program abstraction (AST/CFG) to a lower-level program abstraction (Slithir). But why?
I can only hazard the following guesses:
- Analysis based on an IR removes the dependency on the PL in which a smart contract is written. If tomorrow, a new smart contract PL is invented, Slither can still support it by adding a parser/converter to IR.
- SSA-based IR makes certain kinds of analysis simpler (see section called “Benefits” in the SSA wiki article)
Step 4: Detect Backdoor
Steps 1–3 are performed as the Slither python object is created. Once the analysis infrastructure is ready (AST,CFG,Slithir), detectors are processed sequentially. Each detector encodes the “business logic” of detection for the bug class that it is meant to detect.
So, let’s see what’s happening in the sample backdoor detector.
class Backdoor(AbstractDetector):
"""
Detect function named backdoor
"""
ARGUMENT = 'backdoor' # slither will launch the detector with slither.py --mydetector
HELP = 'Function named backdoor (detector example)'
IMPACT = DetectorClassification.HIGH
CONFIDENCE = DetectorClassification.HIGH
def detect(self):
ret = []
for contract in self.slither.contracts_derived:
# Check if a function has 'backdoor' in its name
for f in contract.functions:
if 'backdoor' in f.name:
# Info to be printed
info = 'Backdoor function found in {}.{}'.format(contract.name, f.name)
# Print the info
self.log(info)
# Add the result in ret
source = f.source_mapping
ret.append({'vuln': 'backdoor', 'contract': contract.name, 'sourceMapping' : source})
return ret
You’d notice that the business logic of bug detection is quite concise.
The detection logic resides in the detect
method of the Backdoor
object
that implements the AbstractDetector
interface.
To my mind, this is the python equivalent of a Clang Static Analyzer
checker.
Everything that a detector wants to know about the program is contained in
the self.slither
object.
This object contains the following fields:
contracts_derived
: This field holds the_data
: AST obtained from the Solidity compilerfunctions
: CFG of all functions in the contractslither
: Slithir representation of the contract
The detector uses this information to decide whether to flag a bug or not. A detector need only use the information that is necessary for the bug detection logic. For example, here’s what the backdoor detector is doing
- Iterate over all functions in the analyzed contract
- If a function is called “backdoor”
- Flag a bug saying “backdoor found”
- If a function is called “backdoor”
- return a nicely formatted bug diagnostics object (list of dictionaries, each dictionary being a distinct bug report)
In other words, the backdoor
detector is only using the function.name
field in the function’s CFG to flag a bug.
Of course, this is cheating cos you can’t simply conclude that a function is
a backdoor if it is called one.
However, the reason I picked up this specific detector is because it is meant
as an introduction to writing detectors.
In the real-world, you’d do some analysis on the IR (e.g., check if the
analyzed function makes a call to the selfdestruct
function) before
concluding that it is indeed a backdoor.
Perhaps, this entails listing all calls made by a function and checking if
selfdestruct
happens to be one of them.
Outro
So that was a quick dive into Slither. We laid out the work flow of Slither from (1) taking the AST of a smart contract as input, (2) producing its CFG, (3) reducing this to an SSA-based IR (4) and finally, detecting bugs based on program information contained in the IR.
If there is some specific aspect of Slither you’d want to know more about that this post didn’t cover, let me know. When I have the time, I’d be more than happy to write a part 2 of this post. That’s all folks.