-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathresearch.html
More file actions
281 lines (240 loc) · 11 KB
/
research.html
File metadata and controls
281 lines (240 loc) · 11 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>Damien Octeau</title>
<link href="css.css" rel="stylesheet" type="text/css">
<script src="js/ga.js"></script>
</head>
<body>
<table cellspacing="0" cellpadding="5px">
<tr> <td class="myname">Damien Octeau</td> </tr>
<tr> <td id="nav" class="mylinks"> <myl> <a href="index.html">Home</a> ◊ <strong>Research</strong> ◊ <a href="publications.html" class="linknostyle">Publications</a> ◊ <a href="vita.html" class="linknostyle">Vita</a> ◊ <a href="tools.html" class="linknostyle">Tools</a> </myl> </td> </tr>
</table>
<div class="content">
<td class="body-research">
<h2>Retargeting Mobile Applications to Java Bytecode</h2>
<p class="text">
Android applications are developed in Java but compiled to a platform-specific
Dalvik bytecode. Dalvik bytecode runs in a Dalvik virtual machine, which was
designed for resource-constrained platforms such as smartphones and tablets.
Since existing analysis frameworks target Java source code and bytecode, it is
necessary to convert Android applications to these well-known Java formats.
</p>
<p class="text">
ded is a project which aims at decompiling Android applications. The ded
tool retargets Android applications in .dex format to traditional .class
files. These .class files can then be processed by existing Java tools,
including decompilers. Thus, Android applications can be analyzed using a
vast range of techniques developed for traditional Java applications.
</p>
<p class="text">
ded was the first tool that was able to reliably convert Android applications
to source code. It was used in a seminal large scale
analysis of Android applications. We decompiled the 1,100
most popular applications using
ded. The decompiled code was then analyzed using Fortify Source Code
Analyzer (SCA). We implemented Android-specific detection rules in Fortify SCA.
While this analysis did not reveal any malware, we found that phone
identifiers and other personally identifiable information were widely used
by Android applications.
</p>
<center>
<div style="width:250px">
<img src="pics/dare_overview.jpg" alt="Dare retargeting process overview" style="max-width:100%"></img>
</div>
</center>
<p class="text">
On the other hand, the Dare tool adopts a principled approach to Dalvik
retargeting. Its typed intermediate representation uses a strong type
inference algorithm and allows translation to Java bytecode using only 9
rules for all 257 Dalvik opcodes. An important feature of Dare is
its ability to rewrite unverifiable input bytecode so that the output Java
bytecode is verifiable. In particular, the use of
stronger methods makes it a better retargeting tool than
ded, our first (ad hoc) retargeting
tool. Dare is more reliable at retargeting Android bytecode and generates
verifiable
Java bytecode in a vast majority of cases.
In order to enable the analysis of retargeted Android code by other
researchers, we have made Dare available for download. Both binaries and
source code are available from the <a href="http://siis.cse.psu.edu/dare/">Dare webpage</a>.
</p>
<h3>Related Publications</h3>
<p>
Damien Octeau, Somesh Jha and Patrick McDaniel. <i>Retargeting
Android Applications to Java Bytecode.</i> 20th International Symposium
on the Foundations of Software Engineering (FSE). Cary, NC. November 2012.
<i><font color = "ff0000">Best Artifact Award</font></i>
</p>
<p>
William Enck, Damien Octeau Patrick McDaniel and Swarat Chaudhuri.
<i>A Study of Android Application Security.</i>
Proceedings of the 20th USENIX Security Symposium. San Francisco, CA,
August 2011.
</p>
<p>
Damien Octeau, William Enck and Patrick McDaniel.
<i>The ded Decompiler.
</i>
Technical Report NAS-TR-0140-2010,
Network and Security Research Center,
Department of Computer Science and Engineering, Pennsylvania State
University, University Park, PA.
</p>
<h3>Related Tools</h3>
<p>
<a href="http://siis.cse.psu.edu/dare/">Dare</a>
</p>
<p>
<a href="http://siis.cse.psu.edu/ded/">ded</a>
</p>
<p>
<a href="http://siis.cse.psu.edu/tools/fsca_rules-final.html">Fortify SCA custom rules</a>
</p>
<hr />
<h2>Composite Constant Propagation and its Application to Program
Analysis for Security</h2>
<p>Many threats present in smartphones are the result of interactions
between application components, not just artifacts of single
components. For example, information may flow between components in an
unsafe manner. A component in an application may retrieve a user's
location data or contacts. It may subsequently send the sensitive
private information to a component in another application. The
receiving component may then leak the sensitive information to the
network, to an untrusted third party.
</p>
<p>We reduce the discovery of ICC to an instance of the Interprocedural
Distributive Environment (IDE) data flow problem. This approach is
very accurate, conservatively keeping track of multiple execution
branches. It is path-sensitive, flow-sensitive, inter-procedural and
context-sensitive. Our implementation of this approach is called Epicc
(Effective and Precise ICC). It scales well, taking on average less
than two minutes per application in a large scale study of 1,200
applications. Epicc uses Java classes as input, which can be generated
from Android bytecode using our Dare retargeting tool.
</p>
<center>
<div style="width:300px">
<img src="pics/env-transformers.jpg" alt="Environment transformers" style="max-width:100%"></img>
</div>
</center>
<p>While Epicc is a significant improvement over state-of-the-art
approaches, it is still limited in coverage, due to the difficulty of
individually specifying data domains and transfer functions. Thus, we
generalize the problem of inferring values of objects with composite
types as composite constant propagation problems. We introduce the
COAL language to specify composite constant propagation problems and
implement a solver that automatically generates data domains and
transfer functions. Solutions are then found using existing algorithms,
requiring minimal intervention from the analyst.
</p>
<p>Using COAL, we build IC3, a tool for inferring ICC with
significantly better precision than Epicc. Unlike Epicc, it models all
ICC primitives. IC3 itself is used as the basis of inter-component
information flow analysis in the related IccTA tool. COAL was also used
with success to resolve reflection in Android applications.
</p>
<h3>Related Publications</h3>
<p>
Damien Octeau, Daniel Luchaup, Somesh Jha, and Patrick McDaniel.
<a href="pubs/octeau-tse16.pdf">Composite Constant Propagation and its
Application to Android Program Analysis</a>. <i>IEEE Transactions of
Software Engineering (TSE)</i>, vol. 42, no. 11, pp. 999-1014, November
2016.
</p>
<p>
Li Li, Tegawende F. Bissyande, Damien Octeau, and Jacques Klein.
<a href="pubs/li-issta16.pdf">DroidRA: Taming Reflection to Support
Whole-Program Analysis of Android Apps</a>. <i>Proceedings of the 25th
International Symposium on Software
Testing and Analysis (ISSTA)</i>. Saarbrucken, Germany, July 2016.
<i>Acceptance rate: 25.17%</i>.
</p>
<p>
Damien Octeau, Daniel Luchaup, Matthew Dering, Somesh Jha, and Patrick
McDaniel. <a href="pubs/octeau-icse15.pdf">Composite Constant
Propagation: Application to Android
Inter-Component Communication Analysis</a>. <i>Proceedings of the 37th
International Conference on Software Engineering (ICSE)</i>, May 2015.
Florence, Italy. <i>Acceptance rate: 18.5%</i>.
</p>
<p>
Li Li, Alexandre Bartel, Jacques Klein, Yves Le Traon, Steven Artz,
Siegfried Rasthofer, Eric Bodden, Damien Octeau, and Patrick McDaniel.
<a href="pubs/li-icse15.pdf">I Know What leaked in Your Pocket:
Uncovering Privacy Leaks on Android
Apps with Static Taint Analysis</a>. <i>Proceedings of the 37th
International Conference on Software Engineering (ICSE)</i>, May 2015.
Florence, Italy. <i>Acceptance rate: 18.5%</i>.
</p>
<p>
Damien Octeau, Patrick McDaniel, Somesh Jha, Alexandre Bartel,
Eric Bodden, Jacques Klein, and Yves Le Traon.
<a href="pubs/octeau-sec13.pdf">Effective
Inter-Component Communication Mapping in Android with <i>Epicc</i>: An
Essential Step Towards Holistic Security Analysis</a>. <i>Proceedings of
the 22nd USENIX Security Symposium</i>, August 2013. Washington, DC.
<i>Acceptance rate: 16.2%</i>.
</p>
<h3>Related Tools</h3>
<p>
<a href="http://siis.cse.psu.edu/ic3/">IC3</a>
</p>
<p>
<a href="http://siis.cse.psu.edu/coal/">COAL</a>
</p>
<p>
<a href="http://siis.cse.psu.edu/epicc/">Epicc</a>
</p>
<hr />
<h2>Combining Static Analysis Results with Probabilistic Models</h2>
<p>Despite the many techniques devised to increase the precision of
static analysis results, the results precision is often not high
enough for large scale analysis. This is because the static inference
of many properties is undecidable, and others are too computationally
expensive. This is especially problematic with the rise of centralized
application markets, where market providers may want to verify
properties (e.g., security) in their entire corpus. In this case
imprecise results are not acceptable.
</p>
<center>
<div style="width:300px">
<img src="pics/links.png" alt="ICC links" style="max-width:100%"></img>
</div>
</center>
<p>We explore the use of probabilistic models in order to help sift
through large numbers of results and prioritize them by decreasing
order of likelihood. We apply this to the computation of links between
over 10,000 Android applications with our PRIMO tool. We find that
probabilistic models are an effective and accurate way to predict
which links computed with static analysis are most likely to be false
positives.
</p>
<h3>Related Publications</h3>
<p>
Damien Octeau, Somesh Jha, Matthew Dering, Patrick McDaniel, Alexandre
Bartel, Li Li, Jacques Klein, and Yves Le Traon.
<a href="pubs/octeau-popl16.pdf">Combining Static Analysis with
Probabilistic Models to Enable Market-Scale Android
Inter-Component Analysis</a>. <i>Proceedings of the 43rd ACM
SIGPLAN-SIGACT Symposium on Principles of Programming Languages
(POPL)</i>, January 2016. St. Petersburg, Florida, USA. <i>Acceptance
rate: 23.3%</i>.
</p>
<h3>Related Tools</h3>
<p>
<a href="http://siis.cse.psu.edu/primo/">PRIMO</a>
</p>
<hr />
</td>
</div>
<br>
<div class="bott">
<table>
<tr> <td> </td> </tr>
</table>
</div>
</body>
</html>