Proceedings of the IEEE 42th Annual Computers, Software and Applications Conference (COMPSAC '18), IEEE Computer Society (2018) |
Donghong Zhang 1 , Zhenyu Zhang 2 , Bo Jiang 3 , and T.H. Tse 4
[paper from IEEE Xplore | paper from IEEE digital library | technical report TR-2018-03]
ABSTRACT |
Malicious software poses serious threats to our lives,
and the activity to detect malware is becoming more and more important.
An effective approach is to train a classi?er using
known software samples and malware samples, and recognize
malware from new software.
To do that, a recent popular trend
is to use OpCode, which is extracted from executable modules, as
an expression of software entities to drive machine learning.
However, we found that the effectiveness of such a framework highly
suffers from having insuf?cient samples, which is caused by the
low success rate of disassembly due to the intrinsic complexity
of the problem.
In this paper, we propose to increase the success
rate of disassembly by allowing inaccurate disassembling, with
the attempt to increase the number of successful disassembled
samples to improve OpCode-driven malware detection.
We built a lightweight disassembler D-light based on the linear swap
disassembly method to avoid known issues with the recursive
descent manner of IDA Pro.
We carried out experiment to
evaluate the performance, effectiveness, and other design factors
of adopting D-light and IDA Pro as disassemblers for malware detection.
The empirical study shows the D-light is both more
ef?cient and more effective than IDA Pro in supporting malware detection.
Index Terms: Malware detection, OpCode, disassembly, D-light, IDA Pro, linear sweep |
|
EVERY VISITOR COUNTS: |