|
|
http://www.theregister.co.uk/2007/09/14/system_call_sploits/
By Federico Biancuzzi
14th September 2007
The world of multi-core cpus we have just entered is facing a serious
threat.
A security researcher at Cambridge disclosed a new class of
vulnerabilities that takes advantage of concurrency to bypass security
protections such as antivirus software
The attack is based on the assumption that the software that interacts
with the kernel can be used without interference. The researcher, Robert
Watson, showed that a careful written exploit can attack in the little
timeframe when this happens, and literally change the "words" that they
are exchanging.
Even if some of these dark aspects of concurrency were already known,
Watson proved that real attacks can be developed, and showed that
developers have to fix their code. Fast.
Watson presented his work at WOOT07, USENIX Workshop on Offense
Technology, the results of his research entitled "Exploiting Concurrency
Vulnerabilities in System Call Wrappers".
During the talk he showed how concurrency can be used to bypass security
protections applied by so-called syscall wrappers.
A system call, briefly called syscall, is a basic function in the kernel
that is called by a program. For example, when you open a file it's
highly probable that the software you are using called the open()
syscall to open it.
A sycall wrapper sits between the kernel and the program itself, and
analyzes which syscalls are called and their arguments. A security
wrapper might be configured to block access to some files, so in the
previous example trying to open() the file "secrets.txt", it may stop
the operation and return an error to the application.
We contacted Robert to learn more...
How does the attack work?
System call wrapping is a widely-used technique for extending kernel
security, found in anti-virus systems and security policy enhancement
frameworks such as the GSWTK, Systrace, and CerbNG systems I examine in
the paper. System call interposition allows code running in the kernel
address space to "wrap" system calls, adding new security checks,
replacing the values of arguments to virtualize name spaces, or to audit
arguments for the purposes of logging or intrusion detection. It's a
very flexible technique, and appealing to software authors because it
doesn't require changing existing kernel code, and allows control at the
very well-understood system call interface.
This attack targets a weakness in the system call wraper architecture,
in which system call arguments are separately copied by the system call
wrapper and the kernel, allowing the attacker to "race" to replace the
argument values between copies.
I was able to successfully bypass security in many system call wrappers
by creating unmanaged concurrency between the attacking processes and
the wrapper/kernel. This was possible on both uniprocessor systems and
multiprocessor systems.
The existence of some of these vulnerabilities has been known for years
(Ghormley 1998, Garfinkel 2003, Watson 2003), and I approached the
authors of many of these wrapper systems as early as 2002 to report the
problems. The contribution of this paper is in analyzing the
vulnerability class, thoroughly exploring the attack space (I identify
two previously undiscussed classes of race conditions, one of which is
more broadly applicable), and to explore exploit strategies, allowing us
to reason about the effectiveness of this attack aproach. It turns out
that the approach is very effective indeed.
The paper [PDF] provides both a detailed discussion of the general class
of concurrency vulnerabilities, and more concrete discussion of these
specific vulnerabilities. I'd refer readers especially to the pictures
and code in the slides [PDF] associated with the talk, which should make
both the attack approach and simplicity of the exploits clear. In less
than 20 lines of C code, and using only standard OS calls for memory
access and management, the wrapper protections were completely disabled.
What is needed to succeed?
When I started working on this project, I was sure that the
vulnerabilities could be exploited easily on multiprocessor systems, but
didn't know to what extent uniprocessor systems would be susceptible. I
was also unsure of the software requirements -- were threads required,
etc. As it turns out, the attacks are broadly applicable, working on
unprocessor OS's without threading. The attacker needs to be able to run
code in a local process constrained by a system call wrapper, which he
(or she) will then be able to bypass with relative ease.
On multiprocessor systems, we measure the size of the race window in
cycles, and I found that the width of the race varied enourmously by
wrapper system. Most of the wrapper systems I looked at were
kernel-only, so 30,000 cycles might not be an unusual length. However,
Systrace performs control in user space, leading to race conditions of
500,000 cycles or more due to context switching. In the end, the size in
cycles doesn't make much difference, as both of those numbers are very
large compared to the cost of local memory access.
On uniprocessor systems, creating concurrency between the kernel and
user space may be done using page faults, introduced where the kernel
accesses user memory that has been paged to disk due to memory pressure.
They can also be introduced through network delays or other IPC, which
cause the kernel to yield. The key is that the user process is able to
execute during critical windows between access to a system call argument
by a wrapper and the kernel -- this turns out to be quite straight
forward.
Could it be used in a remote exploit? Or it requires too short/precise
timing to work with common internet latency?
These specific attacks require the attacker to be able to control a
process on the system -- either legitimately (perhaps they have an
unprivileged user account) or less legitimately (they have exploited a
vulnerability in a service, such as Apache, BIND, MySQL, etc to gain
execution privilege). The attacker will then be able to escape from a
sandbox placed around their user process or vulnerable service, gaining
access to the remainder of the system.
The details vary based on the intended effects of the wrapper. For one
GSWTK wrapper, I show how to bypass intrusion detection when exploiting
a vulnerable IMAP daemon, preventing alarms from firing despite
accessing files outside the expected execution profile of an IMAP
daemon. For Sysjail, I show that access control limits on what IP
address can be bound may be entirely bypassed. For Sudo monitor mode, I
am able to prevent the arguments to commands from being properly
audited.
How much does the hardware platform affect the attack?
Multiprocessor systems are marginally easier to exploit since they do
not require forcing kernel context switches via paging or other
techniques. However, I was able to successfully bypass the same wrappers
on uniprocessor systems. I did my experimental work on Intel hardware,
but they should work across a range of hardware architectures and
configurations.
And what about the OS?
These attack techniques target an architectural vulnerability in the
wrapper approach, and readily apply across operating systems and
hardware platforms. I was able to use the same C language exploits
across several operating systems, including Linux, FreeBSD, NetBSD, and
OpenBSD. They should apply equally well on other operating systems.
Is it something that might affect software written in any programming
language?
The broader class of concurrency vulnerabilities are relevant to all
concurrent systems, and are something all software developers need to be
aware of. These specific races require shared memory between the two
parties (processes and kernel/system call wrapper), so vulnerable
software would necessarily involve shared memory between two mutually
untrusting processes. You might find this construction in cases where
server and client processes share memory in order to optimize
inter-process communication, such as between databases and clients or in
windowing systems.
While more rich language systems, such as scripting languages, often
introduce opacity in memory access, in practice they behave fairly
predictably and must do so to use shared memory. If languages support
shared memory, improperly written programs might well be vulnerable.
Likewise, they might well support attacks against system call wrappers
using the techniques I've described.
Robert Watson has been actively involved with FreeBSD since 1999 and
started the TrustedBSD Project in 2000, with the goal of bringing more
advanced security features to the platform. As of October, 2005, he
returned to Academia to work on a PhD at the University of Cambridge
Computer Laboratory, after spending about six years in industry working
in commercial and government-sponsored operating system and network
security research and development.