Java Magic. Part 4: sun.misc.Unsafe
2014-07-24 15:49
405 查看
Java is a safe programming language and prevents programmer from doing a lot of stupid mistakes,most of which based on memory management. But, there is a way to do such mistakesintentionally,
using
This article is a quick overview of
and few interesting cases of its usage.
There is no simple way to do it like
has private constructor. It also has static
but if you naively try to call
probably, get
Using this method available only from trusted code.
This is how java validates if code is trusted. It is just checking that our code was loaded with primary classloader.
We can make our code “trusted”. Use option
running your program and specify path to system classes plus your one that will use
But it’s too hard.
contains its instance called
which marked as
We can steal that variable via java reflection.
Note: Ignore your IDE. For example,
eclipse show error “Access restriction…” but if you run code, all works just fine. If the error is annoying, ignore errors on
in:
There are, actually, few groups of important methods for manipulating with various entities. Here is some of them:
Info. Just returns some low-level memory
information.
Objects. Provides methods for object and
its fields manipulation.
Classes. Provides methods for classes and
static fields manipulation.
Arrays. Arrays manipulation.
Synchronization. Low level primitives for
synchronization.
Memory. Direct memory access methods.
can be useful when
you need to skip object initialization phase or bypass security checks in constructor or you want instance of that class but don’t have any public constructor. Consider following class:
Instantiating it using constructor, reflection and unsafe gives different results.
Just think what happens to all your Singletons.
Consider some simple class that check access rules:
The client code is very secure and
calls
check access rules. Unfortunately, for clients, it always returns
Only privileged users somehow can
change value of
and get access.
In fact, it’s not true. Here is the code demostrates it:
Now all clients will get unlimited access.
Actually, the same functionality can be achieved by reflection. But interesting, that we can modify any object, even ones that we do not have references to.
For example, there is another
in memory located next to current
We can modify its
with the following code
Note, we didn’t use any reference to this object.
size of
in 32 bit architecture. We can calculate it manually or use
that defined… right now.
we can implement C-style
This implementation returns shallow size
of object:
Algorithm is the following: go through all non-static fields
including all superclases, get offset for each field, find maximum and add padding. Probably, I missed something, but idea is clear.
Much simpler
be achieved if we just read
from the class struct for this object, which located with offset 12 in
a method for casting signed int to unsigned long, for correct address usage.
Awesome, this method returns the same result as our previous
In fact, for good, safe and accurate
better to use java.lang.instrument package, but it requires specifyng
in your JVM.
or you can implement custom copy function in your object, but it won’t be multipurpose function.
Shallow copy:
object to its address in memory and vice versa.
This copy function can be used to copy object of any type, its size will be calculated dynamically. Note that after copying you need to cast object to specific type.
removing unwanted objects from memory.
Most of the APIs for retrieving user’s password, have signature as
Why arrays?
It is completely for security reason, because we can nullify array elements after we don’t need them. If we retrieve password as
can be saved like an object in memory and nullifying that string just perform dereference operation. This object still in memory by the time GC decide to perform cleanup.
This trick creates fake
with the same size and replace original one in memory:
Feel safe.
UPDATE: That way is not
really safe. For real safety we need to nullify backed char array via reflection:
Thanks to Peter Verhas for
pointing out that.
Correct, except we can cast every type to every another one, if we want.
This snippet adds
to
so we can cast without runtime exception.
One problem that we must do it with pre-casting to object. To cheat compiler.
To perform that read class contents to byte array and pass it properly to
And reading from file defined as:
This can be useful, when you must create classes dynamically, some proxies or aspects for existing code.
This method throws checked exception, but your code not forced to catch or rethrow it. Just like runtime exception.
Everyone knows that standard java
to perform serialization is very slow. It also require class to have public non-argument constructor.
better, but it needs to define schema for class to be serialized.
Popular high-performance libraries, like kryo have dependencies, which can be unacceptable with low-memory requirements.
But full serialization cycle can be easily achieved with unsafe class.
Serialization:
Build schema for object using reflection. It can be done once for class.
Use
etc. to retrieve actual field values.
Add
to have capability restore this object.
Write them to the file or any output.
You can also add compression to save space.
Deserialization:
Create instance of serialized class.
because does not require any constructor.
Build schema. The same as 1 step in serialization.
Read all fields from file or any input.
Use
etc. to fill the object.
Actually, there are much more details in correct inplementation, but intuition is clear.
This serialization will be really fast.
By the way, there are some attempts in
use
is a max size of java array. Using direct memory allocation we can create arrays with size limited by only heap size.
Here is
And sample usage:
In fact, this technique uses
Memory allocated this way not located in the heap and not under GC management, so take care of it using
It also does not perform any boundary checks, so any illegal access may cause JVM crash.
It can be useful for math computations, where code can operate with large arrays of data. Also, it can be interesting for realtime programmers, where GC delays on large arrays can break the limits.
are atomic and can be used to implement high-performance lock-free data structures.
For example, consider the problem to increment value in the shared object using lot of threads.
First we define simple interface
Then we define worker thread
that uses
And this is testing code:
First implementation is not-synchronized counter:
Output:
Working fast, but no threads management at all, so result is inaccurate. Second attempt, add easiest java-way synchronization:
Output:
Radical synchronization always work. But timings is awful. Let’s try
Output:
Still correct, and timings are better. What about atomics?
Output:
even better. Finally, try
see if it is really privilegy to use it.
Output:
Hmm, seems equal to atomics. Maybe atomics use
(YES)
In fact this example is easy enough, but it shows some power of
As I said,
can be used to implement lock-free data structures. The intuition behind this is simple:
Have some state
Create a copy of it
Modify it
Perform
Repeat if it fails
Actually, in real it is more hard than you can imagine. There are a lot of problems like ABA Problem, instructions reordering, etc.
If you really interested, you can refer to the awesome presentation about lock-free HashMap
UPDATE: Added
to
to avoid risk of infinite loop.
Kudos to Nitsan Wakart
from
contains longest English sentence I’ve ever seen:
Block current thread, returning when a balancing unpark occurs, or a balancing unpark has already occurred, or the thread is interrupted, or, if not absolute and time is not zero, the given time nanoseconds have elapsed, or if absolute, the given deadline in
milliseconds since Epoch has passed, or spuriously (i.e., returning for no “reason”). Note: This operation is in the Unsafe class only because unpark is, so it would be strange to place it elsewhere.
a bunch of useful applications, never use it.
出自:http://mishadoff.github.io/blog/java-magic-part-4-sun-dot-misc-dot-unsafe/
Java is a safe programming language and prevents programmer from doing a lot of stupid mistakes,most of which based on memory management. But, there is a way to do such mistakesintentionally,
using
Unsafeclass.
This article is a quick overview of
sun.misc.Unsafepublic API
and few interesting cases of its usage.
Unsafe instantiation
Before usage, we need to create instance ofUnsafeobject.
There is no simple way to do it like
Unsafe unsafe = new Unsafe(), because
Unsafeclass
has private constructor. It also has static
getUnsafe()method,
but if you naively try to call
Unsafe.getUnsafe()you,
probably, get
SecurityException.
Using this method available only from trusted code.
1 2 3 4 5 6 | public static Unsafe getUnsafe() { Class cc = sun.reflect.Reflection.getCallerClass(2); if (cc.getClassLoader() != null) throw new SecurityException("Unsafe"); return theUnsafe; } |
We can make our code “trusted”. Use option
bootclasspathwhen
running your program and specify path to system classes plus your one that will use
Unsafe.
1 | java -Xbootclasspath:/usr/jdk1.7.0/jre/lib/rt.jar:. com.mishadoff.magic.UnsafeClient |
Unsafeclass
contains its instance called
theUnsafe,
which marked as
private.
We can steal that variable via java reflection.
12 | Field f = Unsafe.class.getDeclaredField("theUnsafe"); f.setAccessible(true); Unsafe unsafe = (Unsafe) f.get(null); |
eclipse show error “Access restriction…” but if you run code, all works just fine. If the error is annoying, ignore errors on
Unsafeusage
in:
Preferences -> Java -> Compiler -> Errors/Warnings -> Deprecated and restricted API -> Forbidden reference -> Warning
Unsafe API
Class sun.misc.Unsafe consists of105methods.
There are, actually, few groups of important methods for manipulating with various entities. Here is some of them:
Info. Just returns some low-level memory
information.
addressSize
pageSize
Objects. Provides methods for object and
its fields manipulation.
allocateInstance
objectFieldOffset
Classes. Provides methods for classes and
static fields manipulation.
staticFieldOffset
defineClass
defineAnonymousClass
ensureClassInitialized
Arrays. Arrays manipulation.
arrayBaseOffset
arrayIndexScale
Synchronization. Low level primitives for
synchronization.
monitorEnter
tryMonitorEnter
monitorExit
compareAndSwapInt
putOrderedInt
Memory. Direct memory access methods.
allocateMemory
copyMemory
freeMemory
getAddress
getInt
putInt
Interesting use cases
Avoid initialization
allocateInstancemethod
can be useful when
you need to skip object initialization phase or bypass security checks in constructor or you want instance of that class but don’t have any public constructor. Consider following class:
1 2 3 4 5 67 | class A { private long a; // not initialized value public A() { this.a = 1; // initialization } public long a() { return this.a; } } |
1 2 3 4 5 67 | A o1 = new A(); // constructor |
Memory corruption
This one is usual for every C programmer. By the way, its common technique for security bypass.Consider some simple class that check access rules:
1 2 3 4 5 67 | class Guard { private int ACCESS_ALLOWED = 1; public boolean giveAccess() { return 42 == ACCESS_ALLOWED; } } |
calls
giveAccess()to
check access rules. Unfortunately, for clients, it always returns
false.
Only privileged users somehow can
change value of
ACCESS_ALLOWEDconstant
and get access.
In fact, it’s not true. Here is the code demostrates it:
1 2 3 4 5 67 | Guard guard = new Guard(); guard.giveAccess(); // false, no access // bypass Unsafe unsafe = getUnsafe(); Field f = guard.getClass().getDeclaredField("ACCESS_ALLOWED"); unsafe.putInt(guard, unsafe.objectFieldOffset(f), 42); // memory corruption guard.giveAccess(); // true, access granted |
Actually, the same functionality can be achieved by reflection. But interesting, that we can modify any object, even ones that we do not have references to.
For example, there is another
Guardobject
in memory located next to current
guardobject.
We can modify its
ACCESS_ALLOWEDfield
with the following code
1 | unsafe.putInt(guard, 16 + unsafe.objectFieldOffset(f), 42); // memory corruption |
16is
size of
Guardobject
in 32 bit architecture. We can calculate it manually or use
sizeOfmethod,
that defined… right now.
sizeOf
UsingobjectFieldOffsetmethod
we can implement C-style
sizeoffunction.
This implementation returns shallow size
of object:
1 2 3 4 5 67 | public static long sizeOf(Object o) { Unsafe u = getUnsafe(); HashSet<Field> fields = new HashSet<Field>(); Class c = o.getClass(); while (c != Object.class) { for (Field f : c.getDeclaredFields()) { if ((f.getModifiers() & Modifier.STATIC) == 0) { fields.add(f); } } c = c.getSuperclass(); } // get offset long maxSize = 0; for (Field f : fields) { long offset = u.objectFieldOffset(f); if (offset > maxSize) { maxSize = offset; } } return ((maxSize/8) + 1) * 8; // padding } |
including all superclases, get offset for each field, find maximum and add padding. Probably, I missed something, but idea is clear.
Much simpler
sizeOfcan
be achieved if we just read
sizevalue
from the class struct for this object, which located with offset 12 in
JVM 1.7 32 bit.
12 | public static long sizeOf(Object object){ return getUnsafe().getAddress( normalize(getUnsafe().getInt(object, 4L)) + 12L); } |
normalizeis
a method for casting signed int to unsigned long, for correct address usage.
12 | private static long normalize(int value) { if(value >= 0) return value; return (~0L >>> 32) & value; } |
sizeoffunction.
In fact, for good, safe and accurate
sizeoffunction
better to use java.lang.instrument package, but it requires specifyng
agentoption
in your JVM.
Shallow copy
Having implementation of calculating shallow object size, we can simply add function that copy objects. Standard solution need modify your code withCloneable,
or you can implement custom copy function in your object, but it won’t be multipurpose function.
Shallow copy:
1 2 3 4 5 67 | static Object shallowCopy(Object obj) { long size = sizeOf(obj); long start = toAddress(obj); long address = getUnsafe().allocateMemory(size); getUnsafe().copyMemory(start, address, size); return fromAddress(address); } |
toAddressand
fromAddressconvert
object to its address in memory and vice versa.
1 2 3 4 5 67 | static long toAddress(Object obj) { Object[] array = new Object[] {obj}; long baseOffset = getUnsafe().arrayBaseOffset(Object[].class); return normalize(getUnsafe().getInt(array, baseOffset)); } static Object fromAddress(long address) { Object[] array = new Object[] {null}; long baseOffset = getUnsafe().arrayBaseOffset(Object[].class); getUnsafe().putLong(array, baseOffset, address); return array[0]; } |
Hide Password
One more interesting usage of direct memory access inUnsafeis
removing unwanted objects from memory.
Most of the APIs for retrieving user’s password, have signature as
byte[]or
char[].
Why arrays?
It is completely for security reason, because we can nullify array elements after we don’t need them. If we retrieve password as
Stringit
can be saved like an object in memory and nullifying that string just perform dereference operation. This object still in memory by the time GC decide to perform cleanup.
This trick creates fake
Stringobject
with the same size and replace original one in memory:
1 2 3 4 5 67 | String password = new String("l00k@myHor$e"); String fake = new String(password.replaceAll(".", "?")); System.out.println(password); // l00k@myHor$e System.out.println(fake); // ???????????? getUnsafe().copyMemory( fake, 0L, null, toAddress(password), sizeOf(password)); System.out.println(password); // ???????????? System.out.println(fake); // ???????????? |
UPDATE: That way is not
really safe. For real safety we need to nullify backed char array via reflection:
1 2 3 4 5 6 | Field stringValue = String.class.getDeclaredField("value"); stringValue.setAccessible(true); char[] mem = (char[]) stringValue.get(password); for (int i=0; i < mem.length; i++) { mem[i] = '?'; } |
pointing out that.
Multiple Inheritance
There is no multiple inheritance in java.Correct, except we can cast every type to every another one, if we want.
12 | long intClassAddress = normalize(getUnsafe().getInt(new Integer(0), 4L)); long strClassAddress = normalize(getUnsafe().getInt("", 4L)); getUnsafe().putAddress(intClassAddress + 36, strClassAddress); |
Stringclass
to
Integersuperclasses,
so we can cast without runtime exception.
1 | (String) (Object) (new Integer(666)) |
Dynamic classes
We can create classes in runtime, for example from compiled.classfile.
To perform that read class contents to byte array and pass it properly to
defineClassmethod.
12 | byte[] classContents = getClassContent(); |
1 2 3 4 5 67 | private static byte[] getClassContent() throws Exception { File f = new File("/home/mishadoff/tmp/A.class"); FileInputStream input = new FileInputStream(f); byte[] content = new byte[(int)f.length()]; input.read(content); input.close(); return content; } |
Throw an Exception
Don’t like checked exceptions? Not a problem.1 | getUnsafe().throwException(new IOException()); |
Fast Serialization
This one is more practical.Everyone knows that standard java
Serializablecapability
to perform serialization is very slow. It also require class to have public non-argument constructor.
Externalizableis
better, but it needs to define schema for class to be serialized.
Popular high-performance libraries, like kryo have dependencies, which can be unacceptable with low-memory requirements.
But full serialization cycle can be easily achieved with unsafe class.
Serialization:
Build schema for object using reflection. It can be done once for class.
Use
Unsafemethods
getLong,
getInt,
getObject,
etc. to retrieve actual field values.
Add
classidentifier
to have capability restore this object.
Write them to the file or any output.
You can also add compression to save space.
Deserialization:
Create instance of serialized class.
allocateInstancehelps,
because does not require any constructor.
Build schema. The same as 1 step in serialization.
Read all fields from file or any input.
Use
Unsafemethods
putLong,
putInt,
putObject,
etc. to fill the object.
Actually, there are much more details in correct inplementation, but intuition is clear.
This serialization will be really fast.
By the way, there are some attempts in
kryoto
use
Unsafehttp://code.google.com/p/kryo/issues/detail?id=75
Big Arrays
As you knowInteger.MAX_VALUEconstant
is a max size of java array. Using direct memory allocation we can create arrays with size limited by only heap size.
Here is
SuperArrayimplementation:
1 2 3 4 5 67 | class SuperArray { private final static int BYTE = 1; private long size; private long address; public SuperArray(long size) { this.size = size; address = getUnsafe().allocateMemory(size * BYTE); } public void set(long i, byte value) { getUnsafe().putByte(address + i * BYTE, value); } public int get(long idx) { return getUnsafe().getByte(address + idx * BYTE); } public long size() { return size; } } |
1 2 3 4 5 67 | long SUPER_SIZE = (long)Integer.MAX_VALUE * 2; SuperArray array = new SuperArray(SUPER_SIZE); System.out.println("Array size:" + array.size()); // 4294967294 for (int i = 0; i < 100; i++) { array.set((long)Integer.MAX_VALUE + i, (byte)3); sum += array.get((long)Integer.MAX_VALUE + i); } System.out.println("Sum of 100 elements:" + sum); // 300 |
off-heap memoryand partially available in
java.niopackage.
Memory allocated this way not located in the heap and not under GC management, so take care of it using
Unsafe.freeMemory().
It also does not perform any boundary checks, so any illegal access may cause JVM crash.
It can be useful for math computations, where code can operate with large arrays of data. Also, it can be interesting for realtime programmers, where GC delays on large arrays can break the limits.
Concurrency
And few words about concurrency withUnsafe.
compareAndSwapmethods
are atomic and can be used to implement high-performance lock-free data structures.
For example, consider the problem to increment value in the shared object using lot of threads.
First we define simple interface
Counter:
12 | interface Counter { void increment(); long getCounter(); } |
CounterClient,
that uses
Counter:
1 2 3 4 5 67 | class CounterClient implements Runnable { private Counter c; private int num; public CounterClient(Counter c, int num) { this.c = c; this.num = num; } @Override public void run() { for (int i = 0; i < num; i++) { c.increment(); } } } |
1 2 3 4 5 67 | int NUM_OF_THREADS = 1000; int NUM_OF_INCREMENTS = 100000; ExecutorService service = Executors.newFixedThreadPool(NUM_OF_THREADS); Counter counter = ... // creating instance of specific counter long before = System.currentTimeMillis(); for (int i = 0; i < NUM_OF_THREADS; i++) { service.submit(new CounterClient(counter, NUM_OF_INCREMENTS)); } service.shutdown(); service.awaitTermination(1, TimeUnit.MINUTES); long after = System.currentTimeMillis(); System.out.println("Counter result: " + c.getCounter()); System.out.println("Time passed in ms:" + (after - before)); |
1 2 3 4 5 67 | class StupidCounter implements Counter { private long counter = 0; @Override public void increment() { counter++; } @Override public long getCounter() { return counter; } } |
12 | Counter result: 99542945 Time passed in ms: 679 |
1 2 3 4 5 67 | class SyncCounter implements Counter { private long counter = 0; @Override public synchronized void increment() { counter++; } @Override public long getCounter() { return counter; } } |
12 | Counter result: 100000000 Time passed in ms: 10136 |
ReentrantReadWriteLock:
1 2 3 4 5 67 | class LockCounter implements Counter { private long counter = 0; private WriteLock lock = new ReentrantReadWriteLock().writeLock(); @Override public void increment() { lock.lock(); counter++; lock.unlock(); } @Override public long getCounter() { return counter; } } |
12 | Counter result: 100000000 Time passed in ms: 8065 |
1 2 3 4 5 67 | class AtomicCounter implements Counter { AtomicLong counter = new AtomicLong(0); @Override public void increment() { counter.incrementAndGet(); } @Override public long getCounter() { return counter.get(); } } |
12 | Counter result: 100000000 Time passed in ms: 6552 |
AtomicCounteris
even better. Finally, try
Unsafeprimitive
compareAndSwapLongto
see if it is really privilegy to use it.
1 2 3 4 5 67 | class CASCounter implements Counter { private volatile long counter = 0; private Unsafe unsafe; private long offset; public CASCounter() throws Exception { unsafe = getUnsafe(); offset = unsafe.objectFieldOffset(CASCounter.class.getDeclaredField("counter")); } @Override public void increment() { long before = counter; while (!unsafe.compareAndSwapLong(this, offset, before, before + 1)) { before = counter; } } @Override public long getCounter() { return counter; } |
12 | Counter result: 100000000 Time passed in ms: 6454 |
Unsafe?
(YES)
In fact this example is easy enough, but it shows some power of
Unsafe.
As I said,
CASprimitive
can be used to implement lock-free data structures. The intuition behind this is simple:
Have some state
Create a copy of it
Modify it
Perform
CAS
Repeat if it fails
Actually, in real it is more hard than you can imagine. There are a lot of problems like ABA Problem, instructions reordering, etc.
If you really interested, you can refer to the awesome presentation about lock-free HashMap
UPDATE: Added
volatilekeyword
to
countervariable
to avoid risk of infinite loop.
Kudos to Nitsan Wakart
Bonus
Documentation forparkmethod
from
Unsafeclass
contains longest English sentence I’ve ever seen:
Block current thread, returning when a balancing unpark occurs, or a balancing unpark has already occurred, or the thread is interrupted, or, if not absolute and time is not zero, the given time nanoseconds have elapsed, or if absolute, the given deadline in
milliseconds since Epoch has passed, or spuriously (i.e., returning for no “reason”). Note: This operation is in the Unsafe class only because unpark is, so it would be strange to place it elsewhere.
Conclusion
Although,Unsafehas
a bunch of useful applications, never use it.
出自:http://mishadoff.github.io/blog/java-magic-part-4-sun-dot-misc-dot-unsafe/
相关文章推荐
- Java Magic. Part 4: sun.misc.Unsafe
- Java Magic. Part 4: sun.misc.Unsafe
- Java Magic. Part 4: sun.misc.Unsafe
- Java Magic. Part 4: sun.misc.Unsafe
- Java sun.misc.Unsafe
- Java并发学习(四)-sun.misc.Unsafe
- java对象的内存布局(二):利用sun.misc.Unsafe获取类字段的偏移地址和读取字段的值
- Java sun.misc.Unsafe 和 CAS
- java对象的内存布局(二):利用sun.misc.Unsafe获取类字段的偏移地址和读取字段的值
- 使用sun.misc.Unsafe获取java对象地址
- Java源码剖析 sun.misc.Unsafe
- 聊聊序列化(二)使用sun.misc.Unsafe绕过new机制来创建Java对象
- Java中的sun.misc.Unsafe
- [Java 基础]sun.misc.Unsafe
- Java魔法类:sun.misc.Unsafe
- jdk源码(一):你想过用java直接操作内存吗?sun.misc.Unsafe
- 使用sun.misc.Unsafe获取java对象地址
- Java中的sun.misc.Unsafe包
- 使用sun.misc.Unsafe及反射对内存进行内省(introspection)
- 源码剖析sun.misc.Unsafe && Compare And Swap(CAS)操作