What are the security implications of deserializing untrusted data in Java?
Can I safely deserialize untrusted data if my code has no assumptions about the state or class of the deserialized object, or if only the deserialization behavior will lead to unexpected operations?
(threat model: an attacker can freely modify serialized data, but this is what he can do)
Solution
Deserialization itself can be unsafe A serializable class can define a readObject method (see specification), which is called when the object of this class will be deserialized from the stream An attacker cannot provide this code, but with well-designed input, any input can be used to call any such readObject method in the classpath
Code injection
A readObject implementation can open the door of arbitrary bytecode injection Just read a byte array from the stream and pass it to classloader Defineclass and classloader Resolveclass() (see Javadoc of the former and the later) I don't know what the purpose of this implementation is, but it's possible
Memory fatigue
It is difficult to write a safe readObject method Until something recently, the readObject method of HashMap contains the following lines
int numBuckets = s.readInt(); table = new Entry[numBuckets];
In this way, attackers can easily allocate several gigabytes of memory and only need dozens of bytes of serialized data, so that they can use outofmemoryerror to shut down the system at any time
The current implementation of hashtable still seems to be vulnerable to similar attacks; It calculates the size of the allocated array based on the number of elements and load factor, but unreasonable values are not protected in LoadFactor, so we can easily request to allocate one billion time slots for each element in the table
Excessive CPU load
The vulnerability in HashMap was fixed as part of a change to address another security issue related to hash based maps Cve-2012-2739 describes a denial of service attack based on CPU consumption by creating a HashMap with very many conflicting keys (i.e. different keys with the same hash value) The recorded attack is based on the URL in HTTP post data or the query parameter in the key, but the deserialization of HashMap is also vulnerable to this attack
Safeguards placed in HashMap to prevent this type of attack focus on maps using string keys This is sufficient to prevent HTTP based attacks, but it is easy to bypass deserialization, for example, by wrapping each string with an ArrayList (whose hashcode is also predictable) Java 8 includes a proposal (jep-180) to further improve the behavior of HashMap in the face of many conflicts, extend protection to all key types that implement comparable, but still allow attacks based on ArrayList keys
As a result, an attacker can design a byte stream so that the CPU workload required to deserialize objects from the stream increases twice with the size of the stream
outline
By controlling the input of the deserialization process, an attacker can trigger the call of any readObject deserialization method Theoretically, this method allows bytecode injection In fact, this can easily exhaust memory or CPU resources, leading to denial of service attacks It is very difficult to check whether your system has such a vulnerability: you must check every implementation of readObject, including all implementations in third-party libraries and runtime libraries