Arbitrary data deserialization in Ruby
Deserializing untrusted data is usually dangerous and can lead to serious consequences like remote code execution. Remember not to deserialize user-input data unless you really know what you are doing.
Ruby provides several ways of serializing/deserializing an object in ruby-core
and ruby-stdlib. The most commonly used are Marshal
and YAML
(json
also
has the ability of processing objects in ruby but this will be another story)
The story starts from CVE-2013-0156
in Ruby on Rails, where YAML.load
could be called with arbitrary data when the
framework processes an HTTP request. There’re already some great
write-ups
on this issue so I’m not going to repeat the analysis of the payload.
Generally speaking, the exploits are based on a similar way:
- Find a call of
YAML.load
(orMarshal.load
) where the first parameter could be controlled. - Find a class with a “dangerous method” which will be used to execute the real payload afterwards (the trojan).
- Forge the object of our target class with required properties set as expected to complete the execution chain.
dump
(Marshal
orYAML
) the above object to bytes and send it to the target.
The descriptions above might be obscure for who didn’t look into the issue
carefully. I’ll take ERB
as an example since it exactly fullfills the
requirements of a “deserialization trojan”.
A call to ERB#result
or ERB#run
will eventually lead to a eval
where
the first parameter is the @src
property. This is totally safe while the most
common use case of ERB
is the load a template from local filesystem and render
it with a set of provided bindings.
But what will happen if we look into the following example:
The some_user_input
here actually expects the dump of an object of class Response.
But thanks to Ruby’s duck typing, a dump of an ERB
object is also eligible here.
Finally, the ERB#result
will be called at the puts
line instead of
Response#result
. Therefore, in this case, forging the trojan is considerably easy.
Conclusion
The above example is simplified as much as possible to make it clear enough to understand. Cases in the real world are probably much more complicated to exploit like this one.
Forging the invocation chain in the payload is usually the most tricky part.
During this procedure, some methods like method_missing
, send
or public_send
may worth a closer look.
Due to some historical reasons, the public resources about arbitrary data deserialization are often in the scope of Ruby on Rails instead of Ruby the language and its standard libraries. As Ruby on Rails and its dependencies actually provide many gadgets for completing the invocation chain. However it should be noticed that this issue also exists in common ruby projects. Although the payload may look greatly different to those were used in RoR.