Thursday, April 22, 2010

Is Php the only language doing flexible Base64 decoding?

As a follow up to the Base64 decoding post, I did a quick research on Base64 implementations.

http://www.google.com/codesearch?hl=en&sa=N&filter=0&q=base64+decode+lang:java

And some interesting result came out:

http://www.google.com/codesearch/p?hl=en#p9nGS4eQGUI/gnu/classpath/classpath-0.13.tar.gz|er25_rDDsHI/classpath-0.13/gnu/java/net/BASE64.java&q=base64+decode+lang:java

gnu.java.net.BASE64

public static byte[] decode(byte[] bs)
{
int srclen = bs.length;
while (srclen > 0 && bs[srclen - 1] == 0x3d)
{
srclen--; /* strip padding character */
}

That means that any = is stripped before the decoding is actually done.

$ java BASE64 -d "PHNjcm======PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg=="
PHNjcm======PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg== = <scro\ufffd\ufffd\ufffd\ufffd><script>alert(1)</script>

This is of course a bad implementation of B64 decoding.

But it could fool a control since most of decoders stop at first = sequence.

http://www.google.com/codesearch/p?hl=en#p6HPTpcXbFY/JPainter/painter.zip|Iy8ZaJ1-4W4/jsp/Base64.java&q=base64+decode+lang:java

com.izhuk.util.Base64;

public static byte[] decode(String encoded) {
int i;
byte output[] = new byte[3];
int state;

ByteArrayOutputStream data = new ByteArrayOutputStream(encoded.length());

state = 1;
for(i=0; i < encoded.length(); i++)
{
byte c;
{
char alpha = encoded.charAt(i);
if (Character.isWhitespace(alpha)) continue;
and finally:

http://www.google.com/codesearch/p?hl=en#CskViEIa27Y/src/org/apache/commons/codec/binary/Base64.java&q=base64+decode+lang:java&sa=N&cd=19&ct=rc

org.apache.commons.codec.binary.Base64

public static byte[] decodeBase64(byte[] base64Data) {
// RFC 2045 requires that we discard ALL non-Base64 characters
base64Data = discardNonBase64(base64Data);

... act surprising.

If somebody wants to continue the research of B64 implementation I'll appreciate a comment here :)

4 comments :

  1. http://www.ietf.org/rfc/rfc2045.txt
    Page 25:
    Any characters outside of the base64 alphabet are to be ignored in base64-encoded data.
    ...
    That's about the rfc.
    (found on org.apache.commons.codec.binary.Base64 implementation)

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. Hi Andrew thanks :)

    that exactly fit my toughts.
    It to some extent similar to the concept I tried to focus when talking about Hpp.

    The good update is that probably in the next release of ModSecurity we will have two Base64 decoders strict and flexible ;)

    Stefano

    ReplyDelete
  4. Hey Mate,
    I've just checked a few other Java Base64 implementations.Here you have:

    public class Main {

    public static void main(String[] args) throws IOException {
    String token="a.GVsbG8gd29ybGQ=";

    // 1 - com.Ostermiller.util.Base64
    System.out.println("1 - "+com.Ostermiller.util.Base64.decode(token));
    // 2 - sun.misc.BASE64Decoder !deprecated!
    System.out.println("2 - "+new BASE64Decoder().decodeBuffer(token));
    // 3 - com.sun.faces.util.Base64
    System.out.println("3 - "+com.sun.faces.util.Base64.decode(token.getBytes()));
    // 4 - org.apache.axis.utils.Base64
    System.out.println("4 - "+org.apache.axis.utils.Base64.decode(token.getBytes()));

    //run:
    //1 - hello world
    //2 - [B@1d8957f
    //3 - [B@1abab88
    //4 - [B@16cd7d5
    }
    }

    ReplyDelete