Serialization

De la WikiLabs
Jump to navigationJump to search

Besides Input/Output streams which were thoroughly described in the stream chapter, the Java language also provides a powerful system of serializing objects. We already know that a class is a structure that encapsulates data and functionality. Serialization is the procedure through which the data encapsulated in an instance of a class are sent and received by streams. The classes that do this are java.io.ObjectOutputStream and java.io.ObjectInputStream, and the methods used are writeObject(Object) and readObject().

Rule: For a class be serializable (or better said, for the objects of that class to be serializable), the class has to implement the java.io.Serializable interface:


public class Question implements Serializable{

    public String questionBody;
    public String[] answers;
    public int correctAnswerIndex;

}

Object Serialization

The block diagram of an ObjectOutputStream with an associated OutputStream

An ObjectOutputStream is, just as class FilterOutputStream, a "wrapper" that needs another OutputStream in order to function. The role of ObjectOutputStream is to transform an object into a sequence of bytes which can be then written to the byte-level stream:

public class ObjectWriter{

public static void main(String[] _args){
    try{
        OutputStream _byteStream = getSomeOutputStream();
        ObjectOutputStream _objectStream = new ObjectOutputStream(_byteStream);
        Question _q1 = new Question();
        _objectStream.writeObject(_q1);
        _objectStream.close();
    }catch(IOException _ioe){
        System.out.println("Unable to write object: " + _ioe.getMessage());
    }
}

public static OutputStream getSomeOutputStream(){
    //...
}

}

Static fields are not serialized (since they don't belong to an object, and serialization is only used for objects). If the serialized object contains references to other objects, those are recursively serialized, following the same rules. This is called creating a "deep copy" of the original object.

There is an option of NOT serializing a field in an object, if that field is defined as transient:

public class Question implements Serializable{

    public String questionBody;
    public String[] answers;
    public transient int correctAnswerIndex;

}

Object De-Serialization

Block diagram of an ObjectInputStream with an associated InputStream

Reading serialized objects is done by using an objects of type ObjectInputStream which uses another generic InputStream. As specified above, a serialized objects is in fact a sequence of bytes representing the values of each non-static, non-transient field of the class. This only contains minimal information about the class, to, in order for an object to be deserialized, the application that reads it needs to have available the class of which the object is an instance of. If this class is not available, then the method readObject() will throw an exeception of type java.lang.ClassNotFoundException.

The class checking is done at runtime (during the program execution, during the call to the readObject() method) and it also involves comparing a field defined as static final long serialVersionUID in the class, with the value contained by the serialized object. So, it is not sufficient that the source and destination classes have identical source codes, in order for them to be compatible during serialization, they have to have the same value for field serialVersionUID. If this value is not specified by the programmer, it is generated by the compiler during class compilation and it may differ for the same source code, compiled on two different computers:

public class Question implements Serializable{

    static final long serialVersionUID = 0x0000CAFEBABE0000;

    public String questionBody;
    public String[] answers;
    public int correctAnswerIndex;

}

If any problems appear during deserialization, then the method readObject() will throw an exception of type java.io.InvalidClassException. All serialization related exceptions are extended from java.io.ObjectStreamException.

In order to be compatible with any kind of object, method readObject() from class ObjectInputStream returns a reference to an Object. So, an explicit down-cast is required in order to use the object type class:

import java.io.*;

public class ObjectReader{

public static void main(String[] _args){
    try{
        InputStream _byteStream = getSomeInputStream();
        ObjectInputStream _objectStream = new ObjectInputStream(_byteStream);
        Question _q1;
        _q1 = (Question)_objectStream.readObject();
        _objectStream.close();
    }catch (ClassNotFoundException _cnfe) {
        System.out.println("Class not available: " + _cnfe.getMessage());
    }catch(ObjectStreamException _ose){
        // ObjectStreamException first since it inherits IOException
        System.out.println("Unable to deserialize (Object stream problem): " + _ose.getMessage());
    }catch(IOException _ioe){
        System.out.println("Unable to deserialize (Byte stream problem): " + _ioe.getMessage());
    }
}

public static InputStream getSomeInputStream(){
    //...
}

}

Known Serialization Issues

  1. If an object is written then read from a stream, the read object will have a different reference than the original one, so it will, in fact, be a copy of the object, with the same content.
  2. The virtual machine created a cache of objects when these are written. So, if an object is written on the stream, modified, then written again, the change will not be reflected during the second serialization, since it is written from the cache. There are two work-arounds:
    • you can copy/ clone the original object, change it, and write the copy;
    • you can call ObjectOutputStream.reset() which will empty the cache.
  3. If you write a lot of different objects without clearing the cache, this cache will become so large that it can eventually occupy all of the heap memory, crashing your program with an exception of type java.lang.OutOfMemoryError.
  4. If a resource contains both an InputStream and an OutputStream (like a Socket), and it is required to create an ObjectInputStream and an ObjectOutputStream on top of them, always create the ObjectOutputStream first, or the ObjectInputStream constructor will block until the other side of the stream is opened, potentially hanging your application.