Primary purpose of java serialization is to write an object into a stream, so that it can be transported through a network and that object can be rebuilt again.
Object Serialization in Java is a process used to convert "Object" into a binary format which can be persisted into disk or sent over network to any other running Java virtual machine and the reverse process of creating object from binary stream is called deserialization in Java.
Most impressive thing is that this process is JVM independent, meaning an object can be serialized on one platform and deserialized on an entirely different platform.
Java provides Serialization API for serializing and deserializing object which includes java.io.Serializable, java.io.Externalizable, ObjectInputStream and ObjectOutputStream etc.
Classes ObjectInputStream and ObjectOutputStream are high-level streams that contain the methods for serializing and deserializing an object.
Basic guidelines that should be followed for Object serialization:
- Any class that wants its objects to be serialized must implement the Serializable interface. The Serializable interface does not have any methods. It merely serves to flag an object as serializable to an ObjectOutputStream.
- If there any members in a Serializable class, then the following guidelines apply:
- If they are primitives, they are automatically serializable.
- If they are non-primitive objects, they must implement Serializable. If we try to serialize an object that contains reference to an object that does not implement Serializable then while serializing the object, we get a Runtime Exception.
- If we have a reference to a non-serializable object in our class, then we have to mark the reference with the keyword transient. The transient keyword on a reference means that when the parent object is serialized then the object whose reference is marked as transient will not be serialized.
How serialization is performed in java:
1. Java provides Serialization API, a standard mechanism to handle object serialization. To persist an object in java, the first step is to flatten the object. For that the respective class should implement "java.io.Serializable" interface. We don't need to implement any methods as this interface do not have any methods. This is a marker interface/tag interface. Marking a class as Serializable indicates the underlying API that this object can be flattened.
2. Next step is to actually persist the object. To persist an object you need to
- use node stream to write to file systems or transfer a flattened object across a network wire and have it be rebuilt on the other side.
- You can use java.io.ObjectOutputStream class, a filter stream which is a wrapper around a lower-level byte stream.
- So to write an object you use "writeObject(<<instance>>)" method of "java.io.ObjectOutputStream" class and to read an object you use "readObject()" method of "java.io.ObjectOutputStream" class.
- "readObject()" can read only serialized object, that means if the class does not implement "java.io.Serializable" interface, "readObject()" cannot read that object.
See the following Example:
=========================================
// a non-serializable class
public class NonSer
{
private Integer modelID;
private String modelName;
//rest of the implementations
}
// a serializable class
public class Ser implements Serializable
{
private int engineID;
private String engineName;
// other implementations
}
public class Car implements Serializable
{
//a primitive, hence serializable
private int carID;
//a non-serializable object, hence transient
private transient NonSer nonSerRef;
//a serializable object, hence no transient
private Ser serRef;
// other implementations
}
=========================================
While performing serialization of objects, Java forms a data structure similar to an Object Graph to determine which objects need to be serialized. It starts from the main object to serialize, and recursively traverses all the objects reachable from the main object. For each object that it encounters, which needs serialization, it associates an identifier that marks the object as already been serialized to the given ObjectOutputStream instance. So when Java encounters the same object that has already been marked as serialized to the ObjectOutputStream, it does nor serialize the object again, rather a handle to the same object is serialized. This is how Java avoids having to re-serialize an already serialized object. The seemingly complex problem was solved by the simple method of assigning IDs to objects.
One important thing to note is that if we used different ObjectOutputStream instances to serialize the two Car objects, then Java would have serialized the same Ser object twice albeit in the different streams. This is because the first time Java marks the Ser object with an ID, that ID will associate the object to the first ObjectOutputStream and the next time when Java encounters the same object for serialization it sees that this object has not been serialized to the current ObjectOutputStream and hence it will serialize it again to the new stream.
See following Example:
=========================================
Important points to remember:
Uses of serialization:
=========================================
// a non-serializable class
public class NonSer
{
private Integer modelID;
private String modelName;
//rest of the implementations
}
// a serializable class
public class Ser implements Serializable
{
private int engineID;
private String engineName;
// other implementations
}
public class Car implements Serializable
{
//a primitive, hence serializable
private int carID;
//a non-serializable object, hence transient
private transient NonSer nonSerRef;
//a serializable object, hence no transient
private Ser serRef;
// other implementations
}
Now when we try to serialize an instance of a Car object, there will be no exceptions thrown because all the members of the car object are either primitives, or implement Serializable or are marked with the keyword transient.
Note that in our Car declaration if we had not marked our NonSer object as transient, we would have got a RuntimeException while trying to serialize the Car object.
The above example was a very basic one to show how to serialize an object in Java. Now let us look under the hood as to how Java resolves objects during serialization. To state the problem, consider the following class declarations:
public class Ser implements Serializable
{
private int ID;
private String gearName;
//other members if any
}
public class Car implements Serializable
{
private int ID;
private Ser serRef;
public Car(int i, Ser sRef)
public Car(int i, Ser sRef)
{
this.ID = i;
this. serRef = sRef;
}
}
Now let us serialize two Car objects:
Ser sRef = new Ser();
Car c1 = new Car(1, sRef);
Car c2 = new Car(2, sRef);
ObjectOutputStream out = new ObjectOutputStream(
FileOutputStream(“out.dat”));
out.writeObject(c1);
out.writeObject(c2);
out.close();
In the above code snippet, we serialized two Car objects [note the usage of the same Ser object to construct two different Car objects]. The interesting question is: how many Ser objects were serialized?
There was only one Ser object that both the Car objects were sharing. Hence it should only serialize one Ser object. The answer indeed is one. This can be proved by checking if the serialized Ser object is the same for both the Car objects. Let us look into that:
There was only one Ser object that both the Car objects were sharing. Hence it should only serialize one Ser object. The answer indeed is one. This can be proved by checking if the serialized Ser object is the same for both the Car objects. Let us look into that:
ObjectInputStream in = new ObjectInputStream(
new FileInputStream(“out.dat”));
Car first = (Car) in.readObject();
Car second = (Car) in.readObject();
System.out.println(first. getSerType() == second. getSerType());
The above code does print true. Note that here we are testing for object identity of the Ser object instead of logical equality [which is done via equals() method]. The identity test is required because we want to verify whether both the Ser objects are actually the same.
One important thing to note is that if we used different ObjectOutputStream instances to serialize the two Car objects, then Java would have serialized the same Ser object twice albeit in the different streams. This is because the first time Java marks the Ser object with an ID, that ID will associate the object to the first ObjectOutputStream and the next time when Java encounters the same object for serialization it sees that this object has not been serialized to the current ObjectOutputStream and hence it will serialize it again to the new stream.
See following Example:
=========================================
Ser g = new Ser();
Car c1 = new Car(1, g);
Car c2 = new Car(2, g);
ObjectOutputStream out1 = new ObjectOutputStream(
FileOutputStream(“out1.dat”));
ObjectOutputStream out2 = new ObjectOutputStream(
FileOutputStream(“out2.dat”));
out1.writeObject(c1);
out2.writeObject(c2);
out1.close();
out2.close();
Now to prove that the Ser object is indeed serialized twice, we read the streams back:
ObjectInputStream in1 = new ObjectInputStream(
new FileInputStream(“out1.dat”));
ObjectInputStream in2 = new ObjectInputStream(
new FileInputStream(“out2.dat”));
Car first = (Car) in1.readObject();
Car second = (Car) in2.readObject();
System.out.println(first.getSerType() == second.getSerType());
Now the above code prints false, which proves that the Ser object was serialized twice.
=========================================Important points to remember:
- we use Externalizable interface to control serialization as it provides us writeExternal() and readExternal() method which gives us flexibility to control java serialization mechanism instead of relying on Java's default serialization. Correct implementation of Externalizable interface can improve performance of application drastically.
- Serializable interface exists in java.io package and forms core of java serialization mechanism. It doesn't have any method and also called Marker Interface in Java. When your class implements java.io.Serializable interface it becomes Serializable in Java and gives compiler an indication that use Java Serialization mechanism to serialize this object.
- If we don't want any field to be part of object's state then declare it either static or transient based on our need and it will not be included during Java serialization process.
- If we try to serialize an object of a class which implements Serializable, but the object includes a reference to an non- Serializable class then a ‘NotSerializableException’ will be thrown at runtime and in this case we need to use transient keyword.
- We all know that for serializing an object ObjectOutputStream.writeObject (saveThisobject) is invoked and for reading object ObjectInputStream.readObject() is invoked but there is one more thing which Java Virtual Machine provides you is to define these two method in your class. If you define these two methods in your class then JVM will invoke these two methods instead of applying default serialization mechanism. You can customize behavior of object serialization and deserialization here by doing any kind of pre or post processing task. Important point to note is making these methods private to avoid being inherited, overridden or overloaded. Since only Java Virtual Machine can call private method integrity of your class will remain and Java Serialization will work as normal.
- SerialVersionUID is an ID which is stamped on object when it get serialized usually hashcode of object, you can use tool serialver to see serialVersionUID of a serialized object . SerialVersionUID is used for version control of object. you can specify serialVersionUID in your class file also. Consequence of not specifying serialVersionUID is that when you add or modify any field in class then already serialized class will not be able to recover because serialVersionUID generated for new class and for old serialized object will be different. Java serialization process relies on correct serialVersionUID for recovering state of serialized object and throws java.io.InvalidClassException in case of serialVersionUID mismatch.
-
Suppose if we have super class which is serialized and then if we want to avoid serialization for new class of super class by implementing writeObject() and readObject() method in your Class and need to throw NotSerializableException from those method. This is another benefit of customizing java serialization process.
Suppose we have a class which you serialized it and stored in persistence and later modified that class to add a new field then what will happen if you deserialize the object already serialized?
It depends on whether class has its own serialVersionUID or not. If we don't provide serialVersionUID in our code java compiler will generate it and normally it’s equal to hashCode of object. by adding any new field there is chance that new serialVersionUID generated for that class version is not the same of already serialized object and in this case Java Serialization API will throw java.io.InvalidClassException and this is the reason its recommended to have your own serialVersionUID in code and make sure to keep it same always for a single class.
Uses of serialization:
- To persist data for future use.
- To send data to a remote computer using such client/server Java technologies as RMI or socket programming.
- To "flatten" an object into array of bytes in memory.
- To exchange data between applets and servlets.
- To store user session in Web applications.
- To activate/passivate enterprise java beans.
- To send objects between the servers in a cluster.
No comments:
Post a Comment