Explain Briefly about Serialization
By Ramakrishna on Jul 21, 2010 in Java Important Notes, Java Serialization
Serialization is the process of converting a set of object instances that contain references to each other into a linear stream of bytes, which can then be sent through a socket, stored to a file, or simply manipulated as a stream of data. Serialization is the mechanism used by RMI to pass objects between JVMs, either as arguments in a method invocation from a client to a server or as return values from a method invocation. In the first section of this book, I referred to this process several times but delayed a detailed discussion until now. In this chapter, we drill down on the serialization mechanism; by the end of it, you will understand exactly how serialization works and how to use it efficiently within your applications.
What does it mean for the client to pass an instance of Money to the server? At a minimum, it means that the server is able to call public methods on the instance of Money. One way to do this would be to implicitly make Money into a server as well.[1] For example, imagine that the client sends the following two pieces of information whenever it passes an instance as an argument:
- The type of the instance; in this case, Money.
- A unique identifier for the object (i.e., a logical reference). For example, the address of the instance in memory.
The RMI runtime layer in the server can use this information to construct a stub for the instance of Money, so that whenever the Account server calls a method on what it thinks of as the instance of Money, the method call is relayed over the wire
Attempting to do things this way has three significant drawbacks:
- You can’t access fields on the objects that have been passed as arguments.
Stubs work by implementing an interface. They implement the methods in the interface by simply relaying the method invocation across the network. That is, the stub methods take all their arguments and simply marshall them for transport across the wire. Accessing a public field is really just dereferencing a pointer–there is no method invocation and hence, there isn’t a method call to forward over the wire.
- It can result in unacceptable performance due to network latency.
Even in our simple case, the instance of Account is going to need to call getCents( ) on the instance of Money. This means that a simple call to makeDeposit( ) really involves at least two distinct networked method calls: makeDeposit( ) from the client and getCents( ) from the server.
- It makes the application much more vulnerable to partial failure.
Let’s say that the server is busy and doesn’t get around to handling the request for 30 seconds. If the client crashes in the interim, or if the network goes down, the server cannot process the request at all. Until all data has been requested and sent, the application is particularly vulnerable to partial failures.
This last point is an interesting one. Any time you have an application that requires a long-lasting and durable connection between client and server, you build in a point of failure. The longer the connection needs to last, or the higher the communication bandwidth the connection requires, the more likely the application is to occasionally break down.
TIP: The original design of the Web, with its stateless connections, serves as a good example of a distributed application that can tolerate almost any transient network failure. These three reasons imply that what is really needed is a way to copy objects and send them over the wire. That is, instead of turning arguments into implicit servers, arguments need to be completely copied so that no further network calls are needed to complete the remote method invocation. Put another way, we want the result of makeWithdrawal( ) to involve creating a copy of the instance of Money on the server side. The runtime structure should resemble
The desire to avoid unnecessary network dependencies has two significant consequences:
- Once an object is duplicated, the two objects are completely independent of each other.
Any attempt to keep the copy and the original in sync would involve propagating changes over the network, entirely defeating the reason for making the copy in the first place.
- The copying mechanism must create deep copies.
If the instance of Money references another instance, then copies must be made of both instances. Otherwise, when a method is called on the second object, the call must be relayed across the wire. Moreover, all the copies must be made immediately–we can’t wait until the second object is accessed to make the copy because the original might change in the meantime.
These two consequences have a very important third consequence:
- If an object is sent twice, in separate method calls, two copies of the object will be created.
In addition to arguments to method calls, this holds for objects that are referenced by the arguments. If you pass object A, which has a reference to object C, and in another call you pass object B, which also has a reference to C, you will end up with two distinct copies of C on the receiving side.
