Thursday, June 29, 2017
Java Code to check if two binary trees are identical or not
Java Code to check if two binary trees are identical or not
The idea is to traverse both trees and compare value at their root node. If the value matches, we recursively check if left subtree of first tree is identical to left subtree of second tree and right subtree of first tree is identical to right subtree of second tree. If the value at their root node differs, the trees violates data property. If at any point in the recursion, the first tree is empty & second tree is non-empty or second tree is empty & first tree is non-empty, the trees violates structural property and they cannot be identical.
// Java program to see if two trees are identical
// A binary tree node
class Node
{
int data;
Node left, right;
Node(int item)
{
data = item;
left = right = null;
}
}
class BinaryTree
{
Node root1, root2;
/* Given two trees, return true if they are structurally identical */
boolean identicalTrees(Node a, Node b)
{
/*1. both empty */
if (a == null && b == null)
return true;
/* 2. both non-empty -> compare them */
if (a != null && b != null)
return (a.data == b.data
identicalTrees(a.left, b.left)
identicalTrees(a.right, b.right));
/* 3. one empty, one not -> false */
return false;
}
/* Driver program to test identicalTrees() function */
public static void main(String[] args)
{
BinaryTree tree = new BinaryTree();
tree.root1 = new Node(1);
tree.root1.left = new Node(2);
tree.root1.right = new Node(3);
tree.root1.left.left = new Node(4);
tree.root1.left.right = new Node(5);
tree.root2 = new Node(1);
tree.root2.left = new Node(2);
tree.root2.right = new Node(3);
tree.root2.left.left = new Node(4);
tree.root2.left.right = new Node(5);
if (tree.identicalTrees(tree.root1, tree.root2))
System.out.println("Both trees are identical");
else
System.out.println("Trees are not identical");
}
}
// A binary tree node
class Node
{
int data;
Node left, right;
Node(int item)
{
data = item;
left = right = null;
}
}
class BinaryTree
{
Node root1, root2;
/* Given two trees, return true if they are structurally identical */
boolean identicalTrees(Node a, Node b)
{
/*1. both empty */
if (a == null && b == null)
return true;
/* 2. both non-empty -> compare them */
if (a != null && b != null)
return (a.data == b.data
identicalTrees(a.left, b.left)
identicalTrees(a.right, b.right));
/* 3. one empty, one not -> false */
return false;
}
/* Driver program to test identicalTrees() function */
public static void main(String[] args)
{
BinaryTree tree = new BinaryTree();
tree.root1 = new Node(1);
tree.root1.left = new Node(2);
tree.root1.right = new Node(3);
tree.root1.left.left = new Node(4);
tree.root1.left.right = new Node(5);
tree.root2 = new Node(1);
tree.root2.left = new Node(2);
tree.root2.right = new Node(3);
tree.root2.left.left = new Node(4);
tree.root2.left.right = new Node(5);
if (tree.identicalTrees(tree.root1, tree.root2))
System.out.println("Both trees are identical");
else
System.out.println("Trees are not identical");
}
}
What are Microservices?
The idea behind microservices is that some types of applications become easier to build and maintain when they are broken down into smaller, composable pieces which work together. Each component is developed separately, and the application is then simply the sum of its constituent components. This is in contrast to a traditional, "monolithic" application which is all developed all in one piece.
There are many reasons why this approach is considered an easier way to develop large applications, particular enterprise applications, and various types of software as a service delivered over the Internet.
One of the reasons is from a project engineering perspective. When the different components of an application are separated, they can be developed concurrently. Another is resilience. Rather than relying upon a single virtual or physical machine, components can be spread around multiple severs or even multiple data centers. If a component dies, you spin up another, and the rest of the application can continue to function. It allows more efficient scaling, as rather than scaling up with bigger and more powerful machines, or just more copies of the entire application, you can scale out with duplicate copies of the heaviest-used parts..
Is this a new concept?
The idea of separating applications into smaller parts is nothing new; there are other programming paradigms which address this same concept, such as Service Oriented Architecture (SOA). What may be new are some of the tools and techniques used to deliver on the promise of microservices.
The common definition of microservices generally relies upon each microservice providing an API endpoint, often but not always a stateless REST API which can be accessed over HTTP(S) just like a standard webpage. This method for accessing microservices make them easy for developers to consume as they only require tools and methods many developers are already familiar with.
Microservices depend not just on the technology being set up to support this concept, but on an organization having the culture, know-how, and structures in place for development teams to be able to adopt this model. Microservices are a part of a larger shift in IT departments towards a DevOps culture, in which development and operations teams work closely together to support an application over its lifecycle, and go through a rapid or even continuous release cycle rather than a more traditional long cycle.
Why is open source important for microservices?
When you design your applications from the ground up to be modular and composable, it allows you to use drop-in components in many places where in the past you may have required proprietary solutions, either because the licensing of the components, or specialized requirements. Many application components can be off-the-shelf open source tools.
A focus on microservices may also make it easier for application developers to offer alternative interfaces to your applications. When everything is an API, communications between application components become standardized. All a component has to do to make use of your application and data is to be able to authenticate and communicate across those standard APIs. This allows both those inside and, when appropriate, outside your organization to easily develop new ways to utilize your application's data and services.
Where do Docker and container technologies come in?
Many people see Docker or other container technologies as enablers of a microservice architecture.
Unlike virtual machines, containers are designed to be pared down to the minimal viable pieces needed to run whatever the one thing the container is designed to do, rather than packing multiple functions into the same virtual or physical machine. The ease of development that Docker and similar tools provide help make possible rapid development and testing of services.
Of course, containers are just a tool, and microservice architecture is just a concept. So it is entirely possible to build an application which could be described as following a microservices approach without using containers, just as it would be possible to build a much more traditional application inside of a container (although this may not be a good idea).
How do you orchestrate microservices?
In order to actually run an application based on microservices, you need to be able monitor, manage, and scale the different constituent parts. There are a number of different tools that might allow you to accomplish this. For containers, open source tools like Kubernetes, Docker Swarm, or Apache projects like Mesos or ZooKeeper might be a part of your solution. Alternatively, for non-container pieces of an application, other tools may be used for orchestrating components: for example, in an OpenStack cloud you might use Heat for managing application components. Another option is to use a Platform as a Service (PaaS) tool, which lets developers focus on writing code by abstracting some of the underlying orchestration technology and allowing them to easily select off-the-shelf open source components for certain parts of an application, like a database storage engine, a logging service, a continuous integration server, web server, or other pieces of the puzzle. Some PaaS systems like OpenShift directly use upstream projects like Docker and Kubernetes for managing application components, while others try to re-implement management tools themselves.
What about existing applications?
While utilizing microservices may be an important component of an organization's IT strategy going forward, there are certainly many applications which don't meet this model, nor is it likely that those applications will be rearchitected overnight to meet this new paradigm. Microservices and traditional applications can work together in the same environments, provided the organization has a solid bi-modal IT strategy.
There are many reasons why this approach is considered an easier way to develop large applications, particular enterprise applications, and various types of software as a service delivered over the Internet.
One of the reasons is from a project engineering perspective. When the different components of an application are separated, they can be developed concurrently. Another is resilience. Rather than relying upon a single virtual or physical machine, components can be spread around multiple severs or even multiple data centers. If a component dies, you spin up another, and the rest of the application can continue to function. It allows more efficient scaling, as rather than scaling up with bigger and more powerful machines, or just more copies of the entire application, you can scale out with duplicate copies of the heaviest-used parts..
Is this a new concept?
The idea of separating applications into smaller parts is nothing new; there are other programming paradigms which address this same concept, such as Service Oriented Architecture (SOA). What may be new are some of the tools and techniques used to deliver on the promise of microservices.
The common definition of microservices generally relies upon each microservice providing an API endpoint, often but not always a stateless REST API which can be accessed over HTTP(S) just like a standard webpage. This method for accessing microservices make them easy for developers to consume as they only require tools and methods many developers are already familiar with.
Microservices depend not just on the technology being set up to support this concept, but on an organization having the culture, know-how, and structures in place for development teams to be able to adopt this model. Microservices are a part of a larger shift in IT departments towards a DevOps culture, in which development and operations teams work closely together to support an application over its lifecycle, and go through a rapid or even continuous release cycle rather than a more traditional long cycle.
Why is open source important for microservices?
When you design your applications from the ground up to be modular and composable, it allows you to use drop-in components in many places where in the past you may have required proprietary solutions, either because the licensing of the components, or specialized requirements. Many application components can be off-the-shelf open source tools.
A focus on microservices may also make it easier for application developers to offer alternative interfaces to your applications. When everything is an API, communications between application components become standardized. All a component has to do to make use of your application and data is to be able to authenticate and communicate across those standard APIs. This allows both those inside and, when appropriate, outside your organization to easily develop new ways to utilize your application's data and services.
Where do Docker and container technologies come in?
Many people see Docker or other container technologies as enablers of a microservice architecture.
Unlike virtual machines, containers are designed to be pared down to the minimal viable pieces needed to run whatever the one thing the container is designed to do, rather than packing multiple functions into the same virtual or physical machine. The ease of development that Docker and similar tools provide help make possible rapid development and testing of services.
Of course, containers are just a tool, and microservice architecture is just a concept. So it is entirely possible to build an application which could be described as following a microservices approach without using containers, just as it would be possible to build a much more traditional application inside of a container (although this may not be a good idea).
How do you orchestrate microservices?
In order to actually run an application based on microservices, you need to be able monitor, manage, and scale the different constituent parts. There are a number of different tools that might allow you to accomplish this. For containers, open source tools like Kubernetes, Docker Swarm, or Apache projects like Mesos or ZooKeeper might be a part of your solution. Alternatively, for non-container pieces of an application, other tools may be used for orchestrating components: for example, in an OpenStack cloud you might use Heat for managing application components. Another option is to use a Platform as a Service (PaaS) tool, which lets developers focus on writing code by abstracting some of the underlying orchestration technology and allowing them to easily select off-the-shelf open source components for certain parts of an application, like a database storage engine, a logging service, a continuous integration server, web server, or other pieces of the puzzle. Some PaaS systems like OpenShift directly use upstream projects like Docker and Kubernetes for managing application components, while others try to re-implement management tools themselves.
What about existing applications?
While utilizing microservices may be an important component of an organization's IT strategy going forward, there are certainly many applications which don't meet this model, nor is it likely that those applications will be rearchitected overnight to meet this new paradigm. Microservices and traditional applications can work together in the same environments, provided the organization has a solid bi-modal IT strategy.
Thursday, June 1, 2017
Data Structures - Asymptotic Analysis( Big O, Theta θ, Omega Ω Notation)
Asymptotic analysis of an algorithm refers to defining the mathematical boundation/framing of its run-time performance. Using asymptotic analysis, we can very well conclude the best case, average case, and worst case scenario of an algorithm.
Asymptotic analysis is input bound i.e., if there's no input to the algorithm, it is concluded to work in a constant time. Other than the "input" all other factors are considered constant.The main idea of asymptotic analysis is to have a measure of efficiency of algorithms that doesn’t depend on machine specific constants, and doesn’t require algorithms to be implemented and time taken by programs to be compared. Asymptotic notations are mathematical tools to represent time complexity of algorithms for asymptotic analysis.
Asymptotic analysis refers to computing the running time of any operation in mathematical units of computation. For example, the running time of one operation is computed as f(n) and may be for another operation it is computed as g(n2). This means the first operation running time will increase linearly with the increase in n and the running time of the second operation will increase exponentially when n increases. Similarly, the running time of both operations will be nearly the same if n is significantly small.
Usually, the time required by an algorithm falls under three types −
Best Case − Minimum time required for program execution.
Average Case − Average time required for program execution.
Worst Case − Maximum time required for program execution.
Asymptotic Notations
Following are the commonly used asymptotic notations to calculate the running time complexity of an algorithm.
Ο Notation
Ω Notation
θ Notation
Big Oh Notation, Ο
The notation Ο(n) is the formal way to express the upper bound of an algorithm's running time. It measures the worst case time complexity or the longest amount of time an algorithm can possibly take to complete. The Big O notation defines an upper bound of an algorithm, it bounds a function only from above. For example, consider the case of Insertion Sort. It takes linear time in best case and quadratic time in worst case. We can safely say that the time complexity of Insertion sort is O(n^2). Note that O(n^2) also covers linear time.
If we use Θ notation to represent time complexity of Insertion sort, we have to use two statements for best and worst cases:
1. The worst case time complexity of Insertion Sort is Θ(n^2).
2. The best case time complexity of Insertion Sort is Θ(n).
Omega Notation, Ω
The notation Ω(n) is the formal way to express the lower bound of an algorithm's running time. It measures the best case time complexity or the best amount of time an algorithm can possibly take to complete. Just as Big O notation provides an asymptotic upper bound on a function, Ω notation provides an asymptotic lower bound.Ω Notation can be useful when we have lower bound on time complexity of an algorithm.
Theta Notation, θ
The notation θ(n) is the formal way to express both the lower bound and the upper bound of an algorithm's running time.The theta notation bounds a functions from above and below, so it defines exact asymptotic behavior.
A simple way to get Theta notation of an expression is to drop low order terms and ignore leading constants. For example, consider the following expression.
3n3 + 6n2 + 6000 = Θ(n3)
Dropping lower order terms is always fine because there will always be a n0 after which Θ(n3) has higher values than Θn2) irrespective of the constants involved.
Volatile vs Atomic in Java
Volatile and Atomic are two different concepts. Volatile ensures, that a certain, expected (memory) state is true across different threads, while Atomics ensure that operation on variables are performed atomically.
Take the following example of two threads in Java:
Thread A:
value = 1;
done = true;
Thread B:
if (done)
System.out.println(value);
Starting with value = 0 and done = false the rule of threading tells us, that it is undefined whether or not Thread B will print value. Furthermore value is undefined at that point as well! To explain this you need to know a bit about Java memory management (which can be complex), in short: Threads may create local copies of variables, and the JVM can reorder code to optimize it, therefore there is no guarantee that the above code is run in exactly that order. Setting done to true and then setting value to 1 would be a possible outcome of the JIT.
volatile only ensures, that at the moment of access of such a variable, the new value will be immediately visible to all other threads and the order of execution ensures, that the code is at the state you would expect it to be. So in case of the code above, defining done as volatile will ensure that whenever Thread B checks the variable, it is either false, or true, and if it is true, then value has been set to 1 as well.
As a side-effect of volatile, the value of such a variable is set thread-wide atomically (at a very minor cost of execution speed). This is however only important on 32-bit systems that i.E. use long (64-bit) variables (or similar), in most other cases setting/reading a variable is atomic anyways. But there is an important difference between an atomic access and an atomic operation. Volatile only ensures that the access is atomically, while Atomics ensure that the operation is atomically.
Take the following example:
i = i + 1;
No matter how you define i, a different Thread reading the value just when the above line is executed might get i, or i + 1, because the operation is not atomically. If the other thread sets i to a different value, in worst case i could be set back to whatever it was before by thread A, because it was just in the middle of calculating i + 1 based on the old value, and then set i again to that old value + 1. Explanation:
Assume i = 0
Thread A reads i, calculates i+1, which is 1
Thread B sets i to 1000 and returns
Thread A now sets i to the result of the operation, which is i = 1
Atomics like AtomicInteger ensure, that such operations happen atomically. So the above issue cannot happen, i would either be 1000 or 1001 once both threads are finished.
Take the following example of two threads in Java:
Thread A:
value = 1;
done = true;
Thread B:
if (done)
System.out.println(value);
Starting with value = 0 and done = false the rule of threading tells us, that it is undefined whether or not Thread B will print value. Furthermore value is undefined at that point as well! To explain this you need to know a bit about Java memory management (which can be complex), in short: Threads may create local copies of variables, and the JVM can reorder code to optimize it, therefore there is no guarantee that the above code is run in exactly that order. Setting done to true and then setting value to 1 would be a possible outcome of the JIT.
volatile only ensures, that at the moment of access of such a variable, the new value will be immediately visible to all other threads and the order of execution ensures, that the code is at the state you would expect it to be. So in case of the code above, defining done as volatile will ensure that whenever Thread B checks the variable, it is either false, or true, and if it is true, then value has been set to 1 as well.
As a side-effect of volatile, the value of such a variable is set thread-wide atomically (at a very minor cost of execution speed). This is however only important on 32-bit systems that i.E. use long (64-bit) variables (or similar), in most other cases setting/reading a variable is atomic anyways. But there is an important difference between an atomic access and an atomic operation. Volatile only ensures that the access is atomically, while Atomics ensure that the operation is atomically.
Take the following example:
i = i + 1;
No matter how you define i, a different Thread reading the value just when the above line is executed might get i, or i + 1, because the operation is not atomically. If the other thread sets i to a different value, in worst case i could be set back to whatever it was before by thread A, because it was just in the middle of calculating i + 1 based on the old value, and then set i again to that old value + 1. Explanation:
Assume i = 0
Thread A reads i, calculates i+1, which is 1
Thread B sets i to 1000 and returns
Thread A now sets i to the result of the operation, which is i = 1
Atomics like AtomicInteger ensure, that such operations happen atomically. So the above issue cannot happen, i would either be 1000 or 1001 once both threads are finished.
Does making all fields Final makes the class Immutable in Java?
One of the common misconceptions among many Java Programmer is that a class with all final fields automatically becomes Immutable. This is not correct, you can easily break immutability of certain class if the final field it contains is a mutable one, as we'll see in this article. One of the most common examples of this is a java.util.Date. You have to be extra cautious to keep your class' immutability intact with mutable fields. When you return a reference to a mutable object, you are sharing ownership of that reference with whoever receives it. This can break invariant, such as immutability.
Another example of this kind of pattern which can break immutability is returning collection or array from the getters method .
So, even though, the field which is pointing to Date or Collection or array object is final, you can still break the immutability of the class by breaking Encapsulation by returning a reference to the original mutable object.
There are two ways to avoid this problem, first, don't provide getters to mutable objects if you can avoid it. If you must, then consider returning a copy or clone of the mutable object. If you are returning a collection, you could wrap it as an unmodifiable collection. Since we cannot make an array final or unmodifiable in Java
Monday, May 1, 2017
How to make code Thread-Safe in Java
How to make code Thread-Safe in Java
There are multiple ways to make this code thread safe in Java:
1) Use synchronized keyword in Java and lock the getCount() method so that only one thread can execute it at a time which removes possibility of coinciding or interleaving.
2) use Atomic Integer, which makes this ++ operation atomic and since atomic operations are thread-safe and saves cost of external synchronization.
3) Immutable objects are by default thread-safe because there state can not be modified once created. Since String is immutable in Java, its inherently thread-safe.
4) Read only or final variables in Java are also thread-safe in Java.
5) Locking is one way of achieving thread-safety in Java.
6) Static variables if not synchronized properly becomes major cause of thread-safety issues.
7) Example of thread-safe class in Java: Vector, Hashtable, ConcurrentHashMap, String etc.
8) Atomic operations in Java are thread-safe e.g. reading a 32 bit int from memory because its an atomic operation it can't interleave with other thread.
9) local variables are also thread-safe because each thread has there own copy and using local variables is good way to writing thread-safe code in Java.
10) In order to avoid thread-safety issue minimize sharing of objects between multiple thread.
11) Volatile keyword in Java can also be used to instruct thread not to cache variables and read from main memory and can also instruct JVM not to reorder or optimize code from threading perspective.
Tuesday, April 4, 2017
Top 10 Coding guidelines in Java
The top 10 are as follows
1) Do not expose methods that use reduced-security checks to untrusted code. Certain methods use a reduced-security check that checks only that the calling method is authorized rather than checking every method in the call stack. Any code that invokes these methods must guarantee that they cannot be invoked on behalf of untrusted code.
2) Do not use the clone()method to copy untrusted method parameters. Inappropriate use of the clone() method can allow an attacker to exploit vulnerabilities by providing arguments that appear normal but subsequently return unexpected values. Such objects may consequently bypass validation and security checks.
3) Document thread-safety and use annotations where applicable. The Java language annotation facility is useful for documenting design intent. Source code annotation is a mechanism for associating metadata with a program element and making it available to the compiler, analyzers, debuggers, or Java Virtual Machine (JVM) for examination. Several annotations are available for documenting thread-safety.
4) Be aware of numeric promotion behavior. Promotions in which the operands are converted from an int to a float or from a long to a double can cause a loss of precision.
5) Use a try-with-resources statement to safely handle closeable resources. Using the try-with-resources statement prevents problems that can arise when closing resources with an ordinary try-catch-finally block, such as failing to close a resource because an exception is thrown as a result of closing another resource, or masking an important exception when a resource is closed.
6) Use the same type for the second and third operands in conditional expressions. The complexity of the rules that determine the result type of a conditional expression can result in unintended type conversions. Consequently, the second and third operands of each conditional expression should have identical types.
7) Avoid inadvertent wrapping of loop counters. Unless coded properly, a while or for loop may execute forever, or until the counter wraps around and reaches its final value.
8) Strive for logical completeness. Software vulnerabilities can result when a programmer fails to consider all possible data states.
9) Do not confuse abstract object equality with reference equality. Naïve programmers often confuse the intent of the == operation with that of the Object.equals() method. This confusion is frequently evident in the context of processing of String objects.
10) Understand how escape characters are interpreted when strings are loaded. Many classes allow inclusion of escape sequences in character and string literals. Correct use of escape sequences in string literals requires understanding how the escape sequences are interpreted by the Java compiler, as well as how they are interpreted by any subsequent processor.
1) Do not expose methods that use reduced-security checks to untrusted code. Certain methods use a reduced-security check that checks only that the calling method is authorized rather than checking every method in the call stack. Any code that invokes these methods must guarantee that they cannot be invoked on behalf of untrusted code.
2) Do not use the clone()method to copy untrusted method parameters. Inappropriate use of the clone() method can allow an attacker to exploit vulnerabilities by providing arguments that appear normal but subsequently return unexpected values. Such objects may consequently bypass validation and security checks.
3) Document thread-safety and use annotations where applicable. The Java language annotation facility is useful for documenting design intent. Source code annotation is a mechanism for associating metadata with a program element and making it available to the compiler, analyzers, debuggers, or Java Virtual Machine (JVM) for examination. Several annotations are available for documenting thread-safety.
4) Be aware of numeric promotion behavior. Promotions in which the operands are converted from an int to a float or from a long to a double can cause a loss of precision.
5) Use a try-with-resources statement to safely handle closeable resources. Using the try-with-resources statement prevents problems that can arise when closing resources with an ordinary try-catch-finally block, such as failing to close a resource because an exception is thrown as a result of closing another resource, or masking an important exception when a resource is closed.
6) Use the same type for the second and third operands in conditional expressions. The complexity of the rules that determine the result type of a conditional expression can result in unintended type conversions. Consequently, the second and third operands of each conditional expression should have identical types.
7) Avoid inadvertent wrapping of loop counters. Unless coded properly, a while or for loop may execute forever, or until the counter wraps around and reaches its final value.
8) Strive for logical completeness. Software vulnerabilities can result when a programmer fails to consider all possible data states.
9) Do not confuse abstract object equality with reference equality. Naïve programmers often confuse the intent of the == operation with that of the Object.equals() method. This confusion is frequently evident in the context of processing of String objects.
10) Understand how escape characters are interpreted when strings are loaded. Many classes allow inclusion of escape sequences in character and string literals. Correct use of escape sequences in string literals requires understanding how the escape sequences are interpreted by the Java compiler, as well as how they are interpreted by any subsequent processor.
String Deduplication in Java 8
Java 8 update 20 has introduced a new feature called "String deduplication" which can be used to save memory from duplicate String object in Java application, which can improve the performance of your Java application and prevent java.lang.OutOfMemoryError if your application makes heavy use of String. If you have profiled a Java application to check which object is taking the bulk of memory, you will often find char[] object at the top of the list, which is nothing but internal character array used by String object.
Since from Java 7 onward, String has stopped sharing character array with sub-strings, the memory occupied by String object has gone higher, which had made the problem even worse
The String deduplication is trying to bridge that gap. It reduces the memory footprint of String object on the Java Heap space by taking advantage of the fact that many String objects are identical. Instead of each String object pointing to their own character array, identical String object can point to the same character array.
Btw, this is not exactly same as it was before Java 7 update 6, where substring also points to the same character array, but can greatly reduce memory occupied by duplicate String in JVM. Anyway, In this article, you will see how you can enable this feature in Java 8 to reduce memory consumed by duplicate String objects.
String deduplication also doesn't consider relatively young String for processing. The minimal age of processed String is controlled by -XX:StringDeduplicationAgeThreshold=3 option. The default value of this parameter is 3.
1) This option is only available from Java 8 Update 20 JDK release.
2) This feature will only work along with G1 garbage collector, it will not work with other garbage collectors e.g. Concurrent Mark Sweep GC.
3) You need to provide both -XX:+UseG1GC and -XX:+StringDeduplication JVM options to enable this feature, first one will enable the G1 garbage collector and the second one will enable the String deduplication feature within G1 GC.
4) You can optionally use -XX:+PrintStringDeduplicationStatistics JVM option to analyze what is happening through the command-line.
5) Not every String is eligible for deduplication, especially young String objects are not visible, but you can control this by using -XX:StringDeduplicationAgeThreshold=3 option to change when Strings become eligible for deduplication.
6) It is observed in general this feature may decrease heap usage by about 10%, which is very good, considering you don't have to do any coding or refactoring.
7) String deduplication runs as a background task without stopping your application.
Since from Java 7 onward, String has stopped sharing character array with sub-strings, the memory occupied by String object has gone higher, which had made the problem even worse
The String deduplication is trying to bridge that gap. It reduces the memory footprint of String object on the Java Heap space by taking advantage of the fact that many String objects are identical. Instead of each String object pointing to their own character array, identical String object can point to the same character array.
Btw, this is not exactly same as it was before Java 7 update 6, where substring also points to the same character array, but can greatly reduce memory occupied by duplicate String in JVM. Anyway, In this article, you will see how you can enable this feature in Java 8 to reduce memory consumed by duplicate String objects.
How to enable String deduplication in Java 8
String deduplication is not enabled by default in Java 8 JVM. You can enable String deduplication feature by using -XX:+UseStringDeduplication option. Unfortunately, String deduplication is only available for the G1 garbage collector, so if you are not using G1 GC then you cannot use the String deduplication feature. It means just providing -XX:+UseStringDeduplication will not work, you also need to turn on G1 garbage collector using -XX:+UseG1GC option.String deduplication also doesn't consider relatively young String for processing. The minimal age of processed String is controlled by -XX:StringDeduplicationAgeThreshold=3 option. The default value of this parameter is 3.
Important points
Here are some of the important points about String deduplication feature of Java 8:1) This option is only available from Java 8 Update 20 JDK release.
2) This feature will only work along with G1 garbage collector, it will not work with other garbage collectors e.g. Concurrent Mark Sweep GC.
3) You need to provide both -XX:+UseG1GC and -XX:+StringDeduplication JVM options to enable this feature, first one will enable the G1 garbage collector and the second one will enable the String deduplication feature within G1 GC.
4) You can optionally use -XX:+PrintStringDeduplicationStatistics JVM option to analyze what is happening through the command-line.
5) Not every String is eligible for deduplication, especially young String objects are not visible, but you can control this by using -XX:StringDeduplicationAgeThreshold=3 option to change when Strings become eligible for deduplication.
6) It is observed in general this feature may decrease heap usage by about 10%, which is very good, considering you don't have to do any coding or refactoring.
7) String deduplication runs as a background task without stopping your application.
Wednesday, February 22, 2017
Types of NoSQL Database/Datastore
Wide Row - Also known as wide-column stores, these databases store data in rows
and users are able to perform some query operations via column-based access. A
wide-row store offers very high performance and a highly scalable architecture.
Examples include: Cassandra, HBase, and Google BigTable.
• Columnar - Also known as column oriented store. Here the columns of all the rows
are stored together on disk. A great fit for analytical queries because it reduces disk
seek and encourages array like processing. Amazon Redshift, Google BigQuery,
Teradata (with column partitioning).
• Key/Value - These NoSQL databases are some of the least complex as all of the data
within consists of an indexed key and a value. Examples include Amazon DynamoDB,
Riak, and Oracle NoSQL database
• Document - Expands on the basic idea of key-value stores where "documents" are
more complex, in that they contain data and each document is assigned a unique
key, which is used to retrieve the document. These are designed for storing,
retrieving, and managing document-oriented information, also known as semistructured data. Examples include MongoDB and CouchDB
• Graph - Designed for data whose relationships are well represented as a graph
structure and has elements that are interconnected; with an undetermined number of
relationships between them. Examples include: Neo4J, OrientDB and TitanDB
Tuesday, February 21, 2017
Why PermGen has been removed from Java 8 ?
The Java Virtual Machine (JVM) uses an internal representation of its classes containing per-class metadata such as class hierarchy information, method data and information (such as bytecodes, stack and variable sizes), the runtime constant pool and resolved symbolic reference and Vtables.
In the past (when custom class loaders weren’t that common), the classes were mostly “static” and were infrequently unloaded or collected, and hence were labeled “permanent”. Also, since the classes are a part of the JVM implementation and not created by the application they are considered “non-heap” memory.
For HotSpot JVM prior to JDK8, these “permanent” representations would live in an area called the “permanent generation”. This permanent generation was contiguous with the Java heap and was limited to -XX:MaxPermSize that had to be set on the command line before starting the JVM or would default to 64M (85M for 64bit scaled pointers). The collection of the permanent generation would be tied to the collection of the old generation, so whenever either gets full, both the permanent generation and the old generation would be collected. One of the obvious problems that you may be able to call out right away is the dependency on the ‑XX:MaxPermSize. If the classes metadata size is beyond the bounds of ‑XX:MaxPermSize, your application will run out of memory and you will encounter an OOM (Out of Memory) error.
Following are the drawbacks in PermGen
- Fixed size at startup – difficult to tune.
- Internal Hotspot types were Java objects : Could move with full GC, opaque, not strongly typed and hard to debug, needed meta-metadata.
- Simplify full collections : Special iterators for metadata for each collector
- Want to deallocate class data concurrently and not during GC pause
- Enable future improvements that were limited by PermGen.
The Permanent Generation (PermGen) space has completely been removed and is kind of replaced by a new space called Metaspace. The consequences of the PermGen removal is that obviously the PermSize and MaxPermSize JVM arguments are ignored and you will never get a java.lang.OutOfMemoryError: PermGen error. PermGen In Java 8 it was removed and replaced by area called Metaspace.
Advantages of MetaSpace
- Take advantage of Java Language Specification property : Classes and associated metadata lifetimes match class loader’s
- Per loader storage area – Metaspace
- Linear allocation only
- No individual reclamation (except for RedefineClasses and class loading failure)
- No GC scan or compaction
- No relocation for metaspace objects
Monday, August 8, 2016
Java Code for Converting Speech to Text
Jars Required
1) cmudict04.jar2) cmulex.jar
3) cmutimelex.jar
4) cmu_time_awb.jar
5) cmu_us_kal.jar
6) en_us.jar
7) reetts-jsapi10.jar
8) freetts.jar
9) mbrola.jar
Code for Text to Speech
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import com.sun.speech.freetts.Voice;
import com.sun.speech.freetts.VoiceManager;
public class TextToSpeech {
// Default voice is Kevin16
private static final String VOICENAME = "kevin16";
public static void main(String[] args) {
Voice voice;
// Taking instance of voice from VoiceManager factory.
VoiceManager voiceManager = VoiceManager.getInstance();
voice = voiceManager.getVoice(VOICENAME);
// Allocating voice
voice.allocate();
// word per minute
voice.setRate(120);
voice.setPitch(100);
System.out.print("Enter your text: ");
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
try {
// Ready to speak
voice.speak(br.readLine());
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
Saturday, July 30, 2016
How to decide whether to use newCachedThreadPool or newFixedThreadPool?
newFixedThreadPool()
Creates a thread pool that reuses a fixed number of threads...
This means that if you ask for 2 threads, it will start 2 threads and never start 3.
A FixedThreadPool does have its advantages when you do in fact want to work with a fixed number of threads, since then you can submit any number of tasks to the executor service while knowing that the number of threads will be maintained at the level you specified. If you explicitly want to grow the number of threads, then this is not the appropriate choice.
This does however mean that the one issue that you may have with the CachedThreadPool is in regards to limiting the number of threads that are running concurrently. The CachedThreadPool will not limit them for you, so you may need to write your own code to ensure that you do not run too many threads. This really depends on the design of your application and how tasks are submitted to the executor service.
On the other hand, the newCachedThreadPool()
Creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available.
In your case, if you only have 2 thread to run, either will work fine since you will only be submitting 2 jobs to your pool. However, if you wanted to submit all 20 jobs at once but only have 2 jobs running at one time, you should use a newFixedThreadPool(2). Otherwise all 20 jobs will run at the same time which may not be optimal depending on how many CPUs you have.
Typically use the newCachedThreadPool() when you need the thread to be spawned immediately, even if all of the threads currently running are busy. I recently used it when I was spawning timer tasks. The number of concurrent jobs are immaterial because I never spawn very many but I want them to run when they are requested and I want them to re-use dormant threads.
Use newFixedThreadPool() when you want to limit the number of concurrent tasks running at any one point to maximize performance and not swamp the server. For example if I am processing 100k lines from a file, one line at a time, I don't want each line to start a new thread but I want some level of concurrency so I allocate (for example) 10 fixed threads to run the tasks until the pool is exhausted.
Creates a thread pool that reuses a fixed number of threads...
This means that if you ask for 2 threads, it will start 2 threads and never start 3.
A FixedThreadPool does have its advantages when you do in fact want to work with a fixed number of threads, since then you can submit any number of tasks to the executor service while knowing that the number of threads will be maintained at the level you specified. If you explicitly want to grow the number of threads, then this is not the appropriate choice.
This does however mean that the one issue that you may have with the CachedThreadPool is in regards to limiting the number of threads that are running concurrently. The CachedThreadPool will not limit them for you, so you may need to write your own code to ensure that you do not run too many threads. This really depends on the design of your application and how tasks are submitted to the executor service.
On the other hand, the newCachedThreadPool()
Creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available.
In your case, if you only have 2 thread to run, either will work fine since you will only be submitting 2 jobs to your pool. However, if you wanted to submit all 20 jobs at once but only have 2 jobs running at one time, you should use a newFixedThreadPool(2). Otherwise all 20 jobs will run at the same time which may not be optimal depending on how many CPUs you have.
Typically use the newCachedThreadPool() when you need the thread to be spawned immediately, even if all of the threads currently running are busy. I recently used it when I was spawning timer tasks. The number of concurrent jobs are immaterial because I never spawn very many but I want them to run when they are requested and I want them to re-use dormant threads.
Use newFixedThreadPool() when you want to limit the number of concurrent tasks running at any one point to maximize performance and not swamp the server. For example if I am processing 100k lines from a file, one line at a time, I don't want each line to start a new thread but I want some level of concurrency so I allocate (for example) 10 fixed threads to run the tasks until the pool is exhausted.
Saturday, July 9, 2016
Internals of Garbage Collectors
Garbage collection works by employing several GC algorithm e.g. Mark and Sweep. There are different kinds of garbage collector available in Java to collect different area of heap memory e.g. you have serial, parallel and concurrent garbage collector in Java. A new collector called G1 (Garbage first) are also introduced in JDK 1.7. First step to learn about GC is to understand when an object becomes eligible to garbage collection.
Since JVM provides memory management, Java developers only care about creating object, they don't care about cleaning up, that is done by garbage collector, but it can only collect objects which has no live strong reference or it's not reachable from any thread. If an object, which is suppose to be collected but still live in memory due to unintentional strong reference then it's known as memory leak in Java. ThreadLocal variables in Java web application can easily cause memory leak.
Coming to Garbage Collection lets know few Important points about Garbage Collector
1) Objects are created on heap in Java irrespective of there scope e.g. local or member variable. while its worth noting that class variables or static members are created in method area of Java memory space and both heap and method area is shared between different thread.
2) Garbage collection is a mechanism provided by Java Virtual Machine to reclaim heap space from objects which are eligible for Garbage collection.
3) Garbage collection relieves Java programmer from memory management which is essential part of C++ programming and gives more time to focus on business logic.
4) Garbage Collection in Java is carried by a daemon thread called Garbage Collector.
5) Before removing an object from memory garbage collection thread invokes finalize() method of that object and gives an opportunity to perform any sort of cleanup required.
6) You as Java programmer can not force garbage collection in Java; it will only trigger if JVM thinks it needs a garbage collection based on Java heap size.
7) There are methods like System.gc() and Runtime.gc() which is used to send request of Garbage collection to JVM but it’s not guaranteed that garbage collection will happen.
8) If there is no memory space for creating new object in Heap Java Virtual Machine throws OutOfMemoryError or java.lang.OutOfMemoryError heap space
9) J2SE 5(Java 2 Standard Edition) adds a new feature called Ergonomics goal of ergonomics is to provide good performance from the JVM with minimum of command line tuning.
Now the Question is when does Object becomes eligible for Garbage Collection
An object becomes eligible for Garbage collection or GC if its not reachable from any live threads or by any static references. In other words you can say that an object becomes eligible for garbage collection if its all references are null. Cyclic dependencies are not counted as reference so if object A has reference of object B and object B has reference of Object A and they don't have any other live reference then both Objects A and B will be eligible for Garbage collection.
Generally an object becomes eligible for garbage collection in Java on following cases:
1) All references of that object explicitly set to null e.g. object = null
2) Object is created inside a block and reference goes out scope once control exit that block.
3) Parent object set to null, if an object holds reference of another object and when you set container object's reference null, child or contained object automatically becomes eligible for garbage collection.
Heap Generations for Garbage Collection in Java
Java objects are created in Heap and Heap is divided into three parts or generations for sake of garbage collection in Java, these are called as Young generation, Tenured or Old Generation and Perm Area of heap. New Generation is further divided into three parts known as Eden space, Survivor 1 and Survivor 2 space. When an object first created in heap its gets created in new generation inside Eden space and after subsequent minor garbage collection if object survives its gets moved to survivor 1 and then survivor 2 before major garbage collection moved that object to old or tenured generation.
Permanent generation of Heap or Perm Area of Heap is somewhat special and it is used to store Meta data related to classes and method in JVM, it also hosts String pool provided by JVM as discussed in my string tutorial why String is immutable in Java. There are many opinions around whether garbage collection in Java happens in perm area of Java heap or not, as per my knowledge this is something which is JVM dependent and happens at least in Sun's implementation of JVM. You can also try this by just creating millions of String and watching for Garbage collection or OutOfMemoryError.
Types of Garbage Collector in Java
Java Runtime (J2SE 5) provides various types of Garbage collection in Java which you can choose based upon your application's performance requirement. Java 5 adds three additional garbage collectors except serial garbage collector. Each is generational garbage collector which has been implemented to increase throughput of the application or to reduce garbage collection pause times.
1) Throughput Garbage Collector: This garbage collector in Java uses a parallel version of the young generation collector. It is used if the -XX:+UseParallelGC option is passed to the runtime via JVM command line options . The tenured generation collector is same as the serial collector.
2) Concurrent low pause Collector: This Collector is used if the -Xingc or -XX:+UseConcMarkSweepGC is passed on the command line. This is also referred as Concurrent Mark Sweep Garbage collector. The concurrent collector is used to collect the tenured generation and does most of the collection concurrently with the execution of the application. The application is paused for short periods during the collection. A parallel version of the young generation copying collector is sued with the concurrent collector. Concurrent Mark Sweep Garbage collector is most widely used garbage collector in java and it uses algorithm to first mark object which needs to collected when garbage collection triggers.
3) The Incremental (Sometimes called train) low pause collector: This collector is used only if -XX:+UseTrainGC is passed on the command line. This garbage collector has not changed since the java 1.4.2 and is currently not under active development. It will not be supported in future releases so avoid using this and please see 1.4.2 GC Tuning document for information on this collector.
Important point to not is that -XX:+UseParallelGC should not be used with -XX:+UseConcMarkSweepGC. The argument passing in the J2SE platform starting with version 1.4.2 should only allow legal combination of command line options for garbage collector but earlier releases may not find or detect all illegal combination and the results for illegal combination are unpredictable. It’s not recommended to use this garbage collector in java.
Read More
Java 8 FeaturesHow to Reset Arraylist In Java
How HashMap Work in Java
Why wait (), notify () and notifyAll () must be called from synchronized block or method in Java
XPath to locate Information in XML
Internals of Garbage Collector
Reference Type in Java
Different Ways to Create ObjectClass Loaders in Java
Producer Consumer Problem
Why String is Final in Java
Singleton Class using Enum
JSON tutorial
Exceptional Handling in Java
Saturday, June 11, 2016
REST and SOAP: When Should I Use Which Web Services
Simple Object Access Protocol (SOAP) and Representational State Transfer (REST) are two answers to the same question: how to access Web services. The choice initially may seem easy, but at times it can be surprisingly difficult.
SOAP is a standards-based Web services access protocol that has been around for a while and enjoys all of the benefits of long-term use. Originally developed by Microsoft, SOAP really isn’t as simple as the acronym would suggest.
REST is the newcomer to the block. It seeks to fix the problems with SOAP and provide a truly simple method of accessing Web services. However, sometimes SOAP is actually easier to use; sometimes REST has problems of its own. Both techniques have issues to consider when deciding which protocol to use.
SOAP is a more rigid set of messaging patterns than REST. The rules in SOAP are important because without these rules, you can’t achieve any level of standardization. REST as an architecture style does not require processing and is naturally more flexible. Both SOAP and REST rely on well-established rules that everyone has agreed to abide by in the interest of exchanging information.
A Quick Overview of SOAP
SOAP relies exclusively on XML to provide messaging services. Microsoft originally developed SOAP to take the place of older technologies that don’t work well on the Internet such as the Distributed Component Object Model (DCOM) and Common Object Request Broker Architecture (CORBA). These technologies fail because they rely on binary messaging; the XML messaging that SOAP employs works better over the Internet.
After an initial release, Microsoft submitted SOAP to the Internet Engineering Task Force (IETF) where it was standardized. SOAP is designed to support expansion, so it has all sorts of other acronyms and abbreviations associated with it, such as WS-Addressing, WS-Policy, WS-Security, WS-Federation, WS-ReliableMessaging, WS-Coordination, WS-AtomicTransaction, and WS-RemotePortlets. In fact, you can find a whole laundry list of these standards on Web Services Standards.
The point is that SOAP is highly extensible, but you only use the pieces you need for a particular task. For example, when using a public Web service that’s freely available to everyone, you really don’t have much need for WS-Security.
The XML used to make requests and receive responses in SOAP can become extremely complex. In some programming languages, you need to build those requests manually, which becomes problematic because SOAP is intolerant of errors. However, other languages can use shortcuts that SOAP provides; that can help you reduce the effort required to create the request and to parse the response. In fact, when working with .NET languages, you never even see the XML.
Part of the magic is the Web Services Description Language (WSDL). This is another file that’s associated with SOAP. It provides a definition of how the Web service works, so that when you create a reference to it, the IDE can completely automate the process. So, the difficulty of using SOAP depends to a large degree on the language you use.
One of the most important SOAP features is built-in error handling. If there’s a problem with your request, the response contains error information that you can use to fix the problem. Given that you might not own the Web service, this particular feature is extremely important; otherwise you would be left guessing as to why things didn’t work. The error reporting even provides standardized codes so that it’s possible to automate some error handling tasks in your code.
An interesting SOAP feature is that you don’t necessarily have to use it with the HyperText Transfer Protocol (HTTP) transport. There’s an actual specification for using SOAP over Simple Mail Transfer Protocol (SMTP) and there isn’t any reason you can’t use it over other transports. In fact, developers in some languages, such as Python and PHP, are doing just that.
A Quick Overview of REST
Many developers found SOAP cumbersome and hard to use. For example, working with SOAP in JavaScript means writing a ton of code to perform extremely simple tasks because you must create the required XML structure absolutely every time.
REST provides a lighter weight alternative. Instead of using XML to make a request, REST relies on a simple URL in many cases. In some situations you must provide additional information in special ways, but most Web services using REST rely exclusively on obtaining the needed information using the URL approach. REST can use four different HTTP 1.1 verbs (GET, POST, PUT, and DELETE) to perform tasks.
Unlike SOAP, REST doesn’t have to use XML to provide the response. You can find REST-based Web services that output the data in Command Separated Value (CSV), JavaScript Object Notation (JSON) and Really Simple Syndication (RSS). The point is that you can obtain the output you need in a form that’s easy to parse within the language you need for your application.
Deciding Between SOAP and REST
Before you spend hours fretting over the choice between SOAP and REST, consider that some Web services support one and some the other. Unless you plan to create your own Web service, the decision of which protocol to use may already be made for you. Extremely few Web services, such as Amazon, support both. The focus of your decision often centers on which Web service best meets your needs, rather than which protocol to use.
Soap Vs Rest
SOAP is definitely the heavyweight choice for Web service access. It provides the following advantages when compared to REST:
SOAP is a standards-based Web services access protocol that has been around for a while and enjoys all of the benefits of long-term use. Originally developed by Microsoft, SOAP really isn’t as simple as the acronym would suggest.
REST is the newcomer to the block. It seeks to fix the problems with SOAP and provide a truly simple method of accessing Web services. However, sometimes SOAP is actually easier to use; sometimes REST has problems of its own. Both techniques have issues to consider when deciding which protocol to use.
SOAP is a more rigid set of messaging patterns than REST. The rules in SOAP are important because without these rules, you can’t achieve any level of standardization. REST as an architecture style does not require processing and is naturally more flexible. Both SOAP and REST rely on well-established rules that everyone has agreed to abide by in the interest of exchanging information.
A Quick Overview of SOAP
SOAP relies exclusively on XML to provide messaging services. Microsoft originally developed SOAP to take the place of older technologies that don’t work well on the Internet such as the Distributed Component Object Model (DCOM) and Common Object Request Broker Architecture (CORBA). These technologies fail because they rely on binary messaging; the XML messaging that SOAP employs works better over the Internet.
After an initial release, Microsoft submitted SOAP to the Internet Engineering Task Force (IETF) where it was standardized. SOAP is designed to support expansion, so it has all sorts of other acronyms and abbreviations associated with it, such as WS-Addressing, WS-Policy, WS-Security, WS-Federation, WS-ReliableMessaging, WS-Coordination, WS-AtomicTransaction, and WS-RemotePortlets. In fact, you can find a whole laundry list of these standards on Web Services Standards.
The point is that SOAP is highly extensible, but you only use the pieces you need for a particular task. For example, when using a public Web service that’s freely available to everyone, you really don’t have much need for WS-Security.
The XML used to make requests and receive responses in SOAP can become extremely complex. In some programming languages, you need to build those requests manually, which becomes problematic because SOAP is intolerant of errors. However, other languages can use shortcuts that SOAP provides; that can help you reduce the effort required to create the request and to parse the response. In fact, when working with .NET languages, you never even see the XML.
Part of the magic is the Web Services Description Language (WSDL). This is another file that’s associated with SOAP. It provides a definition of how the Web service works, so that when you create a reference to it, the IDE can completely automate the process. So, the difficulty of using SOAP depends to a large degree on the language you use.
One of the most important SOAP features is built-in error handling. If there’s a problem with your request, the response contains error information that you can use to fix the problem. Given that you might not own the Web service, this particular feature is extremely important; otherwise you would be left guessing as to why things didn’t work. The error reporting even provides standardized codes so that it’s possible to automate some error handling tasks in your code.
An interesting SOAP feature is that you don’t necessarily have to use it with the HyperText Transfer Protocol (HTTP) transport. There’s an actual specification for using SOAP over Simple Mail Transfer Protocol (SMTP) and there isn’t any reason you can’t use it over other transports. In fact, developers in some languages, such as Python and PHP, are doing just that.
A Quick Overview of REST
Many developers found SOAP cumbersome and hard to use. For example, working with SOAP in JavaScript means writing a ton of code to perform extremely simple tasks because you must create the required XML structure absolutely every time.
REST provides a lighter weight alternative. Instead of using XML to make a request, REST relies on a simple URL in many cases. In some situations you must provide additional information in special ways, but most Web services using REST rely exclusively on obtaining the needed information using the URL approach. REST can use four different HTTP 1.1 verbs (GET, POST, PUT, and DELETE) to perform tasks.
Unlike SOAP, REST doesn’t have to use XML to provide the response. You can find REST-based Web services that output the data in Command Separated Value (CSV), JavaScript Object Notation (JSON) and Really Simple Syndication (RSS). The point is that you can obtain the output you need in a form that’s easy to parse within the language you need for your application.
Deciding Between SOAP and REST
Before you spend hours fretting over the choice between SOAP and REST, consider that some Web services support one and some the other. Unless you plan to create your own Web service, the decision of which protocol to use may already be made for you. Extremely few Web services, such as Amazon, support both. The focus of your decision often centers on which Web service best meets your needs, rather than which protocol to use.
Soap Vs Rest
SOAP is definitely the heavyweight choice for Web service access. It provides the following advantages when compared to REST:
- Language, platform, and transport independent (REST requires use of HTTP)
- Works well in distributed enterprise environments (REST assumes direct point-to-point communication)
- Standardized
- Provides significant pre-build extensibility in the form of the WS* standards
- Built-in error handling
- Automation when used with certain language products
- No expensive tools require to interact with the Web service
- Smaller learning curve
- Efficient (SOAP uses XML for all messages, REST can use smaller message formats)
- Fast (no extensive processing required)
- Closer to other Web technologies in design philosophy
HTTP Methods for RESTful Services
For More details on differences between Rest and Soap go though Rest vs Soap
RESTful web services heavily rely on HTTP by design. They use different HTTP methods to perform their job and uses HTTP response code to inform clients about success or failure of a particular request. REST stands for Representational State transfer and it uses HTTP to allow two systems to communicate via remote calls. RESTFul web services is a collection of REST URI which points to resources. These URI can point to a single resource of a collection of resource.
POST- The POST verb is most-often utilized to **create** new resources. In particular, it's used to create subordinate resources. That is, subordinate to some other (e.g. parent) resource. In other words, when creating a new resource, POST to the parent and the service takes care of associating the new resource with the parent, assigning an ID (new resource URI), etc.
On successful creation, return HTTP status 201, returning a Location header with a link to the newly-created resource with the 201 HTTP status.
POST is neither safe nor idempotent. It is therefore recommended for non-idempotent resource requests. Making two identical POST requests will most-likely result in two resources containing the same information.
GET :- The HTTP GET method is used to **read** (or retrieve) a representation of a resource. In the “happy” (or non-error) path, GET returns a representation in XML or JSON and an HTTP response code of 200 (OK). In an error case, it most often returns a 404 (NOT FOUND) or 400 (BAD REQUEST).
According to the design of the HTTP specification, GET (along with HEAD) requests are used only to read data and not change it. Therefore, when used this way, they are considered safe. That is, they can be called without risk of data modification or corruption—calling it once has the same effect as calling it 10 times, or none at all. Additionally, GET (and HEAD) is idempotent, which means that making multiple identical requests ends up having the same result as a single request.
Do not expose unsafe operations via GET—it should never modify any resources on the server.
PUT:- PUT is most-often utilized for **update** capabilities, PUT-ing to a known resource URI with the request body containing the newly-updated representation of the original resource.
However, PUT can also be used to create a resource in the case where the resource ID is chosen by the client instead of by the server. In other words, if the PUT is to a URI that contains the value of a non-existent resource ID. Again, the request body contains a resource representation. Many feel this is convoluted and confusing. Consequently, this method of creation should be used sparingly, if at all.
Alternatively, use POST to create new resources and provide the client-defined ID in the body representation—presumably to a URI that doesn't include the ID of the resource (see POST below).
On successful update, return 200 (or 204 if not returning any content in the body) from a PUT. If using PUT for create, return HTTP status 201 on successful creation. A body in the response is optional—providing one consumes more bandwidth. It is not necessary to return a link via a Location header in the creation case since the client already set the resource ID.
PUT is not a safe operation, in that it modifies (or creates) state on the server, but it is idempotent. In other words, if you create or update a resource using PUT and then make that same call again, the resource is still there and still has the same state as it did with the first call.
If, for instance, calling PUT on a resource increments a counter within the resource, the call is no longer idempotent. Sometimes that happens and it may be enough to document that the call is not idempotent. However, it's recommended to keep PUT requests idempotent. It is strongly recommended to use POST for non-idempotent requests.
PATCH :- PATCH is used for **modify** capabilities. The PATCH request only needs to contain the changes to the resource, not the complete resource.
This resembles PUT, but the body contains a set of instructions describing how a resource currently residing on the server should be modified to produce a new version. This means that the PATCH body should not just be a modified part of the resource, but in some kind of patch language like JSON Patch or XML Patch.
PATCH is neither safe nor idempotent. However, a PATCH request can be issued in such a way as to be idempotent, which also helps prevent bad outcomes from collisions between two PATCH requests on the same resource in a similar time frame. Collisions from multiple PATCH requests may be more dangerous than PUT collisions because some patch formats need to operate from a known base-point or else they will corrupt the resource. Clients using this kind of patch application should use a conditional request such that the request will fail if the resource has been updated since the client last accessed the resource. For example, the client can use a strong ETag in an If-Match header on the PATCH request.
DELETE :- DELETE is pretty easy to understand. It is used to **delete** a resource identified by a URI.
On successful deletion, return HTTP status 200 (OK) along with a response body, perhaps the representation of the deleted item (often demands too much bandwidth), or a wrapped response (see Return Values below). Either that or return HTTP status 204 (NO CONTENT) with no response body. In other words, a 204 status with no body, or the JSEND-style response and HTTP status 200 are the recommended responses.
HTTP-spec-wise, DELETE operations are idempotent. If you DELETE a resource, it's removed. Repeatedly calling DELETE on that resource ends up the same: the resource is gone. If calling DELETE say, decrements a counter (within the resource), the DELETE call is no longer idempotent. As mentioned previously, usage statistics and measurements may be updated while still considering the service idempotent as long as no resource data is changed. Using POST for non-idempotent resource requests is recommended.
There is a caveat about DELETE idempotence, however. Calling DELETE on a resource a second time will often return a 404 (NOT FOUND) since it was already removed and therefore is no longer findable. This, by some opinions, makes DELETE operations no longer idempotent, however, the end-state of the resource is the same. Returning a 404 is acceptable and communicates accurately the status of the call
Wednesday, May 11, 2016
Serialization in Java
Object Serialization in Java is a process used to convert Object into a binary format which can be persisted into disk or sent over network to any other running Java virtual machine; the reverse process of creating object from binary stream is called deserialization in Java. Java provides Serialization API for serializing and deserializing object which includes java.io.Serializable, java.io.Externalizable, ObjectInputStream and ObjectOutputStream etc.
Question 1) What is the difference between Serializable and Externalizable interface in Java?
Externalizable provides us writeExternal() and readExternal() method which gives us flexibility to control java serialization mechanism instead of relying on Java's default serialization. Correct implementation of Externalizable interface can improve performance of application drastically.
Question 2) How many methods Serializable has? If no method then what is the purpose of Serializable interface?
Serializable interface exists in java.io package and forms core of java serialization mechanism. It doesn't have any method and also called Marker Interface in Java. When your class implements java.io.Serializable interface it becomes Serializable in Java and gives compiler an indication that use Java Serialization mechanism to serialize this object.
Question 3) What is serialVersionUID? What would happen if you don't define this?
SerialVersionUID is an ID which is stamped on object when it get serialized usually hashcode of object, you can use tool serialver to see serialVersionUID of a serialized object . SerialVersionUID is used for version control of object. you can specify serialVersionUID in your class file also. Consequence of not specifying serialVersionUID is that when you add or modify any field in class then already serialized class will not be able to recover because serialVersionUID generated for new class and for old serialized object will be different. Java serialization process relies on correct serialVersionUID for recovering state of serialized object and throws java.io.InvalidClassException in case of serialVersionUID mismatch
Question 4) While serializing you want some of the members not to serialize? How do you achieve it?
Declare it either static or transient based on your need and it will not be included during Java serialization process.
Question 5) What will happen if one of the members in the class doesn't implement Serializable interface?
If you try to serialize an object of a class which implements Serializable, but the object includes a reference to an non- Serializable class then a ‘NotSerializableException’ will be thrown at runtime and
Question 6) If a class is Serializable but its super class in not, what will be the state of the instance variables inherited from super class after deserialization?
Java serialization process only continues in object hierarchy till the class is Serializable i.e. implements Serializable interface in Java and values of the instance variables inherited from super class will be initialized by calling constructor of Non-Serializable Super class during deserialization process. Once the constructor chaining will started it wouldn't be possible to stop that , hence even if classes higher in hierarchy implements Serializable interface , there constructor will be executed.
Question 7) Can you Customize Serialization process or can you override default Serialization process in Java?
The answer is yes you can. We all know that for serializing an object ObjectOutputStream.writeObject (saveThisobject) is invoked and for reading object ObjectInputStream.readObject() is invoked but there is one more thing which Java Virtual Machine provides you is to define these two method in your class. If you define these two methods in your class then JVM will invoke these two methods instead of applying default serialization mechanism. You can customize behavior of object serialization and deserialization here by doing any kind of pre or post processing task. Important point to note is making these methods private to avoid being inherited, overridden or overloaded. Since only Java Virtual Machine can call private method integrity of your class will remain and Java Serialization will work as normal. I
Question 8) Suppose super class of a new class implement Serializable interface, how can you avoid new class to being serialized?
If Super Class of a Class already implements Serializable interface in Java then its already Serializable in Java, since you can not unimplemented an interface its not really possible to make it Non Serializable class but yes there is a way to avoid serialization of new class. To avoid Java serialization you need to implement writeObject() and readObject() method in your Class and need to throw NotSerializableException from those method. This is another benefit of customizing java serialization process as described in above Serialization interview question and normally it asked as follow-up question as interview progresses.
Question 9) Which methods are used during Serialization and DeSerialization process in Java?
Java Serialization is done by java.io.ObjectOutputStream class. That class is a filter stream which is wrapped around a lower-level byte stream to handle the serialization mechanism. To store any object via serialization mechanism we call ObjectOutputStream.writeObject(saveThisobject) and to deserialize that object we call ObjectInputStream.readObject() method. Call to writeObject() method trigger serialization process in java. one important thing to note about readObject() method is that it is used to read bytes from the persistence and to create object from those bytes and its return an Object which needs to be type cast to correct type.
Question 10) Suppose you have a class which you serialized it and stored in persistence and later modified that class to add a new field. What will happen if you deserialize the object already serialized?
It depends on whether class has its own serialVersionUID or not. As we know from above question that if we don't provide serialVersionUID in our code java compiler will generate it and normally it’s equal to hashCode of object. by adding any new field there is chance that new serialVersionUID generated for that class version is not the same of already serialized object and in this case Java Serialization API will throw java.io.InvalidClassException and this is the reason its recommended to have your own serialVersionUID in code and make sure to keep it same always for a single class.
11) What are the compatible changes and incompatible changes in Java Serialization Mechanism?
The real challenge lies with change in class structure by adding any field, method or removing any field or method is that with already serialized object. As per Java Serialization specification adding any field or method comes under compatible change and changing class hierarchy or UN-implementing Serializable interfaces some under non compatible changes. For complete list of compatible and non compatible changes I would advise reading Java serialization specification.
12) Can we transfer a Serialized object vie network?
Yes you can transfer a Serialized object via network because Java serialized object remains in form of bytes which can be transmitter via network. You can also store serialized object in Disk or database as Blob.
Question 1) What is the difference between Serializable and Externalizable interface in Java?
Externalizable provides us writeExternal() and readExternal() method which gives us flexibility to control java serialization mechanism instead of relying on Java's default serialization. Correct implementation of Externalizable interface can improve performance of application drastically.
Question 2) How many methods Serializable has? If no method then what is the purpose of Serializable interface?
Serializable interface exists in java.io package and forms core of java serialization mechanism. It doesn't have any method and also called Marker Interface in Java. When your class implements java.io.Serializable interface it becomes Serializable in Java and gives compiler an indication that use Java Serialization mechanism to serialize this object.
Question 3) What is serialVersionUID? What would happen if you don't define this?
SerialVersionUID is an ID which is stamped on object when it get serialized usually hashcode of object, you can use tool serialver to see serialVersionUID of a serialized object . SerialVersionUID is used for version control of object. you can specify serialVersionUID in your class file also. Consequence of not specifying serialVersionUID is that when you add or modify any field in class then already serialized class will not be able to recover because serialVersionUID generated for new class and for old serialized object will be different. Java serialization process relies on correct serialVersionUID for recovering state of serialized object and throws java.io.InvalidClassException in case of serialVersionUID mismatch
Question 4) While serializing you want some of the members not to serialize? How do you achieve it?
Declare it either static or transient based on your need and it will not be included during Java serialization process.
Question 5) What will happen if one of the members in the class doesn't implement Serializable interface?
If you try to serialize an object of a class which implements Serializable, but the object includes a reference to an non- Serializable class then a ‘NotSerializableException’ will be thrown at runtime and
Question 6) If a class is Serializable but its super class in not, what will be the state of the instance variables inherited from super class after deserialization?
Java serialization process only continues in object hierarchy till the class is Serializable i.e. implements Serializable interface in Java and values of the instance variables inherited from super class will be initialized by calling constructor of Non-Serializable Super class during deserialization process. Once the constructor chaining will started it wouldn't be possible to stop that , hence even if classes higher in hierarchy implements Serializable interface , there constructor will be executed.
Question 7) Can you Customize Serialization process or can you override default Serialization process in Java?
The answer is yes you can. We all know that for serializing an object ObjectOutputStream.writeObject (saveThisobject) is invoked and for reading object ObjectInputStream.readObject() is invoked but there is one more thing which Java Virtual Machine provides you is to define these two method in your class. If you define these two methods in your class then JVM will invoke these two methods instead of applying default serialization mechanism. You can customize behavior of object serialization and deserialization here by doing any kind of pre or post processing task. Important point to note is making these methods private to avoid being inherited, overridden or overloaded. Since only Java Virtual Machine can call private method integrity of your class will remain and Java Serialization will work as normal. I
Question 8) Suppose super class of a new class implement Serializable interface, how can you avoid new class to being serialized?
If Super Class of a Class already implements Serializable interface in Java then its already Serializable in Java, since you can not unimplemented an interface its not really possible to make it Non Serializable class but yes there is a way to avoid serialization of new class. To avoid Java serialization you need to implement writeObject() and readObject() method in your Class and need to throw NotSerializableException from those method. This is another benefit of customizing java serialization process as described in above Serialization interview question and normally it asked as follow-up question as interview progresses.
Question 9) Which methods are used during Serialization and DeSerialization process in Java?
Java Serialization is done by java.io.ObjectOutputStream class. That class is a filter stream which is wrapped around a lower-level byte stream to handle the serialization mechanism. To store any object via serialization mechanism we call ObjectOutputStream.writeObject(saveThisobject) and to deserialize that object we call ObjectInputStream.readObject() method. Call to writeObject() method trigger serialization process in java. one important thing to note about readObject() method is that it is used to read bytes from the persistence and to create object from those bytes and its return an Object which needs to be type cast to correct type.
Question 10) Suppose you have a class which you serialized it and stored in persistence and later modified that class to add a new field. What will happen if you deserialize the object already serialized?
It depends on whether class has its own serialVersionUID or not. As we know from above question that if we don't provide serialVersionUID in our code java compiler will generate it and normally it’s equal to hashCode of object. by adding any new field there is chance that new serialVersionUID generated for that class version is not the same of already serialized object and in this case Java Serialization API will throw java.io.InvalidClassException and this is the reason its recommended to have your own serialVersionUID in code and make sure to keep it same always for a single class.
11) What are the compatible changes and incompatible changes in Java Serialization Mechanism?
The real challenge lies with change in class structure by adding any field, method or removing any field or method is that with already serialized object. As per Java Serialization specification adding any field or method comes under compatible change and changing class hierarchy or UN-implementing Serializable interfaces some under non compatible changes. For complete list of compatible and non compatible changes I would advise reading Java serialization specification.
12) Can we transfer a Serialized object vie network?
Yes you can transfer a Serialized object via network because Java serialized object remains in form of bytes which can be transmitter via network. You can also store serialized object in Disk or database as Blob.
Subscribe to:
Comments (Atom)