Storm event processor – GC log file per worker

In the last three months, I am working with a new team building a product for Big Data analytics on Telecom domain.

Storm event processor is one of the main frameworks we use and it is really great. You can read more details on its official documentation (which has been improved).

Storm uses Workers to do your job, where each of them is a single JVM and is administrated internally by Storm (start, restart if no responsive, move Worker to another node of cluster, etc.). For a single job you can run many Workers on your cluster (Storm decides how to distribute your Workers in cluster nodes). As “node” I mean a running OS, either running on VM or on a physical machine.

The tricky point here is that all Workers in a node read the same configuration file (STORM_HOME/conf/storm.yaml) even they are running/processing a different kind of job. Additionally, there is a single parameter (worker.childopts) in this file, which is used for all Workers (of the same node) to initialize theirs JVMs (how to set JVM Options).

As we want to know how GC performs in each worker we need to monitor GC log of each Worker/JVM.

As I said, the problem is that as all Workers, in a node, read the same parameter from the same configuration file in order to initialize theirs JVMs, so it is not trivial to use a different GC logging file for each Worker/JVM.

Fortunately, Storm developers have expose a “variable” that solves this problem. This variable is named “ID” and it is unique for each Worker on each node (same Worker ID could exist in different nodes).

For Workers JVM Options, we use this entry in our “storm.yaml” file:

worker.childopts: "-Xmx1024m -XX:MaxPermSize=256m -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc -Xloggc:/opt/storm/logs/gc-storm-worker-%ID%.log"

Be aware, that you have to add “%” before and after “ID” string (in order to be identified as an internal Storm variable).

Additionally, for Supervisor JVM Options (one process on each node), we use this entry in our “storm.yaml” file:

supervisor.childopts: "-Xmx512m -XX:MaxPermSize=256m -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc -Xloggc:/opt/storm/logs/gc-storm-supervisor.log"

I have also included a kind of memory settings (“-Xmx” and “-XX:MaxPermSize”) too, but it is just an example.

Please keep in mind that Storm requires Oracle Hotspot JDK 6 (JDK 7/8 is not yet supported). This is a strong drawback, but we hope it will be fixed soon.

Hope it helps,
Adrianos Dadis.

Democracy Requires Free Software

Posted in Big Data, Java, Software Development | Tagged , | 1 Comment

Set WildFly binding address and shutdown using CLI

It’s very easy to bind WildFly on a hostname/IP just using command line parameters.
I have a simple GNU/Linux box that I use it to play with various things, one of them is WildFly.

I start WildFly listening on a specific IP using this commands:

$> cd /opt/wildfly/wildfly-8.0.0.Beta1/bin
$> ./standalone.sh -c standalone-full.xml -b=192.168.1.10 -bmanagement=192.168.1.10  

As you can see, I use my IP in 2 parameters/points. The first one (-b) is the regular address that server listens to serve traffic and the second one (-bmanagement) is the management address. The first parameter (-b) is the old known parameter that is used on JBoss 4.x too.

In order to shutdown server, I use the new CLI interface of JBoss/WildFly:

$> cd /opt/wildfly/wildfly-8.0.0.Beta1/bin
$> ./jboss-cli.sh --connect controller=192.168.1.10 command=:shutdown  

JBoss/WildFly CLI is really useful tool, as you can use it to create/change/view various resources on server (Datasource, JMS destinations, etc.) and deploy/undeploy applications.
JBoss/WildFly CLI is very similar to WebLogic Scripting Tool (WLST), but JBoss CLI is Open Source and the knowledge is on public space and does not belong to a company or to a closed group. This is the vital virtue of Open Source Software and Free Software, to learn people to share and innovate on the same time.

Hope it helps,
Adrianos Dadis.

Democracy Requires Free Software

Posted in Administration, JBoss | Tagged , | Leave a comment

Add Apache Camel and Spring as jboss modules in WildFly

These days I am playing with Wildfly and Apache Camel and Spring.

As Panagiotis suggests, a simple way to communicate between EARs / WARs is using direct-vm component of Camel. There are many ways to achieve this with or without Camel. Camel works like a charm in WildFly without any need for extra configurations. Camel is great!!!

In order to avoid pack all required JARs of Spring and Camel with my applications, I create two modules, using the great JBoss Modules framework (which is already used by WildFly). Then I can reference these two frameworks, without the need to pack all these JARs inside my applications (EAR/WAR).

Create Spring module

  • Go to WildFly home dir: $> cd /home/torun/jboss/wildfly/wildfly-8.0.0.Beta1
  • Create Spring module directory structure:
    • $> mkdir -p modules/org/springframework/3.2.5.RELEASE
  • Inside this new directory, create module.xml file with this content:
<module xmlns="urn:jboss:module:1.3" name="org.springframework" slot="3.2.5.RELEASE">
  <resources>
    <resource-root path="aopalliance-1.0.jar"/>
    <resource-root path="aspectjrt-1.7.4.jar"/>
    <resource-root path="aspectjtools-1.7.4.jar"/>
    <resource-root path="aspectjweaver-1.7.4.jar"/>
    <resource-root path="org.aspectj.matcher-1.7.4.jar"/>
    <resource-root path="spring-aop-3.2.5.RELEASE.jar"/>
    <resource-root path="spring-aspects-3.2.5.RELEASE.jar"/>
    <resource-root path="spring-beans-3.2.5.RELEASE.jar"/>
    <resource-root path="spring-context-3.2.5.RELEASE.jar"/>
    <resource-root path="spring-context-support-3.2.5.RELEASE.jar"/>
    <resource-root path="spring-core-3.2.5.RELEASE.jar"/>
    <resource-root path="spring-expression-3.2.5.RELEASE.jar"/>
    <resource-root path="spring-jdbc-3.2.5.RELEASE.jar"/>
    <resource-root path="spring-orm-3.2.5.RELEASE.jar"/>
    <resource-root path="spring-oxm-3.2.5.RELEASE.jar"/>
    <resource-root path="spring-tx-3.2.5.RELEASE.jar"/>
    <resource-root path="spring-web-3.2.5.RELEASE.jar"/>
    <resource-root path="spring-webmvc-3.2.5.RELEASE.jar"/>
    <resource-root path="spring-webmvc-portlet-3.2.5.RELEASE.jar"/>
  </resources>

  <dependencies>
    <module name="javaee.api"/>
    <module name="org.apache.commons.logging"/>
    <module name="org.jboss.vfs"/>
    <module name="org.hibernate"/>
    <module name="javax.el.api" export="true"/>
    <module name="com.sun.xml.bind" export="true"/>
  </dependencies>
</module>
  • Then add all the JARs mentioned as “resource-root” inside this new directory
  • You are DONE with Spring module!!!
  • Now you can reference spring module using the next line in your “jboss-deployment-structure.xml”, from your EAR/WAR:
    • <module name=”org.springframework” slot=”3.2.5.RELEASE”/>

Create Camel module

  • Create Camel module directory structure:
    • $> mkdir -p modules/org/apache/camel/2.12.1
  • Inside this new directory, create module.xml file with this content:
<module xmlns="urn:jboss:module:1.3" name="org.apache.camel" slot="2.12.1">
  <resources>
    <resource-root path="camel-core-2.12.1.jar"/>
    <resource-root path="camel-spring-2.12.1.jar"/>
    <resource-root path="jaxb-impl-2.2.6.jar"/>
  </resources>
  <dependencies>
    <module name="org.springframework" slot="3.2.5.RELEASE" />
    <module name="org.slf4j"/>
    <module name="javax.xml.bind.api"/>
    <module name="javax.api"/>
    <module name="sun.jdk" />
  </dependencies>
</module>
  • Then add all the JARs mentioned as “resource-root” inside this new directory
  • You are DONE with Camel module!!!
  • Now you can reference camel module using the next line in your “jboss-deployment-structure.xml”, from your EAR/WAR:
    • <module name=”org.apache.camel” slot=”2.12.1″ />

You may cut out a few jar dependencies from Spring or Camel module, but these are just my current settings and I know it works 🙂

One more important note. While I was trying to found the correct JARs for these modules, I met a few exceptions… So, if you forget any JAR, then you may see any of these exceptions:

Caused by: java.lang.NoClassDefFoundError: sun/misc/Unsafe
    at org.apache.camel.com.googlecode.concurrentlinkedhashmap.ConcurrentHashMapV8.getUnsafe(ConcurrentHashMapV8.java:4136) [camel-core-2.12.1.jar:2.12.1]
    at org.apache.camel.com.googlecode.concurrentlinkedhashmap.ConcurrentHashMapV8.<clinit>(ConcurrentHashMapV8.java:4101) [camel-core-2.12.1.jar:2.12.1]
    at org.apache.camel.com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap.<init>(ConcurrentLinkedHashMap.java:221) [camel-core-2.12.1.jar:2.12.1]
    at org.apache.camel.com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap.<init>(ConcurrentLinkedHashMap.java:104) [camel-core-2.12.1.jar:2.12.1]
    at org.apache.camel.com.googlecode.concurrentlinkedhashmap.ConcurrentLinkedHashMap$Builder.build(ConcurrentLinkedHashMap.java:1634) [camel-core-2.12.1.jar:2.12.1]
    at org.apache.camel.util.LRUCache.<init>(LRUCache.java:83) [camel-core-2.12.1.jar:2.12.1]
    at org.apache.camel.util.LRUSoftCache.<init>(LRUSoftCache.java:68) [camel-core-2.12.1.jar:2.12.1]
    at org.apache.camel.impl.EndpointRegistry.<init>(EndpointRegistry.java:39) [camel-core-2.12.1.jar:2.12.1]
    at org.apache.camel.impl.DefaultCamelContext.<init>(DefaultCamelContext.java:234) [camel-core-2.12.1.jar:2.12.1]
    at org.apache.camel.spring.SpringCamelContext.<init>(SpringCamelContext.java:67) [camel-spring-2.12.1.jar:2.12.1]
    at org.apache.camel.spring.CamelContextFactoryBean.newCamelContext(CamelContextFactoryBean.java:356) [camel-spring-2.12.1.jar:2.12.1]
    at org.apache.camel.spring.CamelContextFactoryBean.createContext(CamelContextFactoryBean.java:350) [camel-spring-2.12.1.jar:2.12.1]
    at org.apache.camel.spring.CamelContextFactoryBean.getContext(CamelContextFactoryBean.java:361) [camel-spring-2.12.1.jar:2.12.1]
    at org.apache.camel.spring.CamelContextFactoryBean.getContext(CamelContextFactoryBean.java:80) [camel-spring-2.12.1.jar:2.12.1]
    at org.apache.camel.core.xml.AbstractCamelContextFactoryBean.getContext(AbstractCamelContextFactoryBean.java:518) [camel-spring-2.12.1.jar:2.12.1]
    at org.apache.camel.core.xml.AbstractCamelContextFactoryBean.afterPropertiesSet(AbstractCamelContextFactoryBean.java:160) [camel-spring-2.12.1.jar:2.12.1]
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.invokeInitMethods(AbstractAutowireCapableBeanFactory.java:1571) [spring-beans-3.2.5.RELEASE.jar:3.2.5.RELEASE]
    at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.initializeBean(AbstractAutowireCapableBeanFactory.java:1509) [spring-beans-3.2.5.RELEASE.jar:3.2.5.RELEASE]
    ... 23 more
Caused by: java.lang.ClassNotFoundException: sun.misc.Unsafe from [Module "org.apache.camel:2.12.1" from local module loader @1a6e5d5 (finder: local module finder @3b3402 (roots: /home/torun/jboss/wildfly/wildfly-8.0.0.Beta1/modules,/home/torun/jboss/wildfly/wildfly-8.0.0.Beta1/modules/system/layers/base))]
    at org.jboss.modules.ModuleClassLoader.findClass(ModuleClassLoader.java:197) [jboss-modules.jar:1.3.0.Final]
    ...

OR

Caused by: java.lang.NoClassDefFoundError: org/w3c/dom/Node
    at java.lang.Class.getDeclaredConstructors0(Native Method) [rt.jar:1.7.0_40]
    at java.lang.Class.privateGetDeclaredConstructors(Class.java:2483) [rt.jar:1.7.0_40]
    at java.lang.Class.getConstructor0(Class.java:2793) [rt.jar:1.7.0_40]
    at java.lang.Class.getDeclaredConstructor(Class.java:2043) [rt.jar:1.7.0_40]
    at org.springframework.beans.BeanUtils.instantiateClass(BeanUtils.java:105)
    at org.springframework.beans.factory.xml.DefaultNamespaceHandlerResolver.resolve(DefaultNamespaceHandlerResolver.java:129)
    ... 29 more
Caused by: java.lang.ClassNotFoundException: org.w3c.dom.Node from [Module "org.apache.camel:2.12.1" from local module loader @1a6e5d5 (finder: local module finder @3b3402 (roots: /home/torun/jboss/wildfly/wildfly-8.0.0.Beta1/modules,/home/torun/jboss/wildfly/wildfly-8.0.0.Beta1/modules/system/layers/base))]
    at org.jboss.modules.ModuleClassLoader.findClass(ModuleClassLoader.java:197) [jboss-modules.jar:1.3.0.Final]
    ...

OR

Caused by: java.lang.NoClassDefFoundError: javax/xml/bind/JAXBException
    at java.lang.Class.getDeclaredConstructors0(Native Method) [rt.jar:1.7.0_40]
    at java.lang.Class.privateGetDeclaredConstructors(Class.java:2483) [rt.jar:1.7.0_40]
    at java.lang.Class.getConstructor0(Class.java:2793) [rt.jar:1.7.0_40]
    at java.lang.Class.getDeclaredConstructor(Class.java:2043) [rt.jar:1.7.0_40]
    at org.springframework.beans.BeanUtils.instantiateClass(BeanUtils.java:105)
    at org.springframework.beans.factory.xml.DefaultNamespaceHandlerResolver.resolve(DefaultNamespaceHandlerResolver.java:129)
    ... 29 more
Caused by: java.lang.ClassNotFoundException: javax.xml.bind.JAXBException from [Module "org.apache.camel:2.12.1" from local module loader @1a6e5d5 (finder: local module finder @3b3402 (roots: /home/torun/jboss/wildfly/wildfly-8.0.0.Beta1/modules,/home/torun/jboss/wildfly/wildfly-8.0.0.Beta1/modules/system/layers/base))]
    at org.jboss.modules.ModuleClassLoader.findClass(ModuleClassLoader.java:197) [jboss-modules.jar:1.3.0.Final]
    ...

Hope it helps,
Adrianos Dadis.

Democracy Requires Free Software

Posted in Java, Java EE, JBoss, Software Development | Tagged , , , | Leave a comment

Java bytecode verification is always REQUIRED

Many Java performance tuning articles propose to disable bytecode verification when running a Java program (like a Java application server or web container like Tomcat).
This is WRONG and you must NOT apply it on your installations.

OK then, but what is bytecode verification in Java?
The full information is at JVM Specification. In short, it is the procedure to check that the program is type-safe in all program points.

In order to run your program faster, many optimization guides/articles recommend to use one of the following parameters:

  • -Xverify:none
  • -noverify

You must NOT use any of the above parameters, as they may lead you to security problems!!!

It is highly recommended to remove all the above parameters from your startup parameters. If you do so, then the default value “-Xverify:remote” becomes active, which is an acceptable solution.
Alternatively, you can use parameter “-Xverify:all“, to apply full bytecode verification.

In case you need to investigate the problem in depth, please check CERT advisory “Do not disable bytecode verification“.

Regards,
Adrianos Dadis.

Democracy requires Free Software

Posted in Administration, Java, Java EE | Tagged , , , , , , , | Leave a comment

Never set GC parameter -XX:MaxTenuringThreshold greater than 15

When tuning Java Garbage Collector you have to be very careful, as you may result to a worst performance instead of a better one.

The GC parameter “-XX:MaxTenuringThreshold” defines how many minor GC cycles an object can stay in the survivor spaces until it finally gets tenured to the old space.

Until java version 1.5.0_05 the maximum value for this parameter was 31. But in all the newest versions of java (1.5.0_06, jdk6, …) the maximum value changed to 15 and for any value we set greater than 15, GC translates it to infinity. Which is very bad, as converts old space to useless space and  survivor spaces will be consumed indefinitely by old objects (that normally should moved to old space). This will soon lead to heap fragmentation.

As your old space is useless (as it is empty), server will do much more full ‘stop-the-world’ GCs to defragment heap. This will have an impact to your applications, as you will see many unnecessary pauses.

In order to avoid this situation, you have to set -XX:MaxTenuringThreshold to a value between 0 and 15 (0 means objects will get tenured immediately, and 15 means objects will get aged 15 times at most before tenured).

Regards,
Adrianos Dadis.

Democracy requires Free Software

Posted in Administration, Java, Java EE | Tagged , , , , | 2 Comments

A movie about Free Software – Software Wars

Back at 2009, Keith Curtis (@keithccurtis) wrote book “After the Software Wars“.  It’s a book about Free Software and how this idea could change our lives.

Now, Keith tries to create a movie about Free Software and software in general. The movie is now seeking crowdfunding on Indiegogo. If you want to help just apply to this campaign SoftwareWars.

Check the latest trailer of Software Wars at Keith channel. It looks very promising.

Adrianos Dadis.

Democracy requires Free Software

Posted in Freedom, GNU/Linux | Tagged , , | Leave a comment

Java heap space, native heap and memory problems

Recently, I was discussing with a friend, why the Java process uses more memory than the maximum heap that we set when starting the java process.

All java objects that code creates are created inside Java heap space, which its size is defined by the -Xmx option. But a java process is consisted by many spaces, not only by the java heap space. A few of spaces that a java process is consisted are the following:

  • Loaded libraries (including jar and class files)
  • Control structures for the java heap
  • Thread Stacks
  • Generated (JITed) code
  • User native memory (malloced in JNI)
  • … more…

In a 32-bit architecture system, the total process size cannot exceed 4GB. So, a 32-bit java process is consisted by many spaces (java heap, native memory (C-Heap) and other spaces) and its allocated space cannot exceed 4GB.

Assume on a 32-bit production system you run a java application server with -Xmx 1.5 GB (java heap is set to 1.5 GB) for a long time, with many applications deployed. After some time, customer wants to deploy on the same application server more applications. System operator(s) understands that as server will have to process more requests will also need to create more objects and do more processing. So, as a future proof solution operator(s) decides to increase maximum heap of java process to 2 GB.

OK, it looks like a good approach, but what did it really happen on this production application server in reality??? (This is a real case)

The application server crashed with OutOfMemoryError !!!
Can you think about the possible causes?

My first thought was that 2 GB were not enough for all these applications with this load. Unfortunately, the problem was something else.

What do you think now? I will help you a little.

java.lang.OutOfMemoryError: requested 55106896 bytes for Chunk::new.

The real cause was that already deployed (old) applications were needed too large size for the native (C-Heap) memory. Before operator(s) increase the size of the heap size (from 1.5GB to 2 GB) they had not monitored the required native memory space of the old applications. The side effect of this action was to automatically decrease the available maximum size of native memory of java process (from 2.5 GB to 2GB). As the old applications were already use so large size for native memory, this change crash the server!!!

The only accepted solution on this case was to avoid increase the maximum heap size, deploy the new applications and live with less throughput. It is not a perfect solution, but it is the only one viable for this case (as our java process has to be 32-bit).

Especially in 32-bit systems, be aware of the required size of native memory of java process, before you increase the java heap size. If you are in a situation where these two spaces conflict, then the solution may not be so easy. If you cannot change your code to overcome this situation, then the most common solution is to move to a 64-bit system, where the maximum process size limit is too much larger.

There are four major things to remember:

  • The maximum limit of size of a process
  • The size of a java process is not only consisted of java heap
  • The size of native (C-Heap) memory of a java process cannot be configured explicitly, as it is possible with the java heap space
  • The size of java heap space and native (C-Heap) memory space an application requires is only defined by the application and there is not any standard ratio between these two spaces

Happy new year 🙂
Adrianos Dadis.

Democracy Requires Free Software

Posted in Java | Tagged , , , , , , | 2 Comments