2016-04-05

Giving an old wireless router (WRT54G) a new life

Over the holiday break, I had some spare time to work on projects around the house.  One of the projects was to give my old wireless router (WRT54G) a new life as a repeater.

I can't say enough about my current wireless router, an Almond+ by Securifi, it works great, it's easy to use and has tons of automation features.  However, it's located at the front of my house and coverage isn't great at the back of my yard.  To solve my 1st world problem, I reconfigured my old WRT54G.

Here's a couple of things to keep in mind during the process:

  • It's good to have separate internet access throughout the process so that if you have to look anything up, you can find a solution quickly.  My smartphone did the trick for me.
  • Be patient, after restarting the WRT54G sometimes the web interface (LuCi) and SSH access were slow to come up (connection lost message, etc...).  Wait a minute and try again
To start, I dusted off my WRT54G and got to work.  I checked out openwrt, and found my router in the OpenWRT Table of Hardware (make sure you pay attention to version numbers listed on the bottom of your router) and read the WRT54G device page.  It's a good idea to read the entire page, but really the most important section is Installing OpenWRT

I installed backfire v10.03.1 brcm-2.4 on my v3 router.  It was simple to follow their steps, and I had no problems at all.

Next, because I wanted to extend my existing WLAN, I configured the router to be a wireless client bridge (bridgedclient mode).  You can find the recipe here.

The most complicated part of the process was configuring the router for my WLAN.  Here are a couple of notes to help:
  • You'll need a telnet client and/or SSH client installed on the computer you're using to access the router and configure it
  • The router is running Linux, so I used vi (vi cheat sheet) to edit the files, which is a text editor and is a bit confusing if you're not familiar with it
  • All the configuration files are in /etc/config/
  • The default IP address of the WRT54G is in /etc/config/network, and was set to 192.168.1.1 so I had to first change my laptop's IP to the same network so that I could access the router (via a wired Ethernet connection).
  • To change my laptop's IP:
    • Disable DHCP (control panel -> network connections -> lan connection properties -> TCP/IP v4 properties, set the IP to 192.168.1.200 and default gateway to 192.168.1.1
  • Make sure re-enable DHCP on your laptop so that you can access the WRT54G after it is restarted

So that I don't forget the static IP of my WRT54G, I wrote it on the bottom of the device.  Then I plugged it in near the back of my house, and now I have extended WLAN coverage.

2016-02-27

Why is the JVM on my Linux Server using so much CPU?

Once in a while the CPU usage on one of our development servers rails and hits 100% and never lets up.  A couple of times it actually hit 200% (2 threads burning 2 cores at 100%) or more before we noticed.  These are not good situations, and something you want to get to the bottom of and fix before it starts happening in your production environment.
The lazy thing to do is restart the entire server, or restart the process that's railed.  But this is just ignoring the problem.  Avoidance is what you should be doing.  Here's how you get to the bottom of what's going on:
1. Get the PID causing the problem, run top to get the list of processes using the most CPU, it'll be the first entry.  As you can probably guess, it was my JVM.
2. Get the IDs of the threads in the process causing the problem (my PID was 2089): top -H -p 2089
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
11405 liferay   20   0 11.7g 6.8g  14m R 99.4 88.5  23074:07 java
10000 liferay   20   0 11.7g 6.8g  14m R 99.4 88.5  23079:00 java
 3390 liferay   20   0 11.7g 6.8g  14m S  3.9 88.5  22:42.06 java
 2089 liferay   20   0 11.7g 6.8g  14m S  0.0 88.5   0:00.02 java
 2098 liferay   20   0 11.7g 6.8g  14m S  0.0 88.5   0:01.85 java
 2111 liferay   20   0 11.7g 6.8g  14m S  0.0 88.5   3:15.03 java
 2112 liferay   20   0 11.7g 6.8g  14m S  0.0 88.5   3:11.32 java

From the top listing, it's easy to see the first two threads are the problem, 11405 and 10000 are using almost 100% of the CPU each.  Convert the thread IDs to hex for step 4.
11405 -> 0x2C8D
10000 -> 0x2710

3. Take a thread dump of your JVM: /usr/java/jdk1.6.10/bin/jstack 2089 > /tmp/td.log
4. Search your thread dump for the threads causing your problem (found in step 2).  For example, I found the following:
"pool-400-thread-53" prio=10 tid=0x00007fde4801efd0 nid=0x2c8d runnable [0x00007fdde3ba4000]
   java.lang.Thread.State: RUNNABLE
        at java.util.HashMap.getEntry(HashMap.java:347)
        at java.util.HashMap.containsKey(HashMap.java:335)
        at java.util.HashSet.contains(HashSet.java:184)
        at com.vaadin.ui.CustomTable.unregisterPropertiesAndComponents(CustomTable.java:2322)
        at com.vaadin.ui.CustomTable.getVisibleCellsNoCache(CustomTable.java:2219)
        ...
Looking through the stack trace for 0x2c8d, I was eventually able to find a custom class and a line number that allowed me to narrow down the source of the problem.
Finding the general area of the problem is usually the easy part.  The hard part is eliminating the source of the problem, which I'll leave up to you.
Good luck.

2016-01-25

Beware of Liferay on a GlassFish Application Server

If you're using Liferay to build a portal or website, I highly recommend using Liferay with bundled Tomcat.  Learn from other's headaches, it's not worth the trouble of trying to use something other than Tomcat.

Originally, we were trying to use GlassFish 3.x.x because we have a support contract.  We got our environments up and running reasonably quickly.  Our environments were the following:

  • Local Dev (Windows 7) - Tomcat7 (bundled Liferay Developer Studio), MySQL V5 DB
  • Dev (Ubuntu VM) - Liferay 6.1.2 deployed to GlassFish , MySQL V5 DB
  • Test (RHEL) - Liferay 6.1.2 deployed to GlassFish , Oracle 10G
  • QA (RHEL) - Liferay 6.1.2 deployed to GlassFish, Oracle 10G
  • Production (RHEL) - Liferay 6.1.2 deployed GlassFish, Oracle 10G

Once we started doing significant work, developing portlets and customizing Liferay (hooks, themes, ext plugin, etc...) we started running into problems.

I had difficulties migrating data between environments, which was important early in development.  While trying to migrate the database from one environment to another (Control Panel -> Server -> Administration -> Data Migration), I repeatedly got errors and the migration failed.  I determined that migrating from one environment to another never worked when using Liferay deployed to GlassFish.  The migrations always failed with a null pointer exception.  Analysing the error in the logs and a Google search didn't lead to a solution to the problem.

However, I did eventually find a workaround for migrating data between environments, but it wasn't practical due to our network configuration (Firewalls blocking traffic between certain environments).  The workaround was to connect my Local Dev environment, by temporarily modifying my portal-ext.properties file, to the source database for the data migration.  Basically I found that the version of Liferay bundled with Tomcat was the only Liferay installation able to perform the data migration successfully.

We did encounter other problems using GlassFish as well.  We put a lot of effort into clustering Liferay deployed to GlassFish without success.  After scouring the Liferay forums and discussing our options with Liferay support we decided to give up on GlassFish and use Tomcat instead.

After switching to Tomcat, we had clustering working with minimal effort and our data migration issues were solved.  In hindsight we would have saved a lot of time and effort by simply using Tomcat from the start.  It's important to note as well, that we haven't had any stability/reliability issues due to Tomcat, and have almost been live for 2 years.