Cygwin is a project of a collection of tools
that provides a Linux console based environment to Windows. It is free software.
When
internationalizing an application, matching keys in the source code with lines
in properties files is a relatively large task.
Going through the entire application for each language, and ensuring all
properties referenced in the source code are also listed in the property files
is time consuming.
An even
more quick and efficient method of finding missing keys is to write scripts to
parse the source code. Scripts that
parse the source code for keys and match the keys with keys located in the
property files can be written using Cygwin and some creativity. Cygwin provides a command prompt that gives
access to a collection of common Linux/UNIX tools. By employing tools including find, xargs,
cat, grep, sed, wc, sort and diff, mismatched keys can easily be
found. This includes keys either located
in the source code or property file and not in the other, as well as duplicate
keys.
These tools
can be used for other tasks as well:
- Searching projects for strings
- Counting instances of specific strings for estimating effort
- Listing files that require translation so that they can be added to tasks/work items
- Calculating line counts of the files to estimate effort
- Generating repetitive script files that apply changes to a database
- Mass renaming of files
#!/bin/bash
# copy Java files
echo Cleaning local directory...
rm -f *.java .properties PROPERTY_* JAVA
echo Copying Java source files...
find /cygdrive/c/s
rm Internationalizer.java
# copy properties files
echo Copying properties files...
cp /cygdrive/c/s
# get all property entries from Java source into file JAVA
echo Parsing Java source files...
cat *.java | sed "s/in\.get/\nin\.get/g;s/\")/\
# get all property entries from property files into file PROPERTY
echo Parsing properties files...
cat PIOS_en_CA.properties | sed "s/=.*//g" | sort > PROPERTY_EN
# Output properties missing in properties file
echo Missing English Properties \(PIOS_en_CA.properties\)
diff --ignore-blank-lines --ignore-all-space JAVA PROPERTY_EN | grep "<"
echo \*\* PROPERTY KEYS THAT ARE USED MULTIPLE TIMES IN JAVA SOURCE WILL BE REPORTED \*\*
echo Removing copied files...
rm -f *.java .properties PROPERTY_* JAVA
I used this script frequently to match keys referenced in the source code with property file keys. This ensured I wouldn't get exceptions during execution, and quickly scanned for common typos.
It's also really useful to estimate effort. In the same project, translation effort was estimated by the number of lines of code that needed internationalization. I took the complete list of files and broke them down into functional categories to make testing/verification easier. Then, to quickly determine the number of lines (which was later converted to effort in hours), a ran the following which outputs the total number of lines:
cat file1.java file2.java file3.java file4.java | wc -l
The last, and another relatively simple example is breaking a massive log file into pieces. A client didn't have rotating logs implemented on their server, which resulted in a 2GB log file. Obviously the log file is much to large to open and analyze as is. However, by using split it's really easy to break the file into multiple manageable files.
The nice thing is, all of this is command-line based, so it's resource efficient and generally won't tie up your computer. It's also extremely easy to save bash scripts for reference or repeating execution.