Overview

These are some tips to help committers to keep our Subversion (SVN) repository clean. See also Committers FAQ and PMC FAQ.

Coding Style Guidelines

We do not want to get too concerned about style, other than a few obvious things such as whitespace. Basically just follow the style that is already used by the files that you are working on. We loosely follow the Sun Java Style Guide.

Consistent whitespace

Whitespace can cause big problems with SVN. If it is inconsistent, then diffs are very hard to follow - actual changes become lost in the noise of whitespace changes. Some developers use editors that attempt to automatically format the whitespace. The trouble is that if SVN files are inconsistent, some of those editors just make it worse.

The solution comes in two parts:

When a committer adds a new file, they need to ensure that the file has the correct line endings for their own operating system, and that either their SVN client automatically sets the EOL settings for the file based upon the file extension, or they must do:

svn add myfile.txt
svn propset svn:eol-style native myfile.txt

is that some committers who have a proper editor should occasionally correct the whitespace across all SVN files, applying the following rules.

For all text files:

For all Java source files, as for text files plus:

For all XML source files, as for text files plus:

dos2unix and other control-characters

If you are on a UNIX system, when you receive a patch from a contributor on Windows then do a 'dos2unix'. If you are on a Windows system, then ensure that you have a proper SVN client (it is supposed to convert to UNIX line-endings when you commit).

Here is one UNIX way to find all plain-text files that have DOS line-endings (and maybe mixed line-endings). There seem to be many images and jar archives that contain carriage-returns, so to list only the plain-text files:

find . -type f | xargs grep -l '^M' | xargs file | \
grep -i -w text | cut -f1 -d:

To add the ^M use "Ctrl-v Ctrl-m" at the command-line.
Note that copy-and-paste will not work.
The -w can be omitted, but might then match some extra filenames.

To instead find all files (including images and foreign jar archives) that have DOS line-endings: find . -type f | xargs grep -l '^M'

Here is one way to find files that have any hidden control characters: find . -type f | xargs grep -l '[[:cntrl:]]'

Valid XML instances

Many of us have wasted time with a broken build due to xml validation errors. Would all committers please either use a proper xml editor or validate their xdocs with one of the following commands:

onsgmls -c $COCOON_HOME/src/webapp/WEB-INF/entities/catalog \
-wall -wxml -s mydoc.xml
      
export SGML_CATALOG_FILES=$COCOON_HOME/src/webapp/WEB-INF/entities/catalog
xmllint --valid --catalogs --noout mydoc.xml
      
forrest validate-xdocs