Atom Feed SITE FEED   ADD TO GOOGLE READER

Two simple classes for text processing in Java

FileCharSequence adapts a java.io.File as a CharSequence which has nice consequences. For example, you can run Java regular expressions directly against a File. And you can easily send part or all of a file to a StringBuilder or Writer:
/**
* Adapts a text file as a character sequence so that it can be directly
* manipulated by regular expressions and other character utilities. The
* file may be at most 2 GB in size and encoded with {@code ISO-8859-1};
* otherwise behaviour is undefined.
*/
public final class FileCharSequence implements CharSequence {
...
}
If you like this, feel free to use the code in your projects.

I prefer to use Java for one-off text processing tools. Partly this is because that's what my development environment is already set up to do, and partly it's because I'm not very productive in Python. With that constraint, I've written Strip.java. It uses FileCharSequence behind-the-scenes to strip all occurrences of a regex from a file. It uses Java's regex syntax, and supports switches like (?m) for multi-line regexes. Just like the Rip.java tool, it can be executed directly from your command line:

jessewilson:~$ Strip.java
Usage: Strip <regex> [files]

regex: a Java regular expression, with groups
http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html
you can (parenthesize) groups
\s whitespace
\S non-whitespace
\w word characters
\W non-word

files: files to strip. These will be overwritten!

flags:
--clober: overwrite the passed in files rather than creating new ones
-c:

Use 'single quotes' to prevent bash from interfering

This code is also Apache-licensed for your enjoyment. Download Strip.java, make it executable (chmod a+x Strip.java) and put it somewhere on your path!
Maybe a charset parameter would make it more useful.
Jesse, have you played with Beanshell or Groovy?
Karnok - Unfortunately, I cannot accept a charset 'cause this code relies on a strict mapping of one-byte-per-character.

Rob - yeah, Groovy sounds like a great idea. One wrench in that plan is that I'm also using FileCharSequence in a slightly more complex app that reorders the methods in a .java file. I'm unsure how complicated I want to go with the Groovy.
Greetings! We are conducting a online betting. To all who are interested in playing games. Visit our site for more information and this online game is free for all! So what are you waiting for? Play now and enjoy our site! Happy gaming!