Atom Feed SITE FEED   ADD TO GOOGLE READER

Tagsoup: An easy API for parsing HTML in Java

I needed an API to convert sourcecode embedded in HTML (example) into plain text. I needed to keep track of where the <pre> tags start and stop, plus extra stuff for my parsing heuristic.

The Java API I chose for this task is Tagsoup and I'm very impressed. Why Tagsoup is awesome:
  • Very easy setup.
  • Familiar SAX API so there's less to learn
  • It just works.

    If you're writing Java code and you need to work with HTML, I highly recommend Tagsoup!
  • Thanks for another post without any information :(