Tagsoup: An easy API for parsing HTML in Java
I needed an API to convert sourcecode embedded in HTML (example) into plain text. I needed to keep track of where the <pre> tags start and stop, plus extra stuff for my parsing heuristic.The Java API I chose for this task is Tagsoup and I'm very impressed. Why Tagsoup is awesome:
If you're writing Java code and you need to work with HTML, I highly recommend Tagsoup!