PUBLIC OBJECT

Handling a byte order mark in Java

Occasionally you'll need to interact with text files that have a leading byte-order mark. This signals which charset should be used for converting bytes to characters.

Unfortunately, the standard library doesn't include any facility to do this automatically. Instead, cut & paste this code to convert any raw InputStream into a Reader of the appropriate type.

  public Reader inputStreamToReader(InputStream in) throws IOException {
    in.mark(3);
    int byte1 = in.read();
    int byte2 = in.read();
    if (byte1 == 0xFF && byte2 == 0xFE) {
      return new InputStreamReader(in, "UTF-16LE");
    } else if (byte1 == 0xFF && byte2 == 0xFF) {
      return new InputStreamReader(in, "UTF-16BE");
    } else {
      int byte3 = in.read();
      if (byte1 == 0xEF && byte2 == 0xBB && byte3 == 0xBF) {
        return new InputStreamReader(in, "UTF-8");
      } else {
        in.reset();
        return new InputStreamReader(in);
      }
    }
  }