Menu

Charles 4 has HTTP/2

Karl von Randow has released Charles 4.

“With Charles 4 you can now see HTTP 2 working, and you can use all of your familiar tools; Repeat, Breakpoints, and so on. You’ll spot HTTP 2 hosts in Charles as they use a different icon—with a lightning bolt!”

I just took it for a quick spin. The updated HTTP/2 support works as advertised.

If you’d like to try out Charles new HTTP/2 support with OkHttp, it’s three steps to configure OkHttp to trust Charles’ MITM certificate.

Export the certificate from Charles

From the Help menu, pick SSL Proxying and then Save Charles Root Certificate. Save it as a .pem file somewhere, then open it in your favorite text editor. You’ll see something like this:

-----BEGIN CERTIFICATE-----
MIIFfDCCBGSgAwIBAgIGAU9cHuWJMA0GCSqGSIb  
IFByb3h5IEN1c3RvbSBSb290IENlcnRpZmljYXR  
...
ieHLSeZtdIrL7cI45ILT4TkahmPVeZmVVCRwQ==  
-----END CERTIFICATE-----

Import the certificate into OkHttp

OkHttp has an executable custom trust example that we’ll start with. Remove the example’s certificates. Paste the contents of the .pem file instead.

  private InputStream trustedCertificatesInputStream() {
    String charlesRootCa = ""
        + "-----BEGIN CERTIFICATE-----\n"
        + "MIIFfDCCBGSgAwIBAgIGAU9cHuWJMA0GCSqGSIb\n"
        + "IFByb3h5IEN1c3RvbSBSb290IENlcnRpZmljYXR\n"
        + "...\n"
        + "ieHLSeZtdIrL7cI45ILT4TkahmPVeZmVVCRwQ==\n"
        + "-----END CERTIFICATE-----\n";
    return new Buffer()
        .writeUtf8(charlesRootCa)
        .inputStream();
  }

Point OkHttp at the Proxy

The custom trust example

Java’s new HTTP client is upside down

One of Java 9’s new features is a replacement for HttpURLConnection. And while I’m thrilled to get rid of that old garbage its replacement has its own surprises.

The Good

The API is small. The new package is java.net.http and the Javadoc shows only 12 new HTTP types plus 6 more for websockets.

It implements HTTP/2. There’s NIO and java.util.concurrent APIs throughout so it might be efficient for many parallel requests.

There’s builders! And an HttpClient type. Java’s built in TLS stack has been upgraded to support ALPN which means it can negotiate HTTP/2 without the awkward jetty-alpn bootclasspath trick.

The Bad

There’s an HttpHeaders type, but it’s an interface and not an immutable value object. There’s no builder for the headers, only for the HttpRequest.

Though HttpURLConnection is now unnecessary, this new client continues to rely on several of its obsolete support classes. The most frustrating of these is URI, which can’t represent real-world URLs such as those in Google’s image charts. And if you want query parameters you have to concatenate strings.

Many useful knobs and features are absent from the API

The Last HttpURLConnection

An awkward API

OkHttp 1.0 started out as an optimized implementation of HttpURLConnection. This old API is awkward to implement because there is an implicit state machine that corresponds to the underlying network I/O.

GET / HTTP/1.1  
Host: publicobject.com  
Accept: text/html

HTTP/1.1 200 OK  
Content-Length: 300

<html>  
<head><title>Public Object</title></head>  
...

For example, calling getResponseCode() forbids future calls to setRequestProperty() because request headers can’t be edited after they’ve been transmitted.

HttpURLConnection connection = urlFactory.open(url);  
int responseCode = connection.getResponseCode();

// This fails with an exception at runtime!
connection.setRequestProperty("Accept", "text/html");  

What’s safe to call is especially awkward because the state machine itself is dynamic: setFixedLengthStreamingMode() and setChunkedStreamingMode() cause request body writes to lock in request headers.

An uncomfortable implementation

With OkHttp 2.0 we introduced a new API to complement HttpURLConnection. Instead of an implicit state machine the new API uses types to make an explicit model: Request has the inputs, Response has the outputs, and Call does the action.

Request request = new Request.Builder()  
    .url(url)
    .header("Accept", "text/html")
    .build();

Call call = client.newCall(request

The Open Source Maintainer’s Dilemma

We’re in a golden age of reusable open source code. With GitHub and Maven Central it’s never been easier to create and share code.

This is excellent! Android developers have access to a steady stream of new projects. I keep up by following some Android developers on Twitter and by subscribing to Android Weekly and #AndroidDev Digest.

Releasing a new open source project is fun. Take your most reusable code, polish it off, and publish it to the world! Give back to the development community and earn your reputation.

Free as in puppy

There’s a catch. When you release an open source library you implicitly volunteer to be the ongoing maintainer of that library. It’s an unpaid job that imposes real time commitments.

The work as maintainer is to earn trust. This comes by implementing features, fixing bugs, answering Stack Overflow questions, and by responding to (often inane) GitHub issues. If you succeed your userbase will grow and hopefully it’s mostly fun.

Another option is to not do the work. This feels bad in the same way that ignoring street poverty feels bad. Users of my code ask me for things “Please help with this

Disable Cleartext in OkHttp

Alex Klyubin posted instructions on disabling cleartext networking in Android’s built-in HTTP stack.

Ideally, your app should use secure traffic only, such as by using HTTPS instead of HTTP. Such traffic is protected against eavesdropping and tampering.

Unfortunately that approach requires Android 6 or better. But if you’re using OkHttp you can disable cleartext networking for all versions of Android. Just configure your client’s connection specs:

OkHttpClient client = new OkHttpClient.Builder()  
    .connectionSpecs(Arrays.asList(
        ConnectionSpec.MODERN_TLS,
        ConnectionSpec.COMPATIBLE_TLS))
    .build();

If you want even more control, the HTTPS page on OkHttp’s wiki shows you how.

Deliberate Disobedience

Some friends and I do regular boardgame nights. Our favorite is Tichu. The first few times we played we got the rules wrong: we thought certain plays were legal but they weren’t. By playing more, studying the rules, and Googling the edge cases, we eventually got the rules right. We played the right way for a long time! But after years of doing it by the book we came to agree that one of the rules made the game worse. So collectively we decided to ignore it! And we’ve been enjoying Tichu immensely ever since.

Our understanding of Tichu followed this progression:

  1. Fail to follow the rules because we don’t know them.
  2. Strictly follow the rules.
  3. Deeply understand the rules and deliberately disobey the one we don’t like.

Though we aren’t following the rules in either the first or third cases, the reasons why are quite different. #1 comes from a lack of understanding while #3 comes from an abundance of it. I’ve seen this pattern elsewhere.

Coding

Software developers have no shortage of guidelines to follow spanning all aspects of our craft. Some favorites are Effective Java, Google’s Java Style Guidelines, and

Reflection-friendly Value Objects

There’s plenty of good advice on how to do value objects well in Java. Ryan Harter’s AutoValue intro is particularly handy.

When I’m not using AutoValue, I like to do my builders as recommended in Item 2 of Effective Java Second Edition. In it Josh Bloch recommends a private constructor that accepts the Builder as a constructor parameter:

public final class Pizza {  
  public final Size size;
  public final List<Topping> toppings;
  public final Cheese cheese;

  private Pizza(Builder builder) {
    this.size = builder.size;
    this.toppings = immutableList(builder.toppings);
    this.cheese = builder.cheese;
  }

  public static final class Builder {
    private Size size = Size.MEDIUM;
    private List<Topping> toppings = new ArrayList<>();
    private Cheese cheese = Cheese.MOZZA;

    // …setters…

    public Pizza build() {
      return new Pizza(this);
    }
  }
}

This example also assigns default values in the builder. But these defaults are lost when Moshi or Gson use reflection to decode a pizza from JSON. This problem is explained in Moshi’s readme:

If the class doesn’t have a no-arguments constructor, Moshi can’t assign the field’s default value, even if it’s specified in the field declaration. Instead, the field’s default is always 0 for

Reflection Machines

Conventional wisdom says that reflection is slow and that it should be avoided. As always, the truth is more nuanced.

Let’s start with some code.

class Movie {  
  static final List<Movie> BEST_MOVIES = Arrays.asList(
      new Movie("Back to the Future", 1985),
      new Movie("Back to the Future 2", 1989),
      new Movie("Jurassic Park", 1993),
      new Movie("Starship Troopers", 1997));

  final String name;
  final int releaseYear;

  public Movie(String name, int releaseYear) {
    this.name = name;
    this.releaseYear = releaseYear;
  }
}

We’ve got a movie class. Next we’ll write some gratuitous reflection. How about a logging API with reflection-based string interpolation?

  for (Movie movie : Movie.BEST_MOVIES) {
    FancyLogger.log("I loved $name. Did it win best picture in $releaseYear?!", movie);
  }

The hasty implementation is easy:

public final class FancyLogger {  
  public static void log(String message, Object value) {
    try {
      for (Field field : value.getClass().getDeclaredFields()) {
        field.setAccessible(true);
        String fieldValue = String.valueOf(field.get(value));
        message = message.replaceAll("\\$" + field.getName(), fieldValue);
      }
      System.out.println(message);
    } catch (IllegalAccessException e) {
      throw new AssertionError(e);
    }
  }
}

Unfortunately, the code above is why reflection is slow. The problem isn’t that we’re doing reflection, but that we’re doing reflection every single time

Ceilings & Floors

I’ve been thinking about policies and incentives. A lot of motivation comes down to ceilings and floors.

  • Floors: the worst case. We want to raise the floor, limiting the pain in the worst case. Minimum wage is a floor on labor.
  • Ceilings: the best case. We also want to raise the ceiling, improving the pleasure of the best case. A company that expands into new countries raises the ceiling on how many customers they can reach.

Once I start thinking in these terms, they appear everywhere.

Performance Tradeoffs in UTF-8 encoding

In Okio we implemented our own UTF-8 encoder to save the overhead of allocating byte arrays. The encoder was faster for most strings, but not very long ones.

By switching to our own UTF-8 encoder, we made the fastest case faster (raising the ceiling) but made the slowest case slower (lowering the floor).

Security tradeoffs in web session duration

Sign into any website and measure how long it takes for your session to expire. At my bank, the sessions are painfully short, like 60 minutes or less. On Pocket my web session seems to last forever.

Here the short session raises the floor on my security, making session

OkHttp Certificate Pinning Vulnerability!

Bad news. When I added certificate pinning in OkHttp 2.1, I didn’t sanitize the server’s certificate chain. An attacker could exploit this weakness to defeat the protection offered by certificate pinning.

The vulnerability was disclosed to Square by security researcher John Kozyrakis. Whew! We fixed it in OkHttp 3.1.2, and backported that to OkHttp 2.7.4. Matthew McPherrin has requested a CVE for this vulnerability.

If you’re using OkHttp, you should upgrade to the latest version immediately! Staying up to date on OkHttp is a good idea – we track the latest HTTPS cipher suites & TLS versions to balance connectivity with security. It’s like keeping the browser up to date on your computer: staying current is the best way to stay safe.

LinkedHashMap is always better than HashMap

Deterministic code is much easier to debug & understand than non-determinstic code. Got a test that’s failing? Launch the debugger, step through the code, and find the exact statement where an assumption is violated. For me this often looks like a binary search. I add breakpoints and make assertions like “everything is fine before we call pruneCache()”. I’ll run the app, learn something, and repeat.

When you use HashMap in your programs, you’re introducing needless nondeterminism. This is because HashMap’s iteration order is different between platforms (Android vs. Java) and between executions for classes that don’t override equals() and hashCode().

This nondeterminism is toxic to debugging. I think it’s even worse than the nondeterminism caused by multithreading because it’s so unexpected.

Does this program always return the same result?

private static float massPerUnit(  
    float totalMass, float w, float h, float d) {
  Map<CharSequence, Float> dimensions = new HashMap<>();
  dimensions.put(dimension("width", "x"), w);
  dimensions.put(dimension("height", "y"), h);
  dimensions.put(dimension("depth", "z"), d);

  float massPerUnit = totalMass;
  for (Float dimension : dimensions.values()) {
    massPerUnit /= dimension;
  }

  return massPerUnit;
}

Surprise, it doesn’t. And HashMap’s non-deterministic iteration order is to

Your strict naming conventions are a liability

Programmers tend to be pretty obsessive over consistency. Most of that consistency is worthwhile, but some of it is foolish.

For example, let’s take some JSON from GitHub’s gists API:

{
  "files": {
    "Hello.txt": {
      "type": "text/plain",
      "content": "Hello World!\n"
    }
  },
  "created_at": "2014-05-27T02:31:35Z",
  "updated_at": "2015-08-29T14:01:51Z"
}

And we’ll map this JSON structure to some Java classes:

public final class Gist {  
  Map<String, GistFile> files;
  Date createdAt;
  Date updatedAt;
}
public final class GistFile {  
  String type;
  String content;
}

It’s light work for Gson to decode the document into a Java objects:

Gist gist = gson.fromJson(json, Gist.class);  

And moments later, heartbreak! Because createdAt and updatedAt fields are both null! The problem of course is that these field names are camelCase in Java (createdAt) vs. snake_case in JSON (created_at).

Well that sucks. We’ve got three bad choices.

Choice 1: Get the framework to flex its muscle

Gson has built-in magic to convert snakes into camels with its LOWER_CASE_WITH_UNDERSCORES FieldNamingPolicy. We make a global configuration change to our Gson instance and the problem is solved.

Gson gson = new GsonBuilder()  
    .setFieldNamingPolicy(LOWER_CASE_WITH_UNDERSCORES)
    .create();

Choice 2

Sneaking Data into an OkHttp Interceptor

OkHttp Interceptors are a fun & powerful way to implement cross-cutting behavior in your app. You can use ’em for authentication, performance, monitoring, and more. There’s even open source interceptors for OAuth signing and curl logging!

Suppose we want to write an interceptor that tracks which activities use the network the most. For each HTTP call we’ll measure how big it is, how long it took, and which activity triggered it:

public final class CallMeasurement {  
  long requestSize;
  long responseSize;
  long requestDurationMillis;
  long responseDurationMillis;
  String activityName;
}

Our interceptor watches all calls & measures them:

public final class MeasuringInterceptor implements Interceptor {  
  final List<CallMeasurement> measurements = new ArrayList<>();

  @Override public Response intercept(Chain chain) {
    ...
  }
}

We have a problem: there’s no obvious way for the interceptor to find out which activity triggered the call. We could cheat by doing pattern matching on the URLs, or by looking at which activity is in the foreground, but both are clumsy and could be unreliable.

It turns out that we can sneak data into an interceptor with a request header! When initiating each request, include a header that contains the data of interest:

Request request = new Request.Builder()  
    .url("...")
    .header

The Play Store is bad at their job

Today Google removed Podcast Addict from the Play Store. Sometimes people publish podcasts containing sexually explicit material, and Podcast Addict is capable of accessing such material.

“Podcast Addict was the #1 podcast app on the Play Store with 4M downloads, 175K reviews and an average rating of 4.6/5 The app almost had 500K daily users and more than 1M episodes were listened to everyday through the app...”

So the Play Store prevents podcast listeners from using a popular podcasting app.

Last week I discovered that my app, Shush Ringer Restorer has an unauthorized clone in the Play Store. That app has the same exact name as my app, which I assume is in an attempt at tricking users into downloading the wrong app. I petitioned Google to take down this lookalike app, with links to all the media attention Shush has received.

They declined because I haven’t registered a trademark:

“Thanks for submitting your trademark complaint, dated 01/04/2015. We will not be taking action against the apps in question at this time as we are unable to ascertain the validity of your trademark rights. If you'd like us to investigate further, please send us evidence

4 Brainy Books

Like books? I don’t have the attention span for reading but I listen to plenty of audiobooks. Here are some of my all-time favorites.

On Intelligence

Non-fiction by Jeff Hawkins

How your brain works.
Amazon, Audible

Daemon

Fiction by Daniel Suarez

In Terminator, Skynet just launches the nukes. In Daemon, the AI is much better: it uses coercion!
Amazon, Audible

Antifragile

Non-fiction by Nassim Nicholas Taleb

Feedback loops and failure.
Amazon, Audible

The Martian

Fiction by Andy Weir

“I'm gonna have to science the shit out of this.”
Amazon, Audible

OkUrlFactory is going away

OkHttp exposes three APIs:

  • Request/Response. This is the most fully-featured API. It can be used synchronously or asynchronously. It avoids mutable state. Plus interceptors!
  • OkUrlFactory, the HttpURLConnection API. This is a complete implementation of an awkward API. The API’s awkardness has resulted in an implementation that is complex and fragile.
  • The Apache HTTP client shim. This is intended to simplify migration. It’s a partial implementation; some methods throw UnsupportedOperationException when invoked.

In OkHttp 3.0, we’re deprecating both OkUrlFactory and Apache HTTP client shim. And we’ll be deleting them in a follow-up release.

If you’re using the Request/Response API, this is good news. It means we’ll be able to significantly simplify the OkHttp internals yielding better performance and stability.

If you’re using OkUrlFactory this is good news. Finally an excuse to shed some of your own tech debt and adopt a better API. Did I mention interceptors?! If for whatever reason you can’t upgrade, you’ll need to rely on the platform’s built-in HttpURLConnection going forward.

And if you’re on the Apache HTTP client shim, this is merely annoying. You can make your own shim by forking ours

com.squareup.okhttp3

We’re about to start working on OkHttp 3.0. It’ll be like OkHttp 2.7, but with small backwards-incompatible API changes. For example, we’re going to make multipart better.

How can we roll this out without breaking everything? As Jake explains, we’re going to cheat! By changing the package name and Maven group ID, you’ll be able to have both OkHttp 2.7 and OkHttp 3.0 in the same app at the same time.

Shipping two copies of a library is a terrible long-term solution! It bloats your app, consumes more memory, and exacerbates the 65K dex limit. But it’ll enable you to update OkHttp in your application independently from its libraries. Preferably you can get everything upgraded by the time you release your app.

Note that our approach is quite different from Guava’s versioning philosophy. Our package renames will require you to find & replace imports – how annoying! But it also means we can change things without a drawn out deprecation period. Like ripping off the band-aid!

We’re doing this for both OkHttp 3 and Retrofit 2. I’m looking forward to both.

A Convincing Argument

I’m always disagreeing.

I think this might be inevitable. The stuff we agree on is boring, and so the conversation accelerates until it slams into something interesting: a disagreement.

  1. With a neighbour: climate change.
  2. With a family member: basic income.
  3. With a colleague: HBase.
  4. With a friend: messaging software.

I’m not sure where I got this from, but I’ve found a handy–though potentially manipulative–way to advance the conversation:

“What would convince you?”

Suppose we’re arguing about whether to use a blue theme vs. a black theme for a new app. Rather than scattering a bunch of reasons why blue is superior, I ask “What would convince you?” inviting you to tell me exactly which evidence I must present.

You’ll either become convincible by providing concrete problems I must address: “Convince me it won’t get lost in the blue sea of Facebook, Twitter and Inbox. Convince me it will be usable by people who are colorblind. Convince me that blue will strengthen our brand.”

I won’t necessarily be able to address these concerns! Maybe I do the research and discover blue won’t strengthen our brand. But now our disagreement is actionable

Don’t use interfaces for values

If you’re defining an Java API, value objects are your friend. And interfaces aren’t suited to that task.

Java gets this wrong all over:

  • Annotations. They use interfaces to define value objects, and consequently it’s quite awkward for tools like Guice to define annotation values. NamedImpl.java is boilerplate hell.

  • Generic type descriptors. Hidden inside each of Guava, Gson, and Guice are mechanical implementations of GenericArrayType, ParameterizedType, TypeVariable, and WildcardType. This wasted code is particularly tragic because Java 8’s TypeVariable interface broke backwards compatibility.

  • Collections. The List and Set interfaces are too big to implement directly, so we have AbstractList and AbstractSet. If List and Set were just abstract classes, Java 8 wouldn’t have needed default methods!

I’m annoyed that this mistake continues to be made. JAX-RS defined its headers type as an interface without a good way to build an instance in a unit test. Instead every application needs its own implementation for testing.

Interfaces are wonderful! They’re useful in all kinds of modeling problems. But for defining value objects, interfaces are a bad fit.

Publishing Javadoc for GitHub Projects

I recently standardized the location of Javadoc for several of Square’s Java open source projects. Here’s a sample:

We post Javadoc for the latest release of each major version. Each artifact’s docs are published separately.

I wrote Osstrich to make this simple & repeatable. It copies Javadocs from Maven Central to GitHub Project Pages.

If you want to follow this pattern to document your own GitHub projects, release to Maven Central and create a gh-pages branch. Then run the following:

git clone git@github.com:square/osstrich.git  
cd osstrich  
mvn compile  
mvn exec:java \  
  -Dexec.mainClass=com.squareup.osstrich.JavadocPublisher \
  -Dexec.args="temp git@github.com:square/moshi.git com.squareup.moshi"

In this example, Osstrich will download Javadoc for the latest com.squareup.moshi artifacts from Maven central and post ’em to the git@github.com:square/moshi.git GitHub project.

FlatBuffers aren’t fast, they’re lazy

Lots of great Android developers have been promoting FlatBuffers.

Miroslaw Stanek recently posted benchmarks comparing JSON, FlatBuffers, and a hybrid that uses both. He mentioned that decoding JSON to FlatBuffers is about 30-40% faster than decoding JSON to Java models with Gson.

Colt McAnlis has been referring Android developers to his video on FlatBuffers, where he explains some FlatBuffers implementation details:

“This layout allows FlatBuffers to be type safe and further allows you to read from the serialized type format without having to do any type conversions, memory allocations, or unpacking during load time which is a huge speed boost”

Note that FlatBuffers don’t do memory allocations during load time. Well, then when does the memory allocation happen?

FlatBuffers allocate every single time you access a non-primitive property of your object. Every. Single. Time. This is bad. It means allocation is likely happening in your onDraw() method, the dangers of this are well-explained by Ian Ni-Lewis in Avoiding Allocations in onDraw().

When you use FlatBuffers in an Android app, you’re moving work from the background I/O thread to the main thread. The best way to avoid this is by avoiding FlatBuffers.

OkHttp, HTTP/2 & NGINX 1.9.5

OkHttp’s HTTP/2 doesn’t interop with NGINX 1.9.5. HTTP requests made from OkHttp to impacted NGINX servers will fail like this:

java.io.IOException: stream was reset: PROTOCOL_ERROR  
    at com.squareup.okhttp.internal.spdy.SpdyStream.getResponseHeaders(SpdyStream.java:145)
    at com.squareup.okhttp.internal.http.SpdyTransport.readResponseHeaders(SpdyTransport.java:104)
    at com.squareup.okhttp.internal.http.HttpEngine.readNetworkResponse(HttpEngine.java:830)
    at com.squareup.okhttp.internal.http.HttpEngine.access$200(HttpEngine.java:95)
    at com.squareup.okhttp.internal.http.HttpEngine$NetworkInterceptorChain.proceed(HttpEngine.java:823)
    at com.squareup.okhttp.internal.http.HttpEngine.readResponse(HttpEngine.java:684)
    at com.squareup.okhttp.Call.getResponse(Call.java:272)

They’ve merged a fix. In the interim, you can either downgrade your NGINX or disable HTTP/2 in OkHttp:

OkHttpClient client = new OkHttpClient();

// Disable HTTP/2 for interop with NGINX 1.9.5.
// TODO: remove this hack after 2015-12-31.
client.setProtocols(Collections.singletonList(Protocol.HTTP_1_1));  

HPACK is what HTTP/2 uses to compress headers. It’s a bit tricky to get right, partially because there are multiple ways to encode headers. In this case OkHttp’s compression strategy triggered a different code path than the major browsers

#enumsmatter

Colt McAnlis has been arguing that Android developers avoid enums because they have a non-negligible runtime cost compared to int constants. In his talk he measures it, and suggests that pervasive use of enums can harm your Android app.

“And the worst part is that you don't really know that enums are causing a problem until they're already infecting your codebase. And by that point trying to fix it is really a horrible process.”

The implied alternative here is a codebase that uses a single type – int – instead of a variety of different enum types. He wants us to go from this:

  public PaymentState cancel(String id, CancellationReason reason);

to this:

  public int cancel(String id, int reason);

That codebase is harder to maintain and I wouldn’t want to work on it! The reason the entire Internet was disgusted by Colt’s advice is that it takes one of the few tools we have to help us to write readable code (types!) and discourages its use.

#enumsmatter

ProtoParser is Going Away

ProtoParser is a simple Java project that parses .proto files. It’s a dumb parser: whenever you reference a type, it lacks any mechanism to resolve that reference.

Wire is a much more sophisticated Java project that builds upon ProtoParser. It performs that essential linking to make ProtoParser’s output useful, and then uses that message graph to generate nice model classes.

I’m in the process of folding ProtoParser into Wire, combining the two projects into one. The result will be a new subproject (called wire-schema) that does both parsing and linking. This is how we should have done it all along. That new API isn't stable yet, but it will be soon.

If you’re using ProtoParser, be warned: we’re halting development on that project. That’ll allow us to move faster on Wire, which will has a better, more powerful schema model.

Reactions to JMH

Today I’m playing with JMH, the Java benchmarking harness written by Oracle. I’m hoping to try out some optimizations for Moshi.

Some History

After years of struggling with one-off benchmarking harnesses, Kevin B and I started Caliper in 2009. It started out as a project for benchmarking Android and Java. It knew about dalvikvm and java, and could benchmark an attached device. Later, Greg Kick took over the project and gave it a better understanding of HotSpot JVM, plus a new web UI.

The JMH project was first published in 2013. It addresses the same problems as Caliper but with even better access to HotSpot internals.

JMH’s Runner Is Very Sophisticated

It has flexible concurrency control, a powerful State abstraction, and a thoughtful API. JMH has some amazing features like @CompilerControl(DONT_INLINE) which is going to be quite fun for me later!

The JMH runner is more precise than Caliper’s. Caliper uses reflection to invoke the benchmarking method; this has lots of bad consequences such as the need for a reps parameter. JMH just code-gens the entry point to avoid interference from reflection. Way better.

No Love For Android

Because of broken history between Oracle

Null

In announcing OkHttp's new URL class, I wrote about how parsing returns null instead of throwing exceptions:

Instead, parse() just returns null when it doesn’t understand what you passed it.

Several developers thought this was a lousy API. Derek Morr tweeted:

"parse() just returns null" so, not sane.

And Alex Hvostov redditted:

Are you fucking kidding me? What is this, PHP?

Well, the method returns null because I think it’s the best tool for the job. Let’s review the options:

Checked Exceptions

I reject this option immediately because it punishes callers who know their input is valid.

Unchecked Exceptions

This is tempting. But now I’m punishing callers who expect that some of the inputs to parse() to fail. For example, OkHttp’s own redirect handler needs to parse an arbitrary HTTP header, which should be an HTTP URL but could be anything.

public @Nullable Request followUpRequest() throws IOException {  
  String location = response.header("Location");
  if (location == null) {
    return null; // No "Location" header? No follow up.
  }

  HttpUrl redirectUrl = userRequest.httpUrl().resolve(location);
  if (redirectUrl == null) {
    return null; // Location header isn't an HTTP URL? No follow up.
  }

  ...

  return requestBuilder.url(redirectUrl).build();

Wrapping this code in try/catch ceremony

OkHttp 2.4.0-RC1 has HttpUrl

Neither of Java’s built-in APIs make it easy to get at a URL’s query parameters. Instead, you end up having to muck around with string concatenation, string splitting, UrlEncoder, UnsupportedEncodingException, and MalformedURLException.

HttpUrl is a new class that makes URLs easy:

   HttpUrl url = new HttpUrl.Builder()
       .scheme("https")
       .host("www.google.com")
       .addPathSegment("search")
       .addQueryParameter("q", "polar bears")
       .build();

The Javadoc goes on and on explaining why Java needs yet another URL class, and why this one is different.

Get HttpUrl in OkHttp 2.4.0-RC1.

It’s ready on Maven Central and eager to simplify your URL-manipulation code.

<dependency>  
  <groupId>com.squareup.okhttp</groupId>
  <artifactId>okhttp</artifactId>
  <version>2.4.0-RC1</version>
</dependency>  

I invite you to try out this release candidate right away. Unless there are surprises we’ll do a final 2.4.0 release soon.

No beer emoji for java.io.Reader

We’re in the age of Emoji. We fought in the charset wars, obsoleted old bad encodings like ISO-8859-1, and emerged victorious. Being able to sprinkle our writing with sprinked donut emoji is the English speaker’s upside to ubiquitous UTF-8.

Unfortunately, java.io.Reader is stuck in the UTF-16 ghetto: all of the hard work of internationalization but without the donut emoji.

An example

Let’s read characters from a Reader until we hit ‘🍺’ (0x0x1f37a), and then we'll stop. The naïve solution doesn't work:

byte[] data = new byte[] {  
    (byte) 0x68, (byte) 0x65, (byte) 0x6c, (byte) 0x6c,
    (byte) 0x6f, (byte) 0xf0, (byte) 0x9f, (byte) 0x8d,
    (byte) 0xa9, (byte) 0x77, (byte) 0x6f, (byte) 0x72,
    (byte) 0x6c, (byte) 0x64, (byte) 0xf0, (byte) 0x9f,
    (byte) 0x8d, (byte) 0xba
};
Reader reader = new InputStreamReader(new ByteArrayInputStream(data)));  
for (int c; (c = reader.read()) != 0x1f37a; ) {  
  System.out.printf("%08x: %s%n", c, new String(new int[] { c }, 0, 1));
}

This crashes because the single codepoint ‘🍺’ is returned in two halves. We miss the beer altogether and run off the end of the string.

00000068: h  
00000065: e  
0000006c: l  
0000006c: l  
0000006f: o  
0001f369: 🍩  
00000077: w  
0000006f: o  
00000072: r  
0000006c: l  
00000064: d  
0000d83c: ?  
0000df7a

Checked exceptions vs. API design

Java has checked exceptions and unchecked exceptions.

Checked exceptions are for expected problems that you should deal with and recover from at runtime. Flaky network? Catch the IOException and deal with it.

Unchecked exceptions are for unexpected problems that should trigger a crash. Is the byte count negative? Got a string when you expected a URL? Don't recover; just crash. In 2015 even crashing is luxurous with cloud services to make failure delightful.

But when designing general purpose APIs, it's not always possible to differentiate unexpected vs. expected problems. The classic example is NumberFormatException. When you accept raw data from a user and Integer.parseInt() fails, you probably shouldn't crash. But if you're parsing a trusted JSON file and the input is malformed, you should give up so the problem can be found & fixed.

Today I'm annoyed because I'm parsing data that will be clean sometimes and dirty at other times. Do I risk crash good programs on bad input? Or do I punish safe calls with needless try/catch ceremony?

Guava works around this in LoadingCache with two methods that only differ in the types of exceptions they throw: get() and getUnchecked(). This both works and is sad

How do HTTP caching heuristics work?

Suppose you’ve requested a webpage with a Last-Modified date and no other caching headers:

HTTP/1.1 200 OK  
Last-Modified: Tue, 16 Dec 2014 06:00:00 GMT

...

The HTTP client will happily store this response in the cache indefinitely. But it won't serve it from the cache unless it’s still fresh at the time of request.

How do we decide whether it’s still fresh? There are three timestamps we’re interested in:

  • Last requested at: Timestamp when we made the last request.
  • Last modified at: What the Last-Modified header said on that response.
  • Now: Timestamp at the time of the current request.

We use the time between last requested at and last modified at to estimate how frequently a document is edited. If a page was modified 5 minutes before the request, it’s assumed to be frequently modified. If it was last modified 5 years before the request, it’s assumed to be infrequently modified.

A page is fresh for 10% of that duration: 10% of 5 minutes is 30 seconds; 10% of 5 years is 6 months. A page is considered fresh until that 10% has elapsed since the document was last requested.

An