How to Test API Usability: Part 1

Disclaimer: this is a translation of the article written 2 years ago for a corporate blog. Bear in mind, at the moment of the writing I was testing SOAP services and Excel-based import/export at big government project, so most of the examples relate to that experience.

Usability is one of the most crucial quality attributes of an API. Let’s talk about why, when, and how we can assess this characteristic.

Today (hopefully) no one doubts the necessity of usability testing of GUI. Yet, according to ISO 9241, usability is the effectiveness, efficiency and satisfaction with which specified users achieve specified goals in particular environments. There is no mention of menus, fonts, or buttons color. Hence, we can evaluate usability of any product, be it a mobile app, a vacuum cleaner, or an API.

For testing API usability we can use methods developed in the field of HCI; same as used for GUI. Generally, these methods can be divided into two categories: analytical and empirical.

Analytical Methods

Analytical methods involve exploration based on some expert knowledge. Loosely speaking, you and/or the whole dev team try to evaluate and find hypothetical usability problems without users input.

Heuristic Evaluation

Easiest way is to use heuristics. There are no strict lists of what criteria you should check; all depends on what kind of API you have (e.g., library or REST service).

For instance, a paper on a structural analysis of usability problem categories mentions this set of heuristics:

Let’s try to apply some of these heuristics. There was a time when every new tester came to me during the onboarding and asked about the error message “House with ID <> was not found.” I told them to use internal system id instead of global FIAS id (Russian index system for buildings). And every one looked startled and answered me that there is no such parameter in the API request! Well, the problem was that you had to use the same parameter named FIASHouseGUID. For some reason when system was designed no one thought that the better name would have been HouseID as it could be filled either with FIAS id or with internal id. Even though current name was misleading (naming heuristic), it was no longer possible to change without breaking backward compatibility.

Next example is about error handling. One service I tested had a very common error “Access is denied.” There were numerous reasons for this error: no entitling documents, documents are not in the status “published,” other organization already created the same object. Reasons were different, but the received error message was the same; users couldn’t guess what was their problem.

There are other, more “serious” heuristics for API. Often they target specific technical details. You need to be able to code to understand them. For example, criteria from Joshua Bloch. Or a usability research made by Microsoft to determine which constructor is better: default constructor with setters and getters or constructor with required parameters. Results showed that the first method was more preferable; this became a heuristic for library design.

Cognitive Dimensions

These are distinct criteria used predominately for evaluating usability of notations, user interfaces, and programming languages — or, generally speaking, information artifacts. In my view, they intersect with some heuristics, but there is a difference: heuristics are contextually selected by some experts, whereas cognitive dimensions are more or less stable set of principles. You can read about the main set described by Thomas R.G. Green and Marian Petre in the Wikipedia.

Some companies customize cognitive dimensions for their needs, like a framework suggested by Visual Studio usability group:

Here is an example for domain correspondence. Service main entity was a house. Common apartment building can have several entryways, each leading to set of apartments. But in Kaliningrad this doesn’t apply: a typical address there can look like “2-4 Green Street,” where entryways are house 2 and house 4. This bizarre (and initially unknown) domain model broke the whole logic behind API design. For instance, we had to allow users to add house-level metering devices to entryways if it’s actually a house.

Cognitive Walkthrough

While the first two methods are based on checking API against some kind of list of criteria, cognitive walkthrough is closer to scenario-based testing. Essentially, an expert comes up with typical API usage scenarios and attempts to perform them.

Cognitive walkthrough example

You can combine this method with heuristics. When we analyzed services, we found out that there were problems with the consistency: when you sent a request to create an entity, some services responded with entity version id, while others provided root id. Moreover, most of the services required entity id for creation of other entities, and again, it could be either root or version id. It didn’t look that bad, until we tried walking through a business scenario:

  1. Create an entitling document
  2. Create a metering device providing document root id

In existing API workflow you had to do it in 3 steps instead of 2:

  1. Create an entitling document → server responds with document version id
  2. Retrieve the document using provided version id and get document root id from the response
  3. Create a metering device providing document root id

This middle step is objectively unnecessary and generates additional server load. Here, using cognitive walkthrough, we also detected an inconsistency with heuristic “minimal working code size.”

API Peer Review

Heuristics and walkthroughs are great methods, but they could be quite subjective. For better objectivity you’d better use group evaluations, where several persons analyze API. You can read about how and why this method can find usability problems which are rarely found by empirical methods in this Microsoft paper.

Peer reviews involve these four roles:

During the planning process, a usability expert and a chunk owner should discuss:

You should start a peer review session with the explanation of how this meeting will proceed and communicate basic information related to the evaluated API chunk. Next you distribute code examples and discuss them, asking these main questions:

Based on the answers, a usability expert asks to elaborate details. For example, hearing that naming is weird, expert should ask why a person thinks that way and what different name would be better.

The final step is to analyze problems. Here is where an API unit owner can help to identify the most significant issues and could they be resolved or not.

That’s the end of part one. Empirical methods are covered in part two.