code analysis

Writing a custom ESLint rule to spot undeclared props hiding in plain sight

johnny

Nov 1, 2016 — 6 min read

This post aims to give a walkthrough of how to design a custom ESLint rule without any prerequisites. However, some experience with Javascript Abstract Syntax Trees (ASTs) will be helpful. Consider watching my talk about getting started with ASTs.

ESLint is a fantastic linting tool to help enforce your team's code-conventions. I recently introduced ESLint into a codebase to enforce the convention of allowing a script to use dependencies only by explicitly importing them using the module syntax^[1]. Using the built-in ESLint rule Disallow Undeclared Variables (no-undef), it was easy to find (and remove) most dependencies that were declared using the outdated Namespaces convention.
However, I also found some cases where the implicit dependencies exist as properties of other objects, and because of this the no-undef (which only handles variables) doesn't detect them. Thankfully, ESLint can still be used to solve these cases as well. As they state in their About page:

The primary reason ESLint was created was to allow developers to create their own linting rules

So I created my own set of ESLint rules to catch cases where undeclared dependencies hide-out as object properties. This post explains how ESLint analyzes scope, and also how to write custom ESLint rules by showing how these rules were written. If you want to use these rules in your own project, you can find them on NPM and github.

The problem

First, let's give an example of an "undeclared property" that we would like our new rule to be able to identify.
Consider the result of linting this code with ESLint with just the no-undef rule:

/*eslint-env browser*/
window.location.reload();
Foo.bar();
window.Foo.baz();

ESLint's output for this will be 'Foo' is not defined. (at 3:1) so only line #3 will be classified as a violation, while line #4 will not- despite the fact that, clearly, no browser exposes a property called window.Foo and that this is just another way of referencing the undefined variable Foo.
Let's go line-by-line to understand exactly why that is:

A special comment that ESLint will recognize to tell it that the script is going to run in a browser environment.
window is not declared as a variable, a parameter or an imported dependency. But it's considered an allowed global because of the comment in line #1. So no violation is triggered and this is the desired behavior.
Foo is not declared as a variable, a parameter or an imported dependency.
no-undef will trigger a violation, and this is the desired behavior.
ESLint sees that this line refers to window, which is an allowed global in browser code so no violation will be triggered.

In browsers, any object that is accessible globally is also accessible as a property of the window object, but no-undef disregards the latter. By making use of this distinction, devs can get around the convention we're trying to establish by attaching variables to the window object.
I lovingly refer to this foul practice as "piggybacking":
Bear Piggybacking

The desired behavior is that our new rule will be able to distinguish between valid window properties such as window.location and invalid properties such as window.Foo, and alert accordingly.

TDD: Theft Driven Design

One of the easiest ways to get started with writing your own rule is to find an existing rule that does something similar and to understand how it works. Since no-undef already distinguishes between declared and undeclared variables, we can take a look at its source code and ~~steal~~ draw inspiration.
Let's focus on the part of the code that's responsible for identifying and reporting on undeclared references (the rest of the code is mostly ESLint rule boilerplate, which you can read about in the official guide).

create(context) {
  return {
    "Program:exit"(/* node */) {
      const globalScope = context.getScope();
      globalScope.through.forEach(function(ref) {
        context.report({node: ref.identifier, message: "'{{name}}' is not defined.", data: ref.identifier});
      });
    }
  };
}

Let's evaluate this line-by-line to understand what's going on:

Defines a function that will be called for each Program node in the AST (that's the root node so in this case it will be called only once).
Call ESLint's context.getScope(), which returns an object that represents the scope for the current node (which is the global scope in the case of the root node).
Iterate over the references in globalScope.through
Report an undefined violation for each one.

As you can see, this is just a few lines of code and globalScope.through already holds a list of undefined references, so all this rule does is read from that list and report a violation for each item. This means the heavy lifting of parsing the AST and evaluating scope is done internally by ESLint, which exposes this to the rule author. To really understand what's going on, we need to learn how ESLint analyzes scope.

Detour: How ESLint analyzes scope

ESLint uses a library called Escope to analyze scope. Escope is the de-facto tool for evaluating scope in a given AST. The awkwardly-named property through comes from Escope. The Escope docs tell us what it contains:

through: The references that are not resolved with this scope.

Escope has no concept of the environment where the code is being executed. This means it can't differentiate between window and wind0w if neither of them is defined in the script, and both will end up in the through array.
However, ESLint doesn't use Escope as-is. It complements its use with a library called Globals, which is just a well-maintained JSON file documenting "Global identifiers from different JavaScript environments."
For example, for a browser execution environment, Globals lists 651 different globals such as alert, document, etc. So if the user configures ESLint to lint browser code, ESLint fetches the list of well-known browser globals and removes their occurrences from the through array.

This explains why no-undef works so well with just a few lines of code, but it also means that we can't reuse the globalScope.through array for our purposes. through is a blacklist of unresolved global references, and we need a whitelist of resolved global references to compare against. Luckily, ESLint exposes this as well. If we debug the no-undef rule and look at what other properties globalScope has other than through, we can find what we're looking for:

globalScope.variables

globalScope.variables is an array of variables that are available in the global scope. ESLint fetches the list of well-known browser globals and adds them to this list.

Putting it all together

Equipped with the list of known globals, writing our rule becomes quite simple. Here's how it will work:

Iterate over statements in our code with the format {{object}}.{{property}} (these would be the MemberExpression nodes in our AST).
Check if the {{object}} part is a global scope identifier (e.g. window) and the {{property}} is not in the list of valid global variables (by comparing to globalScope.variables).
If so, report a "piggybacking" violation.

Here's the code that implements this
(a simplified version. I recommend taking a look at the actual rule on github):

create(context) {
  var globalScope;

  return {
    "Program": function() {
      globalScope = context.getScope();
    },
    "MemberExpression": function(node) {
      if(node.object.name ==== "window" && !isGlobalProperty(node.property)) {
        context.report(node, "'{{propertyName}}' piggybacks on '{{objectName}}' to extend the global scope", { propertyName: node.property.name, objectName: node.object.name });
      }
    }
  };

  function isGlobalProperty(node) {
    return globalScope.variables.some(function(variable) {
      return variable.name ==== node.name;
    });
  }
}

With this rule in place, linting a script like this:

/*eslint-env browser*/
window.Foo.baz();

Will yield 'Foo' piggybacks on 'window' to extend the global scope (at 2:1).
Et voilà!

Bonus: A rule to spot pesky jQuery plugins

The published version of this rule is part of an ESLint plugin that includes another custom rule called no-jquery-extend. It solves a similar problem to the one we've just solved, that of jQuery plugins "piggybacking" on jQuery Core.
Consider this code:

import $ from 'jquery';
 
$.ajax( ... );
$.cookie( ... );
$.when( ... );

While these jQuery calls look the same, $.cookie is not a valid jQuery method. It's a jQuery plugin. Using jQuery plugins is also an outdated pattern^[2], and it's better to import and use a similar service that provides the same functionality without extending jQuery^[3].
Linting this code will yield 'cookie' piggybacks on '$' to extend jQuery (at 4:1).

Feel free to reach out to me on twitter for feedback or questions about tinkering with ESLint and ASTs.
It's also worth mentioning that in my experience, the ESLint team is super friendly and always willing to help out on github and gitter.

Follow @cowchimp

Following this convention has many merits, mainly the fact that the dependency tree can be automatically inferred from the code so that developers don't have to manually manage which scripts get bundled together. ↩︎
Probably a relic from the days where this was the norm. ↩︎
For example, the author of the $.cookie plugin created a library called js-cookie which supersedes $.cookie and has no dependency on jQuery. ↩︎