Corey Prophitt


Revisiting Bookmarklets

Published on August 17, 2020

···

Everything Old is New Again

Bookmarklets are old news— According to Wikipedia, bookmarklets have been around since the year 2000. The usefulness and popularity of bookmarklets has certainly waned in the last 20 years as browser extensions have grown in popularity. However, you can still create bookmarklets and unlike modern browser extensions, bookmarklets work on all major web browsers (even on mobile).

I maintain a few products that rely on browser extensions to provide convenient product functionality. When it comes to your product and business, Google's web store is risky to say the least. I have been experimenting with the idea of offering a bookmarklet in addition to my product's browser extension. At the very least, I would like to offer a bookmarklet as a fallback in the event things ever go south with Google.

Unfortunately (or, fortunately depending on how you look at it) bookmarklets have been stripped of some useful functionality. Due to security concerns most websites place content policies on their webpages. These policies limit which resources and domains can be used within the context of the page.

As a result of the content policies bookmarklets face a few restrictions:

You are probably thinking bookmarklets are useless with such restrictions. However, there are quite a few upsides to bookmarklets.

Restricted but not useless.

Bookmarklets may not share the same level of control and freedom browser extensions have, but they do offer a number of upsides browser extensions are unable to provide. Here are a few upsides to consider:

  1. You don't need to rely on a third party web store for distribution.
  2. The same bookmarklet code works everywhere; even on mobile browsers.
  3. No automatic updates. I prefer this over the extension model of automatic updates. In the extension world you never know what code is running in your browser at any given point in time. If an extension is hijacked or sold to a malicious third party you could be running malicious code.
  4. Bookmarklets can skirt by a strict IT department. Believe it or not, I have enterprise customers that are unable to use my product's extension because their IT department forbids browser extensions.

Bookmarklets are far from perfect and may not be ideal for your specific needs. There are two patterns I commonly see where bookmarklets remain as viable alternatives to browser extensions. The two common patterns I see can be summarized in the following ways:

  1. You want to modify the target page in some way.
  2. You want to extract data and present it locally or transfer the data to another service.

I will demonstrate both of the patterns above with a few bookmarklets I created for this blog post.

Pattern 1: Modifying Pages (Div Highlighter)

The simplest of all use cases for a bookmarklet is to make a temporary modification to a web page. For example, removing an auth wall, changing a page from "light" mode to "dark" mode, etc. To demonstrate how to accomplish this functionality with a bookmarklet, I took a Chrome browser extension and converted it into a bookmarklet with identical functionality.

After some digging around I found the div-highlighter extension by itsHobbes. When clicked the extension highlights all divs on the current page with a random border and background color. This is a common debugging technique and the extension is a great example for this pattern. I took the code from the extension and wrapped it inside of a function to prevent polluting the global namespace.

The complete code can be seen below:

(function () {
  var MIN_PARENT_COUNT = 0;
  var MAX_PARENT_COUNT = 3;

  function rgba() {
      var o = Math.round, r = Math.random, s = 255;
      return 'rgba(' + o(r() * s) + ',' + o(r() * s) + ',' + o(r() * s) + ',' + 0.4 + ')';
  }

  var divs = document.getElementsByTagName('div');

  for (var i = 0; i < divs.length; i++) {
      if (divs[i].offsetHeight == 0 || divs[i].offsetWidth == 0) {
          continue;
      }

      var parents = 0;
      var node = divs[i];

      while (node != null) {
          if (node.tagName == 'DIV') {
              parents++;
          }

          node = node.parentNode;
      }

      if (parents >= MIN_PARENT_COUNT && parents <= MAX_PARENT_COUNT) {
          var color = rgba();

          divs[i].style.boxSizing = 'border-box';
          divs[i].style.border = '2px solid ' + color;
          divs[i].style.backgroundColor = color;
      }
  }
})();

The minified location string is as follows:

(function(){var MIN_PARENT_COUNT=0;var MAX_PARENT_COUNT=3;function rgba(){var o=Math.round,r=Math.random,s=255;return"rgba("+o(r()*s)+","+o(r()*s)+","+o(r()*s)+","+.4+")"}var divs=document.getElementsByTagName("div");for(var i=0;i<divs.length;i++){if(divs[i].offsetHeight==0||divs[i].offsetWidth==0){continue}var parents=0;var node=divs[i];while(node!=null){if(node.tagName=="DIV"){parents++}node=node.parentNode}if(parents>=MIN_PARENT_COUNT&&parents<=MAX_PARENT_COUNT){var color=rgba();divs[i].style.boxSizing="border-box";divs[i].style.border="2px solid "+color;divs[i].style.backgroundColor=color}}})();

You may notice the code for itsHobbes' extension and the bookmarklet are identical. Bookmarklets and extension content scripts work in essentially the same way.

Pattern 2: Remote Data Extraction (Wayback Machine)

Another commonly seen pattern involves the extraction of some data from the page and the transfer of the data to another remote location. This pattern is often used to "share a post", "post a message", etc. For example, the Hacker News bookmarklet uses this pattern.

To demonstrate this pattern I created a simple bookmarklet that will determine the hostname and protocol for the current web page and open the page up within Wayback Machine's archive. Essentially, we are extracting the url and transferring it to Wayback Machine in a new browser tab.

The complete code can be seen below:

(function () {
  var host = location.protocol + "//" + location.hostname;
  window.open("https://web.archive.org/web/*/" + host, "_blank", "noreferrer noopener");
})()

The minified location string is as follows:

(function(){var host=location.protocol+"//"+location.hostname;window.open("https://web.archive.org/web/*/"+host,"_blank","noreferrer noopener")})();

The demonstration above only extracted the protocol and host address from the web page but you can extract any information from the page. The only limitation is how much data you can safely fit in a url.

The general consensus is you should limit url length to 2,000 characters or less for maximum browser support. I have tested Chrome and Firefox with very large urls (over 40,000 characters) and both worked as intended. However, it is probably a good idea to go with the general consensus unless you have a good reason not to.

Pattern 2: Local Data Extraction (LinkedIn Search Result Scraper)

The data extraction example above extracted simple information from a web page and transferred it to another domain via a new tab. Another common variation of the pattern involves locally downloading data instead of transferring the data to a new location.

To demonstrate this I created a bookmarklet that scrapes leads from a LinkedIn search results page and downloads the leads in a CSV file for local processing.

The complete code can be seen below:

(function () {
  var links = document.querySelectorAll('a[data-control-name="search_srp_result"]');

  if (links.length === 0) return;

  function escape(s) {
    return '"' +  s.replace('"', '""') + '"';
  }

  function download(data) {
    var blob = new Blob([data], {type: 'text/csv'}),
        e    = document.createEvent('MouseEvents'),
        a    = document.createElement('a')

    a.download = 'data.csv';
    a.href = window.URL.createObjectURL(blob)
    a.dataset.downloadurl =  ['text/csv', a.download, a.href].join(':')
    e.initMouseEvent('click', true, false, window, 0, 0, 0, 0, 0, false, false, false, false, 0, null)
    a.dispatchEvent(e)
  }

  // Generate CSV rows.

  var rows = [ ['name', 'occupation', 'location', 'has_premium', 'linkedin_url'] ];

  for (var i = 0; i < links.length; i += 1) {
    var l  = links[i];
    var n  = l.querySelector('.actor-name');
    var s1 = l.parentNode.querySelector('.subline-level-1');
    var s2 = l.parentNode.querySelector('.subline-level-2');
    var p  = l.querySelector('.premium-icon');

    if (!n) continue;

    rows.push([escape(n.innerText), s1 ? escape(s1.innerText) : null, s2 ? escape(s2.innerText) : null, p ? true : false, l.href]);
  }

  // Generate the CSV data.

  var csv = '';

  for (var j = 0; j < rows.length; j += 1) {
    csv += rows[j].join(",") + "\n";
  }

  download(csv);
})();

The minified location string is as follows:

(function(){var name=document.querySelector(".vcard-fullname");var match=location.href.match(/github\.com\/([a-z0-9-_]+)/i);if(!name||!match||!match[1])return;fetch("https://api.github.com/users/"+match[1]+"/events").then(function(r){r.json().then(function(data){var emails=[];var seen={};var nameTokens=[];name.innerText.trim().toLowerCase().split(" ").forEach(function(n){nameTokens.push(n.slice(0,5))});data.forEach(function(e){if(e.type=="PushEvent"&&e.actor.login==match[1]){if(e.payload.commits&&e.payload.commits[0]){var email=e.payload.commits[0].author.email.trim().toLowerCase();if(!email.match(/noreply\.github\.com/i)&&!seen[email]){nameTokens.forEach(function(t){if(email.indexOf(t)>-1&&!seen[email]){seen[email]=true;emails.push(email)}})}}}});if(emails.length)prompt("We found their email(s):",emails.splice(0,5).join(", "))})})})(); 

Pattern 2: Local Data Extraction (Github Email Finder)

A final variation of the data extraction pattern involves extracting information beyond what is available on the page itself. Bookmarklet code is run within the context of the webpage and can make web requests and access any data the target page has access to.

To demonstrate this, I created a bookmarklet that works on a Github user's profile to locate their email address. The bookmarklet works by extracting the user's Github username and then uses Github's API to determine the user's email address.

The complete code can be seen below:

(function () {
  var name = document.querySelector('.vcard-fullname');
  var match = location.href.match(/github\.com\/([a-z0-9-_]+)/i);

  if (!name || !match || !match[1]) return;

  fetch("https://api.github.com/users/"+ match[1] +"/events").then(function (r) {
    r.json().then(function (data) {
      var emails     = [];
      var seen       = {};
      var nameTokens = [];

      name.innerText.trim().toLowerCase().split(" ").forEach(function (n) {
        nameTokens.push(n.slice(0, 5));
      });

      data.forEach(function (e) {
        if (e.type == "PushEvent" && e.actor.login == match[1]) {
          if (e.payload.commits && e.payload.commits[0]) {
            var email = e.payload.commits[0].author.email.trim().toLowerCase();

            if (!email.match(/noreply\.github\.com/i) && !seen[email]) {
              nameTokens.forEach(function (t) {
                if (email.indexOf(t) > -1 && !seen[email]) {
                  seen[email] = true;
                  emails.push(email);
                }
              });
            }
          }
        }
      });

      if (emails.length) prompt("We found their email(s):", emails.splice(0, 5).join(", "));
    });
  });
})();

The minified location string is as follows:

(function(){var name=document.querySelector(".vcard-fullname");var match=location.href.match(/github\.com\/([a-z0-9-_]+)/i);if(!name||!match||!match[1])return;fetch("https://api.github.com/users/"+match[1]+"/events").then(function(r){r.json().then(function(data){var emails=[];var seen={};var nameTokens=[];name.innerText.trim().toLowerCase().split(" ").forEach(function(n){nameTokens.push(n.slice(0,5))});data.forEach(function(e){if(e.type=="PushEvent"&&e.actor.login==match[1]){if(e.payload.commits&&e.payload.commits[0]){var email=e.payload.commits[0].author.email.trim().toLowerCase();if(!email.match(/noreply\.github\.com/i)&&!seen[email]){nameTokens.forEach(function(t){if(email.indexOf(t)>-1&&!seen[email]){seen[email]=true;emails.push(email)}})}}}});if(emails.length)prompt("We found their email(s):",emails.splice(0,5).join(", "))})})})();

Although the usefulness of bookmarklets has gone down over the years they continue to have a place on the web. I was pleasantly surprised with the viability of replacing some browser extensions with simple bookmarklets. At the very least they may provide a fallback for extension functionality that is critical for a product or business.