Skip to content Skip to sidebar Skip to footer

Link Terms On Page To Wikipedia Articles In Pure Javascript

While browsing I came across this blog post about using the Wikipedia API from JavaScript, to link a single search term to it's definition. At the end of the blog post the author m

Solution 1:

Perhaps something like this might help:

Assuming very simple HTML/Text like so:

<div id="theText">Testing the auto link system here...</div>

And two very small scripts.

dictionary.js sets up your list of your terms. My thought was that this could be generated in php by querying the articles database if you wanted. It also can be loaded cross domain (as it sets window.termsRE). If you don't need to generate the list from the database, you could also manually put it with termlinker.js.

This code that generates the RegExp assumes that your terms array contains properly formatted strings to match using Regular Expressions, so be sure to use \\ to escape []\.?*+|(){}^&

// dictionary.js - define some termsvar terms = ['testing', 'auto link'];
window.termsRE = newRegExp("\\b("+terms.join("|")+")\\b",'gi');

termlinker.js is just a simple regexp search replace on the defined terms. It could be an inline <script> too. requires that the dictionary.js has been loaded before you run it.

// termlinker.js - add some tagsvar element = document.getElementById("theText");

element.innerHTML = element.innerHTML.replace(termsRE, function(term) {
  return"<a href='http://en.wikipedia.org/wiki/"+escape(term)+"'>"+term+"</a>";
}); 

This simply searches for any words in the terms array and replaces them with a link to the term. Of course, it will also match properties and values inside HTML tags, which could break your markup a little.

All thrown together you get this (jsbin preview)


Using the API

Based off of the "minimum case" from before, here is the code sample for using the API to receive the list of words directly and the jsbin preview

// Utility FunctionRegExp.escape = function(text) {
  if (!arguments.callee.sRE) {
    var specials = [
      '/', '.', '*', '+', '?', '|',
      '(', ')', '[', ']', '{', '}', '\\'
    ];
    arguments.callee.sRE = newRegExp(
      '(\\' + specials.join('|\\') + ')', 'g'
    );
  }
  return text.replace(arguments.callee.sRE, '\\$1');
};

// JSONP Callback for receiving the APIfunctionreceiveAPI(data) {
  var terms = [];
  if (!data || !data['query'] || !data['query']['allpages']) returnfalse;  
  var pages = data.query.allpagesfor (var x in pages) {
    terms.push(RegExp.escape(pages[x].title));
  }
  window.termsRE = newRegExp("\\b("+terms.reverse().join("|")+")\\b",'gi');
  linkterms();
}  

functionlinkterms() {
  var element = document.getElementById("theText");

  element.innerHTML = element.innerHTML.replace(termsRE, function(term) {
    return"<a href='http://en.wikipedia.org/wiki/"+escape(term)+"'>"+term+"</a>";
  });
}


// the apfrom=testing can be removed, it is only there so that// we can get some useful terms near "testing" to work with.// we are limited to 500 terms for the purpose of this demo:
url = 'http://en.wikipedia.org/w/api.php?action=query&list=allpages&aplimit=500&format=json&callback=receiveAPI' + '&apfrom=testing';
var elem = document.createElement('script');
elem.setAttribute('src', url);
elem.setAttribute('type','text/javascript');
document.getElementsByTagName('head')[0].appendChild (elem);

Post a Comment for "Link Terms On Page To Wikipedia Articles In Pure Javascript"