By Dan Gravell, Founder, elsten software
Content marketing is a cheap way to attract visitors to your site, but how can you achieve a long tail content marketing strategy as a single founder? As a single founder of Micro-ISV, I wanted to develop a scalable way to attract both visitors and inbound links. I came up with a way of combining Wikipedia, the software I sell and oDesk to generate hundreds of pages of useful content that has seen an increase in visitors to my website.
You see a lot of discussion about content marketing nowadays. Content marketing is an approach to marketing where you drive interest in your product or service via the provision of information to prospects. This information could be anything: simple Web pages, promotional videos, or the now ubiquitous ‘infographic’. The great news for solo founders and small companies is that the Web’s democratisation of information publication has meant it is now easy and cheap to publish such content.
That doesn’t mean any old startup can fling up some Web pages about “credit cards” or “insurance” and expect the world to beat a path to their door. The same old economics of competition persist: entrenched competitors with enormous marketing budgets will beat you in marketing content about highly competitive, high worth products and services. Instead, you should be targeting less competitive, ‘long tail’ content that people still search for.
The nature of capturing long tail interest is that you aim to cover many different bases and expect to only receive a small amount of attention for each. Covering many different bases means creating lots of content. So here’s the paradox: how can a solo founder or startup hope to create the content needed for a long tail content marketing strategy? The answer? Outsourcing and automation!
If Patrick McKenzie hasn’t written the book on “ scalable content generation“, he’s certainly written a number of useful blog posts on the subject. Patrick applied scalable content generation to his product, Bingo Card Creator by generating bingo cards for different themes, for example Mother’s Day bingo cards, Classical composer bingo cards and so on. He automated the generation of bingo cards by hiring freelancers to write simple word lists of intended entries in a bingo card and then generating the printable bingo card, uploading it to the Web for others to find.
The results? In Patrick’s own words:

This really works. Some of the activities, like Summer bingo cards or Baby Shower Bingo cards, have resulted in thousands of dollars in sales in the last year. $3.50 in investment, thousands in returns.

Aside from the positives of more website traffic and (hopefully) more sales, you’ve also the chance to make the Internet a better place. People type the most obscure or detailed questions into Google. By combining data in different and innovate ways and paying for freelancers to produce fresh, new content, you’re more likely to be answering those questions. The more people with more questions answered, the happier those people are.
So I decided to start my own scalable content generation project.

bliss-ful content generation

My company, elsten software, writes software called bliss for organising large music collections. Now, that concept is fairly vague and hard-to-sell, so initially I focused on auto populating album artwork in large music collections. That’s much more specific, talks to a pain point, and as it happens was moderately successful (enough to quit my day job after a year-or-so).
The reason I chose to concentrate on album artwork first of all is because quite a few people search Google for terms like [cover art finder] and [album art downloader]. They are specifically searching for tools that will find, download and install album art, so I thought it rude not to deliver a software solution to them. Then, I noticed people are also searching for album artwork for specific albums or for album artwork for specific artists. For example, [AC/DC album covers]. This seemed a promising first target for my content generation project.
My plan developed fairly quickly: use bliss to find artwork for a set of albums, generating a page for each artist with album and artwork. There were a number of reasons I aimed for a page per artist rather than a page per album:

  • I predicted [“artist name” album covers] queries would be less competitive
  • I would be able to generate larger pages with more content, pleasing Google
  • Some people would genuinely want to see a list of “Bo Diddley album covers” so I had made the Web a better place

What were my goals from the project? Twofold: first, to gather a little extra interest in my product, and maybe convert some visitors to downloaders; second, to encourage visitors (maybe fans of the artists depicted) to link to my pages, to further improve the number of inbound links to my domain and therefore improve my search engine rankings.

How I scaled my content generation

If you’re a developer-by-training like me and you run a software company, any old excuse to adopt a technical solution to a marketing problem is to be greeted enthusiastically. And so I came up with my approach to scalable content generation.
First, I needed a way of generating the artist and album names. As I’ve already established, the aim is to generate a lot of content, so this means lots of artists and lots of albums. One option is to go to Wikipedia and copy a list of artists and perform some grep-tastic string mangling, but what about all the albums for those artists to ensure each artist page is long enough? That would mean clicking through each artist, mangling their discography… the thought of all that clicking was not attractive.
bliss finds artwork from a number of sources, and one of these is Wikipedia. The way it does this is to use DBpedia, a structured version of the Wikipedia corpus. DBpedia builds its database by reading Wikipedia articles and using structured elements of those articles (e.g. particularly ‘Infoboxes’ – those boxes of information you often see at the top right of Wikipedia articles, example). The way you query DBpedia to retrieve the data you want is via a language called SPARQL. Using SPARQL you can connect and query multiple different items in the DBpedia dataset, for instance from Thriller you can look up Michael Jackson, and from there his birth town, and then its population… you get the picture.
So, to generate my artist and album list, I wrote a SPARQL query to retrieve all artists in the Rock and Roll Hall of Fame, and from there each of those artists’ albums.

PREFIX rdf:  PREFIX dbpedia2:  PREFIX owl:
SELECT DISTINCT str(?name), (sql:SAMPLE(?artistNames) AS ?ArtistName) WHERE { ?subject dbpedia2:name ?name . ?subject rdf:type owl:MusicalWork . ?subject  ?artist . ?artist  ?artistNames . ?artist   . }
ORDER BY ?ArtistName

This query takes each ‘MusicalWork’ (album, single etc) in the Wikipedia dataset, links its associated artist (this is like a SQL join) and only includes results where the artist is an inductee of the Rock and Roll Hall of Fame. I won’t bother with a tutorial for SPARQL, mainly because I would be incapable of providing one (SPARQL gurus will no doubt recognise inefficiencies in the above; I very much learned-what-I-had-to for this project).
I won’t hot-link to the results to avoid excessive strain on DBpedia’s servers, but if you want to try the query out, copy and paste it into the DBpedia query interface. Suffice to say you’ll get a list of album and artist names.
Step two was to use those album/artist names to look up the artwork using bliss. This bit was easy; I wrote bliss and have access to its source, and the API for finding cover art is fairly mature. I hacked up a bit of Scala (my current scripting language of choice and easily interoperable with my existing code) to find the artwork for each album. These results were then stored as JSON in a results file. In all there were around 2500 separate queries made.
The final piece of work was to generate the Web pages for my results. The website (for that was the destination for this content) is actually a set of static HTML files, hosted on Amazon S3, but generated by Jekyll. The advantages of this are: cheaper hosting and less server software to go wrong (and I guess there are some security advantages too). I wrote a Jekyll plugin to generate the pages by parsing the JSON file and then writing the results by looking up a Liquid template. This template defined a <title> and <h1> tag for “[artist name] album covers” and contained the layout for the art on each page. I also added buttons for Twitter and Facebook to encourage sharing of the content and generate some more inbound links.
Finally, this work was complete! The pages took just a few seconds to generate, were uploaded to and were now live.

Advice from the master

Emboldened by a promise on Patrick McKenzie’s website I shot him an email with a link to the newly generated pages. Patrick replied a couple of days later with some practical on-page SEO advice. The main piece of advice was to add more textual content; around 300 words about the artist on each page.
I experimented by setting up an oDesk job to write content for ten artists (I just randomly picked the first ten artists in the list). I quickly found a contractor who wrote the content and I incorporated this into the page generation. Here’s Frank Zappa’s album covers sans description, and here’s Alice Cooper’s artwork with a bio. I’m moderately pleased with the content; although it is a little clumsy it was reasonably cheap; $3 per article. Anyone wanting to write further bios can get in touch with me; I’ll credit you on the site.

The results

The pages were only picked up by Google on the 11th March, but traffic has been building since then. The current average is around seventy visits per day landing on one of these generated pages.
In terms of my goals, the first was to draw more visitors to the website, convert them to downloaders of my product and (hopefully eventually) customers. So far, conversion rates (visit to download) have been disappointing. Currently only around 1.1% of visitors landing on these pages download the product, whereas the site-wide average is around 25%. That said, I’ve made no real attempt to optimise this yet. There are no calls-to-action and few obvious links to guide the visitor to other parts of the site.
Patrick’s advice of adding extra artist biographies appears to be helpful; of the top twenty pages by visits 30% of the pages have this extra textual content despite only making up 6% of of all the pages. By tracking the number of visits I receive in Google Analytics and tracking these pages’ search rankings I can build a list of new pages to add biographies too and get the best return on investment.
There have been no inbound links posted to the content yet, nor social media shares, but with an increasing number of visitors I am hopeful.
It appears that, as ever, generating the content is the first step of this content marketing strategy: next will be promotion and optimisation!

Photo credit:
Dan Gravell is the founder of elsten software, a London based company which aims to make your large music collection manageable. bliss is an MP3 organizer whose aim is to audit the consistency, completeness and correctness of your computer music library.


  1. Great article, thank you and it gave me some useful pointers for a project I’m working on. One question: how do you ensure that the descriptions written through odesk aren’t stolen/slightly edited from other websites? (Duplicate content might harm your search engine ranking). I hadn’t come across odesk before so any tips you have for specifying tiny writing jobs like that would be greatly appreciated.

  2. Hi Sean. Short answer is: I didn’t check all of the articles. Once I was happy with the first five or so, I trusted the rest. There are duplicate content checkers out there but I figured I’d wait to see if Google Webmaster Tools flagged up any duplicate content warnings.
    Using oDesk: all of the contractors I’ve hired on oDesk have been very professional, although in my experience not all of them have quite the level of English skills they claim. I always seem to attract a lot of applicants for the jobs I post, but they are easily filtered; many of them simply file copy and paste applications. I always look for some indication that the contractor has actually read the job spec. Always ask for portfolio examples and start with small jobs, dangling the carrot of more-of-the-same. Once you have hired and trust a contractor, you can create jobs and invite them direct.

  3. Pingback: Scalable content generation using DBpedia (and a hint of oDesk) | | EspadaSoftEspadaSoft

  4. Hi Dan. Sorry – I didn’t realise I hadn’t got back to you sooner. Thanks for your reply. The tips on oDesk are particularly helpful for my project.

  5. Dan Gravell Reply

    Rob Walling covers virtual assistants in greater detail in his book, Start Small Stay Small. I recommend you get a copy.

Write A Comment