In the interests of SEO, readability and copyright laws, it’s important for all the material on your website to be original and not plagiarised, but how original can content really be?
How often have you come up with a question that you thought was one of life’s great mysteries, then Googled it only to find that it’s been asked and answered many times before? Have you ever come up with what you thought was a great idea, joke or play on words only to find that the internet has beaten you to it?
When Russian billionaire Roman Abramovich bought Chelsea FC in 2003, I remember immediately joking to my dad that they will now become known as ‘Chelski’. I thought I’d made that up, but the media were straight onto it, and even t-shirts with the theme of the club’s Russian revolution were available for purchase right away. It shows that clever ideas are not always plagiarised, but collectively formed by many individuals all at once.
Recently, I discovered that any page of text we can possibly come up with can already be found on the internet – on one particular website.
The Library of Babel, named after a 1941 short story written by Jorge Luis Borges, is an online tool that contains every possible 3,200-character passage of text that can be put together using the 26 letters of the alphabet, spaces, commas and full stop. This means that the overwhelming majority of it is indecipherable gibberish, but that hidden somewhere in this virtual library are countless pages that could change the world.
The idea can be compared to the infinite monkey theorem, although the Library of Babel is a finite work, albeit an unimaginably large one. The number of ‘books’ in the library, if written down, would be 4,678 digits, and each book is 410 pages long. Somewhere in all this randomly generated text, you could find the cure for cancer, or the winning National Lottery numbers for the next five weeks, along with the names and addresses of the people who will win it. You could even stumble upon your own obituary, detailing when, where and how you will die, and even what your last words will be.
So, does this mean plagiarism of huge articles could happen by pure coincidence?
Not really. If you go to the Library of Babel and select a random ‘book’, you will almost certainly be greeted with utter nonsense. You can “Anglishize” the text you get so that any real English words are highlight in yellow, and you’ll probably find that the longest word picked out is only five or six characters.
Even the odds of finding a text beginning with a given word like ‘hello’ are ridiculously small. The chance of an ‘h’ cropping up as the first letter is one in 29 (26 letters plus the comma, full stop and space), so ‘he’ as the first two letters would be one in 29 x 29, which is 841. For the full five-letter word ‘hello’, the odds become one in more than 20 million, so you can see how quickly the probability of a coherent passage of text becomes ludicrously small.
This shows that mass plagiarism by accident is all but impossible, so keep your content fresh and original, even though it’s all already sitting in the Library of Babel waiting to be discovered – just like this article!