Bible versions may be OK, but not at Bible Gateway

I mentioned in a comment on my last post here that I had searched Bible Gateway for the word “OK” in several modern Bible translations, and found no results. For example, my search of The Message gives the following response:

No results found.

No results were found for ok in the version(s):The Message.
Try refining your search using the form above.

You can find more about refining searches and using the search form effectively, visit the frequently-asked questions page.

However, it turns out that the information given here is incorrect and misleading, and, as I will show, that this error is quite “OK” with the staff at Bible Gateway. The word “OK” is probably not in The Message (although “okay” is), or in any other modern English Bible translation, but if it was I would not be able to find that out at Bible Gateway.

The problem is not simply with the word “OK”, because if I change the search term to “and” I get the same “No results found” message, although quite clearly the word “and” is commonly used in The Message, and every other English version. I “visit[ed] the frequently-asked questions page” (there was no direct link so I had follow a long rabbit trail to get there) and could find nothing to explain why these searches were not working, and indeed nothing at all “about refining searches and using the search form effectively”.

On 18th February I contacted Bible Gateway about this issue, and wrote:

I searched “The Message” for “ok”, with “match complete words” not checked. No results. But when I checked for “okay” I found three results. The first search should match all words beginning “ok” but doesn’t. See http://betterbibles.com/2011/02/18/modern-bible-translations-not-ok/#comment-20553.

On 25th February I received a response with the heading

Your request (#17945) has been marked as solved by our support staff.

and a message from Sandy Hall, presumably one of that staff:

Dear Peter,

Greetings from BibleGateway.com.

Thank you for contacting us. The following list of words is a sample listing of common words not recognized in a general search: a, an, the, is, of, and, by, be, for, to, this, I, O (as “O Lord”). The current search engine we use indexes all words in the Bible that are three characters or longer. It’s a technical restriction. To do more than that would index a lot more words making searches take longer and probably a lot more resources on our servers.

If we can be of any further assistance to you, please feel free to contact us again. Thank you for your interest in Bible Gateway.com. We appreciate the opportunity to serve you online.

Sincerely,

BibleGateway.com Customer Care
www.biblegateway.com

Well, it was nice of them to say so, after a week. But if so, why didn’t they say so in the original results? It would have been very easy for them to explain in the “No results found” message, or in some page with a clear direct link from that message, that the search had failed because it was for a word that had been deliberately excluded. I note also that there is not a hint of apology for giving incomplete results or misleading their users like myself.

I have several issues with what they wrote in the message above:

  • This is highly confusing and inconsistent. Have they excluded from their searches all one and two letter words, or only ones they consider common? They claim to index “all words in the Bible that are three characters or longer”, but they have also excluded “and” and “this” from their search. So does that mean that some words are indexed but not recognised in searches, or should I take “all” to exclude “common words”? Why did they not give me a complete list of these “common words”?
  • “OK” is NOT a common word in any English Bible version, although it might be in a broader corpus of the language, and so there is no reason for including it in their undisclosed list of common words.
  • I note that it is not true that no two letter combinations can be searched for. I can search The Message for “qu”, indeed also for “q”, and I get a valid result for all 758 words beginning like this. Why can’t I do the same for “ok”, when words beginning with these letters are probably much less common?
  • Technically, their claim that indexing these common words is a large burden is unsupportable. Probably only around one third of the words in any English Bible version would be on any such list. That means that on any reasonable indexing strategy their indexes would grow by a maximum of 50%. They would have to index perhaps 200,000 to 300,000 more words in each of the less than 100 versions they support. Allowing four bytes per index entry that is a total of about 100 MB extra storage they would need – a trivial amount for any current server.
  • If these searches are not of interest to their users, then few people will make them and the impact on their resources would be trivial. It would actually be more of a burden to maintain a list of excluded words (which would have to be different from language to language) than to include every word.
  • If on the other hand a significant number of people, like myself, are coming to the site for searches like this, then it would surely be in the interests of Bible Gateway, and their advertisers, to support these searches – to provide for their customers what they actually want, rather than what someone at the company thinks they ought to want.

I note that in the 19th century Strong was able to index every word in the KJV, however common, using just pen and paper. Is it really too much for Bible Gateway to do the same using modern computer technology? After all, they have behind them the vast resources of the controversial Murdoch media empire: although their About page does not mention it, Bible Gateway is owned by Zondervan, which is owned by HarperCollins, which is owned by News International.

No, I’m sorry, Bible Gateway, but this issue has not been “solved”. In fact you have done nothing at all to solve it. Your response has caused me to lose the trust I used to have in your product and your website. I will be looking elsewhere in future. And I can’t help recommending readers of this post to do the same.

39 thoughts on “Bible versions may be OK, but not at Bible Gateway

  1. Peter Kirk says:

    As a footnote here: we can be grateful for one thing, that at least so far the Murdoch family has not stepped in to kill this golden egg laying goose, Bible Gateway, by hiding it behind a subscription wall, in the same way that they have killed off the website of The Times. See this article, admittedly from a competitor.

  2. Dannii says:

    How strange. When you search for a phrase it does indeed mention that the common words are not indexed, but if you search for a common word alone it says nothing!

  3. Peter Kirk says:

    Dannii, I see I am now getting a message “”ok” is a very common word, and was not included in your search.” Perhaps it was there all the time. But it is in the banner part of the page, next to the ads (which I automatically skip over) and above the main heading, separate from the main “No results found” message. Anyway, this message is an untruth because “ok” is NOT a very common word, at least in Bible translations.

  4. John Hobbins says:

    Hi Peter,

    Upon reaching the end of your post, I could not help asking the question: is the real problem Rupert Murdoch? Is this guilt by association?

    You are free to boycott Zondervan and HarperCollins since they are owned by the Empire and Darth Vader. But it might be pointed out that Darth Vader has bigger fish to fry and therefore does not exercise any editorial control over Z or HC.

    My theory: in order to get his way, Murdoch puts drugs in people’s Nescafe, which he also owns.

  5. Peter Kirk says:

    John, my main point here is not about Murdoch. I don’t appreciate his empire, but unlike our cabinet minister Vince Cable I have not “declared war on Mr Murdoch”.

    So I am happy to recommend Murdoch products when they are of good quality. You will find that I have written here, and at my own blog, plenty of positive things about NIV and TNIV, published by the Murdoch company Zondervan. I have also been positive about Murdoch controlled blogs like Koinonia and Ruth Gledhill’s now inaccessible blog.

    On the other hand, when a product is of poor quality, like Bible Gateway on this matter, I quick to point out its deficiencies – no matter who owns it.

    I did not intend to suggest that the Murdoch family exercises any editorial control over Zondervan or Bible Gateway – although I would be surprised to see them publish a book by Vince Cable at the moment! My point was simply that Bible Gateway, unlike many Christian initiatives, cannot claim to have cash flow problems or lack access to investment funds.

    Also, just in case anyone takes you more seriously than you probably intended, Nescafé is not owned by Murdoch or his companies but by the Swiss public company Nestlé.

  6. Peter Kirk says:

    David, thank you for recommending Bible Study Tools. Its searches seem better featured than those of Bible Gateway, and properly documented. Not so many versions are available, but there are some, including RSV and NRSV, which are not at Bible Gateway. Also they seem to have only the 1984 NIV, at least at the moment.

  7. Chaka says:

    Really, Peter? You’re going to boycott this website because of a trivial error in their search engine? I have done a tiny bit of Bible-text-related programming, and I can tell you that the intuitive behavior one wants from a search engine is not easy to develop. If you’re convinced that it’s easy, I encourage you to give it a try. 🙂

  8. Peter Kirk says:

    Chaka, I did not say I was going to boycott this site, just that I am not recommending it and am looking elsewhere. But I do not consider it a trivial matter that the site gives me incorrect results without warning me properly of this – and that the site owners fail to acknowledge this as a bug or even as an undesirable feature.

  9. Josiah says:

    I can’t say for certain, but I’ve a hunch that the majority of searches for words like OK and AND are intrinsically junk, simply random catch phrases that people stick into a search engine to see what it spits out.
    I don’t know what your rationale for putting in that search term was, but BibleGateway wants to improve their service to the majority of their legitimate clients. These want to find out where Melchizadech is mentioned in the Bible or something of that nature, not how many ANDs there are.

  10. Peter Kirk says:

    Josiah, of course there may be some random or junk searches. But why is it more valid to search for “Melchizedek” (who you won’t find with the spelling you used) than to search for “ok” or “and”? Presumably people who search for the latter, unless it is just random, have some purpose in doing so. I had a good purpose in searching for “ok” which you can find by following the link I gave. I resent any suggestion that some searches are not proper or genuine. Am I somehow not a “legitimate client”?

    If by the way anyone reading this is opposed to the expansion of the Murdoch empire, follow this link.

  11. Jordan Doty says:

    Thanks for the link to http://www.biblestudytools.com. I liked the easy access to the “parallel bible” tool; however, I put in the HSCB and NLT translations just to see which version they used, and they are using the HCSB 2003 and NLT 1996. There are such significant differences there that I would prefer the more updated versions.

    I am thankful that Biblegateway.com at least has updated the NIV 2011 (2010), while still for the time being keeping the NIV 1984 and TNIV 2005. They just recently updated the NLT from 2004 to 2007, and I am hoping that they will do the same with the HCSB soon (2003 to 2009). Not every site has the ESV (2001 to 2007) changes, but they are not as drastic in such cases. Youversion.com seems to have the latest updates, and Biblos.com seems to have them, minus the NIV update and no HCSB translation. I like the quick search features best in Biblegateway, but I love the quick parallel readings and hebrew/greek/interlinear options in Biblos.com. Of course netbible.org has the cool Net notes and parallel readings; however, for a long while they still had the NLT 1996. I can’t remember if they’ve recently updated it or not.

    Any other ideas out there?

  12. Josiah says:

    “I resent any suggestion that some searches are not proper or genuine.”

    I did not mean to suggest that your search was not proper. I personally distrust any argument that expounds upon the selection of a single (especially an English) word over another but I realize that in some situations such analysis is more valid and important.

    I do think that purely random searches for OK probably outweigh meaningful (where the searcher intends to do something with the result) searches. How often it appears in the Bible is probably less relevant than how strongly it is implanted in people’s vocabulary, and that is probably quite high.
    I also think that in terms of recalling a particular reference in the scripture OK is a thoroughly useless word, because it is not specific to any event nor typically a theme for examining in its own right.

    And I saw the misspell as soon as I submitted, but annoyingly not before.

  13. Peter Kirk says:

    Josiah, I too distrust the kind of word study that expounds the selection of a single English word. But it is not usually words like “ok” or “and” which are used for such studies. Again I would say that it is not for the site owners to decide which kinds of search are permissible or valuable, but they should allow whatever searches their users want, unless there are genuine technical reasons not to.

  14. Ryan says:

    Being a database developer, I understand completely why they chose to exclude words under 3 letters. These servers have to re-index all the texts constantly, and having to search for 2 letters or 1 letter and index that would be hugely inefficient. Perhaps they could have done a better job telling their users that they only search for 3 letter words and greater, but I completely agree with restricting that to 3. Every time they reindex their search, the code goes through every text and builds an entry in a database for that search result to include where in the text it was retrieved from. This has the potential to create huge databases. When you submit a search, you are not asking a script to scan through all the texts they have on their site. What you are doing is entering a query to retrieve that information from a database that was already built by another piece of code and run in the background. It definitely is not an easy task to create a search engine that yields results in a fast manner, and if they were to include two letter or 1 letter combinations, then the whole site would come to a screeching halt as the code worked feverishly to re-index the site every time they changed anything on their site.

    Just something to consider.

  15. josiah says:

    Ryan, you say that their system would have to reindex every time they changed anything on their site. Yet the search is limited to a single version of the Bible, so surely it would be more correct to say it reindex every time the text of the Bible version was changed. But that clearly doesn’t happen very often, which would seem to cast doubt on your reason. I’ll acknowledge however that having an index of perhaps double the size due to these 1 and 2 letter combos would make searches much slower even if the index didn’t ever need rebuilding.

    Incidentally most of the blocked words are not two letters at all, they’re just very common words. I’d assumed that the block was primarily for user benefit, so that a search for “David and Jonathan” didn’t return all the “and”s in the Bible. But perhaps there is a technical reason behind that as well.

  16. Ryan says:

    There is a technical reason behind that, and no, the entire site has to be re-indexed as the search may only be limited to one bible (your query) but the database is indexed for the entire site. Also, the size would not double but be closer to 1 x 10(26) in size due to having to index every letter (i.e. it would not index just I as in myself, but also every time the letter i is used such as in ‘i’ndex. I’m not trying to be combative, just trying to shed a little light on how database search engines work. For example, search in google for the letter ‘a’ and see how many results turn up, and then try it again and note the difference in results. Also consider Google’s primary business is search technology and they have over 15,000 servers re-indexing the web and it still takes a week or so for their “spiders (code that indexes sites)” to inde a new page that crops up on the internet.

    BTW, love the blog!

  17. josiah says:

    Yet the issue doesn’t seem to be two letter words at all.

    “I note that it is not true that no two letter combinations can be searched for. I can search The Message for “qu”, indeed also for “q”, and I get a valid result for all 758 words beginning like this. Why can’t I do the same for “ok”, when words beginning with these letters are probably much less common?”

  18. Ryan says:

    Hmm… good question Josiah. I do not know the intricacies of their database search design. I know the php script I use on my site limits all 1 or 2 letter words due to the amount of indexing and time that would take (and I have a small site). Perhaps the dbase engineer who designed it for them felt those common words would require to many resources to index?

  19. Peter Kirk says:

    Ryan, thank you for your helpful comments. But I am still somewhat confused – although I too have worked as a relational database designer. So I have a question for clarification:

    Is the performance loss actually dependant on the length of the word, or only on how common that word, or any word beginning with those letters, is in the text?

    If the latter, as I suspect, surely it would be more important to exclude longer initial strings which are relatively common, like “inter”, than shorter but less common ones (at least in biblical texts) like “ok”. I note that “qu” and “q” have not been excluded, showing that the problem is not just with one an two letter search strings.

    It seems to me that the system is designed rather inefficiently if all the indexes have to be rebuilt on any change to any version, especially if no cross-version search is possible. Nevertheless new versions are added rather rarely, and the total corpus size is less than half a gigabyte (less than 100 versions, typically 4-5 MB per whole Bible), so this hardly seems a great burden.

  20. Peter Kirk says:

    Ryan, in response to your last comment (6:21 pm), perhaps the issue is that some database engineer decided for himself or herself which combinations to exclude based on general English usage, without reference to the nature of the texts or the fact that many of them are not in English at all. This could have unfortunate consequences if, for example, the word for “sin” or “love” in another language was one of those common three letter English words. If words like this are to be excluded, the choice needs to be made carefully and subject to review.

  21. Ryan says:

    @ Peter,

    No, the performance loss comes in during the indexing portion, specifically with word length. Imagine logging every single instance the letter a was found in scripture and returning that query! The query itself might indeed come back quick (as long as the tables were designed efficiently), but the indexing would take a lot of processing time. The designer may very well have listed some “dirty” words to exclude from indexing. This is fairly common in search engine technologies, however I tend to agree that if one and two letter words are searchable then why not include them all?

    As to your 6:37 PM comment, I could not agree more. It seems a little arbitrary to include qu and exclude ok especially considering what you’ve already noted with languages other than English such as the Greek ous or en (transliterated of course).

  22. Ryan says:

    It does seem to be random though. For example, try ins and Gen 3:16; it does not return a result although pai will show a result using the same word ‘pains’. I suspect that the script they have for indexing their site does a lousy job of searching the entire string in the database entry for Genesis 3:16 (and others). They should probably get new code which will index correctly all entries in their databases (the website Bible translations).

    Just goes to show you programmers are not perfect!

  23. josiah says:

    I disagree. The purpose of this search utility isn’t to find sub-strings in words in the Bible. It’s to find where the Bible refers to a certain concept, possibly being a single key verse that someone cannot remember the reference to. As such it would be counter-productive for “jam” to return references to Benjamin. It also makes it easier to index by removing from the sheer volume of letter combos those that do not start a word.

    Having a start of word capable search is more worthwhile however. For one thing human minds are more likely to recall the start of a complex word (such as the start of a name) than any other part. It also means that searches for “love” will return “loves” and “loved”, which are typically good matches. It sounds to me like a good compromise between matching good matches and ignoring red herrings.

  24. fireandsalt says:

    Josiah,

    It looks like you’ve figured out their coding for their query. It will only return results of words that start with the specific query string. For example it will return Gen 36:5 for a search on Ko; but no results are found when searching for orah.

    I also don’t disagree with the premise that a search engine for Biblical texts should return results in this manner. When preparing a Bible study, I’d look up a keyword using the start of a word rather than something in the middle or end of a word as well.

    Search engines can be designed a multitude of different ways it seems.

  25. Peter Kirk says:

    Ryan and Josiah, thanks for the further comments. I had already understood that matches were only on the starts of words, although I had not made that explicit. I agree that that is a sensible policy, both for meeting users’ needs and also because it would, I think, greatly simplify indexing and searching.

    My comments on indexing rather assumed that it was based on a list of words in the text in alphabetical order, which can be searched very easily for start of word partial matches – find the first word that matches the string, then iterate through the list until you hit a word which does not match. But I don’t know if that is the actual strategy used.

  26. Peter Kirk says:

    By the way, Ryan “fireandsalt”, welcome to this blog – and welcome back to the world above water! I have skimmed both the blogs you linked to. I can’t agree with you on the Trinity, and don’t have a clue about the theological basis to objection to men wearing shorts (apparently a fundamental part of your confession of faith). But it will be good to keep in touch both here and, if you like, on my blog.

  27. Ryan says:

    Kirk,

    Thanks for the kind words. I have fixed the problem with the spam trap on my blog, it seems I have the filter set too high, and when it detected a link, it marked you as a spammer. Problem fixed. As to the theological differences we may have, I’ll be glad to discuss on either my blog or yours, but as this site states specifically that theological discussions are unwelcome, I’ll refrain from commenting on “Better Bibles”.

    Thanks again for the welcome, and I look forward to engaging you in conversation in the future.

    God bless,

    Ryan

  28. Jack says:

    The issue isn’t of BibleGateway.com, but of search engine algorithms. “And” is so prolific that it’s universally ignored ==> because the results are generally worthless to the user.

    From all the issues raised in this ‘complaint’, I don’t see any which have merit or a real impact to users.

    Cheers,

    Jack

  29. Peter Kirk says:

    Jack, I realise that “and” is a special case because it is sometimes used, like “or”, as a search engine operator. But surely it is for the user, not the software designer, to decide which results are of use to the user. I can think of many scenarios in which a user might want to locate the word “and”, not least because in some languages this is a significant content word – it means “vow” in the language I was working with. Thus the problem with your argument is that you “don’t see any [issues] which have merit or a real impact to users” because you are looking in quite the wrong direction.

  30. Danny says:

    Did I really just spend all that time reading this?

    “No one can do anything unless God in heaven allows it.”-John the Baptist

  31. Seth Knorr says:

    I know how you feel when it comes to Bible search software and sites. I was a volunteer youth Pastor for over nine years, and I always thought Bible searches should be better. Since I am also a computer programmer I decided to create a Bible search engine that would be comparable to secular search engines.

    I am still in the process of adding more translations and I do not make any word found in the translation irrelevant.

    Here are some of the key features.
    As you type Bible passages appear in a drop down to auto suggest results that match your criteria. All results are returned by relevance so you find the verses you were looking for appearing first. The search includes an integrated spell checker and results are returned in either a literal or thought for thought and topical manner.

    You can read a full page of information on what makes it different here:
    http://www.smartbiblesearch.com/about_bible_search.php

    And you can try the search engine out here:
    http://www.smartbiblesearch.com/

    In Christ,

    Seth Knorr
    Founder
    Smart Bible Search

  32. Peter Kirk says:

    Thank you, Seth. That looks like a great tool. I would like to see a broader range of Bible versions with it, but I guess that will only work if you can either get publishers on board with it or have it integrated into one of the big well known online Bible search packages.

    Out of interest, do you see this as a commercial venture or as a continuation of your volunteer ministry?

  33. Seth Knorr says:

    Peter, sorry for the delayed response. I never got an email notification.

    Thanks for the link.

    The site is a contribution to help Christians search the Bible. Thus no ads. I am planning on trying to get more translations on there, I have done a couple at a time so far. I am currently working on putting the ISV on the site. 

    The only no I have had thus far was from the NKJV that basically sent me a letter that said they on license it if it is financially beneficial for them. I guess I was kind of confused by the letter. 

    My next translation I want to add is the NLT and then probably the NIV and RSV. I think that will give me a good mix to start. Then down the road I will add more. I have to rewrite the letter I send to reference the newer features.

    I also want to then add commentaries. My end result would would look like traditional Bible software you buy, but for free.

  34. Peter Kirk says:

    Seth, that’s a great vision you have. I am all in favour of what “would look like traditional Bible software you buy, but for free”. I wish you well with getting licenses.

    You could remind the NKJV team that just having their text on your site is giving them free advertising.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s