Find Me on Google+

LSI…a Google SEO myth or reality?

December 2, 2009 by · 5 Comments
Filed under: SEO 

goolge_LSI blocksSo what is the Google LSI and why should I care?

LSI or latent semantic indexing is supposedly a method used by Google to try to adapt it’s growing index of pages/keywords into what can only be described as somewhat “closer” to being ranked by a human being. That is, if a person read your web pages, and then listed in the Google index, what that person thought you meant by your use of keywords and phrases… is what LSI attempts to do that by using the Google indexing algorithm in a similar fashion for these similar words and their usage…at least that’s the common web SEO collective interpretation.

First you should know that Google has NO LSI patents out there…there are some from 2006 that are “sorta” on this…but not on point…..test this on google itself!

So what is LSI in practical terms? 

The premise that similar words can carry similar weight in Google…ie cars and automobiles….follow me here? You see, semantics is the study of words….historically search engines have not done any type of semantic look at words…it was decided that this meaning is too too tough to try to figure out semantically, if car – auto was a match. But supposedly today, Google and it’s LSI is now considering that the way that these two terms are used might be the way to find out what the searcher wants to find…. and therefore, if you were to map the use of ALL words, there would be relationships that can be seen to be linked between some of same. Again, I’m back to car – auto as a valid example of same, but where say car – snowshoe would not seem to be the same, semantically by the concept of how the word is used….hence supposedly, Google’s LSI indexing links all words together and then attempts to figure out which of same interact with each other….ie the level of meaning is what’s important!

For instance, back to snowshoe and snowshoes — the plural of the same term can be seen to be similar hence linked…but there are some real clinkers in this indexing feature. Or, for instance think about tense….shrink, shrinking and shrunken….all pertain to the act of getting smaller, but notice the spelling tense changes that also occurs? Or, how about words that are spelled exactly the same but have different meanings? lead and lead..one is a metal and one means to forge ahead. Or how about smith and smith where one is a surname and the other an occupation? Or, how about synonym terms that have a link but are not the same, as in say the relationship between words…like say student and pupil…or happy and gay…or baby and infant….do you see the issues that an algorithm will need to overcome to be able to provide valid LSI search query answers? Ah, but wait there’s more…

So here’s a test, to see if Google’s supposed LSI is incorporated…do a Google for a plural set of terms…..try car and cars. If LSI is working then they should come out the same…or very very close…did they? No, they didn’t did they….the total results numbers are off and the Top 10 in both share only 1 common hit, a Wikipedia listing.  Then try another term but with different tense useage, say try shrink and shrunken, again close? Nope tho they do share that same 1 common Top 10 listing at Wikipedia, not really a surprise there is it! Then try synonyms….car and automobile…again, not close are they?

But if LSI exists, then LSI should be present, if it is incorporated into Google, but we do not find same, do we? There is no evidence after a few test searches, is there? And if you’re an SEO practitioner, then what “works” is what counts for your serps….all else is futile!

So….what does that mean for SEO? Ignore the talk on LSI, is what I’d offer up here….if you have a site for a autobody shop then use “collision repair” and “accident repairs” but don’t expect to be anywhere ranking wise for keyword phrases like “car repairs” (which would include in say mechanical engine items) or “wheel replacements” or “best body shop”….cause without using those keywords IN your site on-page placements, you just wont show up for those terms!

LSI is a myth!

Be Sociable, Share!
Google

Comments

5 Responses to “LSI…a Google SEO myth or reality?”
  1. Nice write up. I often use the LSI features of Google for keyword research. Works well to see what they think matches up =)

    Josh

  2. Dave says:

    Hey there… uhm, interesting stuff but WHERE does it say that Google ever employed LSI? The original hype was a misunderstood purchase of Applied Semantics back in 2003 which used LSI… interestingly enough they had a system (for Ads) called; AdSense… sooo…. ever since people have been yammering about LSI and Google…

    More here;

    http://www.huomah.com/Search-Engines/Search-Engine-Optimization/Latent-Semantic-Indexing-and-Google-One-more-time.html

    And an older one;

    http://www.huomah.com/search-engines/algorithm-matters/stay-off-the-lsi-bandwagon.html

    Those should give you the idea… It is far more likely they’re using (more robust) methods such as PLSA or even PaIR (phrase based indexing and retrieval). Point being that they certainly DO use various semantic analysis methods, but most certainly LSI isn’t one of them (beyond AdSense).

    Also, you mentioned, “there are some from 2006 that are “sorta” on this” – are we talking about the ‘Phrase Based Indexing and Retrieval’ patents from Anna Patterson? (now running Cuil) – because that is NOT even close to LSI actually…although I’ve seen a few SEOs try to imply this.

    I am a new reader of your blog and had been enjoying some of the posts – this one is problematic. I have a habit of ranting about SEOs that don’t understand search engines; this one does raise the blood level some.. Google’s organic SERPs and LSI is definately a myth, but Google’s use of semantic analysis is not.

    If you want some more details or reading do get in touch (or come join us at the SEO Dojo – we have a ton of geeky reading). I am very passionate about stopping such misconceptions

    Oh… or read all the stuff in this post; http://www.huomah.com/Search-Engines/Algorithm-Matters/SEO-Higher-learning.html

    Anyway, all in all still some good posts (from fellow Canucks) – this one was def ‘Dave Bait’ (as Micheal Martinez is now fond of saying). I look forward to more goodiness

    Ciao… and happy holidays!

  3. Jim says:

    @Dave….

    hmmm…well to begin with I totally do agree with you…hopefully, my last paragraph tells the reader to IGNORE the LSI tactics same as your own posts do. Not to be difficult here, but being an SEO forum “haunter” I see LSI questions almost every day so I thought it behooved me to discuss same. Yes, the SEO community may well believe that G practices LSI…but as a simple set of test proves, it does not exist at an empirical level..hence my opinion…and this piece. Nice tho to see that we really do agree…..and no, this is NOT < dave bait > either :-)))

  4. Dan Lew says:

    Interesting stuff, I somewhat agree that LSI being a myth. There was some discussion about it many years ago, but still nothing really has been taken serious. As for now Googles backlink patent seems to work well for me :)

Trackbacks

Check out what others are saying about this post...