Could Twitter Build a Better Full Text Search Engine?

Twitter has different quality signals to Google within its data. Could Twitter use this to build a better search engine? This article puts the case for Twitter and argues that Twitter might have the edge in creating a very different Index which Google could not easily replicate.

It isn’t beyond the realms of possibility and if I was Twitter I would be thinking about it.

If you search for a keyword on Twitter, it used to be that Twitter would just look at the text in Twitter content. Then they started to look at the text in the URLs. I noticed later that they where also able to parse URL text via redirects and now of course they have changed their own URL shortening system which gives them more weight.

But is there an opportunity here for Twitter to create a better algorithm for sentiment analysis and a better algorithm for crawl priorities which would give them a search engine edge over the likes of google and Bing? I think there is and I think that Twitter knows that the Twitter Fire Hose (The name given to the API that can dump every twitter post out to a third party, if only the third party had the bandwidth to take it) has the ability to be a game danger in search. This is why they would have held out and were ultimately prepared to walk away from the deal with Google.

The thing is, I don’t think Google had been properly leveraging the data. It was being used for “realtime search” but it does not look like Google had been confident enough about the deal to let it distract Google’s main search methodology from weighting its main index based in the link mentions within the Fire Hose feed. If they had – and if they had let the weight of a page rise (and eventually fall) based on these links and the clout (or Klout) of the people shouting out on Twitter, then the results could have look very different on Google.

Twitter – of course – have this data all the time. It could be the basis for prioritising a crawler and that’s half the battle in developing a search agent that needs to scale up to the size needed to index the web. Indexing a full text search is a massive task – as the Majestic 12 project has found – but you do not have to do it all at once, Twitter already index the URL and maybe the page Title. What about the first paragraph only? Or the snippet from when the article appears on a blog home page? Basically enough to get the drift of the article.

Now – combine this with sentiment – which they can get from the context of the Tweets much more readily than Google can from crawling review sites – and Twitter really do have something potentially special.

All that remains would be the business case of trying to create yet another search engine. However – one of the factors in such a decision must be the knowledge that Google is developing its own competitor to Twitter and if Twitter are not careful, they will find their advantages erode from underneath them. So they need to fight back and soon.

2 thoughts on “Could Twitter Build a Better Full Text Search Engine?”

  1. Wasn’t facebook going to push forward it’s own search functions a few months ago? It certainly has enough page views to make it a threat to google. Would be nice to see people using other search options too!

    1. I think that Facebook’s plans involve just searching working Facebook – but of course they could also try that. Somehow, I think that Twitter are in more need of a strategic think about it’s model, though. Twitter is great, but can get undermined if it starts to lose market share.

Comments are closed.