The Internet is a treasure trove of knowledge, especially for students in search of immediate information gratification. However, the ‘Net contains billions of files, and unless you know the exact URL of the one you want, you’re going deep web links to have to rely on search engines to help you unearth the info you need.
Search engines are tools that allow you to search for information available on the Web using keywords and search terms. Rather than searching the Web itself, however, you are actually searching the engine’s database of files.
Search engines are actually three separate tools in one. The spider is a program that “crawls” through the Web, moving from link to link, looking for new web pages. Once it finds new sites or files, they are added to the search engine’s index. This index is a searchable database of all the information that the spider has found on the Web. Some engines index every word in each document, while others select certain words. The search engine itself is a piece of software that allows users to search the engine’s database. Clearly, an engine’s search is only as good as the index it’s searching.
When you run a query using a search engine, you’re really only searching the engine’s index of what’s on the Web, as opposed to the entire Web. No one search engine is capable of indexing everything on the Web – there’s just too much information out there! Additionally, many spiders cannot or will not enter databases or index files. Consequently, much of the information excluded in search engine queries includes breaking news, documents, multimedia files, images, tables, and other data. Collectively, these types of resources are referred to as the deep or invisible Web. They’re buried deep in the Web and are invisible to search engines. While many search engines feature some areas of the deep web, most of these resources require special tools to unearth them.
Estimates vary, but the deep web is much larger than the surface web. Approximately 500 more times information is located on the deep web as exists on the surface web. This consists of multimedia files, including audio, video, and images; software; documents; dynamically changing content such as breaking news and job postings; and information that’s stored on databases, for example, phone book records, legal information, and business data. Clearly, the deep web has something to offer almost any student researcher.
The easiest way to find information on the deep web is to use a specialized search engine. Many search engines index a very small portion of the deep web; however, some engines target the deep web specifically. If you need to find a piece of information that’s likely to be classified as part of the deep web, search engines that focus on such content are your best bet.
Like surface web engines, deep web search engines may also sell advertising in the form of paid listings. They differ in their coverage of deep web content and offer dissimilar advanced search options. Engines that search the deep web can be classified as first vs. second generation, individual vs. meta, and/or separate vs. collated retrieval, just as with surface web engines. Thus, you’ll need to familiarize yourself with the options that are available and gradually add the best engines to your bag of research tricks.