Search Patents Filings from 12-20-06 – Reranking on Information Redundancy and on Searcher Affinities
Some processes covered in patents granted last week by the US Patent and Trademark Office:
- Microsoft reranking results based upon redundancy of information,
- Ask reranking pages based upon affinities between searchers,
- Hewlett-Packard creating queries for searchers to investigate from scanned documents, and;
- Exalead forming dynamic query refinements from words found in documents located within search results.
A number of new patent applications were published also, which involve search in some manner. Microsoft describes targeted contextual advertising in alerts, and debuts two versions of a way to transform large screen web pages to smaller sizes for handheld devices. Nokia echoes Microsoft with a similar process of their own.
IBM defines a way to visually provide information about the depth of a link on a site, and they show us a search engine which would allow searchers to refine search results during a session using dimensions such as time and place and topic. A Qualcomm employee filed a patent application allowing searchers to search based upon ratings of pages from experts, in the subject matter associated with those pages. E-Learning company Platform Learning, Inc., describes a search engine that would assign reading levels to pages, and allow searchers to find pages in part based upon those reading levels.
More about the patent filings…
Microsoft was granted a patent by the same name as the following, earlier this summer. I wrote about it in Microsoft Reranking and Filtering Redundant Information. This one calls itself a continuation of that patent.
The process described in this patent attempts to understand how similar or diverse a set of the highest ranking results might be for a search query, and may rerank the results served to searchers based upon an understanding of that information. It could act to filter repetitive documents, and off topic documents.
Utilizing information redundancy to improve text searches (7,152,057)
Invented by Eric D. Brill and Susan T. Dumais; Granted December 19, 2006; Filed January 20, 2006
Architecture for improving text searches using information redundancy. A search component is coupled with an analysis component to rerank documents returned in a search according to a redundancy values. Each returned document is used to develop a corresponding word probability distribution that is further used to rerank the returned documents according to the associated redundancy values. In another aspect thereof, the query component is coupled with a projection component to project answer redundancy from one document search to another. This includes obtaining the benefit of considerable answer redundancy from a second data source by projecting the success of the search of the second data source against a first data source.
Ask.com (IAC Search & Media, Inc.)
There are two parts to this patent – one involving monitoring and understanding affinities between searchers, and the other looking at their locations. An analogy of how this works might be what you see happen when you do a search at Amazon.com for a book. You might be told that users who looked at that book also looked at or purchased certain other books. Well, during a query session, if you submit a query, and click through a certain result, other searches in that session may show results that are reranked based upon how other searchers who might have also made that first choice might have behaved in their query sessions.
Methods and systems for providing a response to a query (7,152,061)
Invented by Andy Curtis, Alan Levin, and Apostolos Gerasoulis; Granted December 19, 2006, Filed: September 16, 2004
Methods and systems for providing a response to a user’s query based on other users’ picks. For one embodiment of the invention, user responses are correlated to determine an affinity among users. User affinity is then used to modify the presentation of the search results. For one embodiment the location of other user’s is used to modify the presentation of the search results.
You scan a document filled with text, or of an object that has words and labels upon it, or a combination of both, and send the scanned copy off to a search engine. The process described in this patent, involves taking that scanned image, and creating a set of phrases related to it, that a searcher can use to conduct more information about the scan. The patent is fairly narrow, and doesn’t cover any type of object recognition other than letters.
Information research initiated from a scanned image media (US Patent 7,151,864)
Invented by Steven G. Henry, Kristin M. Smith, and John P. Wolf; Granted December 19, 2006; Filed September 18, 2002
A device scans text of an image media and generates text data corresponding to the scanned text. A research component of the device generates a phrase list from the text data and initiates research for information corresponding to the phrase list.
This patent, for French search engine Exalead, shows a mix of suggested query refinements to the right of search results, based upon an existing taxonomy of topics or categories, and by deriving keyword choices from documents returned as a result of the search.
Searching tool and process for unified search using categories and keywords (7,152,064)
Invented by Francois Bourdoncle, Patrice Bertin, and Eric Jeux; Granted December 19, 2006; Filed August 14, 2001
A database of entries, such as Web pages and sites, is provided. The entries are at least partially mapped to a set of predetermined categories. The entries are also associated with keywords, for instance, by automatic indexing of documents. In response to a query into the database, a user is provided with a series of refinement strategies, in addition to search results. Refinement strategies comprise categories relevant for the search, selected among the set of predetermined categories. Refinement strategies also include keywords dynamically selected among keywords associated with the entries. The user may easily navigate among the results to the query, and formulate new queries.
We’ve seen advertisements with our emails, and advertisements in feeds. Adding targeted advertisements based upon the content of an alert doesn’t seem like a stretch.
Advertisements in an alert interface (US Patent Application 20060282312)
Invented by Matthew C. Carlson, Todd S. Biggs, and Mark T.K. Looi; Published December 14, 2006; Filed June 10, 2005
Advertisements are incorporated into alerts generated for a user. A server-side alert delivery system can be used by a partner server, client device, or other server to deliver an alert to a user about some event the user has requested. A relevant advertisement is selected based upon the content of the alert and other data. In some cases, the time of day of the alert and/or the location to which the alert is delivered may be considered in selecting the advertisement. The alert may be provided to a user through different services and devices, including an instant messaging service, mail service, or through a mobile device.
Phones, PDAs, and other handheld devices that can access the internet aren’t often planned for by site designers. Microsoft describes a way that it might be able to breakdown and understand how a web page should be displayed on smaller screens, based upon a combination of the analysis of the HTML on the page, and by looking at borders, whitespace, and the way information is chunked together on a page into headers, footers, sidebars, and content areas. I wrote about some of the implications of a search engine understanding better how a page is put together in Smaller Screens Make Smarter Search Engines. I point out in that post three documents from Microsoft and Google that delve deeper into how a process like this works.
Small Form Factor Web Browsing (US Patent Application 20060282444)
Invented by Yu Chen, Wei-Ying Ma, Ming-Yu Wang, and Hong Jiang Zhang; Published December 14, 2006; Filed August 18, 2006
A large web page is analyzed and partitioned into smaller sub-pages so that a user can navigate the web page on a small form factor device. The user can browse the sub-pages to find and read information in the content of the large web page. The partitioning can be performed at a web server, an edge server, at the small form factor device, or can be distributed across one or more such devices. The analysis leverages design habits of a web page author to extract a representation structure of an authored web page. The extracted representation structure includes high level structure using several markup language tag selection rules and low level structure using visual boundary detection in which visual units of the low level structure are provided by clustering markup language tags. User viewing habits can be learned to display favorite parts of a web page.
A variation of the above patent application was also filed and published by Microsoft:
In addition to the search engines, some phone companies and service providers are considering the best ways to break a larger web page down into smaller parts. The following patent application from Nokia describes one approach:
System and method for identifying segments in a web resource (US Patent Application 20060282758)
Invented by Kevin Simons, Robert Katta, Mitri Abou-Rizk, and William Papp; Published December 14, 2006; Filed June 10, 2005
A robust, lightweight, bottom-up segmentation method for Internet content. According to the present invention, individual segments are created based upon weights assigned according to document structure and markup elements and semantics. Smaller segments are then merged into larger segments by determining which portions of the content page are related to each other. The remaining segments are then intelligently divided based upon device constraints.
Not all links are equal, at least based upon how deeply into a site they might have visitors travel. Imagine being able to tell that from looking at a link. I’ve wondered if the adoption of something like this might have implications for indexing the pages on a site.
Depth indicator for a link in a document (US Patent Application 20060282765)
Invented by Gregory Richard Hintermeister and Michael D. Rahn; Published December 14, 2006; Filed June 9, 2005
A method, apparatus, system, and signal-bearing medium that, in an embodiment, determine a tree representing links embedded in documents, create a depth indicator having a size proportional to the size of the tree, and display the depth indicator with a root link in a root document. The tree is determined by repeatedly probing the links to retrieve the documents. In various embodiments, the size of the tree may be the number of levels in the tree or the number of links in the tree. The depth indicator may include representations of the links and represents a possible future context of the root document. In various embodiments, a graphical representation of the tree may be displayed, hover help that includes the tree size may be displayed, and an indication of a condition reported by a document may be displayed. In various embodiments, the condition may include a message, updated content, new content, or an error.
Imagine being able to search and then add refinements to a query based upon a number of different dimensions, such as time, content type (newspaper articles, business magazines, books, etc.), geography, and topic. The process in this patent application would enable those choices.
System and method for performing a high-level multi-dimensional query on a multi-structural database (US Patent Application 20060282411)
Invented by Ronald Fagin, Ramanathan V. Guha, Phokion Gerasimos Kolaitis, Jasmine Gina Novak, Shanmugasundaram Ravikumar, Dandapani Sivakumar, and Andrew Stephen Tomkins; Published December 14, 2006; Filed June 13, 2005
A multi-structural query system performs a high-level multi-dimensional query on a multi-structural database. The query system enables a user to navigate a search by adding restrictions incrementally. The query system uses a schema to discover structure in a multi-structural database. The query system leaves a choice of nodes to return in response to a query as a constrained set of choices available to the algorithm. The query system further casts the selection of a set of nodes as an optimization. The query system uses pairwise-disjoint collections to capture a concise set of highlights of a data set within the allowed schema. The query system further comprises efficient algorithms that yield approximately optimal solutions for several classes of objective functions.
Would you use a search engine that allowed you to sort results based upon ratings from experts in the subjects involved? This patent application, invented by an employee of Qualcomm describes what such a search engine might be like.
Internet search engine with critic ratings (20060282336)
Invented by Ian Tzeung Huang; Published December 14, 2006; Filed June 8, 2006
The present invention provides an internet search engine and associated website which provides users with ranked website search results. In an aspect of the present invention, the search engine and associated website provides a critical rating function. Critics can be human experts who review websites on the internet and rate and comment on them. Users apply to become critics, and their applications are reviewed for acceptability by other critics. Critics are selected in particular professions for their expertise in those areas. The critics provide a rating and comments in relation to a site, or to other online content, including text, audio and video, among other things. Ratings and comments are also available to users. In other words, the present invention provides for at least two levels of critical review: critics’ review and users’ review. In an aspect of the present invention, an advanced critic sorting mechanism is provided.
Platform Learning, Inc.
The following is assigned to an elearning company that would score and allow the sorting of results based upon readibility test results on those pages.
System and method for a search engine using reading grade level analysis (20060282413)
Invented by Victor Joseph Bondi; Published December 14, 2006; Filed November 3, 2005
A system and method presents search results relevant to a search query of a database based on user criteria, such as reading grade level. Reading grade level is used to rank and characterize relevant search results. The determined reading grade level of the search results provides quick and easy access to relevant documents and provides a measure of cognitive ability indicative of the content of the search page result. The system and method obtains an initial set of relevant search results from a corpus of documents in a database and determines the reading grade level of the search result documents. The system and method displays the determined reading grade level of the search results with the search results to provide an easy index or ranking.
Disclaimer: Patents are filed to protect ideas and methods developed as part of the intellectual property of a company, and may be used to exclude others from using the same, or similar processes, but the granting of a patent or publication of a patent application doesn’t necessarily mean that the processes involved have been fully developed, or will be in the future. Yet, the documents can provide some insight into the ideas that an organization is working upon, and may act as a starting point for more research.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.