System, methods, and user interface for presenting information from unstructured data
First Claim
1. A computer-implemented method for processing information in a closed document residing on a computer file system, comprising:
- displaying, in a computer file system user interface, a name or icon of an electronic file, wherein the electronic file is in a closed state, wherein the electronic file is an object in the computer file system, wherein the computer file system comprises one or more folders or subfolders and at least one file in a tree structure;
allowing a user to act on the name or icon of the electronic file, or on a user interface object associated with the name or icon of the electronic file, wherein the user action comprises moving a pointing device over, clicking, or touching, or a voice or visually activated action;
in response to a user action, receiving a first term and a second term extracted from a text content associated with the electronic file;
determining a two-part display format that represents a hierarchical relation between the first term and the second term, wherein the two-part display format comprises a first part and a second part;
displaying the first term in the first part; and
displaying the second term in the second part, wherein the second term is displayed as an item subordinate to the first term in the hierarchical format, wherein the subordinate relationship is defined by a visual format including a heading-body relation, or a difference in size, color or character style, or position, or annotation,wherein the first term and the second term are obtained by;
(a) receiving a user-generated text content in the electronic file,(b) tokenizing the text content into terms, each term comprising a word or a phrase or a sentence,(c) identifying a first term in the text content,(d) identifying an attribute associated the first term using a machine-based algorithm, wherein the attribute comprises a grammatical, semantic, positional, or frequency attribute,(e) assigning an importance measure to the first term based on the attribute,(f) selecting the first term for extraction if the importance measure is above a threshold,(g) identifying a sentence containing the first term and the second term,(h) identifying a grammatical structure in the sentence, wherein the grammatical structure comprises components and one or more types of relations between the components, wherein the components and relations comprise a grammatical subject in relation to a non-subject portion of the sentence, or a multi-word phrase comprising a head term in relation to a modifier term,(i) determining the first term and the second term as two components in one of the one or more types of relations in the grammatical structure, and(j) extracting the first term and the second term based on the type of relation.
1 Assignment
0 Petitions
Accused Products
Abstract
A system, methods, and user interface for extracting information from unstructured data sources and presenting such information in a structured or semi-structured format for better information search and utilization, and can be applied to replace the conventional methods of displaying search results. The methods identify terms representing topics and related comments in various types of text contents including documents and Web pages, and extract such terms and present them in a form of a topic-comment or object-properties hierarchy, including a heading+list format and heading+cloud or group format. Methods and interface object are provided to make a file object a non-terminal node in a computer file system, with information extracted from the file content displayed as deeper levels of the file system hierarchy. Methods for displaying information extracted from unstructured document contents in terms of class-members and topic-attributes are also disclosed.
49 Citations
12 Claims
-
1. A computer-implemented method for processing information in a closed document residing on a computer file system, comprising:
-
displaying, in a computer file system user interface, a name or icon of an electronic file, wherein the electronic file is in a closed state, wherein the electronic file is an object in the computer file system, wherein the computer file system comprises one or more folders or subfolders and at least one file in a tree structure; allowing a user to act on the name or icon of the electronic file, or on a user interface object associated with the name or icon of the electronic file, wherein the user action comprises moving a pointing device over, clicking, or touching, or a voice or visually activated action; in response to a user action, receiving a first term and a second term extracted from a text content associated with the electronic file; determining a two-part display format that represents a hierarchical relation between the first term and the second term, wherein the two-part display format comprises a first part and a second part; displaying the first term in the first part; and displaying the second term in the second part, wherein the second term is displayed as an item subordinate to the first term in the hierarchical format, wherein the subordinate relationship is defined by a visual format including a heading-body relation, or a difference in size, color or character style, or position, or annotation, wherein the first term and the second term are obtained by; (a) receiving a user-generated text content in the electronic file, (b) tokenizing the text content into terms, each term comprising a word or a phrase or a sentence, (c) identifying a first term in the text content, (d) identifying an attribute associated the first term using a machine-based algorithm, wherein the attribute comprises a grammatical, semantic, positional, or frequency attribute, (e) assigning an importance measure to the first term based on the attribute, (f) selecting the first term for extraction if the importance measure is above a threshold, (g) identifying a sentence containing the first term and the second term, (h) identifying a grammatical structure in the sentence, wherein the grammatical structure comprises components and one or more types of relations between the components, wherein the components and relations comprise a grammatical subject in relation to a non-subject portion of the sentence, or a multi-word phrase comprising a head term in relation to a modifier term, (i) determining the first term and the second term as two components in one of the one or more types of relations in the grammatical structure, and (j) extracting the first term and the second term based on the type of relation. - View Dependent Claims (2, 3, 4, 5)
-
-
6. A computer-implemented method for processing information in a closed document residing on a computer file system, comprising:
-
displaying, by a computing device, a first user interface object associated with or comprising a name or icon of an electronic file, wherein the electronic file is in a closed state, wherein the electronic file is an object in a computer file system, wherein the name or icon of the electronic file is a node in a computer file system hierarchy, wherein the computer file system comprises one or more folders or subfolders and at least one file in a tree structure, wherein the folders or subfolders are non-terminal nodes in the tree structure; enabling the first user interface object to respond to a user action; receiving a user action on the first user interface object, wherein the action comprises moving a pointing device over the first user interface object, clicking, or touching on the first user interface object, or a voice or visually activated action; in response to the user action, changing the first user interface object to a non-terminal node in the file system hierarchy if the first user interface object is a terminal node in the file system hierarchy; creating a lower-level node under the non-terminal node in the file system hierarchy, wherein the lower-level node comprises at least a first display area associated with the first user interface object; obtaining a first term automatically extracted by a machine from the content in the electronic file, wherein the first term comprises a word or a phrase; and displaying the first term in the first display area, wherein the first term is obtained by; (a) tokenizing the text content of the electronic file into terms, each term comprising a word or a phrase or a sentence, (b) identifying a first term in the text content, (c) identifying an attribute associated the first term using a machine-based algorithm, wherein the attribute comprises a grammatical, semantic, positional, or frequency attribute, (d) assigning an importance measure to the first term based on the attribute, and (e) extracting the first term from the text content if the importance measure is above a threshold. - View Dependent Claims (7, 8, 9, 10, 11, 12)
-
Specification