Smart string replacement
First Claim
Patent Images
1. A method to replace a source string in a document with a target string, comprising:
- preprocessing textual content via a computer by;
1) tokenizing text in said document
2) collecting information concerning morphological information, part-of-speech disambiguation, syntactic dependencies, anaphoric dependencies and semantic relationships in the textual content and
3) labeling the document with such information, wherein said tokenizing includes multiword tokenization wherein textual input is tokenized into non-isolated units;
selecting a target string that is placed at one or more locations within the document via the computer;
selecting a source string to replace the target string via the computer;
morpho-syntactically disambiguating textual content of the document via the computer;
identifying a set of string dependencies, via the computer, to detect grammatical or anaphoric dependencies, or both, between the strings in the textual content of the document;
disambiguating one or more of gender, number, or part of speech and prompting user- specified disambiguation, via the computer, if the source string or the target string have more than at least one of one possible meaning, gender, and number;
identifying occurrences of the source string in the document that satisfy the user specifications via the computer;
identifying string relations from the set of string dependencies that define direct or indirect links, or both, to the source string via the computer;
assessing whether replacing the source string with the target string is semantically coherent via the computer;
replacing each occurrences of the source string in the document that satisfy the user specifications with the target string via the computer;
correcting grammatical and anaphoric inconsistencies beyond the phrase level, via the computer, in the string relations in the document that are introduced when the source string is replaced with the target string; and
outputting the document.
3 Assignments
0 Petitions
Accused Products
Abstract
String replacement is performed in text using linguistic processing. The linguistic processing identifies the existence of direct or indirect links between the string to be replaced and other strings in the text. Morphological, syntactic, anaphoric, or semantic inconsistencies, which are introduced in strings with the identified direct or indirect links to the string that is to be replaced are detected and corrected.
45 Citations
20 Claims
-
1. A method to replace a source string in a document with a target string, comprising:
-
preprocessing textual content via a computer by;
1) tokenizing text in said document
2) collecting information concerning morphological information, part-of-speech disambiguation, syntactic dependencies, anaphoric dependencies and semantic relationships in the textual content and
3) labeling the document with such information, wherein said tokenizing includes multiword tokenization wherein textual input is tokenized into non-isolated units;selecting a target string that is placed at one or more locations within the document via the computer; selecting a source string to replace the target string via the computer; morpho-syntactically disambiguating textual content of the document via the computer; identifying a set of string dependencies, via the computer, to detect grammatical or anaphoric dependencies, or both, between the strings in the textual content of the document; disambiguating one or more of gender, number, or part of speech and prompting user- specified disambiguation, via the computer, if the source string or the target string have more than at least one of one possible meaning, gender, and number; identifying occurrences of the source string in the document that satisfy the user specifications via the computer; identifying string relations from the set of string dependencies that define direct or indirect links, or both, to the source string via the computer; assessing whether replacing the source string with the target string is semantically coherent via the computer; replacing each occurrences of the source string in the document that satisfy the user specifications with the target string via the computer; correcting grammatical and anaphoric inconsistencies beyond the phrase level, via the computer, in the string relations in the document that are introduced when the source string is replaced with the target string; and outputting the document. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
-
-
12. A method to replace a source string in a document with a target string, comprising:
-
selecting a target string that is placed at one or more locations within the document via a computer; selecting a source string, via the computer, to replace the target string; selecting one of more of a part of speech, a gender, and a number for each of the target string and the source string, via the computer, to replace in the document; morpho-syntactically disambiguating textual content of the target and source strings, via the computer, using two levels of analysis;
1) examining the parts-of-speech on their own, and if ambiguities are found,
2) examining each occurrence of said source string in the textual content of the document, wherein disambiguating the textual content includes prompting user-specified disambiguation if the source string or the target string have one or more of one possible meaning, gender, and number;identifying occurrences of the source string in the document that satisfy user specifications via the computer; identifying a first set of possible senses for the source string and a second set of possible senses for the target string via the computer; assessing, via the computer, whether replacing the source string having the first set of possible senses with the target string having the second set of possible senses is semantically coherent; replacing each occurrences of the source string in the document that satisfy the user specifications with the target string via the computer; outputting a warning, via the computer, when the replacement of the source string with the target string. is not semantically coherent; correcting grammatical and anaphoric inconsistencies, via the computer, in the document that are introduced when the source string is replaced with the target string; and outputting the document. - View Dependent Claims (13, 14, 15)
-
-
16. A method to replace a source string in a document with a target string, comprising:
-
preprocessing textual content, via a computer, to;
1) tokenize text in said document,
2) collect information concerning morphological information, part-of-speech disambiguation, syntactic dependencies, anaphoric dependencies and semantic relationships in the textual content and
2) label the document with such information, wherein said tokenizing includes multiword tokenization wherein textual input is tokenized into non-isolated units;selecting a target string, via the computer, that is placed at one or more locations within the document; selecting a source string, via the computer, to replace the target string; selecting one of more of a part of speech, a gender, and a number for each of the target string and the source string, via the computer, for replacement in the document; morpho-syntactically disambiguating textual content of the target and source strings, via the computer, using two levels of analysis;
1) examining the parts-of-speech on their own, and if ambiguities are found,
2) examining each occurrence of said source string in the textual content of the document;identifying a set of string dependencies by detecting grammatical dependencies between strings in the textual content of the document via the computer; disambiguating one or more of gender, number, or part of speech and prompting a user to provide user-specified disambiguation, via the computer, if the source string or the target string have one or more of various possible meanings, genders, and numbers; identifying occurrences of the source string in the document, via the computer, that satisfy the user specifications; identifying string relations from the set of string dependencies, via the computer, that define direct or indirect links, or both, to the source string; replacing each occurrences of the source string in the document, via the computer, that satisfy the user specifications with the target string; correcting grammatical and anaphoric inconsistencies, beyond the phrase level, in the string relations in the document, via the computer, that are introduced when the source string is replaced with the target string; and outputting the document; wherein the disambiguation of the source string or the target string is performed before replacing each occurrences of the source string in the document that satisfy the user specifications with the target string. - View Dependent Claims (17, 18, 19, 20)
-
Specification