User interface operation based on token frequency of use in text
First Claim
1. A method for providing a user interface of a machine for production of electronic text-based documents and, the machine comprising at least one processor and a display, the method comprising:
- identifying, by the at least one processor, at least one baseline token in a baseline text corpus comprising text corresponding to a selected domain;
identifying, by the at least one processor, for each of the at least one baseline token, a plurality of corresponding baseline contexts;
for each of the at least one baseline token, and for each of the plurality of corresponding baseline contexts for the baseline token,determining, by the at least one processor, frequency of use data;
storing, by the at least one processor, the frequency of use data in association with the baseline token and the corresponding baseline context;
identifying, by the at least one processor, at least one token in a targeted text listing;
identifying, by the at least one processor, for each of the at least one token, a corresponding context based on the targeted text listing;
for a selected token of the at least one token in the targeted text listing, identifying, by the at least one processor, context-matched usage data and non-context-matched usage data for a matching baseline token that matches the selected token, wherein the context-matched usage data comprises frequency of use data for the matching baseline token in a first baseline context that matches the corresponding context of the selected token, wherein the non-context-matched usage data further comprises frequency of use data for the matching baseline token in a second baseline context that does not match the corresponding context of the selected token, and wherein the matching baseline token, the first baseline context and the second baseline context are based on the baseline text corpus; and
providing, by the at least one processor to the user interface on the display, the context-matched usage data and the non-context-matched usage data.
6 Assignments
0 Petitions
Accused Products
Abstract
Operation of a user interface includes performing token based analysis of a baseline text corpus and a targeted text listing. For a selected token in the targeted text listing, a matching baseline token in identified. From a plurality of contexts corresponding to the matching baseline token, context-matched and non-context matched usage data for the matching baseline token is identified and provided to a user interface. Similar processing may be performed on the basis of a related, but matching, baseline token. In another embodiment, instances of similar spelling errors are identified on the basis of a plurality of tokens identified in the targeted text listing.
12 Citations
24 Claims
-
1. A method for providing a user interface of a machine for production of electronic text-based documents and, the machine comprising at least one processor and a display, the method comprising:
-
identifying, by the at least one processor, at least one baseline token in a baseline text corpus comprising text corresponding to a selected domain; identifying, by the at least one processor, for each of the at least one baseline token, a plurality of corresponding baseline contexts; for each of the at least one baseline token, and for each of the plurality of corresponding baseline contexts for the baseline token, determining, by the at least one processor, frequency of use data; storing, by the at least one processor, the frequency of use data in association with the baseline token and the corresponding baseline context; identifying, by the at least one processor, at least one token in a targeted text listing; identifying, by the at least one processor, for each of the at least one token, a corresponding context based on the targeted text listing; for a selected token of the at least one token in the targeted text listing, identifying, by the at least one processor, context-matched usage data and non-context-matched usage data for a matching baseline token that matches the selected token, wherein the context-matched usage data comprises frequency of use data for the matching baseline token in a first baseline context that matches the corresponding context of the selected token, wherein the non-context-matched usage data further comprises frequency of use data for the matching baseline token in a second baseline context that does not match the corresponding context of the selected token, and wherein the matching baseline token, the first baseline context and the second baseline context are based on the baseline text corpus; and providing, by the at least one processor to the user interface on the display, the context-matched usage data and the non-context-matched usage data. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8)
-
-
9. An apparatus for production of electronic text-based documents based on a user interface, the apparatus comprising:
-
at least one processor operatively connected to a display; a storage device operatively connected to the at least one processor and having stored thereon instructions that, when executed by the at least one processor, cause the at least one processor to; identify at least one baseline token in a baseline text corpus comprising text corresponding to a selected domain; identify, for each of the at least one baseline token, a plurality of corresponding baseline contexts; for each of the at least one baseline token, and for each of the plurality of corresponding baseline contexts for the baseline token, determine frequency of use data; store the frequency of use data in association with the baseline token and the corresponding baseline context; identify at least one token in a targeted text listing; identify, for each of the at least one token, a corresponding context based on the targeted text listing; for a selected token of the at least one token in the targeted text listing, identify context-matched usage data and non-context-matched usage data for a matching baseline token that matches the selected token, wherein the context-matched usage data comprises frequency of use data for the matching baseline token in a first baseline context that matches the corresponding context of the selected token, wherein the non-context-matched usage data further comprises frequency of use data for the matching baseline token in a second baseline context that does not match the corresponding context of the selected token, and wherein the matching baseline token, the first baseline context and the second baseline context are based on the baseline text corpus; and provide, to the user interface on the display, the context-matched usage data and the non-context-matched usage data. - View Dependent Claims (10, 11, 12, 13, 14, 15, 16)
-
-
17. A non-transitory, machine-readable medium having stored thereon instructions that, when executed by at least one processor, cause the at least one processor to operate as an apparatus for production of electronic text-based documents based on a user interface and to:
-
identify at least one baseline token in a baseline text corpus comprising text corresponding to a selected domain; identify, for each of the at least one baseline token, a plurality of corresponding baseline contexts; for each of the at least one baseline token, and for each of the plurality of corresponding baseline contexts for the baseline token, determine frequency of use data; store the frequency of use data in association with the baseline token and the corresponding baseline context; identify at least one token in a targeted text listing; identify, for each of the at least one token, a corresponding context based on the targeted text listing; for a selected token of the at least one token in the targeted text listing, identify context-matched usage data and non-context-matched usage data for a matching baseline token that matches the selected token, wherein the context-matched usage data comprises frequency of use data for the matching baseline token in a first baseline context that matches the corresponding context of the selected token, wherein the non-context-matched usage data further comprises frequency of use data for the matching baseline token in a second baseline context that does not match the corresponding context of the selected token, and wherein the matching baseline token, the first baseline context and the second baseline context are based on the baseline text corpus; and provide, to the user interface on a display operatively connected to the at least one processing device, the context-matched usage data and the non-context-matched usage data. - View Dependent Claims (18, 19, 20, 21, 22, 23, 24)
-
Specification