Efficient computation of character offsets for token-oriented representation of program code
First Claim
1. A software engineering tool encoded in one or more computer readable media as instructions executable to represent program code as a doubly-linked list of lexical tokens and to maintain, consistent with operations thereon, both a token-coordinates representation and a character-coordinates representation of an insertion point.
2 Assignments
0 Petitions
Accused Products
Abstract
An editor, software engineering tool or collection of such tools may be configured to encode (or employ an encoding of) an insertion point in both token-coordinates and character-coordinates. Efficient implementations of insert, remove and replace operations that employ and maintain such a representation are described herein. Some realizations further maintain a total buffer size encoding consistent with each such operations. Computational costs of such operations typically scale at worst with the size of fragments inserted into and/or removed from such a token-oriented representation, rather than with buffer size. Accordingly, such implementations are particularly well-suited to providing efficient support for programming tool environments in which a token stream is updated incrementally in correspondence with user edits.
118 Citations
45 Claims
- 1. A software engineering tool encoded in one or more computer readable media as instructions executable to represent program code as a doubly-linked list of lexical tokens and to maintain, consistent with operations thereon, both a token-coordinates representation and a character-coordinates representation of an insertion point.
-
16. A method of providing character-oriented coordinates for an insertion point in an edit buffer represented as a sequence of lexical tokens, the method comprising:
-
representing the edit buffer as a doubly-linked list of nodes, each node corresponding to a respective one of the lexical tokens; and
representing the insertion point in the edit buffer, the insertion point representation identifying;
(i) a particular one of the lexical tokens corresponding to the insertion point; and
(ii) a total character offset of the insertion point into the edit buffer. - View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24)
-
-
25. One or more computer readable media encoding a data structure that represents contents of an edit buffer as a sequence of lexical tokens, the encoded data structure comprising:
-
a doubly linked list of nodes;
token representations each corresponding to at least one respective node of the list, wherein at least some of the token representations have associated string encodings; and
an insertion point representation identifying, for the insertion point, both a particular one of the lexical tokens and a total character offset into the edit buffer. - View Dependent Claims (26, 27, 28, 29, 30, 31, 32, 33, 34, 35)
-
-
36. A method of supporting access by one or more software engineering tools to program code, wherein at least one such tool operates on the program code as a token sequence and at least one such tool operates on the program code as a character sequence, the method comprising:
-
maintaining a representation of the program code as a doubly-linked list of nodes, each node corresponding to a lexical token thereof; and
responsive to repositioning of an insertion point, updating a representation thereof that identifies;
a particular one of the lexical tokens;
a character offset into a string associated with the particular lexical token; and
a total character offset into the program code. - View Dependent Claims (37, 38, 39, 40, 41)
-
-
42. An apparatus comprising:
-
storage for a computer readable encoding of an edit buffer represented as a sequence of lexical tokens; and
means for representing an insertion point in the edit buffer, the insertion point identifying both a particular one of the lexical tokens and a total character offset into the edit buffer. - View Dependent Claims (43, 44, 45)
-
Specification