Efficient computation of character offsets for token-oriented representation of program code

US 20040003374A1
Filed: 06/28/2002
Published: 01/01/2004
Est. Priority Date: 06/28/2002
Status: Abandoned Application

First Claim

Patent Images

1. A software engineering tool encoded in one or more computer readable media as instructions executable to represent program code as a doubly-linked list of lexical tokens and to maintain, consistent with operations thereon, both a token-coordinates representation and a character-coordinates representation of an insertion point.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An editor, software engineering tool or collection of such tools may be configured to encode (or employ an encoding of) an insertion point in both token-coordinates and character-coordinates. Efficient implementations of insert, remove and replace operations that employ and maintain such a representation are described herein. Some realizations further maintain a total buffer size encoding consistent with each such operations. Computational costs of such operations typically scale at worst with the size of fragments inserted into and/or removed from such a token-oriented representation, rather than with buffer size. Accordingly, such implementations are particularly well-suited to providing efficient support for programming tool environments in which a token stream is updated incrementally in correspondence with user edits.

118 Citations

View as Search Results

45 Claims

1. A software engineering tool encoded in one or more computer readable media as instructions executable to represent program code as a doubly-linked list of lexical tokens and to maintain, consistent with operations thereon, both a token-coordinates representation and a character-coordinates representation of an insertion point.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
- - 2. The software engineering tool of claim 1, wherein the operations include insertion point repositioning operations.
  - 3. The software engineering tool of claim 1, wherein the operations include edit operations.
  - 4. The software engineering tool of claim 3, wherein the edit operations include one or more of:
    - insertion operations;
      
      removal operations; and
      
      replacement operations.
  - 5. The software engineering tool of claim 1, wherein the token-coordinates representation identifies both a particular one of the lexical tokens and a substring offset into a substring associated with the particular lexical token.
  - 6. The software engineering tool of claim 1, wherein the character-coordinates representation identifies a character offset into the program code.
  - 7. The software engineering tool of claim 6, wherein the character offset is a total character offset from a start of buffer position in the program code.
  - 8. The software engineering tool of claim 6, wherein the character offset is from a particular position in the program code.
  - 9. The software engineering tool of claim 1, wherein, coincident with movement of the insertion point to a new position in the program code, the maintaining includes traversing from a particular position in the doubly-linked list toward the new position, and updating both the token-coordinates representation and character-coordinates representation of the insertion point in correspondence therewith.
  - 10. The software engineering tool of claim 9, wherein the particular position corresponds to the insertion point.
  - 11. The software engineering tool of claim 9, wherein the instructions are further executable to maintain a representation of total character count at an end of the represented program code;
    - and wherein the particular position corresponds to one of;
      
      a beginning of represented program code, the insertion point, and the end of the represented program code, selected based on comparison with the new position to generally minimize computational overhead associated with the scanning and updating.
  - 12. The software engineering tool of claim 9, wherein computational overhead associated with the scanning and updating is generally insensitive to length of the represented program code, and instead exhibits no greater than O(N) scaling behavior, where N corresponds to scale of displacement from the particular position to the new position.
  - 13. The software engineering tool of claim 1, wherein the instructions are further executable to maintain, consistent with each operation that modifies the program code, a total character count.
  - 14. The software engineering tool of claim 1, configured as one or more of:
    - an editor;
      
      a source level debugger;
      
      a class viewer;
      
      a profiler; and
      
      an integrated development environment.
  - 15. The software engineering tool of claim 1, wherein the one or more computer readable media are selected from the set of a disk, tape or other magnetic, optical, or electronic storage medium and a network, wireline, wireless or other communications medium.

16. A method of providing character-oriented coordinates for an insertion point in an edit buffer represented as a sequence of lexical tokens, the method comprising:
- representing the edit buffer as a doubly-linked list of nodes, each node corresponding to a respective one of the lexical tokens; and
  
  representing the insertion point in the edit buffer, the insertion point representation identifying;
  
  (i) a particular one of the lexical tokens corresponding to the insertion point; and
  
  (ii) a total character offset of the insertion point into the edit buffer.
- View Dependent Claims (17, 18, 19, 20, 21, 22, 23, 24)
- - 17. The method of claim 16, wherein the insertion point representation further identifies:
    - (iii) a string offset into a string associated with the particular lexical token.
  - 18. The method of claim 16, further comprising:
    - maintaining, coincident with an operation that modifies contents of the edit buffer, the insertion point representation.
  - 19. The method of claim 18, wherein the operation that modifies contents of the edit buffer includes one or more of an insert, remove, split, join or replace operation performed at the insertion point.
  - 20. The method of claim 16, further comprising:
    - maintaining, coincident with movement of the insertion point to a new position, the insertion point representation.
  - 21. The method of claim 20, wherein the maintaining includes scanning from a particular position in the doubly-linked list toward the new position, and updating the identification of both the particular lexical token and the total character offset.
  - 22. The method of claim 21, wherein the particular position corresponds to the insertion point.
  - 23. The method of claim 21, further comprising maintaining a representation of total character count for the edit buffer;
    - and wherein the particular position corresponds to one of;
      
      a beginning of the edit buffer, the insertion point, and an end of the edit buffer, selected based on comparison with the new position to generally minimize computational overhead associated with the scanning and updating.
  - 24. The method of claim 21, further comprising:
    - updating, consistent with each operation that modifies contents of the edit buffer, a representation of total character count.

25. One or more computer readable media encoding a data structure that represents contents of an edit buffer as a sequence of lexical tokens, the encoded data structure comprising:
- a doubly linked list of nodes;
  
  token representations each corresponding to at least one respective node of the list, wherein at least some of the token representations have associated string encodings; and
  
  an insertion point representation identifying, for the insertion point, both a particular one of the lexical tokens and a total character offset into the edit buffer.
- View Dependent Claims (26, 27, 28, 29, 30, 31, 32, 33, 34, 35)
- - 26. The encoded data structure of claim 25, wherein the insertion point representation further identifies a character offset into a string associated with the particular lexical token.
  - 27. The encoded data structure of claim 25, embodied as a software object that defines at least one operation that repositions the insertion point, wherein performance of the repositioning operation updates both the particular lexical token and the total character offset.
  - 28. The encoded data structure of claim 27, wherein performance of the repositioning operation includes traversing from a particular position in the doubly-linked list toward a new position, and updating both the particular lexical token and the total character offset in correspondence therewith.
  - 29. The encoded data structure of claim 28, wherein the particular position corresponds to the insertion point.
  - 30. The encoded data structure of claim 28, further comprising a representation of total character count for the edit buffer, wherein the particular position corresponds to one of:
    - a beginning of the edit buffer, the insertion point, and an end of the edit buffer, selected based on comparison with the new position to generally minimize computational overhead associated with the scanning and updating.
  - 31. The encoded data structure of claim 28, wherein computational overhead associated with the scanning and updating is generally insensitive to length of the edit buffer, and instead exhibits no greater than O(N) scaling behavior, where N corresponds to scale of the repositioning from the particular position to the new position.
  - 32. The encoded data structure of claim 25, wherein the total offset is explicitly represented.
  - 33. The encoded data structure of claim 26, wherein the total offset is represented as a value derived from plural explicitly represented values including the character offset.
  - 34. The encoded data structure of claim 25, embodied as a software object that defines edit operations on contents of the edit buffer, wherein, consistent with semantics thereof, each of the edit operations performed on the edit buffer updates a representation of total character count for the edit buffer contents.
  - 35. The encoded data structure of claim 25, wherein the one or more computer readable media are selected from the set of a disk, tape or other magnetic, optical, or electronic storage medium and a network, wireline, wireless or other communications medium.

36. A method of supporting access by one or more software engineering tools to program code, wherein at least one such tool operates on the program code as a token sequence and at least one such tool operates on the program code as a character sequence, the method comprising:
- maintaining a representation of the program code as a doubly-linked list of nodes, each node corresponding to a lexical token thereof; and
  
  responsive to repositioning of an insertion point, updating a representation thereof that identifies;
  
  a particular one of the lexical tokens;
  
  a character offset into a string associated with the particular lexical token; and
  
  a total character offset into the program code.
- View Dependent Claims (37, 38, 39, 40, 41)
- - 37. The method of claim 36, wherein the repositioning includes traversing from a particular position in the doubly-linked list toward a new position and updating the insertion point representation in correspondence therewith.
  - 38. The method of claim 37, wherein the particular position corresponds to the insertion point.
  - 39. The method of claim 37, further comprising:
    - maintaining a representation of total character count for the program code, wherein the particular position corresponds to one of;
      
      a beginning of the program code, the insertion point, and an end of the program code, selected based on comparison with the new position to generally minimize computational overhead associated with the scanning and updating.
  - 40. The method of claim 36, wherein the tool that operates on the program code as a token sequence and the tool that operates on the program code as a character sequence are different tools.
  - 41. The method of claim 36, wherein the tool that operates on the program code as a token sequence and the tool that operates on the program code as a character sequence are a same tool.

42. An apparatus comprising:
- storage for a computer readable encoding of an edit buffer represented as a sequence of lexical tokens; and
  
  means for representing an insertion point in the edit buffer, the insertion point identifying both a particular one of the lexical tokens and a total character offset into the edit buffer.
- View Dependent Claims (43, 44, 45)
- - 43. The apparatus of claim 42, wherein the insertion point representation means further identifies a substring offset into a substring associated with the particular lexical token.
  - 44. The apparatus of claim 42, further comprising:
    - means for maintaining a total character count in correspondence with an operation that modifies contents of the edit buffer.
  - 45. The apparatus of claim 42, further comprising:
    - means for updating the insertion point in correspondence with a repositioning operation.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sun Microsystems Incorporated (Oracle Corporation)
Original Assignee
Sun Microsystems Incorporated (Oracle Corporation)
Inventors
Urquhart, Kenneth B., Van De Vanter, Michael L.

Application Number

US10/185,753
Publication Number

US 20040003374A1
Time in Patent Office

Days
Field of Search
US Class Current

717/112
CPC Class Codes

G06F 8/33 Intelligent editors

Efficient computation of character offsets for token-oriented representation of program code

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

118 Citations

45 Claims

Specification

Use Cases

Quick Links

Others

Efficient computation of character offsets for token-oriented representation of program code

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

118 Citations

45 Claims

Specification

Subscription Required

Use Cases

Quick Links

Others