Token-oriented representation of program code with support for textual editing thereof

US 20040003373A1
Filed: 06/28/2002
Published: 01/01/2004
Est. Priority Date: 06/28/2002
Status: Abandoned Application

First Claim

Patent Images

1. A method of efficiently supporting operations on contents of an edit buffer represented as a sequence of lexical tokens, the method comprising:

representing the edit buffer as a doubly-linked list of nodes, each node corresponding to a respective one of the lexical tokens; and

representing an insertion point in the edit buffer, the insertion point representation identifying both a particular one of the lexical tokens and an offset into a text string associated with the particular token.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

An editor, software engineering tool or collection of such tools may be configured to encode (or employ an encoding of) an insertion point representation that identifies both a particular token of a token-oriented representation and a character offset thereinto. Efficient implementations of insert, remove and replace operations that employ such a representation are described herein. Computational costs of such operations typically scale at worst with the size of fragments inserted into and/or removed from such a token-oriented representation, rather than with buffer size. Accordingly, such implementations are particularly well-suited to providing efficient support for programming tool environments in which a token stream is updated incrementally in correspondence with user edits.

Citations

32 Claims

1. A method of efficiently supporting operations on contents of an edit buffer represented as a sequence of lexical tokens, the method comprising:
- representing the edit buffer as a doubly-linked list of nodes, each node corresponding to a respective one of the lexical tokens; and
  
  representing an insertion point in the edit buffer, the insertion point representation identifying both a particular one of the lexical tokens and an offset into a text string associated with the particular token.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 15)
- - 2. The method of claim 1, further comprising:
    - maintaining the insertion point representation, including the particular lexical token identification and the substring offset, consistent with each edit operation performed on the edit buffer.
  - 3. The method of claim 2, wherein the edit operation includes one or more of an insert, remove, split, join or replace operation performed on or with one or more lexical tokens.
  - 4. The method of claim 2, wherein the edit operation includes one or more of an insert, remove, split, join or replace operation performed on or with a string of one or more characters.
  - 5. The method of claim 1, further comprising:
    - maintaining the insertion point representation, including the particular lexical token identification and the substring offset, consistent with each navigation operation performed.
  - 6. The method of claim 5, wherein the navigation operation moves the insertion point forward or backward from a current position in the edit buffer.
  - 7. The method of claim 5, wherein the navigation operation repositions the insertion point to an particular position in the edit buffer.
  - 8. The method of claim 1, further comprising:
    - maintaining the insertion point representation, including the particular lexical token identification and the substring offset, consistent with each insertion of one or more lexical tokens into the edit buffer.
  - 9. The method of claim 1, further comprising:
    - maintaining the insertion point representation, including the particular lexical token identification and the substring offset, consistent with each insertion of a one or more characters into the edit buffer.
  - 10. The method of claim 1, further comprising:
    - maintaining the insertion point representation, including the particular lexical token identification and the substring offset, consistent with each deletion of one or more lexical tokens from the edit buffer.
  - 11. The method of claim 1, further comprising:
    - maintaining the insertion point representation, including the particular lexical token identification and the substring offset, consistent with each deletion of a one or more characters from the edit buffer.
  - 12. The method of claim 1, wherein, for a given state of the edit buffer, two or more particular identical text strings are represented as a single instance thereof, the single instance being associated with plural corresponding nodes of the list.
  - 15. The method of claim 1, wherein, for a given state of the edit buffer, each identical one of the text strings is represented using a single instance thereof, the single instance associated with each of the corresponding nodes of the list.

13. The method of claim 13, wherein at least one additional text strings, which is identical to the particular strings, is represented as a separate instance.
- View Dependent Claims (14)
- - 14. The method of claim 13, wherein the representation of the two or more identical text strings as a single instance, reduces storage consumed in the representation of the edit buffer.

16. One or more computer readable media encoding a data structure that represents as a sequence of lexical tokens an edit buffer of functionally descriptive program code, the encoded data structure comprising:
- a doubly linked list of nodes;
  
  token representations each corresponding to at least one respective node of the list, wherein at least some of the token representations include associated text string encodings; and
  
  an insertion point encoding a position in the edit buffer, the insertion point identifying both a particular one of the lexical tokens and an offset thereinto.
- View Dependent Claims (17, 18, 19, 20, 21, 22, 23)
- - 17. The encoded data structure of claim 16, wherein the associated text string encodings are referencable through respective pointers encoded in respective ones of the nodes.
  - 18. The encoded data structure of claim 17, wherein, for a given state of the edit buffer, a two or more particular identical ones of the associated text string encodings are represented as a single instance thereof, the single instance being associated with respective nodes of the list.
  - 19. The encoded data structure of claim 16, wherein the associated text string encodings are encoded in respective ones of the nodes.
  - 20. The encoded data structure of claim 16, embodied as a software object that defines one or more edit operations on the edit buffer, wherein, consistent with semantics of thereof, the edit operations performed on the edit buffer maintain the insertion point, including the particular lexical token identification and the offset thereinto.
  - 21. The encoded data structure of claim 20, wherein a particular one of the access operations implements one or more of an insert, remove, split, join or replace on or with one or more lexical tokens.
  - 22. The encoded data structure of claim 20, wherein a particular one of the access operations implements one or more of an insert, remove, split, join or replace on or with one or more a string of one or more character.
  - 23. The encoded data structure of claim 16, wherein the one or more computer readable media are selected from the set of a disk, tape or other magnetic, optical, or electronic storage medium and a network, wireline, wireless or other communications medium.

24. A software engineering tool that represents program code as a stream of lexical tokens and represents a cursor position therein by identifying both a particular one of the lexical tokens and an offset thereinto corresponding to the cursor position.
- View Dependent Claims (25, 26)
- - 25. The software engineering tool of claim 24, configured as one or more of:
    - an editor;
      
      a source level debugger;
      
      a class viewer;
      
      a profiler; and
      
      an integrated development environment.
  - 26. The software engineering tool of claim 24, embodied as software encoded in one or more computer readable media and executable on a processor.

27. A method of supporting access by one or more software engineering tools to program code, wherein at least one such tool operates on the program code as a token sequence and at least one such tool operates on the program code as a character sequence, the method comprising:
- maintaining a representation of the program code as a doubly-linked list of nodes, each node corresponding to a lexical token, wherein at least some of the nodes have associated text string encodings; and
  
  responsive to updates to the program code and consistent with state of the program code representation, maintaining an insertion point identifier that identifies both a particular one of the nodes and offset into a corresponding one of the text string encodings, if any.
- View Dependent Claims (28, 29)
- - 28. The method of claim 27, wherein the tool that operates on the program code as a token sequence and the tool that operates on the program code as a character sequence are different tools.
  - 29. The method of claim 27, wherein the tool that operates on the program code as a token sequence and the tool that operates on the program code as a character sequence are a same tool.

30. An apparatus comprising:
- storage for a computer readable encoding of an edit buffer represented as a sequence of lexical tokens; and
  
  means for representing an insertion point encoding a position in the edit buffer, the insertion point identifying both a particular one of the lexical tokens and an offset thereinto.
- View Dependent Claims (31, 32)
- - 31. The apparatus of claim 30, further comprising:
    - means for representing two or more identical text strings corresponding to lexical tokens as a single instance, the single instance being associated with plural corresponding nodes of the sequence.
  - 32. The apparatus of claim 30, further comprising:
    - means for maintaining the insertion point in correspondence with an edit operation on the edit buffer.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Sun Microsystems Incorporated (Oracle Corporation)
Original Assignee
Sun Microsystems Incorporated (Oracle Corporation)
Inventors
Urquhart, Kenneth B., Van De Vanter, Michael L.

Application Number

US10/185,752
Publication Number

US 20040003373A1
Time in Patent Office

Days
Field of Search
US Class Current

717/112
CPC Class Codes

G06F 40/106   Display of layout of docume...

G06F 40/169   Annotation, e.g. comment da...

G06F 40/284   Lexical analysis, e.g. toke...

Token-oriented representation of program code with support for textual editing thereof

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

32 Claims

Specification

Solutions

Use Cases

Quick Links

Token-oriented representation of program code with support for textual editing thereof

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

32 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links