Natural language image spatial and tonal localization

US 9,412,366 B2
Filed: 11/21/2012
Issued: 08/09/2016
Est. Priority Date: 09/18/2012
Status: Active Grant

First Claim

Patent Images

1. A method comprising:

receiving a natural language input by one or more computing devices, the natural language input formed using arbitrary language;

parsing the natural language input by the one or more computing devices into action data containing arbitrary vocabulary, the arbitrary vocabulary comprising nouns and verbs referencing an action to be performed;

translating portions of the arbitrary vocabulary contained in the action data to constrained vocabulary data by the one or more computing devices and mapping non-translated portions of the arbitrary vocabulary contained in the action data to a generalized vocabulary specifying parameters for image processing;

determining spatial and tonal localizations of one or more image editing operations as specified by the generalized vocabulary and constrained vocabulary data by the one or more computing devices; and

initiating performance of the one or more image editing operations on image data using the determined spatial and tonal localization by the one or more computing devices.

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Natural language image spatial and tonal localization techniques are described. In one or more implementations, a natural language input is processed to determine spatial and tonal localization of one or more image editing operations specified by the natural language input. Performance is initiated of the one or more image editing operations on image data using the determined spatial and tonal localization.

Citations

20 Claims

1. A method comprising:
- receiving a natural language input by one or more computing devices, the natural language input formed using arbitrary language;
  
  parsing the natural language input by the one or more computing devices into action data containing arbitrary vocabulary, the arbitrary vocabulary comprising nouns and verbs referencing an action to be performed;
  
  translating portions of the arbitrary vocabulary contained in the action data to constrained vocabulary data by the one or more computing devices and mapping non-translated portions of the arbitrary vocabulary contained in the action data to a generalized vocabulary specifying parameters for image processing;
  
  determining spatial and tonal localizations of one or more image editing operations as specified by the generalized vocabulary and constrained vocabulary data by the one or more computing devices; and
  
  initiating performance of the one or more image editing operations on image data using the determined spatial and tonal localization by the one or more computing devices.
- View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
- - 2. A method as described in claim 1, wherein the performance of the one or more image editing operations includes forming one or more localization masks.
  - 3. A method as described in claim 2, wherein the one or more localization masks are a combination of a spatial localization mask with a tonal region localization mask.
  - 4. A method as described in claim 3, wherein the spatial localization mask has spatial components that define a shape to convey a local operation, a spatial gradient function, or a border function.
  - 5. A method as described in claim 3, wherein the tonal region localization mask has components configured to modulate tonal shapes.
  - 6. A method as described in claim 2, wherein at least one said localization mask is formed for each filter used to perform a respective said image editing operation.
  - 7. A method as described in claim 2, wherein a single said localization mask is formed for a composite filter operation involved in performance of a plurality of said image editing operations.
  - 8. A method as described in claim 1, wherein the spatial localization of the natural language input is specified using at least two directions.
  - 9. A method as described in claim 1, wherein at least one of the spatial localization or the tonal localization of the natural language input is specified at least in part using a gesture.
  - 10. A method as described in claim 9, wherein the gesture is formed from a series of touch inputs that define at least part of a boundary of the portion of an image displayed using the image data.
  - 11. A method as described in claim 9, wherein the gesture identifies a base of a portion of an image displayed using the image data that is to be subject of further processing by an object identification module to determine a boundary of the portion.
  - 12. A method as described in claim 11, wherein the object identification module employs:
    - one or more facial recognition algorithms to determine the boundary of the portion;
      
      orone or more algorithms to identify landmarks to determine the boundary of the portion.
  - 13. A method as described in claim 1, wherein the natural language input does not name an object included in an image displayed using the image data.

14. A method comprising:
- generating one or more localization masks by one or more computing devices based on a phrase in a natural language input, where the natural language input comprises arbitrary verbs and nouns, each of the one or more localization masks being a combination of a spatial localization mask and a tonal region localization mask, respectively;
  
  identifying one or more image editing operations that are included in the phrase by the one or more computing devices, where the one or more image editing operations are identified by using lexical ontologies and semantic distances to map the arbitrary verbs and nouns included in the phrase to the one or more specific image editing operations; and
  
  initiating performance of the one or more identified image editing operations on image data by the one or more computing devices using the generated one or more localization masks.
- View Dependent Claims (15, 16, 17)
- - 15. A method as described in claim 14, wherein the spatial localization mask has spatial components that define a shape to convey a local operation, a spatial gradient function, or a border function.
  - 16. A method as described in claim 14, wherein the tonal region localization mask has components configured to modulate tonal shapes.
  - 17. A method as described in claim 14, wherein at least one said localization mask is a function of the image data and parameterized by a set of spatial localization parameters as well as tonal localization parameters.

18. One or more computer-readable storage media comprising instructions stored thereon that, responsive to execution on a computing device, causes the computing device to perform operations comprising:
- determining a strength as well as spatial and tonal localization of one or more image editing operations specified by a natural language input, where the natural language input comprises arbitrary nouns and verbs, the determining of the spatial and tonal localization based at least in part of identification of a direction and a modifier included in the natural language input, the one or more image editing operations specified identified by using lexical ontologies and semantic distances to map the arbitrary nouns and verbs in the natural language input to one or more specific image editing operations; and
  
  initiating performance of the one or more image editing operations on image data using the determined spatial and tonal localization and strength.
- View Dependent Claims (19, 20)
- - 19. One or more computer-readable storage media as described in claim 18, wherein the modifier references another direction that is substantially perpendicular to the identified direction.
  - 20. One or more computer-readable storage media as described in claim 18, wherein the modifier is added to a cluster of phrases collectively stating a current state of modification.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
Adobe Inc.
Original Assignee
Adobe Systems Incorporated (Adobe Inc.)
Inventors
Wilensky, Gregg D., Chang, Walter W., Dontcheva, Lubomira A., Laput, Gierad P., Agarwala, Aseem O.
Primary Examiner(s)
Godbold, Douglas

Application Number

US13/683,416
Publication Number

US 20140081625A1
Time in Patent Office

1,357 Days
Field of Search

704/275
US Class Current

1/1
CPC Class Codes

G06F 3/04845   for image manipulation, e.g...

G06F 3/04883   for inputting data by handw...

G06F 3/167   Audio in a user interface, ...

G06F 40/186   Templates

G06F 40/205   Parsing

G06F 40/253   Grammatical analysis; Style...

G06F 40/279   Recognition of textual enti...

G06F 40/284   Lexical analysis, e.g. toke...

G06F 40/30   Semantic analysis

G10L 15/22   Procedures used during a sp...

Natural language image spatial and tonal localization

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

20 Claims

Specification

Solutions

Use Cases

Quick Links

Natural language image spatial and tonal localization

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

20 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links