System and method for speech-based navigation and interaction with a device's visible screen elements using a corresponding view hierarchy
First Claim
1. A computer-implemented method comprising:
- obtaining (i) a view hierarchy that is of a user interface and that includes textual representations, location data, and properties of text-based and non-text-based viewable elements in the user interface and of a particular viewable element in the user interface, and (ii) a transcription of a speech input that was spoken while the user interface was displayed;
determining that the transcription of the speech input matches text that is associated with the particular viewable element (i) that is included in the view hierarchy that includes textual representations, location data, and properties of text-based and non-text-based viewable elements in the user interface and of the particular viewable element in the user interface and (ii) that was displayed when the speech input was spoken; and
in response to determining that the transcription of the speech input matches text that is associated with the particular viewable element (i) that is included in the view hierarchy that includes textual representations, location data, and properties of text-based and non-text-based viewable elements in the user interface and of the particular viewable element in the user interface and (ii) that was displayed when the speech input was spoken, initiating an action that is associated with the particular viewable element (i) that is included in the view hierarchy that includes textual representations, location data, and properties of text-based and non-text-based viewable elements in the user interface and of the particular viewable element in the user interface and (ii) that was displayed when the speech input was spoken.
2 Assignments
0 Petitions
Accused Products
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enabling screen-specific user interfacing with elements of viewable screens presented by an electronic device are disclosed. In one aspect, a method includes the actions of identifying a character sequence representing a first input that is received while displaying a viewable screen having at least one selectable viewable element. The actions further include accessing an electronic file that provides a text representation of one or more of the at least one selectable viewable element. The actions further include comparing the character sequence to the text representation. The actions further include selecting, within the viewable screen, a selectable viewable element whose text representation matches the character sequence. The actions further include triggering any action linked to the selecting the selectable viewable element.
19 Citations
19 Claims
-
1. A computer-implemented method comprising:
-
obtaining (i) a view hierarchy that is of a user interface and that includes textual representations, location data, and properties of text-based and non-text-based viewable elements in the user interface and of a particular viewable element in the user interface, and (ii) a transcription of a speech input that was spoken while the user interface was displayed; determining that the transcription of the speech input matches text that is associated with the particular viewable element (i) that is included in the view hierarchy that includes textual representations, location data, and properties of text-based and non-text-based viewable elements in the user interface and of the particular viewable element in the user interface and (ii) that was displayed when the speech input was spoken; and in response to determining that the transcription of the speech input matches text that is associated with the particular viewable element (i) that is included in the view hierarchy that includes textual representations, location data, and properties of text-based and non-text-based viewable elements in the user interface and of the particular viewable element in the user interface and (ii) that was displayed when the speech input was spoken, initiating an action that is associated with the particular viewable element (i) that is included in the view hierarchy that includes textual representations, location data, and properties of text-based and non-text-based viewable elements in the user interface and of the particular viewable element in the user interface and (ii) that was displayed when the speech input was spoken. - View Dependent Claims (2, 3, 4, 5, 6, 7, 8, 9)
-
-
10. A system comprising:
one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising; obtaining (i) a view hierarchy that is of a user interface and that includes textual representations, location data, and properties of text-based and non-text-based viewable elements in the user interface and of a particular viewable element in the user interface, and (ii) a transcription of a speech input that was spoken while the user interface was displayed; determining that the transcription of the speech input matches text that is associated with the particular viewable element (i) that is included in the view hierarchy that includes textual representations, location data, and properties of text-based and non-text-based viewable elements in the user interface and of the particular viewable element in the user interface and (ii) that was displayed when the speech input was spoken; and in response to determining that the transcription of the speech input matches text that is associated with the particular viewable element (i) that is included in the view hierarchy that includes textual representations, location data, and properties of text-based and non-text-based viewable elements in the user interface and of the particular viewable element in the user interface and (ii) that was displayed when the speech input was spoken, initiating an action that is associated with the particular viewable element (i) that is included in the view hierarchy that includes textual representations, location data, and properties of text-based and non-text-based viewable elements in the user interface and of the particular viewable element in the user interface and (ii) that was displayed when the speech input was spoken. - View Dependent Claims (11, 12, 13, 14, 15)
-
16. A non-transitory computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising:
-
obtaining (i) a view hierarchy that is of a user interface and that includes textual representations, location data, and properties of text-based and non-text-based viewable elements in the user interface and of a particular viewable element in the user interface, and (ii) a transcription of a speech input that was spoken while the user interface was displayed; determining that the transcription of the speech input matches text that is associated with the particular viewable element (i) that is included in the view hierarchy that includes textual representations, location data, and properties of text-based and non-text-based viewable elements in the user interface and of the particular viewable element in the user interface and (ii) that was displayed when the speech input was spoken; and in response to determining that the transcription of the speech input matches text that is associated with the particular viewable element (i) that is included in the view hierarchy that includes textual representations, location data, and properties of text-based and non-text-based viewable elements in the user interface and of the particular viewable element in the user interface and (ii) that was displayed when the speech input was spoken, initiating an action that is associated with the particular viewable element (i) that is included in the view hierarchy that includes textual representations, location data, and properties of text-based and non-text-based viewable elements in the user interface and of the particular viewable element in the user interface and (ii) that was displayed when the speech input was spoken. - View Dependent Claims (17, 18, 19)
-
Specification