Multi-domain natural language processing architecture
First Claim
1. A method comprising:
- receiving, by a computing system, data representing an utterance spoken by a user and comprising a query;
for each group of a plurality of groups of words included in the utterance and responsive to a determination that words included in the group are related to one another by a semantic concept and a semantic attachment, adding, by the computing system, the group to a pool of groups shared by a plurality of different domain pipelines, wherein the semantic concept is associated with a surface form of an individual mention and the semantic attachment is related only to the individual mention;
processing, by the computing system, in parallel, and for each domain pipeline of the plurality of different domain pipelines, the data in accordance with one or more natural language understanding (NLU) models of the domain pipeline based on the individual mention and the semantic attachment to produce a plurality of output sets for the query, each domain pipeline corresponding to a different subject domain of a plurality of related concepts, and each output set of the plurality of output sets comprising a ranking of a plurality of interpretation candidates determined by the domain pipeline for the query; and
re-ranking, by the computing system and for each output set of the plurality of output sets, the plurality of interpretation candidates to determine an interpretation of the query.
2 Assignments
0 Petitions
Accused Products
Abstract
An arrangement and corresponding method are described for multi-domain natural language processing. Multiple parallel domain pipelines are used for processing a natural language input. Each domain pipeline represents a different specific subject domain of related concepts. Each domain pipeline includes a mention module that processes the natural language input using natural language understanding (NLU) to determine a corresponding list of mentions, and an interpretation generator that receives the list of mentions and produces a rank-ordered domain output set of sentence-level interpretation candidates. A global evidence ranker receives the domain output sets from the domain pipelines and produces an overall rank-ordered final output set of sentence-level interpretations.
56 Citations
20 Claims
-
1. A method comprising:
-
receiving, by a computing system, data representing an utterance spoken by a user and comprising a query; for each group of a plurality of groups of words included in the utterance and responsive to a determination that words included in the group are related to one another by a semantic concept and a semantic attachment, adding, by the computing system, the group to a pool of groups shared by a plurality of different domain pipelines, wherein the semantic concept is associated with a surface form of an individual mention and the semantic attachment is related only to the individual mention; processing, by the computing system, in parallel, and for each domain pipeline of the plurality of different domain pipelines, the data in accordance with one or more natural language understanding (NLU) models of the domain pipeline based on the individual mention and the semantic attachment to produce a plurality of output sets for the query, each domain pipeline corresponding to a different subject domain of a plurality of related concepts, and each output set of the plurality of output sets comprising a ranking of a plurality of interpretation candidates determined by the domain pipeline for the query; and re-ranking, by the computing system and for each output set of the plurality of output sets, the plurality of interpretation candidates to determine an interpretation of the query. - View Dependent Claims (2, 3, 4, 5, 6, 7)
-
-
8. A system comprising:
-
at least one processor; and a memory comprising instructions that when executed by the at least one processor cause the system to; receive data representing an utterance spoken by a user and comprising a query; for each group of a plurality of groups of words included in the utterance and responsive to a determination that words included in the group are related to one another by a semantic concept and a semantic attachment, add the group to a pool of groups shared by a plurality of different domain pipelines, wherein the semantic concept is associated with a surface form of an individual mention and the semantic attachment is only related to the individual mention; process, in parallel, and for each domain pipeline of a plurality of different domain pipelines, the data in accordance with one or more natural language understanding (NLU) models of the domain pipeline to produce a plurality of output sets for the query, each domain pipeline corresponding to a different subject domain of a plurality of related concepts, and each output set of the plurality of output sets comprising a ranking of a plurality of interpretation candidates determined by the domain pipeline for the query; and re-rank, for each output set of the plurality of output sets, the plurality of interpretation candidates to determine an interpretation of the query. - View Dependent Claims (9, 10, 11, 12, 13, 14)
-
-
15. One or more non-transitory computer-readable media comprising instructions that when executed by one or more computers cause the one or more computers to:
-
receive data representing an utterance spoken by a user and comprising a query; for each group of a plurality of groups of words included in the utterance and responsive to a determination that words included in the group are related to one another by a semantic concept and a semantic attachment, wherein the semantic concept is associated with a surface form of an individual mention and the semantic attachment is related only to the individual mention, add the group to a pool of groups shared by a plurality of different domain pipelines; process, in parallel, and for each domain pipeline of a plurality of different domain pipelines, the data in accordance with one or more natural language understanding (NLU) models of the domain pipeline based on the individual mention and the semantic attachment to produce a plurality of output sets for the query, each domain pipeline corresponding to a different subject domain of a plurality of related concepts, and each output set of the plurality of output sets comprising a ranking of a plurality of interpretation candidates determined by the domain pipeline for the query; and re-rank, for each output set of the plurality of output sets, the plurality of interpretation candidates to determine an interpretation of the query by selecting a best-ranked interpretation candidate from among the plurality of output sets. - View Dependent Claims (16, 17, 18, 19, 20)
-
Specification