Real time overlay placement in videos for augmented reality applications

US 10,636,176 B2
Filed: 02/01/2019
Issued: 04/28/2020
Est. Priority Date: 09/06/2018
Status: Expired due to Fees

First Claim

Patent Images

1. A processor implemented method, comprising:

receiving, in real time, (i) an input video comprising a plurality of frames and an object of interest in the plurality of frames, and (ii) a label for which an initial overlay position is pre-computed for placement on a center frame of the input video (202);

computing, in real time, a saliency map for each of the plurality of frames to obtain a plurality of saliency maps (204);

computing, in real time, for each of the plurality of frames, Euclidean distance between a current overlay position and a previous overlay position based on the initial overlay position of the label to obtain a plurality of Euclidean distances (206), wherein the Euclidean distance for each of the plurality of frames is computed for controlling, in real time, temporal jitter in a position of the label to be placed in the input video; and

calculating, in real time, an updated overlay position of the label for placement in the input video based on the plurality of saliency maps and the plurality of Euclidean distances (208).

View all claims

2 Assignments

Timeline View

Assignment View

0 Petitions

Accused Products

Abstract

Textual overlays/labels add contextual information in Augmented Reality (AR) applications. The spatial placement of labels is a challenging task particularly for real time videos. Embodiments of the present disclosure provide systems and methods for optimal placement of contextual information for Augmented Reality (AR) applications to overcome the limitations of occlusion with object/scene of interest through optimally placing labels aiding better interpretation of scene. This is achieved by combining saliency maps computed for each frame of an input video with Euclidean distance between current and previous overall positions for each frame based on an initial overlay position of the label to calculate an updated overlay position for label placement in the video. The placement of overlays is formulated as an objective function that minimizes visual saliency around the object of interest and minimizes the temporal jitter facilitating coherence in real-time AR applications.

Citations

12 Claims

1. A processor implemented method, comprising:
- receiving, in real time, (i) an input video comprising a plurality of frames and an object of interest in the plurality of frames, and (ii) a label for which an initial overlay position is pre-computed for placement on a center frame of the input video (202);
  
  computing, in real time, a saliency map for each of the plurality of frames to obtain a plurality of saliency maps (204);
  
  computing, in real time, for each of the plurality of frames, Euclidean distance between a current overlay position and a previous overlay position based on the initial overlay position of the label to obtain a plurality of Euclidean distances (206), wherein the Euclidean distance for each of the plurality of frames is computed for controlling, in real time, temporal jitter in a position of the label to be placed in the input video; and
  
  calculating, in real time, an updated overlay position of the label for placement in the input video based on the plurality of saliency maps and the plurality of Euclidean distances (208).
- View Dependent Claims (2, 3, 4)
- - 2. The processor implemented method of claim 1, wherein the updated overlay position of the label is computed by combining the plurality of saliency maps and the plurality of Euclidean distances.
  - 3. The processor implemented method of claim 1, further comprising shifting the label from the initial overlay position to the updated overlay position to minimize occlusion from viewing the object of interest (210).
  - 4. The processor implemented method of claim 1, wherein a plurality of pixels corresponding to Euclidean distance between the current overlay position and the previous overlay position that is within a predefined threshold range are selected for shifting the label from initial overlay position to the updated overlay position.

5. A system (100), comprising:
- a memory (102) storing instructions;
  
  one or more communication interfaces (106); and
  
  one or more hardware processors (104) coupled to the memory (102) via the one or more communication interfaces (106), wherein the one or more hardware processors (104) are configured by the instructions to;
  
  receive, in real time, (i) an input video comprising a plurality of frames and an object of interest in the plurality of frames, and (ii) a label for which an initial overlay position is precomputed for placement on a center frame of the input video;
  
  compute, in real time, a saliency map for each of the plurality of frames to obtain a plurality of saliency maps;
  
  compute, in real time, for each of the plurality of frames, Euclidean distance between a current overlay position and a previous overlay position based on the initial overlay position of the label to obtain a plurality of Euclidean distances, wherein the Euclidean distance for each of the plurality of frames is computed for controlling, in real time, temporal jitter in a position of the label to be placed in the input video; and
  
  calculate, in real time, an updated overlay position of the label for placement in the input video based on the plurality of saliency maps and the plurality of Euclidean distances.
- View Dependent Claims (6, 7, 8)
- - 6. The system of claim 5, wherein the updated overlay position of the label is computed by combining the plurality of saliency maps and the plurality of Euclidean distances.
  - 7. The system of claim 5, wherein the one or more hardware processors are further configured to shift the label from the initial overlay position to the updated overlay position to minimize occlusion from viewing the object of interest.
  - 8. The system of claim 5, wherein a plurality of pixels corresponding to Euclidean distance between the current overlay position and the previous overlay position that is within a predefined threshold range are selected for shifting the label from initial overlay position to the updated overlay position.

9. One or more non-transitory machine-readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors cause:
- receiving, in real time, (i) an input video comprising a plurality of frames and an object of interest in the plurality of frames, and (ii) a label for which an initial overlay position is precomputed for placement on a center frame of the input video;
  
  computing, in real time, a saliency map for each of the plurality of frames to obtain a plurality of saliency maps;
  
  computing, in real time, for each of the plurality of frames, Euclidean distance between a current overlay position and a previous overlay position based on the initial overlay position of the label to obtain a plurality of Euclidean distances, wherein the Euclidean distance for each of the plurality of frames is computed for controlling, in real time, temporal jitter in a position of the label to be placed in the input video; and
  
  calculating, in real time, an updated overlay position of the label for placement in the input video based on the plurality of saliency maps and the plurality of Euclidean distances.
- View Dependent Claims (10, 11, 12)
- - 10. The one or more non-transitory machine-readable information storage mediums of claim 9, wherein the updated overlay position of the label is computed by combining the plurality of saliency maps and the plurality of Euclidean distances.
  - 11. The one or more non-transitory machine-readable information storage mediums of claim 9, wherein the instructions which when executed by the one or more hardware processors further cause shifting the label from the initial overlay position to the updated overlay position to minimize occlusion from viewing the object of interest.
  - 12. The one or more non-transitory machine-readable information storage mediums of claim 9, wherein a plurality of pixels corresponding to Euclidean distance between the current overlay position and the previous overlay position that is within a predefined threshold range are selected for shifting the label from initial overlay position to the updated overlay position.

Specification

Resources

Litigation Campaign Assessment

Current Assignee
TATA Consultancy Services Limited (Tata Sons Pvt Ltd.)
Original Assignee
TATA Consultancy Services Limited (Tata Sons Pvt Ltd.)
Inventors
Hegde, Srinidhi, Hebbalaguppe, Ramya
Primary Examiner(s)
Hoang, Phi

Application Number

US16/265,969
Publication Number

US 20200082574A1
Time in Patent Office

452 Days
Field of Search
US Class Current
CPC Class Codes

G06T 11/00   2D [Two Dimensional] image ...

G06T 19/00   Manipulating 3D models or i...

G06T 2219/004   Annotating, labelling

G06T 3/20   Linear translation of whole...

Real time overlay placement in videos for augmented reality applications

First Claim

2 Assignments

0 Petitions

Accused Products

Abstract

Citations

12 Claims

Specification

Solutions

Use Cases

Quick Links

Real time overlay placement in videos for augmented reality applications

First Claim

2 Assignments

Subscription Required

Subscription Required

0 Petitions

Subscription Required

Accused Products

Subscription Required

Abstract

Citations

12 Claims

Specification

Subscription Required

Solutions

Use Cases

Quick Links