Patent Application 18298834 - METHOD AND SYSTEM FOR CREATING GROUP HIGHLIGHT

Title: METHOD AND SYSTEM FOR CREATING GROUP HIGHLIGHT REELS OF CONSUMERS CONSUMING MEDIA CONTENT AT DIFFERENT LOCATIONS/TIMES

Application Information

Invention Title: METHOD AND SYSTEM FOR CREATING GROUP HIGHLIGHT REELS OF CONSUMERS CONSUMING MEDIA CONTENT AT DIFFERENT LOCATIONS/TIMES
Application Number: 18298834
Submission Date: 2025-05-19T00:00:00.000Z
Effective Filing Date: 2023-04-11T00:00:00.000Z
Filing Date: 2023-04-11T00:00:00.000Z
National Class: 345
National Sub-Class: 629000
Examiner Employee Number: 90408
Art Unit: 2615
Tech Center: 2600

Rejection Summary

102 Rejections: 0
103 Rejections: 2

Cited Patents

The following patents were cited in the rejection:

Office Action Text

DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Comments
The computer readable storage device recited in claims 17 excludes the “propagating electromagnetic signals” as shown in the published Specification [0105].

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 3, 6, 8-9, 11, 13, 16, 17, 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Roh et al. (US 2015/0015690 A1) in view of Ni et al. (US 2018/0027307 A1).
Regarding claim 1, Roh teaches:
An electronic device (Abstract)comprising:
a network interface (FIG. 1)
a memory (FIG. 1) having program code for a consumer experience collage generation (CECG) application (FIG. 5, 101) that enables the electronic device to combine one or more images received … generate a combined output comprising the one or more images and a portion of the first media content from a presentation time at which each of the one or more images was generated …; ([0236] “The above-described method of controlling the electronic device may be written as computer programs and may be implemented in digital microprocessors that execute the programs using a computer readable recording medium. The method of controlling the electronic device may be executed through software. The software may include code segments that perform required tasks. Programs or code segments may also be stored in a processor readable medium or may be transmitted according to a computer data signal combined with a carrier through a transmission medium or communication network.” Abstract: “a controller configured to display the execution screen of a predetermined application on the touch screen, control the camera and the microphone to operate upon entering reaction capture mode, acquire a reaction image by capturing a video or still image of a user through the camera upon detecting the user making a facial expression or gesture through the camera or the user's voice through the microphone, and display the acquired reaction image on the touch screen.”)
a controller (FIG. 1) communicatively coupled to the at least one network interface and the memory, the controller processing the program code of the CECG application, (FIG. 5) which causes the electronic device to:
receive, … the one or more images and at least one of the portion of the first media content or an associated time marker identifying the presentation time within the first media content; ([0143], “For example, if the user's smile lasts for not less than 5 seconds, the controller 180 may capture a video of the user and save it.”)
in response to receiving the associated time marker, access a copy of the first media content and retrieve, based on the associated time marker, the portion of the first media content from the copy; ([0223], “That is, continuously captured reaction images may serve as indices which indicate the points in time when the individual reaction images are captured. Accordingly, when any one of the thumbnail reaction images is chosen, the video may start to play at the playback position in the video where the chosen reaction image is captured.”) and
generates a composite product comprising a combination of the at least one image and the portion of the first media content.([0144]-[0145], “The controller 180 may display the acquired reaction image on the touch screen (S109). This means that the controller 180 can display the user's image on the touch screen 151 as soon as it detects the user's reaction through the camera 121. The controller 180 may display the acquired reaction image in an overlapping way on the execution screen (e.g., video playback screen) of a predetermined application displayed on the touch screen.” FIG. 6 and 20)
However, Roh does not teach the emotion reaction is shared in different client devices. On the other hand, Ni teaches an emotion reaction sharing among different client devices method. Specifically, Ni teaches:
a network interface (FIG. 1) that communicatively couples the electronic device to at least one second electronic device that can each operate in a consumer experiencing capturing (CEC) mode while locally presenting a first media content in a respective, locally monitored area; (Abstract: “For example, a client device captures video of a user viewing content, such as a live stream video. Landmark points, corresponding to facial features of the user, are identified and provided to a user reaction distribution service that evaluates the landmark points to identify a facial expression of the user, such as a crying facial expression. The facial expression, such as landmark points that can be applied to a three-dimensional model of an avatar to recreate the facial expression, are provided to client devices of users viewing the content, such as a second client device. The second client device applies the landmark points of the facial expression to a bone structure mapping and a muscle movement mapping to create an expressive avatar having the facial expression for display to a second user.”)
a memory (FIG. 1) having program code for a consumer experience collage generation (CECG) application (FIG. 5, 101) that enables the electronic device to combine one or more images received from the at least one second electronic device to generate a combined output comprising the one or more images and a portion of the first media content from a presentation time at which each of the one or more images was generated by a corresponding one of the at least one second electronic device; (Abstract: “For example, a client device captures video of a user viewing content, such as a live stream video. Landmark points, corresponding to facial features of the user, are identified and provided to a user reaction distribution service that evaluates the landmark points to identify a facial expression of the user, such as a crying facial expression. The facial expression, such as landmark points that can be applied to a three-dimensional model of an avatar to recreate the facial expression, are provided to client devices of users viewing the content, such as a second client device. The second client device applies the landmark points of the facial expression to a bone structure mapping and a muscle movement mapping to create an expressive avatar having the facial expression for display to a second user.”)
receive, via the network interface from the at least one second electronic device, the one or more images and at least one of the portion of the first media content (FIG. 6 , FIG. 7)
Roh teaches the emotion reaction capturing and displaying method. Ni teaches an emotion reaction capturing and sharing among different client devices method.
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to have combined the teachings of Roh with the specific teachings of Ni to allow the emotion reaction be shared in a distributed environment.

Regarding claim 3, Roh in view of Ni teaches:
The electronic device of claim 1, further comprising: a device interface that communicatively couples the electronic device to a local media output device that presents the first media content in a local monitored area; (Roh FIG. 1, the controller communicatively couples to the display as shown in FIG. 2c to display captured expression image.) at least one image capturing device having a field of view of the local monitored area including at least one consumer of the first media content; (Roh [0009], “If the user's facial expression or gesture is detected through the camera for a predetermined time or more, or the user's voice is detected through the microphone for the predetermined time or more, the controller may acquire the reaction image as a video, and if the user's facial expression, gesture, or voice is detected for less than the predetermined time, the controller may acquire the reaction image as a still image.”) and a timer which monitors a runtime of the first media content presented to the local monitored area; (Roh [0219]-[0223], “FIG. 19 is a view for explaining a fourth embodiment of the present invention. As shown in FIG. 19, the controller 180 may display reaction images 33, 34, 35, 36, 37, 38, and 39 on the video playback screen 20 by continuously capturing a user P15 a plurality of times. The reaction images 33, 34, 35, 36, 37, 38, and 39 may be displayed in thumbnail form. As shown in (a) of FIG. 1, the controller 180 may receive a touch input for choosing any one of the reaction images 33, 34, 35, 36, 37, 38, and 39 displayed in thumbnail form with a finger F5. As shown in (b) of FIG. 1, the controller 180 may allow the reaction image 36 to jump to a point in time in the video when the reaction image 36 is captured and display the video, based on the input for choosing the thumbnail reaction image 36. That is, continuously captured reaction images may serve as indices which indicate the points in time when the individual reaction images are captured.) wherein the controller is communicatively connected to the device interface and the at least one image capturing device,( Roh FIG. 1) and the controller: triggers the image capturing device to capture at least one local image at a time that overlaps with the presentation time of the portion of the first media content; and incorporates the at least one local image into the composite product.( Roh [0126]-[0127], “When the execution screen (e.g., video playback screen) of a predetermined application is displayed on the touch screen 151, the electronic device 100 according to the embodiment of the present invention may operate in reaction capture mode for capturing an image of a user seeing the touch screen 151 by operating the camera and the microphone. That is, upon detecting a reaction such as a user gesture while capturing the user watching a video by operating the camera and the microphone, the controller may operate in reaction capture mode that captures an image of the user and displays the captured image on the touch screen.” FIG. 6)

Regarding claim 6, Roh in view of Ni teaches:
The electronic device of claim 1, wherein: the first media content is broadcast content presented to each local and remote monitored area contemporaneously. (Ni, Abstract: “One or more computing devices, systems, and/or methods for emotional reaction sharing are provided. For example, a client device captures video of a user viewing content, such as a live stream video. Landmark points, corresponding to facial features of the user, are identified and provided to a user reaction distribution service that evaluates the landmark points to identify a facial expression of the user, such as a crying facial expression. The facial expression, such as landmark points that can be applied to a three-dimensional model of an avatar to recreate the facial expression, are provided to client devices of users viewing the content, such as a second client device. The second client device applies the landmark points of the facial expression to a bone structure mapping and a muscle movement mapping to create an expressive avatar having the facial expression for display to a second user.” The combination rationale of claim 1 is incorporated here.)

Regarding claim 8, Roh in view of Ni teaches:
The electronic device of claim 1, wherein: the at least one second electronic device comprises multiple second electronic devices;(Ni, FIG. 1) the time marker represents a corresponding runtime of the first media content when a trigger event occurred at a first one of the multiple second electronic devices;(Roh, [0219]-[0223]) and the controller transmits to each other second electronic device a request to retrieve and forward an image from a background video monitoring of the FOV of a local image capturing device while a same portion of the first media content is being presented at respective locations of the other second electronic devices. (Ni, FIG. 9 and corresponding paragraphs. [0038] teaches real-time sharing reaction emotions. ) The combination rationale of claim 1 is incorporated here.

Regarding claim 9, Roh in view of Ni teaches:
The electronic device of claim 3, wherein: the media output device comprises a video display device; (Roh FIG. 1) the portion of the first media content comprises one of a still frame image or video segment copied from a video stream and a screen shot captured from content presented on the video display device; (Roh [0138], “The controller 180 may acquire a reaction image by capturing a video or still image of the user through the camera” FIG. 5) the composite product is a combination of the still frame image, video segment, or screen shot and the captured at least one image; (Roh FIG. 5, display the image as show in FIG. 6) and the controller stores the composite product in one of a local storage of the electronic device and a remote storage.( Roh [0146], “The controller 180 may save the screen including the reaction image (S110).” Ni teaches storage of different devices in FIG. 2, 9. The combination rationale of claim 1 is incorporated here.)

Claims 11, 13, 16 recite similar limitations of claims 1, 3, 8 respectively, thus are rejected accordingly.
Claims 17, 19 recite similar limitations of claims 1, 3 respectively, thus are rejected accordingly.

Claim(s) 2, 4-5, 7, 10, 12, 14-15, 18, 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Roh in view of Ni and further in view of Matila (US 2014/0301645 A1).
Regarding claim 2, Roh in view of Ni teaches:
The electronic device of claim 1, wherein the controller: retrieves, …time data indicating a time of capture of a corresponding image relative to the presentation time at which the portion is presented within the first media content; synchronizes the time data for each received image with the presentation time of the portion of the media content; and generates the composite product from the at least one image having overlapping time data with the presentation time of the portion of the media content. (Roh [0219]-[0223], “FIG. 19 is a view for explaining a fourth embodiment of the present invention. As shown in FIG. 19, the controller 180 may display reaction images 33, 34, 35, 36, 37, 38, and 39 on the video playback screen 20 by continuously capturing a user P15 a plurality of times. The reaction images 33, 34, 35, 36, 37, 38, and 39 may be displayed in thumbnail form. As shown in (a) of FIG. 1, the controller 180 may receive a touch input for choosing any one of the reaction images 33, 34, 35, 36, 37, 38, and 39 displayed in thumbnail form with a finger F5. As shown in (b) of FIG. 1, the controller 180 may allow the reaction image 36 to jump to a point in time in the video when the reaction image 36 is captured and display the video, based on the input for choosing the thumbnail reaction image 36. That is, continuously captured reaction images may serve as indices which indicate the points in time when the individual reaction images are captured. Accordingly, when any one of the thumbnail reaction images is chosen, the video may start to play at the playback position in the video where the chosen reaction image is captured.”)
However, Roh in view of Ni does not, but Mattila teaches:
retrieves, from a header of each of the one or more images, time data ([0036], “In one embodiment, the user-captured images are stored in files that may include metadata in their file headers, including a time of capture, positioning data, and camera pose information. By way of example, the image headers may store the metadata according to the exchangeable image file format (EXIF).” [0038], “the image sharing service may be queried based on a time of capture,”)
Roh in view of Ni teaches tracking and saving image capturing time and retrieving the images based on the time, but does not teach how the time information is saved. Mattila teaches saving the time information in the header the image metadata.
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to have combined the teachings of Roh in view of Ni with the specific teachings of Mattila to easily save and retrieve time information of an image.

Regarding claim 4, Roh in view of Ni teaches:
The electronic device of claim 3, wherein the controller: detects a trigger event (Roh [0126]-[0127], “When the execution screen (e.g., video playback screen) of a predetermined application is displayed on the touch screen 151, the electronic device 100 according to the embodiment of the present invention may operate in reaction capture mode for capturing an image of a user seeing the touch screen 151 by operating the camera and the microphone. That is, upon detecting a reaction such as a user gesture while capturing the user watching a video by operating the camera and the microphone, the controller may operate in reaction capture mode that captures an image of the user and displays the captured image on the touch screen.”) logs the runtime of the timer tracking the presentation time of the first media content, wherein the time marker represents a corresponding runtime of the first media content when the trigger event occurred; (Roh [0219]-[0223], each reaction images represents a time marker.) activates the at least one image capturing device to capture the at least one image encompassing the field of view; (Roh [0126]-[0127]) performs a time-grab of a portion of the first media content being presented at that presentation time; and tags the portion with the runtime. (Roh [0219]-[0223], “FIG. 19 is a view for explaining a fourth embodiment of the present invention. As shown in FIG. 19, the controller 180 may display reaction images 33, 34, 35, 36, 37, 38, and 39 on the video playback screen 20 by continuously capturing a user P15 a plurality of times. The reaction images 33, 34, 35, 36, 37, 38, and 39 may be displayed in thumbnail form. As shown in (a) of FIG. 1, the controller 180 may receive a touch input for choosing any one of the reaction images 33, 34, 35, 36, 37, 38, and 39 displayed in thumbnail form with a finger F5. As shown in (b) of FIG. 1, the controller 180 may allow the reaction image 36 to jump to a point in time in the video when the reaction image 36 is captured and display the video, based on the input for choosing the thumbnail reaction image 36. That is, continuously captured reaction images may serve as indices which indicate the points in time when the individual reaction images are captured. Accordingly, when any one of the thumbnail reaction images is chosen, the video may start to play at the playback position in the video where the chosen reaction image is captured.”)
However, Roh in view of Ni does not, but Mattila teaches:
stores the runtime in a metadata of each of the at least one image; ([0036], “In one embodiment, the user-captured images are stored in files that may include metadata in their file headers, including a time of capture, positioning data, and camera pose information. By way of example, the image headers may store the metadata according to the exchangeable image file format (EXIF).” [0038], “the image sharing service may be queried based on a time of capture,”)
Roh in view of Ni teaches tracking and saving image capturing time and retrieving the images based on the time, but does not teach how the time information is saved. Mattila teaches saving the time information in the header the image metadata.
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to have combined the teachings of Roh in view of Ni with the specific teachings of Mattila to easily save and retrieve time information of an image.

Regarding claim 5, Roh in view of Ni teaches:
The electronic device of claim 4, wherein the controller: associates each of the logged runtime, the captured at least one image, and time-grabbed portion of the first media content with one another; (Roh [0219]-[0223], “FIG. 19 is a view for explaining a fourth embodiment of the present invention. As shown in FIG. 19, the controller 180 may display reaction images 33, 34, 35, 36, 37, 38, and 39 on the video playback screen 20 by continuously capturing a user P15 a plurality of times. The reaction images 33, 34, 35, 36, 37, 38, and 39 may be displayed in thumbnail form. As shown in (a) of FIG. 1, the controller 180 may receive a touch input for choosing any one of the reaction images 33, 34, 35, 36, 37, 38, and 39 displayed in thumbnail form with a finger F5. As shown in (b) of FIG. 1, the controller 180 may allow the reaction image 36 to jump to a point in time in the video when the reaction image 36 is captured and display the video, based on the input for choosing the thumbnail reaction image 36. That is, continuously captured reaction images may serve as indices which indicate the points in time when the individual reaction images are captured.”) stores …at least one image, and portion of the user selected media content in a local memory; and stores the composite product to the local memory. (Roh [0011], “the controller may save the execution screen of the predetermined application displaying the acquired reaction image as a single file in the memory.”)
However, Roh in view of Ni does not, but Mattila teaches:
stores the associated runtime, at least one image ([0036], “In one embodiment, the user-captured images are stored in files that may include metadata in their file headers, including a time of capture, positioning data, and camera pose information. By way of example, the image headers may store the metadata according to the exchangeable image file format (EXIF).” [0038], “the image sharing service may be queried based on a time of capture,”)
Roh in view of Ni teaches tracking and saving image capturing time and retrieving the images based on the time, but does not teach how the time information is saved. Mattila teaches saving the time information in the header the image metadata.
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to have combined the teachings of Roh in view of Ni with the specific teachings of Mattila to easily save and retrieve time information of an image.

Regarding claim 7, Roh in view of Ni and Mattila teaches:
The electronic device of claim 4, wherein: the first media content is presented to a second monitored area at a later time from when the first media content is presented to the local monitored area. (Roh [0219]-[0223] teaches the images and video can be displayed at later time when user selects a thumb image that represents a starting time of a replay. And Mattila teaches the images and video can be displayed in other devices. It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to have combined the teachings of Roh in view of Ni with the specific teachings of Mattila to provide a flexible user application.)

Regarding claim 10, Roh in view of Ni and Mattila teaches:
The electronic device of claim 7, wherein the controller transmits a copy of the composite product to each of the at least one second electronic device registered to participate in a collaborative media content consumption experience. (Ni, Abstract: “For example, a client device captures video of a user viewing content, such as a live stream video. Landmark points, corresponding to facial features of the user, are identified and provided to a user reaction distribution service that evaluates the landmark points to identify a facial expression of the user, such as a crying facial expression. The facial expression, such as landmark points that can be applied to a three-dimensional model of an avatar to recreate the facial expression, are provided to client devices of users viewing the content, such as a second client device. The second client device applies the landmark points of the facial expression to a bone structure mapping and a muscle movement mapping to create an expressive avatar having the facial expression for display to a second user.” FIG. 1 and 9 shows collaborative environment. The combination rationale of claim 1 is incorporated here. )

Claims 12, 14-15 recite similar limitations of claims 2, 4-5 respectively, thus are rejected accordingly.

Claims 18, 20 recite similar limitations of claims 2, 4 respectively, thus are rejected accordingly.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YANNA WU whose telephone number is (571)270-0725. The examiner can normally be reached Monday-Thursday 8:00-5:30 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alicia Harrington can be reached at 5712722330. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/YANNA WU/Primary Examiner, Art Unit 2615

Patent Application 18298834 - METHOD AND SYSTEM FOR CREATING GROUP HIGHLIGHT - Rejection

Patent Application 18298834 - METHOD AND SYSTEM FOR CREATING GROUP HIGHLIGHT

Application Information

Rejection Summary

Cited Patents

Office Action Text

Transform your business with AI in minutes, not months