Patent Application 17804376 - IDENTIFYING AND LOCALIZING EDITORIAL CHANGES TO

Title: IDENTIFYING AND LOCALIZING EDITORIAL CHANGES TO IMAGES UTILIZING DEEP LEARNING

Application Information

Invention Title: IDENTIFYING AND LOCALIZING EDITORIAL CHANGES TO IMAGES UTILIZING DEEP LEARNING
Application Number: 17804376
Submission Date: 2025-05-14T00:00:00.000Z
Effective Filing Date: 2022-05-27T00:00:00.000Z
Filing Date: 2022-05-27T00:00:00.000Z
National Class: 382
National Sub-Class: 100000
Examiner Employee Number: 88165
Art Unit: 2673
Tech Center: 2600

Rejection Summary

102 Rejections: 0
103 Rejections: 2

Cited Patents

The following patents were cited in the rejection:

US 0273852🔗
US 0027732🔗

Office Action Text

DETAILED ACTION

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1 and 3-4 are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (“ObjectFormer for Image Manipulation Detection and Localization”) in view of Arslan et al. (US Pub. No. 2022/0027732 A1).
Regarding claim 1, Wang discloses, a non-transitory computer readable medium comprising instructions that, when executed by at least one processor, cause the at least one processor to perform operations comprising: (The image processing is inherently executed by a processor using instructions stored on a computer readable medium.)
generating a fused feature vector by combining deep features from the aligned first image and the second image; (See Wang p. 3 Section 3.1, “We then generate spatial patches of the same sizes using Gr and Gf and further flatten them to a sequence of C-d vectors with the length of L. We concatenate the two sequences to obtain a multimodal patch vector p.” Further see Fig. 2)
and generating one or more visual indicators, from the fused feature vector utilizing one or more neural network layers, identifying locations of editorial modifications in the first image relative to the second image. (See Wang p. 4 Section 3.4, “While for manipulation localization, we progressively upsample Gout by alternating convolutional layers and linear interpolation operations to obtain a predicted mask M^.” See Fig. 2 and Fig. 5)
Wang discloses the above limitations but he fails to disclose the following limitation.
However, Arslan discloses, aligning a first image with a second image to generate an aligned first image; (See Arlslan ¶34, “The first and second input image data 102-A, 102-B correspond to a digital representation of first and second images, respectively. Preferably, apparatus 100 further comprises a preprocessor (not shown) which is configured to preprocess the first and second input image data 102-A, 102-B in order to align the corresponding first and second images based on a plurality of predefined image points.”)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the preprocessing alignment of images prior to being compared via a parallel neural network as suggested by Arslan to Wang’s images that are input to the parallel neural network. This can be done using known engineering techniques, with a reasonable expectation of success. The motivation for doing so is in order to ensure that the images are aligned so that the comparison between them is more accurate.

Regarding claim 3, Wang and Arslan disclose, the non-transitory computer readable medium of claim 1, wherein the operations further comprise: generating a first set of deep features for the aligned first image utilizing a neural network feature extractor; (See Wang p. 3 left column first para, “We first extract the feature map Gr ∈ RHs×Ws×Cs and generate patch embeddings using a few convolutional layers, parameterized by g, for faster convergence.”)
and generating a second set of deep features for the second image utilizing the neural network feature extractor. (See Wang p. 3 Section 3.1, “After that, we input Xh to several convolutional layers to extract frequency features Gf .”)

Regarding claim 4, Wang and Arslan disclose, the non-transitory computer readable medium of claim 3, wherein generating the fused feature vector by combining deep features from the aligned first image and the second image comprises: generating a combination of the first set of deep features and the second set of deep features; and generating the fused feature vector from the combination utilizing a neural network encoder. (See Wang p. 3 Section 3,2, “The object encoder aims to learn a group of mid-level representations automatically that attend to specific regions in Gr/Gf and identify whether these regions are consistent
with each other. To this end, we use a set of learnable parameters o ∈ RN×C as object prototypes, which are learned to represent objects that may appear in images.”)

Claims 2, 18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (“ObjectFormer for Image Manipulation Detection and Localization”) in view of Arslan et al. (US Pub. No. 2022/0027732 A1) and in further view of Dong (US Pub. No. 2024/0273852 A1).
Regarding claim 2, Wang and Arslan disclose, the he non-transitory computer readable medium of claim 1, but they fail to disclose the following limitations.
However, Dong discloses, wherein aligning the first image with the second image comprises: generating an optical flow between the first image and the second image; and warping the first image utilizing a de-warping unit based on the optical flow to generate the aligned first image. (See Dong ¶4, “A conventional image alignment method is: calculating an optical flow field between a target image and a reference image, taking the optical flow field as a dense registration relation between the target image and the reference image, and finally aligning the target image to the reference image by means of back-warping.”)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to include the image alignment using optical flow as suggested by Dong to Wang and Arslan’s image pair alignment using known engineering techniques, with a reasonable expectation of success. The motivation for doing so is because optical flow enables the accurate determine of a moving object’s position.

Regarding claim 18, Wang, Arslan, and Dong disclose, the computer-implemented method comprising: aligning a first image with a second image to generate an aligned first image based on an optical flow between the first image and the second image; generating a first set of deep features for the aligned first image and a second set of deep features for the second image utilizing a neural network feature extractor; generating a fused feature vector by combining the first set of deep features and second set of deep features; and generating one or more visual indicators from the fused feature vector utilizing one or more neural network layers, the one or more visual indicators identifying locations of editorial modifications in the first image relative to the second image. (See the rejection of claim 2 as it is equally applicable for claim 18 as well.)

Regarding claim 20, Wang, Arslan, and Dong disclose, the computer-implemented method of claim 18, wherein generating a fused feature vector by combining the first set of deep features and second set of deep features comprises passing a combination of the first set of deep features and second set of deep features through a neural network encoder. (See the rejection of claim 4 as it is equally applicable for claim 20 as well.)

Allowable Subject Matter
Claims 5-8 and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Regarding claim 5, the non-transitory computer readable medium of claim 1, wherein generating the one or more visual indicators, from the fused feature vector utilizing the one or more neural network layers, identifying locations of editorial modifications in the first image relative to the second image comprises: generating a heat map from the fused feature vector utilizing a multilayer perceptron; and overlaying the one or more visual indicators on the first image based on the heat map. (The disclosed prior art of record fails to disclose the limitations of this claim.)

Regarding claim 6, the non-transitory computer readable medium of claim 1, wherein operations further comprise generating a classification for modifications of the first image relative to the second image as benign, editorial, or a different image. (The disclosed prior art of record fails to disclose the limitations of this claim.)

Regarding claims 7 and 8, these claims are objected to since they depend from objected to claim 6.

Regarding claim 19, the computer-implemented method of claim 18, wherein generating the one or more visual indicators comprises: generating a heat map from the fused feature vector utilizing a multilayer perceptron; and overlaying the one or more visual indicators on the first image based on the heat map. (The disclosed prior art of record fails to disclose the limitations of this claim.)

Claims 9-17 are allowed.
The following is an examiner’s statement of reasons for allowance:
Regarding claim 9, a system comprising: one or more memory devices comprising a set of trusted digital images; and one or more processors that are configured to cause the system to: search the set of trusted digital images for a trusted near-duplicate image to a query image; align the query image to the trusted near-duplicate image to generate an aligned query image; generating a fused feature vector by combining deep features from the aligned query image and the trusted near-duplicate image; and determine whether changes to the query image relative to the trusted near-duplicate image comprise benign changes or editorial changes by generating a classification from the fused feature vector utilizing one or more neural network layers. (The closed found prior art references are Wang et al. (“ObjectFormer for Image Manipulation Detection and Localization”) in view of Arslan et al. (US Pub. No. 2022/0027732 A1). However, Wang in view of Arslan do not disclose all the limitations of this claim.)

Regarding claims 10-17, these claims are allowed sine they depend from allowed claim 9.

Any comments considered necessary by applicant must be submitted no later than the payment of the issue fee and, to avoid processing delays, should preferably accompany the issue fee. Such submissions should be clearly labeled “Comments on Statement of Reasons for Allowance.”

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVID PERLMAN whose telephone number is (571) 270-1417.
The examiner can normally be reached on Monday - Friday; 10:00am -6:30pm.
Examiner interviews are available via telephone and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chineyere Wills-Burns can be reached at (571) 272-9752. The fax phone number for the organization where this application or proceeding is assigned is
(571) 273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at (866) 217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call (800) 786-9199 (IN USA OR CANADA) or (571) 272-1000.

/DAVID PERLMAN/Primary Examiner, Art Unit 2673

Patent Application 17804376 - IDENTIFYING AND LOCALIZING EDITORIAL CHANGES TO - Rejection

Patent Application 17804376 - IDENTIFYING AND LOCALIZING EDITORIAL CHANGES TO

Application Information

Rejection Summary

Cited Patents

Office Action Text

(Ad) Transform your business with AI in minutes, not months

Transform your business with AI in minutes, not months