18209024. EFFICIENT DATA DISTRIBUTION PRESERVING TRAINING PARADIGM (Oracle International Corporation)

From WikiPatents
Revision as of 07:31, 19 December 2024 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

EFFICIENT DATA DISTRIBUTION PRESERVING TRAINING PARADIGM

Organization Name

Oracle International Corporation

Inventor(s)

Renata Khasanova of Zurich (CH)

Aneesh Dahiya of Zurich (CH)

Felix Schmidt of Baden-Dattwil (CH)

EFFICIENT DATA DISTRIBUTION PRESERVING TRAINING PARADIGM

This abstract first appeared for US patent application 18209024 titled 'EFFICIENT DATA DISTRIBUTION PRESERVING TRAINING PARADIGM



Original Abstract Submitted

A computer performs deduplication of an original training corpus for maintaining accuracy of accelerated training of a reconstructive or other machine learning (ML) model. Distinct multidimensional points are detected in the original training corpus that contains duplicates. Based on duplicates in the original training corpus, a respective observed frequency of each distinct multidimensional point is increased. In a reconstructive embodiment and based on a particular distinct multidimensional point as input, a reconstruction of the particular distinct multidimensional point is generated by a reconstructive ML model. Based on increasing the observed frequency of the particular distinct multidimensional point, a scaled error of the reconstruction of the particular distinct multidimensional point is increased. Based on the scaled error of the reconstruction of the particular distinct multidimensional point, accuracy of the reconstructive model is increased. In an embodiment, the reconstructive ML model is an artificial neural network that is a denoising autoencoder that detects anomalous database statements.