17951587. TRAINING NEURAL NETWORKS WITH CONVERGENCE TO A GLOBAL MINIMUM simplified abstract (International Business Machines Corporation)

From WikiPatents
Jump to navigation Jump to search

TRAINING NEURAL NETWORKS WITH CONVERGENCE TO A GLOBAL MINIMUM

Organization Name

International Business Machines Corporation

Inventor(s)

Lam Minh Nguyen of Ossining NY (US)

TRAINING NEURAL NETWORKS WITH CONVERGENCE TO A GLOBAL MINIMUM - A simplified explanation of the abstract

This abstract first appeared for US patent application 17951587 titled 'TRAINING NEURAL NETWORKS WITH CONVERGENCE TO A GLOBAL MINIMUM

Simplified Explanation

The patent application describes a method for training a neural network with a non-convex architecture by approximating a convex optimization sub-problem to learn a common classifier from training data.

  • Initial weight vector selection: Choose an initial weight vector for the convex optimization sub-problem associated with the neural network.
  • Approximation and updating: Use at least one processor to approximate a solution to the convex optimization sub-problem, update the initial weight vector by subtracting the approximate solution times a learning rate, and repeat for multiple iterations.
  • Convergence to global minimum: Continue iterating until convergence to a global minimum is achieved, obtaining a final weight vector for the neural network to implement the common classifier.

---

      1. Potential Applications

This technology can be applied in various fields such as image recognition, natural language processing, and autonomous driving for developing efficient and accurate classifiers.

      1. Problems Solved

This technology addresses the challenge of training neural networks with non-convex architectures by approximating convex optimization sub-problems, leading to improved convergence and performance.

      1. Benefits

The benefits of this technology include faster training times, improved accuracy of classifiers, and the ability to handle complex neural network architectures effectively.

      1. Potential Commercial Applications

This technology can be utilized in industries such as healthcare for medical image analysis, finance for fraud detection, and e-commerce for personalized recommendations, offering enhanced machine learning capabilities.

      1. Possible Prior Art

Prior art may include techniques for training neural networks using gradient descent, stochastic gradient descent, or other optimization algorithms to minimize loss functions and improve model performance.

---

        1. Unanswered Questions
        1. How does this method compare to other approaches for training neural networks with non-convex architectures?

The article does not provide a comparison with other methods for training neural networks with non-convex architectures, such as genetic algorithms or reinforcement learning.

        1. What are the computational requirements of implementing this method on large-scale datasets?

The article does not discuss the computational resources needed to apply this method to training neural networks on extensive datasets, which could be crucial for real-world applications.


Original Abstract Submitted

Select an initial weight vector for a convex optimization sub-problem associated with a neural network having a non-convex network architecture loss surface. With at least one processor, approximate a solution to the convex optimization sub-problem that obtains a search direction, to learn a common classifier from training data. With the at least one processor, update the initial weight vector by subtracting the approximate solution to the convex optimization sub-problem times a first learning rate. With the at least one processor, repeat the approximating and updating steps, for a plurality of iterations, with the updated weight vector from a given one of the iterations taken as the initial weight vector for a next one of the iterations, to obtain a final weight vector for the neural network, until convergence to a global minimum is achieved, to implement the common classifier.