All Blogs / Trading Ideas / Neural Networks

Researchers hope Deep Learning algorithms can run on FPGAs and Supercomputers

25 March 2015, 13:05

TipMyPip

872

Machine learning has made big advances in the past few years, thanks in no small part to new methods for scaling out compute-intensive workloads across more cores. A batch of newly National Science Foundation-funded research project suggests we might just be seeing the tip of the iceberg in terms of what’s possible, as researchers try to scale techniques such as deep learning across more computers and new types of processors.

One particularly interesting project, which is being carried out by a team at the State University of New York at Stony Brook, aims to prove the field-programmable gate arrays are superior to graphics-processing units, or GPUs, when it comes to running deep learning algorithms faster and more efficiently. This flies in the face of current conventional wisdom, which holds that GPUs, with their thousands of cores per device, are the default choice for speeding up the power-hungry models.

According to the project abstract Get all the news you need about Cloud with the Gigaom newsletter Subscribe “The [principal investigators] anticipate demonstrating that the slowest portion of the algorithm on the GPU will achieve significant speedup on the FPGA, arising from the efficient support of irregular fine-grain parallelism. Meanwhile, the fastest portion of the algorithm on the GPU is anticipated to run with comparable performance on the FPGA, but at dramatically lower power consumption.” Actually, though, the idea of running these types of models on hardware other than GPUs isn’t entirely new.

IBM, for example, recently made a splash with a new brain-inspired chip it claims could be ideal for running neural networks and other cognitive-inspired workloads. And Microsoft Research demonstrated in July its Project Adam work, which reworked a popular deep learning technique to run on everyday Intel CPU processors.

Because of their customizable nature, FPGAs have been picking up a little momentum themselves, too. In June, Microsoft explained how it’s speeding up Bing search by offloading certain parts of the process to FPGAs. Later that month, at Gigaom’s Structure conference, Intel announced a forthcoming hybrid chip architecture that will co-locate an FPGA alongside a CPU (they’ll actually share memory), primarily targeting specialized big data workloads like those Microsoft has with Bing.

However, FPGAs aren’t the only possible new infrastructural choices for deep learning models. The NSF has also funded a project from a New York University researcher to test out deep learning algorithms, as well as other workloads, on Ethernet-based Remote Direct Memory Access. Most commonly used in supercomputers, but now making its way into some enterprise systems, RDMA interconnects speed up the transfer of data between computers by sending messages directly to memory and avoiding the CPU, switches, and other components that add latency to the process.

Speaking of supercomputers, another new NSF-funded project — this one led by machine learning expert Andrew Ng of Stanford (and Baidu and Coursera), and supercomputing gurus Jack Dongarra of the University of Tennessee and Geoffrey Fox of Indiana University — aims to make deep learning models programmable using Python and port them to supercomputers and scale-out cloud systems. The project, which received nearly $1 million in NSF grants, is called Rapid Python Deep Learning Infrastructure.

According to its abstract description: “RaPyDLI will be built as a set of open source modules that can be accessed from a Python user interface but executed interoperably in a C/C++ or Java environment on the largest supercomputers or clouds with interactive analysis and visualization. RaPyDLI will support GPU accelerators and Intel Phi coprocessors and a broad range of storage approaches including files, NoSQL, HDFS and databases.”

All the work being done to make deep learning algorithms more accessible and improve their performance — and these three projects are just a small fraction of it — will be critical if the the approach is ever going to make its way into commercial settings beyond giant web companies, or into research centers and national labs using computer to tackle truly complicated problems.

This broader future for deep learning, and artificial intelligence in general, is the theme of our upcoming Gigaom meetup, which takes place Sept. 17 in San Francisco. Ng will be one of the presenters, along with experts from Google and Microsoft Research, and several researchers and entrepreneurs trying to streamline the process of putting these techniques to work.

#Neural networks, FPGA, SuperComputers, Deep Learning

Source

To add comments, please log in or register

2594
0
2

Researchers hope Deep Learning algorithms can run on FPGAs and Supercomputers

Markovian State Spaces and Dynamic Confluence Trajectories: Engineering Non-Linear Risk Cascades in Multi-Asset MQL5 Cod

The Boltzmann Matrix: How Energy-Based AI Models Revolutionize Neural Network Architecture

Crypto Kong ML - Hybrid AI Trading System with Neural Network for MetaTrader 5

Adaptive AI Trading Engine for MT5

Deep Reinforcement Learning in MQL5: A Primer

Sunday PROMO: 50% OFF Flash Sale (Only 10 Coupons)

The Battle of the AIs: ONNX Models vs. LLMs (via WebRequest) in MetaTrader 5

How I Built a Hybrid, ML-Powered EA for MT5 (And Why a "Black Box" Isn't Enough)

How AI Helps You Make Money on Forex

Range Brain AI v3.0 Released: Enhanced Neural Network Performance and New Real-Time Monitoring Panel

Action Value Functional Variations and Bellman Optimality Fields: Embedding High Speed Q Learning Matrices for Native MQ

Temporal Difference Learning and Policy Gradient Optimization Fields: Engineering Native MQL5 Reinforcement Learning Arc

Statistical Ergodes and Eigenvalue Realization Trajectories in Quantitative Asset Architecture: Local Optimization Field

Why Do So Many Traders Lose Profits They Already Earned?

Non-Linear Probability Fields in Algorithmic Trading: Mathematical Rigor and Deep Learning Architectures in Live Market

The Architecture of True Machine Learning in MQL5: Why API-Dependent Trading Systems Fail and How to Build Native, On-Ch

Neural Networks in Trading: Why AI Systems Are Becoming the New Market Filter

No Repaint XAU Hunter indicator

SMART STRUCTURE

The Comeback Is Happening Live

Asymmetric Alpha Generation and Algorithmic Liquidity Scaling: Why Bitcoin is the Ultimate Asset Class for Quantitative

Smart Pair Trading — Input Parameters Guide

What Your Trading Could Look Like In 30 Days

Currently Short on BITCOIN|| Quant Direction 3-D Bullish Confluence Setup