Menu

Stronger. Faster. Better.

August 6, 2018

Pushing performance through computer architecture and algorithm development.

By Helen Hill for MGHPCC
David Kaeli heads the Northeastern University Computer Architecture Research (NUCAR) Laboratory, a group focused on the performance and design of high-performance computer systems and software.
While the remarkable advances that have been made in sensor-based and data-driven science over the past two decades have largely been occurred on the back of Moore’s Law, there is a growing consensus that similarly rapid growth cannot be relied on to last for ever.  To address this Kaeli and his group are exploring how advances in software and algorithm development can accelerate computation to increase the scale and complexity of problems that can be tackled.
Several students from the NUCAR lab shared their research as part of the HPC Day 2018 workshop held at Northeastern University in May. Here we spotlight three projects, one from a group who have been testing how machine learning techniques can be used to improve medical imaging, another from a group who have been trying to develop a better understanding of transient errors, and a third who have been assessing the effectiveness of novel high level synthesis approaches in increasing the attractiveness of FPGA architectures.

Denoising Monte Carlo Photon Transport Simulations
Leiming Yu, Yaoshen Yuan, Zhao Hang, David Kaeli, and Qianqian Fang
click here for pdf

Monte Carlo (MC) methods are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. They are considered to be the gold standard for modeling light propagation in turbid media, such as human tissue, and as such can be found at the heart of many medical imaging technologies. However MC-based simulations are inherently noisy, and while increasing the number of simulated photons reduces the amount of stochastic noise, it does so at the cost of proportionately increased run times.
Having previously explored using a Graphics Processing Unit (GPU) accelerated noise-adaptive non-local mean (ANLM) filter to improve low-photon MC simulations, new work from Leiming Yu et al, has been exploring neural network techniques to help denoise its MC photon transport simulations. Implementing a denoising neural network model to learn the noise, and to learn the photon energy degradation contour, in place of using an ANLM filter, the team were able to demonstrate a four fold improvement in the signal-to-noise ratio from their ANLM filtered simulations. They also showed that their denoised low-photon simulations result can attain comparable quality to one generated from a simulation with two or three orders of magnitude more photons.
The team is now turning its attention to using their neural network model to improve the intensity of the light source.
Transient Fault Handling and GPU Program Vulnerability
Charu Kalra, Fritz Previlon, and David Kaeli

click here for pdf

While transient faults remain a major concern for the High Performance Computing (HPC) community, computer scientists continue to lack a clear understanding of how such faults propagate within applications.
Working with Fritz Previlon and David Kaeli, Charu Kalra has been exploring two particular aspects of the vulnerabilities of HPC applications run on GPUs: their dependence on execution parameters (input data and block size), and their correlation with specific types of instructions, in particular scalar and vector type instructions.
The results Kalra presented at HPC Day 2018 show that the vulnerability of most of the programs studied was insensitive to changes in input values, except in less common cases when input values were highly biased but that corruption rates could vary by up to 8% when the block size changed.
In their work distinguishing the error propagation characteristics when faults occur in scalar, versus vector instructions, results left room for further work with the team finding that while some scalar opcodes showed positive correlation with outcome behaviors, others do not.
High Level Synthesis Approaches
Nicolas Bohm Agostini, Elmira Karimi, Jose Abellan, and David Kaeli.

click here for pdf
With a stalling of CPU clock frequency, a measure of processor speed, the HPC community has started looking into alternative architectures to speed processing time. Chief among the candidates are multicore-CPU, GPU, FPGA (Field Programmable Gate Array) and ASIC (applications Specific Integrated Circuit) architectures, however although each has its strengths and weaknesses, until now, despite performance and power consumption being critical factors where CPUs and GPUs tend to lag, the HPC community has typically focused its attention on them due to ease of use and performance considerations.
Motivated by recent advances in High Level Synthesis (HLS) tools which promise to bridge the “Ease of Use” gap that prevents FPGA accelerators from becoming mainstream, Bohm et al. have been exploring how the rise of new HLS development tools capable of making FPGA architectures easier to leverage could impact the HPC architectural landscape.
The team found that FPGA platforms exhibit good performances with great power efficiency, as compared to CPU and GPU platform solutions, while HLS tools are already capable of delivering similar performance to CPU platforms while also simplifying their programming flow. They expect increased use of FPGAs to increase platform adoption, in turn increasing interest from the research community, likely leading to more breakthroughs.
Story image: Shutterstock

Links

David Kaeli
Northeastern University Computer Architecture Research (NUCAR) Laboratory

Related

HPC Day 2018

Research projects

A Future of Unmanned Aerial Vehicles
Yale Budget Lab
Volcanic Eruptions Impact on Stratospheric Chemistry & Ozone
The Rhode Island Coastal Hazards Analysis, Modeling, and Prediction System
Towards a Whole Brain Cellular Atlas
Tornado Path Detection
The Kempner Institute – Unlocking Intelligence
The Institute for Experiential AI
Taming the Energy Appetite of AI Models
Surface Behavior
Studying Highly Efficient Biological Solar Energy Systems
Software for Unreliable Quantum Computers
Simulating Large Biomolecular Assemblies
SEQer – Sequence Evaluation in Realtime
Revolutionizing Materials Design with Computational Modeling
Remote Sensing of Earth Systems
QuEra at the MGHPCC
Quantum Computing in Renewable Energy Development
Pulling Back the Quantum Curtain on ‘Weyl Fermions’
New Insights on Binary Black Holes
NeuraChip
Network Attached FPGAs in the OCT
Monte Carlo eXtreme (MCX) – a Physically-Accurate Photon Simulator
Modeling Hydrogels and Elastomers
Modeling Breast Cancer Spread
Measuring Neutrino Mass
Investigating Mantle Flow Through Analyses of Earthquake Wave Propagation
Impact of Marine Heatwaves on Coral Diversity
IceCube: Hunting Neutrinos
Genome Forecasting
Global Consequences of Warming-Induced Arctic River Changes
Fuzzing the Linux Kernel
Exact Gravitational Lensing by Rotating Black Holes
Evolution of Viral Infectious Disease
Evaluating Health Benefits of Stricter US Air Quality Standards
Ephemeral Stream Water Contributions to US Drainage Networks
Energy Transport and Ultrafast Spectroscopy Lab
Electron Heating in Kinetic-Alfvén-Wave Turbulence
Discovering Evolution’s Master Switches
Dexterous Robotic Hands
Developing Advanced Materials for a Sustainable Energy Future
Detecting Protein Concentrations in Assays
Denser Environments Cultivate Larger Galaxies
Deciphering Alzheimer’s Disease
Dancing Frog Genomes
Cyber-Physical Communication Network Security
Avoiding Smash Hits
Analyzing the Gut Microbiome
Adaptive Deep Learning Systems Towards Edge Intelligence
Accelerating Rendering Power
ACAS X: A Family of Next-Generation Collision Avoidance Systems
Neurocognition at the Wu Tsai Institute, Yale
Computational Modeling of Biological Systems
Computational Molecular Ecology
Social Capital and Economic Mobility
All Research Projects

Collaborative projects

ALL Collaborative PROJECTS

Outreach & Education Projects

See ALL Scholarships
100 Bigelow Street, Holyoke, MA 01040