[RE-HPC'16 --- Final Deadline Aug. 22]: First Int'l Workshop on Resilience and/or Energy-aware techniques for High-Performance Computing

First International Workshop on Resilience and/or Energy-aware
techniques for High-Performance Computing (RE-HPC)

In conjunction with the International Green and Sustainable 
Computing Conference (IGSC), 2016. 

November 7-9, Hangzhou, China. 


NEWS: Selected papers presented at the workshop will be invited for a special issue in the Elsevier Journal Sustainable Computing: Informatics and Systems (SUSCOM).

Resilience and energy consumption have become two important concerns for high-performance computing (HPC) systems. With the increasing core count and technology miniaturization, today's large computing platforms (datacenters, clusters, supercomputers, etc.) are increasingly prone to failures. Faults are becoming norm rather than exception. Besides the classical fail-stop errors (such as hardware failures), soft errors (such as SDCs for silent data corruptions) constitute another threat that can no longer be ignored by the HPCcommunity. Another concern is energy. Presently, large computing centers are among the largest consumers of energy, hence measures must be taken to reduce energy consumption. Energy is needed not only to power the individual cores but also to provide cooling for the system. In today's datacenters, a large proportion of energy is spent on cooling and thermal-related activities. It is anticipated that the power dissipated to perform communications and I/O transfers will also make up a much larger share of the overall power consumption. The relative cost of communication is expected to increase dramatically, both in terms of latency/overhead and of consumed energy. Re-designing algorithms for HPC systems to ensure resilience and to reduce energy consumption will be crucial to achieving sustained performance. The link between resilience and energy must also be carefully tackled. Better resilience often requires redundancy (replication and/or checkpointing, rollback and recovery), which consumes extra energy. Hot cores may lead to less resilient computing or increase the probability of individual failures. On the other hand, reducing the energy consumption via voltage/frequency scaling techniques will increase the application running time, and hence the expected number of failures during execution. 

This workshop will encompass a broad range of topics related to resilience and energy efficiency for HPC. Its objective is to facilitate exchange of valuable information and ideas among researchers and practitioners. Topics of interest include (but are not limited to):
- Fault-tolerant algorithms, tools, and protocols
- Checkpointing, replication, and recovery techniques
- Detection and prediction of soft errors and SDCs
- System reliability, testing, and verification
- Resilience models, algorithms, and simulations
- Energy-efficient scheduling and resource management
- Power-aware runtime systems
- Energy-efficient I/O, storage, and networking
- Thermal behavior modeling, control and management
- Cooling-aware optimizations and evaluations
- Tradeoffs between performance, reliability, energy and temperature

Important Dates: 

Paper Submission:   August 22, 2016 (Final Extension)
Author Notification: September 15, 2016
Camera-ready Paper: October 1, 2016

Author Information:

Full papers following the guidelines of the International Green and Sustainable Computing (IGSC) Conference (http://igsc.eecs.wsu.edu/cfp_16) are sought. Authors should select Resilience and/or Energy-aware techniques for High-Performance Computing (RE-HPC'16) when submitting their papers on easychair (https://easychair.org/conferences/?conf=rehpc16). All submitted manuscripts will be reviewed and evaluated on correctness, originality, technical strength, significance, quality of presentation, and interest and relevance to the scope of the workshop. Papers presented at the workshop will be published in the official conference proceedings (through IEEE Digital Library) contingent on two conditions: (1) One author of each accepted paper must register for the conference at the time of the submission of the final manuscript and (2) One of the authors must appear to present the paper at the workshop. Please note that each accepted workshop paper will require a full IGSC registration at the IEEE member or at the non-member rate (NOT student rate). This means that there is no separate workshop-only registration.

Workshop Co-Chairs:

Anne Benoit, ENS de Lyon, France
Jean-Marc Pierson, University of Toulouse, France
Hongyang Sun, ENS de Lyon, France

Program Committee:

Guillaume Aupy, Vanderbilt University, USA
Leonardo Bautista-Gomez, Barcelona Supercomputing Center, Spain
Pascal Bouvry, University of Luxembourg, Luxembourg
Georges Da Costa, IRIT, University of Toulouse, France
Zhihui Du, Tsinghua University, China
Amina Guermouche, The University of Tennessee, Knoxville, USA
Sebastien Lafond, Åbo Akademi University and Turku Center for Computer Science, Finland
Hermann de Meer, University of Passau, Germany
Rami Melhem, University of Pittsburgh, USA
Ariel Oleksiak, Poznan Supercomputing and Networking Center, Poland
Dana Petcu, West University of Timisoara, Romania
Enrique Quintana-Orti, HPCA, Jeaume, Spain
Leonel Sousa, INESC, Portugal
Patricia Stolf, IRIT, University of Toulouse, France


Please email hongyang.sun@ens-lyon.fr for any questions.


