HPC 2012
High Performance Computing, GRIDS and clouds
An International
Advanced Workshop
June 25-29,
2012, 

Final Programme
| 
   | 
  
   International Programme Committee  | 
 |||||||||||||||||||||||||||||||||||||||||||||||||||||
  
 Organizing Committee 
  | 
 ||||||||||||||||||||||||||||||||||||||||||||||||||||||
 Sponsors
| 
   IBM  | 
  
  
  | 
 |
| 
   | 
  
   | 
 |
| 
   HEWLETT PACKARD  | 
  
   
  | 
 |
| 
   | 
  
   | 
 |
| 
   MICROSOFT  | 
  
   
  | 
 |
| 
   | 
  
   | 
 |
| 
   | 
  
   | 
 |
| 
   JUELICH SUPERCOMPUTING
  CENTER, Germany  | 
  
   
  | 
 |
| 
   | 
  
   | 
 |
| 
   | 
  
   | 
 |
| 
   Nvidia Corporation  | 
  
   
  | 
 |
| 
   | 
  
   | 
 |
| 
   | 
  
   | 
 |
| 
   | 
  
   | 
 |
| 
   | 
  
   | 
 |
| 
   | 
  
   | 
 |
| 
   Advance project  | 
  
   
  | 
 |
| 
   | 
  
   | 
 |
| 
   Amazon Web
  Services  | 
  
   
  | 
 |
| 
   Free Amazon web Service credits for all HPC
  2012 delegates Amazon is very pleased to be able to provide
  $200 in service credits to all HPC 2012 delegates. Amazon Web Services
  provides a collection of scalable high performance and data-intensive
  computing services, storage, connectivity and integration tools. From GPUs, to tightly coupled workloads on EC2; from 50k core
  scale out systems to map/reduce and Hadoop, utility
  computing is a good fit for a variety of HPC workloads. For more information, visit our website: http://aws.amazon.com/hpc   | 
 ||
| 
   | 
  
   | 
 |
| 
   Bull  | 
  
   
  | 
 |
| 
   | 
  
   | 
 |
| 
   The Chain Project  | 
  
   
  | 
 |
| 
   | 
  
   | 
 |
| 
   Convey Computer  | 
  
   
  | 
 |
| 
   | 
  
   | 
 |
| 
   Cray Inc.  | 
  
   
  | 
 |
| 
   | 
  
   | 
 |
| 
   E4 Computer
  Engineering  | 
  
   
  | 
 |
| 
   | 
  
   | 
 |
| 
   ENEA – Italian
  National Agency for New Technologies, Energy and the Environment (t.b.c.)  | 
  
   
  | 
 |
| 
   | 
  
   | 
 |
| 
   eXludus  | 
  
   
  | 
 |
| 
   | 
  
   | 
 |
| 
   Fujitsu  | 
  
   
  | 
 |
| 
   | 
  
   | 
 |
| 
   Loongson  | 
  
   
  | 
 |
| 
   | 
  
   | 
 |
| 
   National Research
  Council of  ICAR - Institute
  for High Performance Computing and Networks  | 
  
   
  | 
 |
| 
   | 
  
   | 
 |
| 
   | 
  
   
  | 
 |
| 
   | 
  
   | 
 |
| 
   | 
  
   
  | 
 |
2012 Media Sponsors
| 
  
   HPCwire is
  the #1 resource for news and information from the high performance computing
  industry. HPCwire continues to be the portal of
  choice for business and technology professionals from the academic,
  government, industrial and vendor communities who are interested in high
  performance and computationally-intensive computing, including systems,
  software, tools and applications, middleware, networking and storage. To receive your complimentary
  subscription, visit: http://www.hpcwire.com/xs/register.  | 
 
| 
  
   HPC in the Cloud is the only
  portal dedicated to covering data-intensive cloud computing in science,
  industry and the data center. The publication
  provides technology decision-makers and stakeholders in the high performance
  computing industry (spanning government, industry, and academia) with the
  most accurate and current information on developments happening in the point
  where high performance and cloud computing intersect. Free subscriptions for the
  community! Subscribing is free! Visit:
  http://www.hpcinthecloud.com/xs/register.  | 
 
| 
  
   Datanami is a news portal dedicated to
  providing insight, analysis and up-to-the-minute information about emerging
  trends and solutions in big data. The portal sheds light on all cutting edge
  technologies including networking, storage and applications, and their effect
  upon business, industry, government, and research. The publication examines
  the avalanche of unprecedented amounts of data and the impact the high-end
  data explosion is having across the IT, enterprise, and commercial markets. Subscriptions are complimentary! Visit: www.datanami.com  | 
 
| 
   Speakers Frank Baetke Global
  HPC Programs Academia
  and Scientific Research Hewlett
  Packard Palo Alto, CA USA Natalie Bates Energy
  Efficient HPC Working Group Bill Blake Cray Inc. Marian Bubak Department
  of Computer Science and ACC Cyfronet, AGH  and Informatics
  Institute,  Asmterdam THE  Charlie Catlett Math
  & Computer Science Div. and  Computation
  Institute of The  Alok Choudhary Northwestern University Evanston, IL USA Marcello Coppola STMicroelectronics Advanced
  System Technology Grenoble Lab Timothy David Centre
  for Bioengineering Christchurh NEW
  ZEALAND Erik D’Hollander Ghent University BELGIUM Beniamino Di Martino Dipartimento di Ingegneria dell'Informazione Seconda Universita' di Napoli Aversa,
  CE Jack Dongarra Innovative
  Computing Laboratory and USA Sudip S. Dosanjh SANDIA National Labs Albuquerque,
  NM USA Ron Dror D. E. Shaw
  Research New York USA Massimiliano Fatica NVIDIA Corporation Santa Clara, CA USA Ian Foster Argonne
  National Laboratory and Dept. of Computer
  Science The  Argonne
  & Chicago, IL USA Geoffrey Fox Community
  Grid Computing Laboratory Guang Gao Department
  of Electrical and Computer Engineering Newark, Delaware USA Carlos Garcia Garino Information
  and Communication Technologies
  Institute  Universidad
  Nacional de Cuyo Dale Geldart eXludus
  Technologies, Inc. Corporate
  Headquarters Montréal,
  Québec Wolfgang Gentzsch HPC
  Consultant formerly SUN
  Microsystems and Vladimir Getov Alfredo Goldman Department
  of Computer Science  Sergei Gorlatch Universitaet Muenster Institut
  fuer Informatik Muenster Weiwu Hu Institute
  of Computing Technology Peter Kacsuk MTA
  SZTAKI Odej Kao Complex
  and Distributed IT Systems Technische Universitat Janusz Kowalik University of
  Gdansk POLAND Thomas Lippert Institute
  for Advanced Simulation and and John von
  Neumann Institute for Computing (NIC) also Europen
  PRACE IP Projects and of the DEEP Exascale Project Yutong Lu Bob Lucas Computational
  Sciences Division Information
  Sciences Institute  Patrick Martin School of
  Computing Queen’s
  University Ken Miura Center
  for Grid Research and Development National  Jean-Pierre Panziera Extreme Computing Division Bull Valerio Pascucci Center for
  Extreme Data Management, Analysis and Visualization, Scientific
  Computing and Imaging Institute School of
  Computing and Dana Petcu Computer
  Science Department Tadeusz Puzniakowski Judy Qiu and Pervasive
  Technology Institute Bloomington, IN USA Mark Seager INTEL Corporation Santa Clara, CA USA Alex Shafarenko Department
  of Computer Science Hatfield Sunil Sherlekar Parallel
  Computing Research INTEL
  Labs Thomas  and Alex Szalay Department
  of Physics and Department
  of Computer Science Baltimore,
  MD USA Gregory Tallant Lockheed
  Martin Aeronautics Company Kenji Takeda Microsoft
  Research Cambridge UK Domenico Talia Dept. of
  Electronics, Informatics and Systems Yoshio Tanaka AIST –
  National Institute of Advanced Industrial Science and Technology Tsukuba William M. Tang Dept. of Astrophysical
  Sciences, Plasma Physics Section Fusion
  Simulation Program and Princeton
  Institute for Computational Science and Engineering Princeton USA Jose Luis Vazquez-Poletti Dpt. de Arquitectura de Computadores y Automática Universidad
  Complutense de Madrid Steve Wallach Convey
  Computer Corporation Amy Wang Institute
  for Interdisciplinary Information Sciences Akinori Yonezawa RIKEN Advanced
  Institute of Computational Science and Department
  of Computer Science JAPAN  | 
 
Workshop Agenda
Monday, June 25th
| 
   Session  | 
  
   Time  | 
  
   Speaker/Activity  | 
 
| 
   | 
  
   9:00
  – 9:10  | 
  
   Welcome Address  | 
 
| 
   | 
  
   State of the Art and Future Scenarios  | 
 |
| 
   | 
  
   9:15
  – 9:45  | 
  
   J. Dongarra On the Future of High
  Performance Computing: How to Think for Peta and Exascale Computing  | 
 
| 
   | 
  
   9:45
  – 10:15  | 
  
   I. Foster  | 
 
| 
   | 
  
   10:15
  – 10:45  | 
  
   G. FOX Scientific Computing Supported by
  Clouds, Grids and Exascale Systems  | 
 
| 
   | 
  
   10:45
  – 11:15  | 
  
   K. Takeda  | 
 
| 
   | 
  
   11:15
  – 11:45  | 
  
   COFFEE BREAK  | 
 
| 
   | 
  
   11:45
  – 12:15  | 
  
   A. Szalay  | 
 
| 
   | 
  
   12:15
  – 12:45  | 
  
   S. Wallach  | 
 
| 
   | 
  
   12:45
  – 13:00  | 
  
   CONCLUDING REMARKS  | 
 
| 
   | 
  
   Emerging Computer Systems
  and Solutions  | 
 |
| 
   | 
  
   17:00
  – 17:30  | 
  
   F. Baetke  | 
 
| 
   | 
  
   17:30
  – 18:00  | 
  
   J.P. Panziera  | 
 
| 
   | 
  
   18:00
  – 18:30  | 
  
   W. Gentzsch  | 
 
| 
   | 
  
   18:30
  – 19:00  | 
  
   COFFEE BREAK  | 
 
| 
   | 
  
   19:00
  – 19:30  | 
  
   B. BLAKE Supercomputing and Big Data: where
  are the real boundaries and opportunities for synergy  | 
 
| 
   | 
  
   19:30
  – 20:00  | 
  
   S. Wallach  | 
 
| 
   | 
  
   20:00
  – 20:10  | 
  
   CONCLUDING REMARKS  | 
 
Tuesday, June 26th
| 
   Session  | 
  
   Time  | 
  
   Speaker/Activity  | 
 
| 
   | 
  
   Advances in HPC
  Technology and Systems I  | 
 |
| 
   | 
  
   9:00
  – 9:25  | 
  
   S. Sherlekar  | 
 
| 
   | 
  
   9:25
  – 9:50  | 
  
   W. Hu  | 
 
| 
   | 
  
   9:50
  – 10:15  | 
  
   D.
  GELDART  | 
 
| 
   | 
  
   10:15
  – 10:40  | 
  
   M.
  COPPOLA From Multi-Processor
  System-on-Chip to High Performance Computing  | 
 
| 
   | 
  
   10:40
  – 11:05  | 
  
   e. d’hollander Programming and Performance of
  a combined GPU/FPGA Super Desktop  | 
 
| 
   | 
  
   11:05
  – 11:35  | 
  
   COFFEE BREAK  | 
 
| 
   | 
  
   11:35
  – 12:00  | 
  
   M. Fatica Efficient utilization of
  computational resources in hybrid clusters  | 
 
| 
   | 
  
   12:00
  – 12:25  | 
  
   J. Kowalik Is heterogeneous computing a next
  mainstream technology in HPC?  | 
 
| 
   | 
  
   12:25
  – 12:50  | 
  
   T. PuzniakoWski  | 
 
| 
   | 
  
   12:50
  – 13:00  | 
  
   CONCLUDING REMARKS  | 
 
| 
   | 
  
   Advances in HPC
  Technology and Systems II  | 
 |
| 
   | 
  
   17:00 – 17:30  | 
  
   S. Gorlatch A Uniform High-Level Approach to
  Programming Systems with Many Cores and Multiple GPUs  | 
 
| 
   | 
  
   17:30 – 18.00  | 
  
   G. GAO A Codelet Based Execution
  Model and Its Memory Semantics  | 
 
| 
   | 
  
   18:00 – 18:30  | 
  
   M.
  BUBAK Environments for Collaborative
  Applications on e-Infrastructures  | 
 
| 
   | 
  
   18:30 - 19:00  | 
  
   COFFEE BREAK  | 
 
| 
   | 
  
   19:00 -19:30  | 
  
   A. Yonezawa Applications on K computer and Advanced
  Institute of Computational Science  | 
 
| 
   | 
  
   19:30 - 20:00  | 
  
   K. Miura  | 
 
| 
   | 
  
   20:00
  – 20:10  | 
  
   CONCLUDING REMARKS  | 
 
Wednesday, June 27th
| 
   Session  | 
  
   Time  | 
  
   Speaker/Activity  | 
 
| 
   | 
  
   Software and Architecture
  for Extreme Scale Computing I  | 
 |
| 
   | 
  
   9:00 – 9:30  | 
  
   M. Seager  | 
 
| 
   | 
  
   9:30 – 10:00  | 
  
   R. NAIR  | 
 
| 
   | 
  
   10:00 – 10:30  | 
  
   T.   | 
 
| 
   | 
  
   10:30 – 11:00  | 
  
   B. Lucas  | 
 
| 
                     | 
  
   11:00 – 11:30  | 
  
   COFFEE BREAK  | 
 
| 
   | 
  
   11:30 – 12:00  | 
  
   S. Dosanjh  | 
 
| 
   | 
  
   12:00 – 12:30  | 
  
   T. Lippert The EU Exascale Project DEEP -
  Towards a Dynamical Exascale Entry Platform  | 
 
| 
   | 
  
   12:30 – 13:00  | 
  
   Y.
  LU  | 
 
| 
   | 
  
   13:00 – 13:10  | 
  
   CONCLUDING REMARKS  | 
 
| 
   | 
  
   Software and
  Architecture for Extreme Scale Computing II  | 
 |
| 
   | 
  
   16:30 – 17:00  | 
  
   W. TANG Extreme Scale Computational Science Challenges in Fusion Energy
  Research  | 
 
| 
   | 
  
   17:00 – 17:30  | 
  
   N. Bates  | 
 
| 
   | 
  
   17:30 – 18:00  | 
  
   COFFEE
  BREAK  | 
 
| 
   | 
  
   18:00 – 20:00  | 
  
   PANEL DISCUSSION: Five
  years into exascale exploration: what have we
  learned? Chairman: P. Messina  Participants: F. Baetke, N. Bates, W. Blake, S. Dosanjh, T. Lippert,
  Y. Lu, B. Lucas, K. Miura, R. Nair, M. Seager, T. Sterling, W. Tang, S. Wallach  | 
 
Thursday, June 28th
| 
   Session  | 
  
   Time  | 
  
   Speaker/Activity  | 
 
| 
   | 
  
   Cloud Computing
  Technology and Systems I  | 
 |
| 
   | 
  
   9:00 – 9:25  | 
  
   V. Getov  | 
 
| 
   | 
  
   9:25 – 9:50  | 
  
   R. Martin  | 
 
| 
   | 
  
   9:50– 10:15  | 
  
   J. Vazquez-Poletti  | 
 
| 
   | 
  
   10:15 – 10:40  | 
  
   O. Kao  | 
 
| 
   | 
  
   10:40 – 11:05  | 
  
   D. Talia A Cloud Framework for Knowledge
  Discovery Workflows on Azure  | 
 
| 
                     | 
  
   11:05 – 11:35  | 
  
   COFFEE BREAK  | 
 
| 
   | 
  
   11:35 – 12:00  | 
  
   G. Fox FutureGrid exploring Next
  Generation Research and Education  | 
 
| 
   | 
  
   12:00 – 12:25  | 
  
   P. Kacsuk Executing Multi-workflow simulations
  on a mixed grid/cloud infrastructure using the SHIWA Technology  | 
 
| 
   | 
  
   12:25 – 12:50  | 
  
   D. Petcu Open-source platform-as-a-service:
  requirements and implementation challenges  | 
 
| 
   | 
  
   12:50 – 13:00  | 
  
   CONCLUDING REMARKS  | 
 
| 
   | 
  
   Cloud Computing
  Technology and Systems II  | 
 |
| 
   | 
  
   15:45 – 16:10  | 
  
   Y. Tanaka  Building Secure and Transparent Inter-Cloud
  Infrastructure for Scientific Applications  | 
 
| 
   | 
  
   16:10 – 16:35  | 
  
   J. Qiu  | 
 
| 
   | 
  
   16:35 – 17:00  | 
  
   A. Goldman  | 
 
| 
   | 
  
   17:00 – 17:30  | 
  
   COFFEE BREAK  | 
 
| 
   | 
  
   BIG DATA and
  Data-Intensive Computing  | 
 |
| 
   | 
  
   17:30 – 17:55  | 
  
   V. Pascucci  | 
 
| 
   | 
  
   17:55 – 18:20  | 
  
   W. Gentzsch EUDAT - European scientists and
  data centers turn to big data collaboration  | 
 
| 
   | 
  
   18:20 – 18:45  | 
  
   C. Catlett Smart Cities and Opportunities
  for Convergence of Open Data and Computational Modeling  | 
 
| 
   | 
  
   18:45 – 19:10  | 
  
   A. Choudhary Discovering Knowledge from
  Massive Social Networks and Science Data - Next Frontier for HPC  | 
 
| 
   | 
  
   19:15 – 20:15  | 
  
   PANEL DISCUSSION: Cloud
  Computing and Big Data: Challenges and Opportunities Chairmen: C. Catlett and V. Getov Participants: A. Choudhary,
  P. Martin, V. Pascucci, D. Talia  | 
 
Friday, June 29th
| 
   Session  | 
  
   Time  | 
  
   Speaker/Activity  | 
 
| 
   | 
  
   Challenging
  Applications of HPC, Grids and Clouds  | 
 |
| 
   | 
  
   9:00 – 9:25  | 
  
   G. Tallant High Performance Computing Challenges from an Aerospace
  Perspective  | 
 
| 
   | 
  
   9:25 – 9:50  | 
  
   T. David Macro-scale phenomena of arterial coupled cells: a Massively
  Parallel simulation  | 
 
| 
   | 
  
   9:50 – 10:15  | 
  
   R. Dror  | 
 
| 
   | 
  
   10:15 – 10:40  | 
  
   C. Garcia Garino Job scheduling of parametric
  computational mechanics studies on cloud computing infrastructure  | 
 
| 
   | 
  
   10:40 – 11:05  | 
  
   V. PASCUCCI  | 
 
| 
   | 
  
   11:05 – 11:35  | 
  
   COFFEE BREAK  | 
 
| 
   | 
  
   Advanced
  Infrastructures and Projects of HPC, Grids and Clouds  | 
 |
| 
   | 
  
   11:35 – 12:00  | 
  
   B. Di Martino  | 
 
| 
   | 
  
   12:00 – 12:25  | 
  
   A. Wang Smart Sensing for Discovering and
  Reducing Energy Wastes in Office Buildings  | 
 
| 
   | 
  
   12:25 – 12:50  | 
  
   A. SHAFARENKO Project ADVANCE: Ant Colony
  Optimisation (ACO) using coordination programming based on S-Net  | 
 
| 
   | 
  
   12:50 – 13:00  | 
  
   CONCLUDING REMARKS  | 
 
CHAIRMEN
Paul Messina
Argonne
National Laboratory
Argonne,
IL
Gerhard Joubert
Jack Dongarra
Innovative Computing
Laboratory
and
Ian Foster
and
Department of Computer Science
The 
USA
Bill Blake
Cray Inc.
Bill Blake
Cray Inc.
Wolfgang Gentzsch
HPC Consultant
formerly
SUN Microsystems and
Wolfgang Gentzsch
HPC Consultant
formerly
SUN Microsystems and
Bob Lucas
Computational Sciences
Division
Information Sciences Institute
Patrick Martin
School of Computing
Queen’s University
Patrick Martin
School of Computing
Queen’s University
PANELS
| 
  
   Five years into exascale
  exploration: what have we learned? It
  has already been five years since the first three workshops on exascale computing were organized. Literally dozens of
  additional workshops on various aspects of exascale
  computing have been held, Research&Development
  efforts have been launched by various countries, computer manufacturers have
  worked on roadmaps that would lead to affordable exascale
  systems, and computational scientists have identified myriad exciting
  advances that such systems would enable. What lessons have we learned from
  these activities that might help guide the considerable additional R&D
  that is needed on component technologies, system architecture integration,
  programming models, system and application software? The panelists will voice
  their opinions about the lessons learned and debate about the most fruitful
  future directions. Chairman:
  P. Messina Panelists: F. Baetke, N. Bates, W. Blake, S. Dosanjh, T. Lippert, Y. Lu, B. Lucas, K. Miura, R. Nair, M. Seager, T.
  Sterling, W. Tang, S. Wallach  | 
 
| 
  
   Cloud Computing and Big Data: Challenges and
  Opportunities Cloud
  computing represents a fundamental shift in the delivery of information technology
  services and has been changing the computing landscape over the last several
  years. Concurrently, an increasing number of application areas are grappling
  with challenges related to the scale and/or complexity of data - collectively
  called "big data" challenges. In both areas we see commercial
  successes as well as continuing research challenges. What
  are the overlaps between cloud computing, particularly at global scale, and
  big data? Is there room for working towards joint solutions? What classes of
  "big data" problems can be addressed via a cloud approach, and are
  there classes of data that are less effectively handled in a cloud
  environment? In this panel session, each of the panelists will present their
  position statements covering certain important aspects of this subject
  followed by a discussion of the future directions for research and
  development. Chairmen:
  C. Catlett and V. Getov Participants:
  A. Choudhary, P. Martin, V. Pascucci,
  D. Talia  | 
 
ABSTRACTS
| 
   On the Future of High Performance Computing: How
  to Think for Peta and Exascale
  Computing Jack Dongarra In this
  talk we examine how high performance computing has changed over the last 10-year
  and look toward the future in terms of trends. These changes have had and
  will continue to have a major impact on our software.  Some of the
  software and algorithm challenges have already been encountered, such as
  management of communication and memory hierarchies through a combination of
  compile--time and run--time techniques, but the increased scale of
  computation, depth of memory hierarchies, range of latencies, and increased
  run--time environment variability will make these problems much harder.   | 
 
| 
  
   Ian
  Foster Computation Institute Argonne National Laboratory &  We have made much progress over the past decade
  toward effectively harnessing the collective power of IT resources
  distributed across the globe. In fields such as high-energy physics,
  astronomy, and climate, thousands benefit daily from tools that manage and
  analyze large quantities of data produced and consumed by large collaborative
  teams. But we now face a far greater challenge: Exploding
  data volumes and powerful simulation tools mean that far more—ultimately
  most?--researchers will soon require capabilities not so different from those
  used by these big-science teams. How is the general population of researchers
  and institutions to meet these needs? Must every lab be filled with computers
  loaded with sophisticated software, and every researcher become an
  information technology (IT) specialist? Can we possibly afford to equip our
  labs in this way, and where would we find the experts to operate them? Consumers and businesses face similar challenges, and
  industry has responded by moving IT out of homes and offices to so-called
  cloud providers (e.g., Google, Netflix, Amazon, Salesforce),
  slashing costs and complexity. I suggest that by similarly moving research IT
  out of the lab, we can realize comparable economies of scale and reductions
  in complexity. More importantly, we can free researchers from the
  burden of managing IT, giving them back their time to focus on research and
  empowering them to go beyond the scope of what was previously possible. I describe work we are doing at the Computation
  Institute to realize this approach, focusing initially on research data
  lifecycle management. I present promising results obtained to date with the Globus Online system, and suggest a path towards
  large-scale delivery of these capabilities.  | 
 
| 
   Scientific Computing Supported by Clouds,
  Grids and Exascale Systems Geoffrey
  Fox Community Grid Computing Laboratory We analyze scientific computing into classes of
  applications and their suitability for different architectures covering both
  compute and data analysis cases and both high end and long tail users. We
  propose an architecture for next generation Cyberinfrastructure
  and outline some of the research challenges.  | 
 
| 
   Cloud computing for research and innovation Microsoft Research Connections EMEA Cloud computing is challenging the way we think
  about parallel and distributed computing, particularly in the context of HPC andthe Grid. It opens up many possibilities for how
  research, development and businesses can exploit compute, storage and
  services on-demand to exploit new opportunities across the whole spectrum of
  applications and domains. In this talk we discuss how the community has been
  exploring the use of Cloud Computing, including through the European Union
  Framework Programme 7 VENUS-C project, and the global Azure Research Engagement
  programme. We conclude with thoughts on how cloud computing is potentially
  reshaping the landscape of research and innovation.  | 
 
| 
   Extreme Data-Intensive Scientific Computing A. Szalay Department
  of Physics and Department of Computer Science Scientific
  computing is increasingly revolving around massive amounts of data. From
  physical sciences to numerical simulations to high throughput genomics and
  homeland security, we are soon dealing with Petabytes
  deployed various scientific test cases, mostly drawn from astronomy, over
  different architectures and compare performance and scaling laws. We discuss
  a hypothetical cheap, yet high performance multi-petabyte
  system currently under consideration at JHU. We will
  also explore strategies of interacting with very if not Exabytes
  of data. This new, data-centric computing requires a new look at computing
  architectures and strategies. We will revisit Amdahl's Law establishing the
  relation between CPU and I/O in a balanced computer system, and use this to
  analyze current computing architectures and workloads. We will
  discuss how existing hardware can be used to build systems that are much
  closer to an ideal Amdahl machine. We have large amounts of data, and compare
  various large scale data analysis platforms.  | 
 
| 
  
   Steve
  Wallach Convey Computer Corporation Big Data has been processed for decades. Classically
  the database size was constant or 
  gradually increasing. With the advent of searching, directed
  advertisement, social networking, worldwide electronic messaging and
  web-based applications, the database increases in real-time. This coupled
  with the availability of petabytes of storage,
  naturally leads to the need for new types of power aware computer
  architectures and knowledge discovery algorithms. This talk will focus on new types of algorithms, and
  architectures that are dynamically chosen based on the data type and data
  base size.  | 
 
| 
   Technology Trends in High Performance
  Computing Frank Baetke Global HPC Programs Academia and Scientific Research Hewlett Packard Palo
  Alto, CA, USA HP’s HPC
  product portfolio which has always been based on standards at the processor,
  node and interconnect level lead to a successful penetration of the High
  Performance Computing market across all application segments. The rich
  portfolio of the Proliant BL-series and the
  well-established rack-based Proliant DL family of
  nodes has been complemented bythe SL-series with
  proven Petascale scalability and leading energy efficiency.Very recently this portfolio has been extended
  by a new family of severs announced under the name “Moonshot”.
   Power and cooling
  efficiency is primarily an issue of cost, but also extends for the power and
  thermal density of what can be managed in a data center.  To leverage the economics of scale established
  HPC centers as well as providers of HPC Cloud
  services are evaluating new concepts which have the potential to make
  classical data center designs obsolete. Those new
  concepts provide significant advantages in terms of energy efficiency,
  deployment flexibility and manageability. Examples of this new approach,
  often dubbed POD for Performance Optimized Datacenter,
  including a concept to scale to multiple PFLOPS at highest energy efficiency
  will be shown.  Finally an outlook
  will be given towards systems families due end of the decade that will
  provide performance in excess of a 1000 Petaflops
  or 1 Exaflop.  | 
 
| 
   Efficient Architecture for Exascale Applications Jean-Pierre
  Panziera Extreme Computing Division Bull, France Now that more Petaflop
  systems are becoming available, the HPC industry is turning to the next
  challenge: Exascale.   | 
 
| 
  
   Wolfgang
  Gentzsch Executive HPC Consultant Fujitsu (external) With the
  K Computer installed at the RIKEN Advanced Institute for Computational
  Science in  While K
  and its commercial PRIMEHPC FX-10 joined the top systems, the x86 based
  PRIMERGY systems are completing the pyramid’s mid and bottom layers for
  mainstream HPC. All Fujitsu
  HPC systems are bundled into user-friendly ready-to-go solutions consisting
  of HPC hardware, middleware, HPC portal, and services, providing ease-of-use
  HPC for the different application segments. In addition, collaborations
  enabled by the SynfiniWay integrated software
  framework for virtualized distributed HPC. This
  presentation aims at providing an overview of Fujitsu’s HPC solution
  portfolio, from top-end supercomputing, to mid-market HPC, and technical
  cloud computing. We will demonstrate how Fujitsu as the world's third-largest
  IT services provider drives innovation in high performance computing for
  industry and research.  | 
 
| 
   Supercomputing and Big Data: where are the real boundaries
  and opportunities for synergy Bill Blake CTO and SVP, Cray, Inc.,  Supercomputing provides an increasing high fidelity view of the world
  through numerically intensive modeling and
  simulation techniques that support complex decision making and discoveries in
  the scientific and technical fields. Big Data Analytics, as it is called
  today, also provides an accurate view of the world through data intensive
  search, aggregation, sorting and grouping techniques that support complex
  decision making and knowledge discovery in the web and business transaction
  fields. The talk will explore the architectures, data models and programming
  models of Supercomputing and Big Data and in particular the implication to
  Cray's Adaptive Supercomputing Vision.  | 
 
| 
  
   Steve Wallach Convey Computer Corporation An overview of the architectural aspects, both
  hardware and software, of the convey’s thrust into
  data intensive computing.  | 
 
| 
  
   A confluence of Technology, Architectures &
  Algorithms Sunil Sherlekar Parallel Computing Research INTEL Labs For an engineer
  engaged in the design of (say) an aircraft, the ideal design tool is a
  computational Appliance — one that
  would be optimised for his/her computational needs in terms of performance,
  cost of capital (hardware), cost of operation (power consumption) and user
  interface. For aircraft design, these computational needs would typically
  involve computing the lift that the wings would generate and the atmosphericdrag that the aircraft would experience. A
  similar scenario can be painted for any designer who uses simulation and
  optimisation in his/her design flow. A custom-built appliance — right down to
  the compute engines in silicon — is, however, an expensive proposition, both
  in terms of design and fabrication. This becomes more so as we progress
  further into nanometre semiconductor fabrication technologies. Over the
  years, therefore, using general-purpose compute engines or processors has
  become a commonplace. For the
  last three decades or so, processors have shown a steady improvement in
  performance. Most of this is an outcome of  A
  virtuous cycle has been established between the semiconductor industry and
  application developers: while application developers eagerly use up
  increasing processor performance, they also set the expectation of higher
  performance from future processors. Over the
  last few years, however, this fairy-tale-like increase in clock frequency has
  hit a wall. This is because increasing clock frequency means increasing power
  consumption. Besides the economic downside of higher operating costs for HPC,
  this has now created the additional problem of dissipating the resulting
  heat. The only
  way to tackle the problem of heat dissipation is to produce less heat! The
  only way to produce less heat is to operate the processors at a lower clock
  frequency and lower operating voltage. If this is done, it also means — unfortunately
  — that each processor also has a lower performance! A lower performance at
  the system level is, of course, not
  acceptable. The
  semiconductor industry has tackled this dilemma by providing increasing
  performance through a technique that the HPC community has always used:
  increasing parallelism! The increasing proliferation of multi-core and
  many-core chips is a result of this strategy. The
  multi-core chips from Intel’s Xeon family provide for fine-grained
  parallelism through vector instructions or SIMD and coarse-gained parallelism
  through several cores on the same chip. This idea is taken further in Intel’s
  Knights or MIC (Many-Integrated Core) family. MIC provides for even greater
  parallelism through a larger SIMD width and a much larger number of cores on
  one chip. KNC, the first in this family to be made commercially available,
  provides a 1 TF performance on DGEMM as announced during SC’11. In the future, as we go into smaller fabrication process geometries, increasing performance will be provided through increasing parallelism on a chip while attempting to keep the power dissipation per chip constant. This will require addressing several issues: §        
  Reducing operating voltage while avoiding bit errors or minimising them and handling them at “higher levels”
  through error correction. §        
  Reducing bus power by using techniques such as current-mode signalling. §        
  Developing circuit design techniques to handle variations in
  transistor characteristics with a minimum impact on performance. §        
  Avoiding clocking and using “transition signalling”
  where possible. The other
  serious “wall” the semiconductor industry faces today is that of moving data.
  This problem has two facets. One, while the speed of moving data is
  increasing, it is not keeping pace with the increasing speed of computation.
  This is true both for moving data to and from memory into processors and for
  moving data between compute nodes in a system. This means the overall speed
  of computation is increasingly being limited by the bandwidth of memory and
  of interconnect networks. Secondly, the reduction in power consumed per unit
  of computation is happening faster than the reduction in power consumed to
  move data. This means it is getting increasingly cheaper, in terms of power
  consumption, to perform computation on data than to move it around! The
  technologies being pursued by the semiconductor industry to tackle the data
  movement wall include: §        
  Bringing the memory closer to the processors and increasing the data
  bus width by using the chip area and not just the perimeter (3D chip stacking
  with TSV’s or Through-Silicon Vias). §        
  Increasing the data rate by using optical signals. While Intel’s
  silicon photonics technologies help achieve this, electro-optical conversion
  at a miniaturised level is still a challenge. §        
  Better interconnect topologies. §        
  Obviating the constraints of topologies by using free-space
  communication using steered laser beams: still up in the air! Even if
  all of the above technologies were to bear fruition, the problem will only be
  alleviated; it is quite unlikely that it will actually go away. The key,
  therefore, it to develop “communication avoiding” algorithms — those that
  reduce data movement even at the cost of increased computation. This can be
  done at several levels of abstraction. Going
  forward, we at Intel are committed to expand our design strategy to encompass
  a top-down approach. This means designing architectures that explicitly take the requirements of
  application developers into account. In the near to mid-term, the following
  are some of the ideas that may deserve consideration: §        
  Should the high-speed memory that can be created using 3D chip
  stacking be a program-addressable memory or a (last-level) cache? §        
  Should we have cache memory at all or should all memory be
  program-addressable? What are the implications for power consumption? §        
  For a program-addressable memory hierarchy — when the data traffic is
  program generated and not for cache coherence — what on-chip interconnect
  architectures would be most suitable? §        
  If all memory is program-addressable, can compiler technology
  alleviate the programmer’s burden to manage date transfer between various
  levels of memory? §        
  If, say for legacy reasons, it is necessary to have a cache hierarchy,
  would it help if the cache replacement policy were to take care of the data
  access patterns of a given application? Can data access patterns be characterised
  for this purpose? Would it still help to allow the programmer to define
  his/her own cache replacement policy? §        
  With a large — and perhaps increasing — SIMD width such as that on
  Intel’s MIC processors, would it help if, instead of SIMD, we could carry out
  more than one operation on different parts of the SIMD register? In
  particular, is VLIW better than SIMD? Should the architecture allow a
  programmer-controlled, application-driven trade-off? §        
  Are hardware blocks specific to application domains a good idea? The point
  about all the above ideas — and many others — is not that they are
  particularly radical. It is that evaluating their impact in terms of various
  applications needs a huge investment in design time and prototyping costs. If
  this analysis can be carried out without the need of prototyping, it would be
  a great boon. As a first step, it would help create a formal description of
  hardware that is more abstract than RTL so as to be tractable but less
  abstract than ISA so as to be useful. As a
  company, Intel’s commitment to application-driven architecture design is
  enabled by the fact that we can optimise all aspects of the design and
  fabrication process. In the final analysis, the biggest problems that need to
  be solved in the long-term are
  those that involve fabrication. We are also committed to ensure backward
  compatibility (to support all “legacy applications) and to support a
  continuity of programming paradigms (to minimise programming effort). This
  brings us back to the issue of providing Design
  Appliances which are tailored to specific application domains. Especially
  with the increasing cost of foundries that cater to nanometre-scale
  geometries, it seems impractical to use hardware that is application
  specific. We can arrive at a solution, however, by looking at the exact requirements
  of an HPC appliance: §        
  Efficient computing that can solve HPC problems in a reasonable amount
  of time at a reasonable cost. §        
  A user-interface that is tailored to the application domain and
  “talks” the language of the domain (instead of the language of computer
  science or electronics). §        
  A service that is provided on-demand and independent of the location
  of the user. A
  possible way of providing such appliances would be to use the “Cloud” model.
  This would entail: §        
  Setting up several petascale HPC systems
  based on standard, general-purpose processors and a generous repertoire of
  application software. §        
  Connecting these systems to one another and to all the users through a
  high-speed network. §        
  Implementing application-specific, user-interface software on end-user
  devices for visualisation and interaction with the
  application software on the HPC systems. Besides
  the continuing improvements in computing technologies, creating such
  appliances will need: a)      Developing highly reliable, truly high-bandwidth
  wireless communication technologies. This is needed to support the transfer
  of huge amounts of data that some application generate to end-user devices on
  the go and b)      Flexible display panels that can
  be rolled up or folded to be easily carried and temporarily pinned or stuck
  on walls for use. This is to support high-quality visualisation
  of simulation results on the go. If this
  is done, we would have created, for each application domain, a Virtual Appliance — something that
  combines the customised experience of a real appliance with the economy of a
  general-purpose shared system.  | 
 
| 
   The Chinese Godson Microprocessor for HPC Weiwu HU Institute of Computing Technology The presentation will briefly introduce the
  Godson CPU roadmap for high performance computers (HPC). Servers and HPCs use the same CPU before the year of 2012. Under the
  background of building 100PFLOPS HPC in the year of 2015, the CPU for HPC
  should reach TeraFLOPS performance. Different CPUs will be designed for servers
  and HPCs. Server CPU will take the traditional
  multi-core architecture, while HPC CPU will take many-core or long vector
  architecture. Bandwidth limitation and power consumption limitation
  will be the big challenge for HPC CPU design.  | 
 
| 
  
   Dale Geldart eXludus Technologies, Inc. Corporate Headquarters Montréal,  As core counts continue to rise, the need to safely and reliably run
  more concurrent tasks on each system also increases if we are to maximize
  processor and energy efficiency. Concurrently running more tasks, however,
  can lead to increased shared resource conflicts that can degrade efficiency,
  especially as in many cases memory per core is decreasing, which puts more
  pressure on memory resources. New lightweight micro-virtualization strategies
  can help users improve system efficiency while avoiding these shared resource
  conflicts.  | 
 
| 
   From Multi-Processor System-on-Chip to High
  Performance Computing Marcello
  Coppola STMicroelectronics, Advanced System TechnologyGrenoble
  Lab Current high-end multicore architectures when
  designed for maximum speed waste available transistors, computation time,
  memory bandwidth , pipeline flow (optimized for sequential operation)
  resulting in a power efficiency that is one or two orders of magnitude away
  from what HPC demands.  Today,
  architecture designed for mobile and embedded systems, employing
  energy-efficient components, represent a valid alternative is to standard multicore architecture. 
  .In this presentation, first some example of MPSoC
  architectures used in high end consumer markets is presented. Next, we
  introduce how technology and innovative heterogeneous architecture could be
  used to implement modern HPC. Finally we conclude the presentation showing
  the power of MPSoC architectures in delivering
  substantial performance improvements in high-performance computing
  applications.  | 
 
| 
   Programming
  and Performance of a combined GPU/FPGA Super Desktop Erik D’Hollander The high-performance
  of GPUs have made personal supercomputing a reality
  in many applications exhibiting single program multiple data parallelism.
  Programs with less obvious parallelism may be accelerated by field
  programmable gate arrays or FPGAs, which complement
  the computing power by a very flexible and massively parallel architecture. Field programmable
  gate arrays provide a programmable architecture which allows to embed an
  algorithm into hardware and drive it with data streams. A multicore
  CPU accelerated by GPUs and FPGAs
  is a hybrid heterogeneous system with a huge computational power and a large
  application area. We present a super
  desktop computer consisting of a GPU and two FPGAs
  and describe the interconnections, the tool chain and the programming
  environment. The performance of GPUs and FPGAs as accelerators
  of desktops and supercomputers is restricted by the traffic lanes between the
  processor and the accelerator. The roofline model by Williams et al. is able
  to represent both the raw computing performance and the input-output
  bottleneck in a single graph. Whereas the roofline is completely determined
  by the characteristics of processors with a fixed architecture, this is not the
  case for reconfigurable processing elements such as FPGAs.
  On the contrary, in this case the roofline model may be used to optimize the
  resource utilization and the input-output channels as to obtain the maximum
  performance for a particular application. The design and quality of different
  hardware implementations of the same algorithm is enhanced by the strength of
  modern high-level synthesis tools such as AutoESL
  and ROCCC, which facilitate the development of powerful reconfigurable
  systems. We present the results of a number of image processing algorithms
  where the roofline model was used to obtain the maximum performance with a
  balanced resource usage and maximum input-output yield. It is shown that the
  modern high level tools vary significantly with respect to development time
  and performance of the resulting computational architecture.  | 
 
| 
   Efficient utilization of computational resources in
  hybrid clusters Massimiliano Fatica NVIDIA Corporation Santa Clara, CA,
  USA Efficient
  utilization of computational resources in hybrid clusters. Hybrid
  clusters composed by node accelerated with Graphics Processor Units (GPUs) are moving quickly from the experimental stage into
  production systems. This talk
  will present two examples in which the computational workload is split
  between CPU cores and GPUs in order to fully
  utilize the computational capabilities of hybrid clusters. The first
  example will describe a library that accelerates matrix multiplications,
  currently used in the CUDA accelerated HPL code and in quantum chemistry
  codes. The
  second example is from TeraTF, a CFD code part of
  the SPEC-MPI suite. In both
  cases close to optimal performances could be achieved taking particular care
  of the data movement and by using a combination of MPI, OpenMP
  and CUDA.  | 
 
| 
   Is heterogeneous computing a next
  mainstream technology in HPC? Janusz Kowalik Heterogeneous
  computing is regarded as a technology on the path to the exascale
  computation. However current architectural and programming trends point to
  significant changes that may replace the notion of the heterogeneous
  computing. by the
  classic idea of SMP with massive parallelism. Hence the answer to the title
  question is a good topic for a workshop discussion.  | 
 
| 
  
   Tadeusz Puźniakowski The OpenCL standard is a
  relatively new standard that allows for computation on heterogeneous
  architectures. The first part of the presentation summarizes basic rules and
  abstractions used in OpenCL. The main part will
  contain the experimental results related to a linear algebra algorithm
  implemented with different methods of optimization and run on different
  hardware as well as the same algorithm run using OpenMP.  | 
 
| 
   A Uniform High-Level Approach to Programming Systems with Many Cores
  and Multiple GPUs Universitaet Münster Institut für Informatik Application programming for modern heterogeneous
  systems which comprise multiple multi-core CPUs and GPUs
  is complex and error-prone. Approaches like OpenCL
  and CUDA are low-level and offer neither support for multiple GPUs within a stand-alone computer nor for systems that
  integrate several computers. Distributed systems require programmers to use a
  mix of different programming models, e.g., MPI together with Pthreads, OpenCL or CUDA. We propose a uniform approach based on the OpenCL standard for programming both stand-alone and
  distributed systems with GPUs. The approach is based on two parts: 1) the SkelCL library
  for high-level application programming on stand-alone computers with
  multi-core CPUs and multiple GPUs, and 2) the dOpenCL
  middleware for transparent execution of OpenCL
  programs on several stand-alone computers connected over a network. Both parts are built on top of the OpenCL standard which ensures their high portability
  across different kinds of processors and GPUs. The SkelCL library
  offers a set of pre-implemented patterns (skeletons) of parallel computation
  and communication which greatly simplify programming for multi-GPU systems.
  The library also provides an abstract vector data type and a high-level data
  (re)distribution mechanism to shield the programmer from the low-level data
  transfers between a system's main memory and multiple GPUs.The
  dOpenCL middleware extends OpenCL,
  such that arbitrary computing devices (multi-core CPUs and GPUs) in a distributed system can be used within a single
  application, with data and program code moved to these devices transparently. In this talk, we describe SkelCL
  and dOpenCL and illustrate how they are used
  together to simplify programming of heterogeneous HPC systems with many cores
  and multiple GPUs.  | 
 
| 
   Environments
  for Collaborative Applications on e-Infrastructures Marian Bubak Department of
  Computer Science and ACC Cyfronet, AGH  Institute for
  Informatics,  Development and
  execution of e-science applications is a very demanding task. They are
  collaborative, used in dynamics scenarios (similar to experiments) and there
  is a need to link them with publications [8]. Most of them are used to solve
  problems which are multi-physics and multi-scale what results in various
  levels of coupling of applications components. Besides of being compute
  intensive, more and more often they are data also intensive. This talk presents
  and evaluates a few approaches to development and execution of such e-science
  applications on currently available e-infrastructures like grids and clouds
  [1]. Resources of these infrastructure are shared between different
  organisations and may change dynamically, so there is a need for methods and
  tools to master them in an efficient way [2]. We present the
  WS-VLAM workflow system which aims at covering the entire life cycle of
  scientific workflows: end-users are able to share workflows, reuse each other
  workflow components, and execute workflow on resources across multiple
  organizations [3]. GridSpace [4] is a novel virtual laboratory framework
  enabling to conduct virtual experiments on grid-based infrastructures. It
  facilitates exploratory development of experiments by means of scripts which
  can be expressed in a number of popular languages, including Ruby, Python and
  Perl. One of most demanding applications are those from the area of Virtual
  Physiological Human. Cloud Data and Compute Platform enables efficient
  development and execution of such applications by providing methods and tools
  to install services on available resources, execute workflows and standalone
  applications, and to manage data in a hybrid cloud-grid infrastructure [5]. Common Information
  Space is a service-based framework for processing of sensor data streams and to
  run early warning systems applications and manage their results. Although
  originally it was elaborated for building and running flood early warning
  systems, it may be applicable as an environment for any e-science
  applications [6]. On top of the GridSpace we have elaborated an environment for composing
  multi-scale applications [7] built from single scale models implemented as
  scientific software components, distributed in various e-infrastructures. Applications
  structure is described with the Multiscale Modelling
  Language (MML). The environment consists of a semantic-aware persistence
  store to record metadata about models and scales, a visual composition tool
  transforming high level MML description into executable GridSpace
  experiment, and finally, the GridSpace supports
  execution and result management of generated experiments. The talk will be
  concluded with an analysis and evaluation of these different approaches to
  construction of environments supporting collaborative e-science applications. References [1] M. Bubak, T. Szepieniec, K. Wiatr (Eds.): Building a National Distributed e-Infrastructure -
  Pl-Grid. Scientific and Technical Achievements. Springer, LNCS 7136, 2012. [2] J.T. Moscicki; M. Lamanna; M.T. Bubak and P.M.A. Sloot:
  Processing moldable tasks on the grid: Late job
  binding with lightweight user-level overlay, Future Generation Computer
  Systems, vol. 27, nr 6 pp. 725-736. June 2011. ISSN 0167-739X. (DOI:
  10.1016/j.future.2011.02.002) [3] Adam Belloum, Márcia A. Inda, Dmitry Vasunin, Vladimir Korkhov, Zhiming Zhao, Han Rauwerda, Timo M. Breit, Marian Bubak, Louis O. Hertzberger: Collaborative e-Science Experiments and
  Scientific Workflows. IEEE Internet Computing (INTERNET) 15(4):39-47 (2011) [4] E. Ciepiela, D. Harezlak, J. Kocot, T. Bartynski, M. Kasztelnik, P. Nowakowski, T. Gubała, M. Malawski, M. Bubak: Exploratory Programming in the Virtual Laboratory.
  In: Proceedings of the International Multiconference
  on Computer Science and Information Technology, pp. 621-628 (October 2010),
  [5] VPH-Share Cloud Platform:
  http://dice.cyfronet.pl/projects/details/VPH-Share [6] Bartosz Balis, Marek Kasztelnik, Marian Bubak, Tomasz Bartynski, Tomasz
  Gubala, Piotr Nowakowski, Jeroen Broekhuijsen: The UrbanFlood
  Common Information Space for Early Warning Systems. Procedia
  CS 4: 96-105 (2011) [7] Katarzyna Rycerz and Marian Bubak: Building and Running Collaborative Distributed Multiscale Applications, in: W. Dubitzky,
  K. Kurowsky, B. Schott (Eds),
  Chapter 6, Large Scale Computing, J. Wiley and Sons, 2012 [8] Marian Bubak, Piotr Nowakowski, Tomasz Gubala, Eryk Ciepiela: QUILT –
  Interactive Publications, FET11 – The European Future Technologies Conference
  and Exhibition,   | 
 
| 
   Applications
  on K computer and Advanced Institute of Computational Science Akinori Yonezawa Advanced
  Institute of Computational Science (AICS) Some notable
  applications running on the K supercomputer will be presented, which include
  Tsunami simulations and mitigation of their damage as well as simulation of a
  whole human heart. Also the talk
  describes Riken Advanced Institute of Computational Science (AICS) which is the
  research organization for K computer and the next generation HPC.  | 
 
| 
   Open Petascale
  Libraries (OPL) Project Dr. Kenichi Miura National  and Fujitsu Laboratories Limited,  With the advent of the petascale
  supercomputing systems, we need to rethink the programming model and
  numerical libraries. For one thing, we need to make efficient use of
  multi-core CPUs. For example, the K Computer at RIKEN contains over 700
  thousand cores, and features fast inter-core communication, sharing of the
  programmable L2 cache, and so on.  The Open Petascale
  Libraries Project has been initiated by Fujitsu Laboratories of Europe (FLE)
  to address this issue. It is a global collaboration that aims to promote the
  development of open-source thread-parallel and hybrid numerical libraries. My
  talk introduces the project, provides an update on progress, and seeks to
  obtain feedback from the wider community on future directions. At this time,
  the project includes: dense linear algebra, sparse solvers and adaptive
  meshing, Fast Fourier Transforms, and random number generators. In
  particular, I am interested in the development of highly scalable parallel
  random number generators, and a wider use of the  Further information on the OPL Project is
  available at http://www.openpetascale.org/.  | 
 
| 
   Future Exascale
  systems, so what’s different? Mark Seager INTEL Corporation The challenges of Exascale have been discussed at length. Addressing the
  power and resiliency challenges require an aggressive near threshold voltage
  (NTV) circuit designs that actually make the resiliency problem worse. In
  this talk, I discuss a hierarchal approach to dealing with these issues and
  also the impacts on applications, algorithms, computation &
  communications methods and IO.  | 
 
| 
   Software
  Implications of New Exascale Technologies Continuing on the high-end
  high-performance computing trajectory towards Exascale
  requires the overcoming of several obstacles. A lot of attention has been
  paid in the community to the hardware challenges arising principally from the
  slowing down of Dennard scaling. Several innovative
  approaches have been proposed to dealing with these challenges. However, most
  of these approaches only add to the software hurdles that already need to be
  overcome in order to make Exascale systems
  successful. This talk will provide examples of hardware innovations that have
  been proposed or would be needed to build an Exascale
  system and will describe new software challenges that these innovations would
  present.  | 
 
| 
   Achieving Scalability in the Presence
  of Asynchrony Thomas Sterling, Ph.D Professor of Informatics and Computing The last 35 years of mainstream parallel computing
  have depended upon the assumption of synchronous operation; the expectation
  that the time measure of actions was knowable and exploitable in the
  management of physical resources and abstract actions. This was true with the
  architecture and programming methods for basic vector computing of the
  1970’s, the SIMD Array processing systems of the 1980’s, and the
  communicating sequential processes based message passing programming of MPPs and commodity clusters of the 1990’s. This
  philosophy promoted explicit programmer specification of resource management
  and task scheduling with compile time assistance. Now in the Petaflops era with the inflation of number of cores
  (either multicore sockets or GPU structures),
  widely disparate latencies, and algorithms exhibiting increasingly irregular
  structures and time varying response times, asynchronous behavior
  is increasingly manifest in terms of degradation of efficiency and
  limitations to scalability. Combined with the effects of overhead in
  determining effective granularity, and therefore indirectly concurrency,
  these factors may demand a revolutionary change to dynamic adaptive
  strategies through the implementation and application of runtime system
  software as an intermediary to mitigate asynchrony; the independence and
  uncertainty of timing of execution events. This presentation will borrow from
  the experimental ParalleX execution model to
  consider a set of runtime mechanisms(some from prior art in computer science
  research) that address these interrelated challenges all contributing to
  growing asynchrony within high performance computing systems and their
  implications for future architectures and programming methods that will
  enable Exascale computing by the end of this
  decade.  | 
 
| 
  
   Bob Lucas Computational Sciences Division Information Sciences Institute  With the end of Dennard
  scaling, there appear to be three paths forward to greater computing
  capability: massive scaling of general purpose processors, purpose built
  systems, or pursuit of new physical phenomenon to exploit. Adiabatic quantum
  computing is an example of the latter. It is a new modeling
  of computing, first proposed in 2000. The   | 
 
| 
   Exascale Design Space Exploration Sudip Dosanjh Extreme-scale Computing Sandia National Laboratories The U.S.
  Department of Energy's mission needs in energy, national security and science
  require a thousand-fold increase in supercomputing technology during the next
  decade. It will not be possible to build a usable exascale
  system within an affordable power budget based on computer industry roadmaps.
  Both architectures and applications will need to change dramatically.
  Although exascale is an important driver, these
  changes will impact all scales of computing from single nodes to racks to
  supercomputers. The entire computing industry faces the same power, memory,
  concurrency and programmability challenges. Exascale
  computing has additional challenges, notably scalability and reliability,
  that are related to the extreme size of systems of interest. In order
  to influence the design of future systems we must partner with computer
  companies and application developers to explore the design space. Benefits of
  proposed changes must be quantified relative to costs. Costs could be related
  to energy and silicon area as well as software development. In order for
  computer companies to adopt changes the benefits must be quantified with
  trusted and validated models across a broad range of applications. The wider
  this range the easier it will be to leverage industry roadmaps. In the past
  it has been difficult to perform design tradeoff
  studies due to the lack of validated simulation/emulation tools and the
  complexity of HPC applications, which can be millions of lines of code. Our
  proposed methodology for design space exploration is to use multi-scale
  architectural simulation coupled with mini- and skeleton- applications to
  analyze a range of abstract machine models. Close collaboration with
  application teams will be needed to enable the reformulation of key
  algorithms that accommodate machine constraints.  | 
 
| 
   The EU Exascale Project DEEP - Towards a Dynamical Exascale Entry Platform Thomas Lippert Institute for
  Advanced Simulation, Jülich Supercomputing Centre and  and John von
  Neumann Institute for Computing (NIC) also Europen PRACE IP Projects and of the DEEP Exascale Project Since begin of   | 
 
| 
   Hybrid system
  architecture and application Yutong Lu With more and more Petaflops systems deployed, many debates come from how we
  could use them efficiently. This talk introduces the efforts on hybridarchitecture and software of Tianhe-1A to address
  the performance, scalability and reliability issues. In additional, the
  update applications running on the Tianhe-1A will be introduced to analyses
  the usability and feasibility of the hybrid system.The
  brief prospect of the next generation HPC system will also be given.  | 
 
| 
   Extreme Scale
  Computational Science Challenges in Fusion Energy Research William M. Tang Advanced computing is
  generally recognized to be an increasingly vital tool for accelerating
  progress in scientific research in the 21st Century. The imperative is to
  translate the combination of the rapid advances in super-computing power
  together with the emergence of effective new algorithms and computational
  methodologies to help enable corresponding increases in the physics fidelity
  and the performance of the scientific codes used to model complex physical
  systems. If properly validated against experimental measurements and verified
  with mathematical tests and computational benchmarks, these codes can provide
  reliable predictive capability for the behavior of
  fusion energy relevant high temperature plasmas. The fusion energy research
  community has made excellent progress in developing advanced codes for which
  computer run-time and problem size scale well with the number of processors
  on massively parallel supercomputers. A good example is the effective usage
  of the full power of modern leadership class computational platforms from the
  terascale to the petascale
  and beyond to produce nonlinear particle-in-cell simulations which have
  accelerated progress in understanding the nature of plasma turbulence in
  magnetically-confined high temperature plasmas. Illustrative results provide
  great encouragement for being able to include increasingly realistic dynamics
  in extreme-scale computing campaigns to enable predictive simulations with
  unprecedented physics fidelity. Some key aspects of application issues for
  extreme scale computing will be included within this brief overview of
  computational science challenges in the Fusion Energy Sciences area.  | 
 
| 
   Achieving the
  20MW Target: Energy Efficiency for Exascale Natalie Bates Energy Efficient
  HPC Working Group  The growth rate in
  energy consumed by data centers in the   | 
 
| 
   Cloud Adoption Issues:
  Interoperability and Security Vladimir Getov The concept of a hybrid cloud is an attractive one
  for many organisations, allowing an organisation with an existing private
  cloud to partner with a public cloud provider. This can be a valuable resource
  as it allows end users to keep some of their operation in-house, but benefit
  from the scalability and on-demand nature of the public cloud. There are,
  however, a number of issues that organisations must consider before opting
  for a hybrid cloud set-up. The single most pressing issue that must be
  addressed is that, by definition, the hybrid cloud is never ‘yours’ – part of
  it is owned or operated by a third party, which can lead to security
  concerns. With a true ‘private cloud’ – hosted entirely on your own premises,
  then the security concerns are no different to those associated with any
  other complex distributed system. Indeed, ‘Cloud computing’ as a term has become
  very overloaded – it is doubtful whether this type of internal private cloud system
  qualifies as cloud computing at all, as it does not bring the core benefits
  associated with cloud computing, including taking the pressure off in-house
  IT resources and providing a quickly scalable “elastic” solution using the
  new pay-as-you-go business model. However, when this ‘private cloud’ is
  hosted by a third party, the security issues facing end users become more
  complex. Although this cloud is in theory, still private, the fact that it
  relies on external resources means that IT Managers are no longer in sole
  control of their data. Security remains a major adoption concern, as many
  service providers put the burden of cloud security on the customer, leading
  some to explore costly ideas like third party insurance. It is a huge risk,
  as well as impractical, to ignore the high potential risk from losing
  expensive and/or sensitive data. Another issue that organisations must
  consider is interoperability – internal and external systems must work
  together before security issues can be considered. It could be said, therefore, that a true hybrid
  cloud is actually quite difficult to achieve, when interoperability and
  security issues are considered. One solution might be a regulatory framework
  that would allow cloud subscribers to undergo a risk assessment prior to data
  migration, helping to make service providers accountable and provide
  transparency and assurance. Concerns with hybrid cloud are indicative of the
  anxiety that many companies feel when considering cloud computing as a viable
  business option. We need to see a global consensus on regulation and
  standards to increase trust in this technology and lower the risks that many
  organisations feel goes hand-in-hand with entrusting key data or processing
  capabilities to third parties. Once this hurdle is removed then the true
  benefits of cloud computing can finally be realised.  | 
 
| 
   Qos-Aware Management of Cloud Applications Patrick Martin School of
  Computing, Queen’s University Many organizations
  are considering moving their applications and data to a cloud environment in
  order to take advantage of its flexibility and potential cost savings. There
  are numerous challenges associated with making this move including selecting
  a cloud service provider, deploying and provisioning an application in the
  cloud to meet required QoS levels, monitoring
  application performance and dynamically re-provisioning as demand fluctuates
  in order to maintain QoS commitments and minimize
  costs. In the talk I will
  propose a framework for QoS-aware management of
  cloud applications to address these challenges. I will discuss the structure
  of the framework and highlight the key research questions that must be
  answered in order to develop the framework.  | 
 
| 
   Automatic IaaS Elasticity
  for the PaaS Cloud of the Future Jose Luis Vazquez-Poletti Dpt. de Arquitectura de Computadores y Automática Universidad Complutense de Madrid Cloud computing is essentially
  changing the way services are built, provided and consumed. Despite simple
  access to Clouds, building elastic services is still an elitist domain and
  proprietary technologies are an entry barrier especially to SMEs and consequently, it remains largely within the
  domain of established players. The 4CaaSt project (http://4CaaSt.eu/) aims to
  create an advanced PaaS Cloud platform which
  supports the optimized and elastic hosting of Internet-scale multi-tier
  applications. 4CaaSt embeds all the necessary features, easing programming of
  rich applications and enabling the creation of a true business ecosystem
  where applications coming from different providers can be tailored to
  different users, mashed up and traded together. This talk will describe the research
  efforts, involving   | 
 
| 
   Stratosphere - data management on the
  cloud Odej Kao Complex and Distributed IT Systems Technische Universitat Data Intensive Scalable Computing is a much investigated topic in
  current research. Next to parallel databases, new flavors of data processors
  have established themselves - most prominently the map/reduce programming and
  execution model. The new systems provide key features that current parallel
  databases lack, such as flexibility in the data models, the ability to
  parallelize custom functions, and fault tolerance that enables them to scale
  out to thousands of machines. In this talk, we will present the Nephele
  system – an execution engine for massive-parallel virtualized environments
  centered around a programming model of so called Parallelization Contracts (PACTs). Nephele is part of the
  large system Stratosphere, which is as generic as map/reduce systems, while
  overcoming several of their major weaknesses. The focus will be set on the
  underlying cloud model, the execution strategies, the detection of
  communication bottlenecks and network topology, and on light-weight fault
  tolerance methods. Resume: Dr. Odej Kao is a Full
  Professor at the Technische Universität
   Dr. Kao is a graduate from the Technische
  Universität Clausthal,
  where he earned a Master’s degree in Computer Science and Electrical
  Engineering in 1995. Thereafter, he spent two years working on his PhD thesis
  dealing with high performance image processing and defended his dissertation
  in December  In April 2002 Dr. Kao joined the University of  Since 1998, he has published over 220 peer-reviewed
  papers at prestigious scientific conferences and journals. Dr. Kao is member
  of many international program committees and editorial boards of Journals
  such as Parallel Computing. His research interests include Cloud computing, Virtualisation, data and resource management, Quality of
  Service and SLAs, identity management, and
  peer2peer based resource description and discovery.  | 
 
| 
   A Cloud Framework for Knowledge
  Discovery Workflows on Azure Domenico Talia Dept. of Electronics, Informatics and
  Systems Cloud platforms provide scalable
  processing and data storage and access services that can be effectively exploited
  for implementing high-performance knowledge discovery systems and applications. We
  designed a Cloud framework that supports the composition and scalable
  execution of knowledge discovery applications on the Windows Azure platform.
  Here we describe the system architecture, its implementation, and current
  work aimed at supporting the design and execution of knowledge discovery
  applications modeled as workflows.  | 
 
| 
   Executing Multi-workflow
  simulations on a mixed grid/cloud infrastructure using the SHIWA Technology Peter Kacsuk MTA SZTAKI Various scientific communities use different kind
  of scientific workflow systems that can run workflows on a specific DCI (Distributed
  Computing Infrastructure). The problem with the current workflow usage
  scenario is that user communities are locked in their workflow system, i.e.
  they cannot share their workflows with scientists using in the same field but
  selected a different workflow system. They are also locked into the DCI that
  is supported by the selected workflow system, i.e., they cannot run their
  workflow application in another DCI that is not supported by the selected
  workflow system. The SHIWA technology enables to avoid these pitfalls and
  makes it possible to share workflows written in various workflow languages
  among different user communities. It also enables the creation of so-called
  meta-workflows that combine workflow applications into a higher level
  workflow system. The other important feature of the SHIWA technology is the
  support of multi-DCI execution of these meta-workflows both on various grids
  and cloud systems. The talk wil describe in detail
  how such meta-workflows can be created and executed on a mixed grid/cloud
  infrastructure.  | 
 
| 
   Open-source
  platform-as-a-service: requirements and implementation challenges Dana Petcu While at the infrastructure (‐as‐a‐) service level the adoption of
  emerging standards is slowly progressing as solution for interoperability in
  agreed or ad‐hoc
  federation of Clouds, the market of platforms (‐as‐a‐) services is still struggling
  with the variety of proprietary offers and approaches, leading the
  application developers to a vendor lock‐in. Open‐source platforms that are
  currently emerging as middleware build on top of multiple Clouds have a high
  potential to help the development of applications that are vendor agnostic
  and a click away from the Clouds, and, by this, to boost the migration
  towards the Clouds. Due to the complexity of such platforms the number of
  existing solutions is currently low. We will present a short analysis of the
  available implementations, including VMware’s Cloud Foundry or Red Hat’s OpenShift, as well as with a special focus on mOSAIC’s platform [1]. While fulfilling the user
  requirements, the platform needs also to automate the processes running on
  the providers’ sites. In this context, special components to be developed
  when implementing an opensource platform are
  related to the main characteristics of the Cloud, like elasticity (through
  auto‐scaling
  mechanisms for example) or high availability (through adaptive scheduling for
  example). The requirements of auto‐scaling and adaptive scheduling in the case
  of using services from multiple Clouds will be discussed and the recent
  approaches exposed in [2,3] will be detailed. [1] mOSAIC Consortium. Project details at http://www.mosaic‐cloud.eu. Platform implementation at https://bitbucket.org/mosaic. Documentation at: http://developers.mosaic‐cloud.eu. [2] N.M. Calcavecchia, B.A.Caprarescu, E. Di Nitto, D. J. Dubois, D. Petcu, DEPAS: A Decentralized Probabilistic Algorithm for Auto‐Scaling, http://arxiv.org/abs/1202.2509, 2012 [3] M. Frincu,
  N. Villegas, D. Petcu, H.A. Mueller, R. Rouvoy, Self‐Healing Distributed Scheduling Platform, Procs. 11th IEEE/ACM International Symposium on Cluster,
  Cloud and Grid Computing (CCGrid'11), IEEE Computer Press, 225 ‐ 234  | 
 
| 
   Building Secure and Transparent
  Inter-Cloud Infrastructure for Scientific Applications Yoshio Tanaka National Institute of Advanced Industrial Science
  and Technology (AIST) On 11 March 2011  In this
  presentation, I’ll talk about our experiences on building secure and
  transparent Inter-Cloud infrastructure for scientific applications. Current
  status and future issues will be presented as well.  | 
 
| 
   Scientific Data Analysis on Cloud
  and HPC Platforms Judy Qiu and Pervasive Technology Institute We are in the era
  of data deluge and future success in science depends on the ability to leverage
  and utilize large-scale data. Systems such as MapReduce
  have been applied to a wide range of “big data” applications and the
  open-source Hadoop system has increasingly been
  adopted by researchers of HPC, Grid and Cloud community. These applications
  include pleasingly parallel applications and many loosely coupled data mining
  and data analysis problems where we will use genomics, information retrieval
  and particle physics as examples. We will introduce the key features of Hadoop and Twister (MapReduce
  variant). Then, we will discuss important issues of interoperability between
  HPC and commercial clouds and reproducibility using cloud computing
  environments.  | 
 
| 
   The suitability of
  BSP/CGM model for HPC on Clouds Alfredo Goldman Department of Computer Science Nowadays
  the concepts and infrastructures of Cloud Computing are becoming a standard for
  several applications. Scalability is not only a buzzword anymore, but is
  being used effectively. However, despite the economical advantages of
  virtualization and scalability, some factors as latency, bandwidth and
  processor sharing can be a problem for doing High Performance Computing on
  the Cloud. We will
  provide an overview on how to tackle these problems using the BSP (Bulk
  Synchronous Parallel). We will also introduce the main advantages of CGM
  (Coarse Grained Model), where the main goal is to minimize the number of
  communication rounds, which can have an important impact on BSP algorithms
  performance. We will also present our experience on using BSP in an
  opportunistic grid computing environment. 
  Then we will show several recent models for distributed computing
  initiatives based on BSP. We will also provide some research directions to
  improve the performance of BSP applications on Clouds. Finally we
  will present some preliminary experiments comparing the performance of BSP
  and MapReduce model.  | 
 
| 
   Big Data Analytics for Science
  Discovery Valerio Pascucci Director, Center
  for Extreme Data Management Analysis and Visualization (CEDMAV) Associate Director, Scientific Computing and
  Imaging Institute Professor, School of Computing, University
  of  Laboratory Fellow,  CTO, ViSUS Inc. (visus.us) Advanced techniques for analyzing and
  understanding Big Data models are a crucial ingredient for the success of any
  supercomputing center and data intensive scientific
  investigation. Such techniques involve a number of major challenges such as
  developing scalable algorithms that run efficiently on the simulation data
  generated on the largest supercomputers in the world or incorporating robust
  methods are provably correct and complete in their extraction of features
  from the data. In this talk, I will present the application of a
  discrete topological framework for the representation and analysis of large
  scale scientific data. Due to the combinatorial nature of this framework, we
  can implement the core constructs of Morse theory without the approximations
  and instabilities of classical numerical techniques. The inherent robustness
  of the combinatorial algorithms allows us to address the high complexity of
  the feature extraction problem for high resolution scientific data.  Our approach has enabled the successful
  quantitative analysis for several massively parallel simulations including
  the study turbulent hydrodynamic instabilities, porous material under stress
  and failure, the energy transport of eddies in ocean data used for climate modeling, and lifted flames that lead to clean energy
  production. During the talk, I will provide a live
  demonstration of some software tools for topological analysis of large scale
  scientific data and discuss the evolution of the organization of the project,
  highlighting key aspects that enabled us to successfully deploy this new
  family of tools to scientists in several disciplines. BIOGRAPHY Valerio Pascucci is the funding Director, Center for Extreme Data Management Analysis and
  Visualization (CEDMAV), recently established as a permanent organization at
  the   | 
 
| 
   EUDAT - European scientists and
  data centers turn to big data collaboration Wolfgang Gentzsch Advisor, EUDAT EUDAT is a pan-European
  big data project, bringing together a unique consortium of research
  communities and national data and high performance omputing
  centers, aiming to contribute to the production of
  a collaborative data infrastructure to support  The aim of this talk is to
  highlight the main objectives of the EUDAT project and its Collaborative Data
  Infrastructure, and to discuss a set of cross-disciplinary data services
  designed to service all European research communities, such as safe
  replication of data sets among different sites, data staging to compute
  facilities, easy data storage, metadata, single sign-on, and persistent
  identifiers.  | 
 
| 
   Charlie Catlett Argonne National Laboratory and The  The increasing scale of new urban infrastructure projects
  and the accelerating rate of demand for such projects bring into focus
  several opportunities, indeed mandates, to harness information technologies
  that have not been traditionally applied to urban design, development, and
  evaluation. Architectural planning tools in use today rely on simplified
  models, typically lacking adequate treatment of complexity, underlying
  physical processes, or socio-economic factors which are at the heart of
  stated city objectives such as "safe," "harmonious," or
  "sustainable." To date these objectives have been difficult to
  measure due to lack of data, however the trend toward transparency and public
  access to "open data" is already enabling interdisciplinary
  scientific analysis and performance prediction at unprecedented detail. Our
  experience with cities over the past century suggests that the traditional
  approach of simplified models, combined with heuristics, often produces
  unintended results that are manifest only after they are difficult, or
  impractical, to unravel. Embracing open data and computational modeling into urban planning and design has the potential
  to radically shorten this experience loop, reducing risk while also allowing
  for innovation that would be otherwise impractical.  | 
 
| 
   Discovering Knowledge from Massive
  Social Networks and Science Data - ¬Next Frontier for HPC Prof. Alok
  N. Choudhary John G. Searle Professor Electrical Engineering and Computer Science Northwestern University Knowledge
  discovery in science and engineering has been driven by theory, experiments
  and more recently by large-scale simulations suing high-performance
  computers. Modern experiments and simulations involving satellites,
  telescopes, high-throughput instruments, imaging devices, sensor networks,
  accelerators, and supercomputers yield massive amounts of data. At the same
  time, the world, including social communities is creating massive amounts of
  data at an astonishing pace. Just consider Facebook,
  Google, Articles, Papers, Images, Videos and others. But, even more complex
  is the network that connects the creators of data. There is knowledge to be
  discovered in both. This represents a significant and interesting challenge
  for HPC and opens opportunities for accelerating knowledge discovery. In this
  talk, followed by an introduction to high-end data mining and the basic
  knowledge discovery paradigm, we present the process, challenges and
  potential for this approach. We will present many case examples, results and
  future directions including (1) mining sentiments from massive datasets on
  the web, (2) Real-time stream mining of text from millions of and tweets to
  identify influencers and sentiments of people; (3) Discovering knowledge from
  massive social networks containing millions of nodes and hundreds of billions
  of edges from real world Facebook, twitter and
  other social network data (E.g., Can anyone follow Presidential campaigns and
  real-time?) and (4) Discovering knowledge from massive datasets from science
  applications including climate, medicine, biology and sensors. Biography:
   Alok Choudhary is a John G. Searle Professor of Electrical
  Engineering and Computer Science at  He
  received the National Science Foundation's Young Investigator Award in 1993. He
  has also received an IEEE Engineering Foundation award, an IBM Faculty
  Development award, an Intel Research Council award. He is a
  fellow of IEEE, ACM and AAAS. His research interests are in high-performance
  computing, data intensive computing, scalable data mining, computer
  architecture, high-performance I/O systems and software and their
  applications. Alok Choudhary
  has published more than 350 papers in various journals and conferences and
  has graduated 30 PhD students. Techniques
  developed by his group can be found on every modern processor and scalable
  software developed by his group can be found on most supercomputers.  | 
 
| 
   High Performance Computing
  Challenges from an Aerospace Perspective Greg Tallant Lockheed Martin is a world leader in the
  design development, and integration of large complex systems. In this role
  Lockheed Martin is involved in many areas of technology that have an
  extremely broad range of computational challenges. These challenges range
  from engineering problems like computational fluid dynamics (CFD) and
  structural analysis to real time signal processing and embedded systems
  control. One of the biggest challenges currently being faced by Lockheed
  Martin is developing more affordable system solutions that are insensitive to
  increases in system complexity. To address this challenge, Lockheed Martin
  has assembled a team spanning our business areas and the academic community
  to explore quantum computing resources for application to new systems
  engineering capabilities and reduced costs for software development and
  testing. The objective of our effort is to develop a system-level
  verification & validation (V&V) approach and enabling tools that
  generate probabilistic measures of correctness for an entire large-scale
  cyber-physical system, where V&V costs are insensitive to system
  complexity. In this presentation we will provide an overview of our current
  research and present some of the initial results obtained to date.  | 
 
| 
   Macro-scale phenomena of arterial
  coupled cells: a Massively Parallel simulation Timothy David Centre for Bioengineering Impaired mass transfer characteristics of blood borne
  vasoactive species such as ATP in regions such as
  an arterial bifurcation have been hypothesized as a prospective mechanism in
  the etiology of atherosclerotic lesions. Arterial
  endothelial (EC) and smooth muscle cells (SMC) respond differentially to altered
  local hemodynamics and produce coordinated
  macro-scale responses via intercellular communication. Using a
  computationally designed arterial segment comprising large populations of
  mathematically modelled coupled ECs \& SMCs, we investigate their response to spatial gradients
  of blood borne agonist concentrations and the effect of the micro-scale
  driven perturbation on a macro-scale. Altering homocellular
  (between same cell type) and heterocellular
  (between different cell types) intercellular coupling we simulated four cases
  of normal and pathological arterial segments experiencing an identical
  gradient in the concentration of the agonist. Results show that the heterocellular calcium coupling between ECs and SMCs is important in
  eliciting a rapid response when the vessel segment is stimulated by the
  agonist gradient. In the absence of heterocellular
  coupling, homocellular calcium coupling between
  smooth muscle cells is necessary for propagation of calcium waves from
  downstream to upstream cells axially. Desynchronized intracellular calcium
  oscillations in coupled smooth muscle cells are mandatory for this
  propagation. Upon decoupling the heterocellular
  membrane potential, the arterial segment looses the inhibitory effect of
  endothelial cells on the calcium dynamics of underlying smooth muscle cells.
  The full system comprising hundreds of thousands of coupled nonlinear
  ordinary differential equations simulated on the massively parallel Blue Gene
  architecture. The use of massively parallel computational architectures shows
  the capability of this approach to address macro-scale phenomena driven by
  elementary micro-scale components of the system.  | 
 
| 
   Overcoming Communication Latency Barriers
  in Massively Parallel Molecular Dynamics Simulation on Anton Ron Dror D. E. Shaw Research Strong scaling of scientific applications on
  parallel architectures is increasingly limited by communication latency.  This talk will describe the techniques used
  to reduce latency and mitigate its effects on performance in Anton, a
  massively parallel special-purpose machine that accelerates molecular
  dynamics (MD) simulations by orders of magnitude compared with the previous
  state of the art.  Achieving this
  speedup required both specialized hardware mechanisms and a restructuring of
  the application software to reduce network latency, sender and receiver
  overhead, and synchronization costs. 
  Key elements of Anton’s approach, in addition to tightly integrated
  communication hardware, include formulating data transfer in terms of counted
  remote writes and leveraging fine-grained communication.  Anton delivers end-to-end inter-node
  latency significantly lower than any other large-scale parallel machine, and
  the total critical-path communication time for an Anton MD simulation is less
  than 3% that of the next-fastest MD platform.  | 
 
| 
   Job
  scheduling of parametric computational mechanics studies on cloud computing infrastructure Carlos
  García Garino a,b , Cristian Mateos c,d
  and Elina Pacini a
  a Information & Communication
  Technologies Institute (ITIC) and b
   cISISTAN Institute, UNICEN
  University, Tandil, Buenos Aires, Argentina d Consejo Nacional de
  Investigaciones Científicas y Técnicas (CONICET) Parameter Sweep Experiments (PSEs) allow
  scientists and engineers to conduct experiments by running the same program
  code against different input data. In particular non linear Computational
  Mechanical Problems are addressed in this case [1] This usually results in
  many jobs with high computational requirements. Thus, distributed
  environments, particularly Clouds, can be employed to fulfill these demands.
  However, job scheduling is challenging as it is an NP-complete problem.
  Recently, Cloud schedulers based on bio-inspired techniques–which work well
  in approximating problems– have been reviewed [2]. Sadly, existing proposals
  ignore job priority, which is a very important aspect in PSEs since it allows
  accelerating PSE results processing and visualization in scientific Clouds.  In this work a new Cloud scheduler based on
  Ant Colony Optimization, the most popular bio-inspired technique, which also
  exploits well known notions from operating systems theory is proposed.
  Simulated experiments performed with real PSE job data and other Cloud
  scheduling policies indicate that this new proposal allows for a more agile
  job handling while reducing PSE completion time. [1] Pacini, E., Ribero, M., Mateos, C.,
  Mirasso, A., García Garino, C.: Simulation on cloud computing infrastructures
  of parametric studies of nonlinear solids problems. In: F. V. Cipolla-Ficarra
  et al. (ed.) Advances in New Technologies, Interactive Interfaces and
  Communicability (ADNTIIC 2011). pp. 56–68. Lecture Notes in Computer Science
  (2011), to appear. [2] Pacini, E., Mateos, C., García Garino,
  C.: Schedulers based on Ant Colony Optimization for Parameter Sweep
  Experiments in Distributed Environments. in S. Bhattacharyya and P. Dutta
  (Editors), Handbook of Research on Computational Intelligence for
  Engineering, Science and Business. IGI Global,   | 
 
| 
   Multi-Resolution Stream Valerio Pascucci Director, Center
  for Extreme Data Management Analysis and Visualization (CEDMAV) Associate Director, Scientific Computing and
  Imaging Institute Professor, School of Computing, University
  of  Laboratory Fellow,  CTO, ViSUS Inc. (visus.us) Effective use of data management techniques for
  massive scientific data is a crucial ingredient for the success of any
  supercomputing center and data intensive scientific
  investigation. Developing such techniques involves a number of major
  challenges such as the real-time management of massive data, or the
  quantitative analysis of scientific features of unprecedented complexity.
  Addressing these challenges requires interdisciplinary research in diverse
  topics including the mathematical foundations of data representations, the
  design of robust, efficient algorithms, and the integration with relevant
  applications in physics, biology, or medicine.  In this
  talk, I will present a scalable approach for processing large scale
  scientific data with high performance selective queries on multiple terabytes
  of raw data. The combination of this data model with progressive streaming techniques
  allows achieving interactive processing rates on a variety of computing
  devices ranging from handheld devices like an iPhone,
  to simple workstations, to the I/O of parallel supercomputers. I will
  demonstrate how our system has enabled the real time streaming of massive
  combustion simulations from DOE platforms such as Hopper2 at LBNL and
  Intrepid at ANL. During the talk, I will provide a live
  demonstration of the effectiveness of some software tools developed in this
  effort and discuss the deployment strategies in an increasing heterogeneous
  computing environment. BIOGRAPHY Valerio Pascucci is the funding Director, Center
  for Extreme Data Management Analysis and Visualization (CEDMAV), recently
  established as a permanent organization at the   | 
 
| 
   Portability and Interoperability
  in Clouds: Agents, Semantic and Volunteer computing can help - the mOSAIC and Cloud@Home projects Beniamino Di Martino Second University of Naples - mOSAIC Project Coordinator Cloud vendor lock-in and interoperability gaps
  arise (among many reasons) when  semantics of resources and services, and of
  Application Programming Interfaces is not shared. Standards and techniques borrowed from SOA and
  Semantic Web Services areas might help in gaining shared, machine readable
  description of Cloud offerings (resources, Services at Platform and
  Application level, and their API groundings), thus allowing automatic
  discovery, matchmaking, and thus supporting selection, brokering,
  interoperability end even composition of Cloud Services among multiple
  Clouds. The EU funded mOSAIC
  project (http://www.mosaic-cloud.eu) aims at designing and developing an
  innovative open-source API and platform that enables applications to be Cloud
  providers' neutral and to negotiate Cloud services as requested by their
  users. Using the mOSAIC Cloud ontology and Semantic
  Engine, cloud applications' developers will be able to specify their services
  and resources requirements and communicate them to the mOSAIC
  Platform and Cloud Agency.  The mOSAIC Cloud Agency
  will implement a multi-agent brokering mechanism that will search for Cloud
  services matching the applications’ request, and possibly compose the
  requested service.  The PRIN (National Relevance Research Project) Prject Cloud@Home
  (http://cloudathome.unime.it/) aims at implementing a volunteer Cloud, a
  paradigm which mixes aspects of both Cloud and Volunteer computing. The main
  enhancement of Cloud@Home is the capability of a
  host to be at the same time both contributing and consumer host, establishing
  a symbiotic interaction with the Cloud@Home
  environment.  | 
 
| 
   Smart Sensing for Discovering and
  Reducing Energy Wastes in Office Buildings Amy Wang Institute for Interdisciplinary Information
  Sciences Recent survey shows that in our offices up to 70%
  of computers and related equipments are left on all the time. Equipment
  energy costs can be reduced by 20% just by turning off when not in use.
  However, it is very challenging to develop an automatic control system to
  discover and reduce the energy wastes. Particularly, to discover the energy
  wastes, the running states of the massive appliances need to be tracked in
  real-time and checked against the real-time user requirements to judge
  whether an electrical appliance is wasting energy or not. Because the
  electrical appliances are massive and the user requirements are highly
  dynamic, it is generally very difficult and cost inefficient to track the
  states of the electrical appliances and the real-time user requirements. In
  this talk, we report how the recent advantages of smart metering and
  compressive sensing technologies can be exploited to solve above challenging
  problems. Although the real-time electrical appliance states and the real-time
  user requirements compose very high dimensional dynamic signals, they are
  converted to sparse signals by temporal and spatial transformations
  respectively. Compressive sensing systems by smart meters and infrared
  sensors are designed to track these sparsified
  signals using lightweight metering and sequential decoding. Particularly in
  this talk, the design methodologies, theoretical bounds and experimental
  results will be introduced.  | 
 
| 
   Project ADVANCE: Ant Colony
  Optimisation (ACO) using coordination programming based on S-Net Alex Shafarenko Department of Computer Science This talk presents some of the results of the EU
  Framework 7 project ADVANCE. We report our experiences of applying an HPC
  structuring technique: dataflow  coordination programming, and the specific
  programming environment: the language S-Net, to restructuring existing
  numerical code developed by SAP AG.  The code implements an ACO solution to the
  Travelling Salesman Problem. We have converted the ACO algorithm to a
  stream-processing network  and encoded it as a coordination program. We then
  implemented this solution by using either explicit thread management in C (a
  manually coded version)  or by applying our coordination compiler, and
  compared the results. We find that the use of S-Net results in a low
  code-development cost while achieving  the same scaling characteristics and very similar
  performance compared to the manually coded solution at large system sizes.
  The message-driven (as opposed  to message-passing) nature of the coordinating
  streaming code creates the prerequisites for a large scale distributed, but
  still easily manageable and maintainable  implementation. We argue that it is that
  maintainability and manageability that makes our approach uniquely suitable
  for industrial uptake of HPC.  |