The eighth Annual Meeting on High Performance Computing and Infrastructure in Norway
NOTUR2009 - PROGRAMME
Monday May 18 - Tutorial Day
Registration starts at 9:00, near Auditorium 5. There will be a short welcome (in Auditorium 5) at 10:00. From there, the participants will go to separate workshop rooms. The workshops start 10:15 and finish ca. 16:00. There will be common lunch at 13:00.
The following tutorials are given:
- ADF workshop
The tutorial will be given by Matt Kundrat and Stan van Gisbergen from SCM. Overview presentations are given on the capabilities of and applications with ADF (molecular DFT code), BAND (periodic DFT code), and COSMO-RS (fluid thermodynamics), as well as graphical user interfaces to these codes. GUI demo's by ADF tutors. Followed by a hands-on session where participants can try all software on the Windows desktops in the seminar room.
- Introduction to Chapel: A Next Generation HPC Language
The tutorial is given by Steve Deitz, Cray Inc. What is Chapel? It is a new
parallel language being developed by Cray Inc as part of Cray's entry in the
DARPA HPCS program. The main goal of the progam is to improve programmer
productivity by:
- Improve the programmability of parallel computers
- Match or improve upon the better performance of current programming models
- Provide better portability than current programming models
- Improve robustness of parallel codes.
Presentation: Introduction To Chapel: A Next Generation HPC Language, Steve Deitz
- Linux Multicore Performance Analysis and Optimization in a Nutshell
The tutorial is given by Philip Mucci. In this tutorial, students will learn
about the basics of code optimization on modern "multi-core, multi-socket"
computer systems. Particular attention will be paid to performance lost to the
memory subsystem as well as issues common to multi-core architectures. General
guidelines will be introduced to help guide new application development as well
as several recurring patterns seen in application optimization. In addition to
code tuning, a workflow will be presented that guides the developer through
various stages of analysis with the various performance tools available on the
system. The tutorial will conclude with a discussion of two parallel
computation models, Pthreads and OpenMP and talk about common pitfalls and best
practices . This course will be oriented towards the application scientist,
thus only a rudimentary knowledge of computer architecture is a requirement.
The pace will be lively however, as a significant amount of material will be
covered. Those with advanced backgrounds in computer architecture are most
welcome to participate and add their breadth of experience to the in-class
discussions.
Presentation: Linux Multicore Performance Analysis and
Optimization in a Nutshell, Philip Mucci
- Doing computations with Cython
The tutorial is given by Dag Sverre Seljebotn. Python, along with NumPy
and other packages, is becoming a popular environment for scientific
programming and experimentation. While convenient both for glue code and
computations which can be expressed on an array level, Python's lack of
speed makes both intensive for-loops and implementing new data structures
unviable.
As an alternative to switching to C, or similar, for parts of the program,
Cython allows compilation of Python-like code into regular Python
extension modules. While the convenience of the Python language and
environment is still available, one can easily add type information to
reach the same performance as C. Cython is also popular as a tool for
interfacing Python with C code.
ADF workshop
| Time |
Title |
| 09:00 |
Registration + Coffee |
| 10:00 |
Welcome and Seminar Practicalities |
| 10:15 |
Overview presentations ADF, BAND, COSMO-RS, GUI's. Including Q&A. Demonstration of the GUIs. |
| 13:00 |
Lunch break |
| 14:00 |
Hands-on session where participants can try our tutorials and exercises to carry out calculations and visualize the results. |
| 16:00
| End of Workshop |
|
|
Introduction to Chapel: A Next Generation HPC Language
| Time |
Title |
| 09:00 |
Registration + Coffee |
| 10:00 |
Welcome and Seminar Practicalities |
| 10:15 |
Chapel background, language basics |
| 11:15 |
Coffee break |
| 11:30 |
Task parallelism, data parallelism, locality and affinity |
| 13:00 |
Lunch break |
| 14:00
| HPCC case study, compiler overview, hands-on session |
| 16:00 |
End of Workshop |
|
Linux Multicore Performance Analysis and Optimization in a Nutshell
| Time |
Title |
| 09:00 |
Registration + Coffee |
| 10:00 |
Welcome and Seminar Practicalities |
| 10:15 |
First half of tutorial |
| 13:00 |
Lunch break |
| 14:00 |
Second half of tutorial |
| 16:00
| End of Workshop |
|
|
Writing Fast Code with Cython
| Time |
Title |
| 09:00 |
Registration + Coffee |
| 10:00 |
Welcome and Seminar Practicalities |
| 10:15 |
First half of tutorial |
| 13:00 |
Lunch break |
| 14:00 |
Second half of tutorial |
| 16:00 |
End of Workshop |
|
|
Tuesday May 19
Registration starts at 9:00.
The programme starts at 10:00 and finishes ca. 17:10.
| Time |
Title |
Speaker |
Abstract |
Slides |
| 09:00 |
Registration + Coffee |
|
|
|
| 10:00 |
Welcome |
Arne Sølvberg, Dean of Faculty of Information Technology, Mathematics and Electrical Engineering (IME), NTNU |
|
|
| 10:15 |
The national e-Infrastructure for science |
Jacko Koster, UNINETT Sigma |
|
X |
| 10:30 |
Beyond the Hype: A Berkeley View of Cloud Computing |
Anthony D. Joseph, University of California at Berkeley |
X |
X |
| 11:15 |
PRACE - Software enabling for Petaflop/s systems |
Sebastian von Alfthan, CSC, the Finnish IT center for science |
|
X |
| 12:00 |
Scaling to Petaflop |
Ola Tørudbakken, Sun Microsystems |
|
X |
| 12:40 |
Lunch |
|
|
|
| 13:30 |
Three-dimensional numerical modelling of water and sediment fl ow in rivers |
Nils Reidar B. Olsen, NTNU |
|
X |
| 14:10 |
High Performance Computing on GPUs |
Anne Elster, NTNU |
|
|
| 14:50 |
Coffee break / Poster session |
|
X |
|
| 16:00 |
Aerosol-cloud-climate interactions - simulating climate change with the global model CAM-Oslo |
Alf Kirkevåg, Norwegian Meteorological Institute |
X |
X |
| 16:40 |
Automatic differentiation of density functionals |
Radovan Bast, CTCC, University of Tromsø |
X |
|
| 17:10 |
End of Day 1 |
|
|
|
| |
| Dinner on Munkholmen |
| 19:00 |
Departure by boat to Munkholmen |
|
|
|
| 19:15 |
Visit of Munkholmen |
|
|
|
| 20:00 |
Dinner |
|
|
|
| 22:45 |
Return by boat to Trondheim |
|
|
|
Wednesday May 20
The programme starts at 9:00. The day finishes ca. 15:00.
| Time |
Title |
Speaker |
Abstract |
Slides |
| 09:00 |
Welcome / Poster winner announcement |
|
|
| 09:15 |
Honey, I shrunk the parallel computer |
Erik Hagersten, Uppsala University |
X |
X |
| 10:00 |
High performance computing in Physics of Geological Processes |
Marcin Krotkiewski, Physics of Geological Processes, University of Oslo |
X |
X |
| 10:45 |
Coffee break |
|
|
|
| 11:10 |
The PGAS Programming Model |
Steve Deitz, Cray Inc. |
|
X |
| 11:55 |
Semantic Systems Biology - state and challenges |
Vladimir Mironov, NTNU |
X |
X |
| 12:30 |
Lunch |
|
|
|
| 13:30 |
Systems Level Acceleration: A Hybrid Computing Solution |
Kirk E. Jordan, IBM T.J. Watson Research Center |
X |
X |
| 14:15 |
Inside Intel Nehalem Microarchitecture |
Andrey Semin, EMEA HPC Technology Manager at Intel Corporation |
X |
X |
| 15:00 |
Closing remarks |
|
|
|
Abstracts
Radovan Bast, CTCC, University of Tromsø
Automatic differentiation of density functionals.
In many areas of computational sciences, the differentiation of functions constitutes an important element in the methodology used. The differentiation is challenging both in terms of the computational time needed to numerically evaluate the derivatives, but also in the implementation of these derivatives, a process which may be error prone when the functions become complicated.
In our contribution we will discuss the concept of automatic differentiation as an attractive alternative to numerical differentiation or to codes generated by symbolic differentiation or manual implementations. We will briefly outline the basic principles of automatic differentiation, and then illustrate how this approach can be used to obtain a compact implementation of time-dependent density functional theory.
Density functional theory is one of the most popular, and accurate, methods for studying the electronic structure of molecules. In this approach, the interactions between electrons are described by approximate exchange-correlation (XC) functionals, which in many cases are functions involving a large number of terms. The XC functionals depend on the electronic density of the molecule and the spin density and their Cartesian gradients.
In order to calculate electronic excitation energies and molecular response properties, time-dependent density functional theory is the method of choice. Such calculations require second and higher order derivatives of XC functionals. Explicit expressions for higher order functional derivatives can be very involved and are in general out of reach for manual differentiation. We will demonstrate that automatic differentiation can greatly simplify the implementation of these functional derivatives without loss of accuracy.
Erik Hagersten, Uppsala University
Honey, I shrunk the parallel computer.
The introduction of multicore architecture brings the promise of great performance at low cost. However, it also introduces new bottlenecks that could even result in slower execution. Tuning for multicore requires insights into complex resource sharing (e.g., caches and bandwidth) and thread interaction (e.g., synchronization, coherence and false sharing). Even your perfectly scalable algorithm may need to be altered to run well.
This session discusses the new challenges associated with multicore performance today and tomorrow. A couple of very simple optimizations will be shown to have dramatic impact on performance and scalability. Multicore may not bring any free lunch, but there is often plenty of low-hanging fruit.
Biography Erik Hagersten
Professor Erik Hagersten holds a chair in computer architecture at Uppsala University Sweden since 1999. He is also the CTO of Acumem AB, developing new technology for multicore optimizations.
Hagersten was the chief architect for Sun Microsystem's high-end server engineering division 1993-1999. He is the architect of the Sun WildFire, the Sunfire Link and the Sun Enterprise 15k/25k UltraSPARC III scalable coherence technology. He coined the Cache-Only Memory Architecture (and it brain-dead acronym COMA) while managing the architecture research group at the Swedish Inst. of Computer Science. The Simics simulator is another result from that group.
Hagersten has been a board member of SNIC (Swedish National Infrastructure for Computing) since 2002. He is the author of more than 50 academic papers, holds more than 100 patents and is a member of the Royal Swedish Academy of Science and Engineering (IVA).
Kirk E. Jordan, Emerging Solution Executive Computational Science
IBM T.J. Watson Research Center
Systems Level Acceleration: A Hybrid Computing Solution.
High performance computing (hpc) is a tool frequently used to understand complex problems in numerous areas such as aerospace, biology, climate modeling and energy. Scientists and engineers working on problems in these and other areas demand ever increasing compute power for their problems. However, when the scientific and engineering communities tackle their real problems, they often encounter bottlenecks as a result of pre- or post-processing required for a complete solution of the overall problem. In this talk, I propose a new hpc paradigm: System Level Accelerators, the deployment of diverse compute resources to solve the problem of a single work stream, all with the experience of a single server to the user. I will describe some of the challenges that will need to be considered in designing Systems Level Accelerators and point out some of the work underway to meet these challenges. I will give a few examples of current applications in geosciences and life sciences where we and others are beginning to apply these system level acceleration approaches. In conclusion, some discussion not only on the most obvious way to use ultra-scale, multi-core hpc systems as part of a work flow will be given but also some thoughts on how one might use such systems in a systems level acceleration paradigm to tackle previously intractable problems.
Anthony D. Joseph, Computer Science Division, University of California at Berkeley
Beyond the Hype: A Berkeley View of Cloud Computing.
With Cloud Computing, the long-held dream of computing as a utility, developers of new Internet services no longer require the large capital outlays in hardware to deploy their service or the human expense to operate it. There are three new aspects of large-scale computing in Cloud Computing:
- The illusion of infinite computing resources available on demand, thereby eliminating the need for Cloud Computing users to plan far ahead for provisioning.
- The elimination of an up-front commitment by Cloud users, thereby allowing companies to start small and increase hardware resources only when there is an increase in their needs.
- The ability to pay for use of computing resources on a short-term basis as needed (e.g., processors by the hour and storage by the day) and release them as needed, thereby rewarding conservation by letting machines and storage go when they are no longer useful.
Companies with large batch-oriented tasks can get results as quickly as their programs can scale, since using 1000 servers for one
hour costs no more than using one server for 1000 hours. This elasticity of resources, without paying a premium for large scale, is unprecedented in the history of the Information Technology field.
However, there is significant hype surrounding Cloud Computing and in this talk, I will present a balanced discussion of our definition of Cloud Computing, the benefits it offers, and ten significant obstacles to Cloud Computing.
Biography Anthony D. Joseph
Anthony D. Joseph is a professor in Electrical Engineering and Computer Science at the University of California Berkeley, and the Director of Intel Research Berkeley. He received his Ph.D. degree in computer science from MIT in 1998, holds a UC Berkeley Chancellor's Professorship, and is a member of IEEE, ACM, and USENIX. He is developing adaptive techniques for: cloud computing, distributed network monitoring and triggering, network and computer security, and security defenses for machine learning-based decision systems. He also co-leads the DETERlab testbed, a secure scalable testbed for conducting cybersecurity research. His principal field of interest is systems and networking: cybersecurity, datacenter architectures, mobile systems and networking, and overlay networks.
Alf Kirkevåg, Norwegian Meteorological Institute (met.no)
Aerosol-cloud-climate interactions - simulating climate change with the global model CAM-Oslo.
Processes determining the physio-chemical properties of aerosols and their interactions with clouds are key processes in climate research. According to IPCC (2007), the low level of understanding of these processes contribute to large uncertainties in the simulations of climate and climate change. Results from multi-decadal as well as shorter simulations will be presented to illustrate the large sensitivity of radiative forcing and climate response to basic assumptions about the aerosols. For instance, some natural aerosols are poorly known and therefore neglected or only crudely parameterized in climate models. If no artificial constraints are introduced, this leads to an underprediction of the number of natural cloud droplets, which in turn yield severely overestimated cooling due to aerosols. It may also affects the climate sensitivty to greenhouse gases, although to a smaller degree. The ongoing development of more realistic parameterizations of aerosols and their interactions with clouds, the coupling of the various parts of the climate system in a new Norwegian Earth System Model, NorESM, and longer climate simulations as well as a need for increased spatial resolution all call for a drastic increase in the high performance computing resources in climate research in the coming years.
Marcin Krotkiewski, Physics of Geological Processes, University of Oslo
High performance computing in Physics of Geological Processes.
Physics of Geological Processes (PGP) is funded by the Norwegian center of excellence initiative. PGP strives to obtain a fundamental understanding of the processes that shape our planet. Numerical models are the obvious choice for studying these processes given large differences in both geometrical and temporal scales. While 2D models usually run on single CPU computers, 3D models require massively parallel systems. Here, we give an overview of the scientific and technical challenges, and show how the NOTUR architecture helps us to achieve our goals. In particular, we briefly discuss the following activities at PGP.
- Faulting and fracturing of rocks is an intrinsically discontinuous
process. We use the discrete Element method to simulate the small scale
results of earth quakes: fault g
ouge fragmentation, deformation localisation and boundary roughness
evolution in fault zones. 3D simulations provide insights into dynamic
grain scale interactions that control
the development of damage structures, the macroscopic mechanical
stability, and hence the earthquake potential of a fault.
- Violent geophysical processes require numerical models that are capable of resolving the evolution of sharp fronts. For this purpose we use SAGE (developed at Los Alamos), a code that uses locally refining Eulerian meshes to solve the compressible hyrodynamics equations. SAGE uses realistic equations to represent the atmosphere, seawater and ocean crust. It is used to estimate risks and effects of a wide range of violent processes, such as tsunamis formed due to landslides and asteroid impacts, and explosive eruptions in geothermal systems.
- The above mentioned are fast processes. Another extreme is the deformation that takes place in the Earth interior over millions of years. Here we assume rocks behave as an incompressible Stokes fluid. Extending on our expertise with fast 2D finite element solvers (www.milamin.org), we have developed a 3D equivalent: BILAMIN. We use body fitted, unstructured meshes and iterative solvers. The parallel code scales on the entire Hexagon cluster (5.5 thousand CPUs). The largest systems solved have 0.5 billion elements. The code helps us understand how folds develop and interact in 3D, and what the effective material properties of heterogeneous rocks are.
Vladimir Mironov, NTNU
Semantic Systems Biology - state and challenges.
Systems biology is a new branch of integrative biology relying on computational modelling of biological processes for hypotheses generation. Systems biology is inherently multidisciplinary and depends on efficient data integration. For a number of reasons this task proved to be hard to achieve. The most recent and promising trend in the area of biological knowledge management is the application of Semantic Web technologies. We termed the fusion between Semantic Web technologies and systems biology Semantic Systems Biology. In this case new hypotheses are generated through deployment of automatic reasoning agents (primarily Description Logics reasoners) over massive biological knowledge bases. We discuss the current state of the field and the computational challenges.
Andrey Semin, Intel Corporation
Inside Intel Nehalem Microarchitecture.
The Core i7 and Xeon 5500 series microprocessors (built on Intel s Nehalem microarchitecture) represents a major advance in Intel processors designs, enabling significant increase in performance of wide range of applications. In this presentation, Andrey Semin, an HPC Technology Manager of Intel in Europe, will provide detailed overview of the new features in Nehalem microarchitecture, and highlight their benefits for HPC applications. In this talk you'll learn how new microarchitecture features, including Turbo Boost Technology, an integrated memory controller, and Hyper-Threading help real-life applications in CAE, Numerical Weather Simulation and Energy sectors. As HPC users always strive getting the best possible sustained performance of their applications, this presentation will cover software application aspects providing some tips-and-tricks for achieving performance gains on Nehalem, by utilizing knowledge about micro-architecture of the processor and system architecture.
Poster Session - Tuesday May 19, 14:50-16:00
The following submissions were accepted:
| Authors | Title |
| A. Anderlik (UiB), A. Z. Munthe-Kaas (UiB), O. K. Øye (CMR), E. Eikefjord (UiB), J. Rørvik (UiB), D. M. Ulvang (CMR), F. G. Zöllner (University of Heidelberg), A. Lundervold (UiB), and C. Anderlik (BCCS) | Integrated software prototype for medical image processing |
| Glenn Tørå and Alex Hansen (NTNU) and Pål-Eric Øren (Numerical Rocks) | A dynamic network model for imbibition in porous media. |
| V. Vionnet, L. Bertino and K. A. Lisæter (Nansen and Environmental Remote Sensing Center, Mohn Sverdrup Center for Operational Oceanography, Bergen) | Effects of snow cover heterogeneities on sea-ice development |
| Wenjie Wei, Stuart R. Clark, Xing Cai and Are Magnus Bruaset (Simula Research Laboratory) | Parallel Simulation of Dual Lithology Sedimentation |
| Mustafa Barri, Helge I. Andersson, George K. El Khoury and Bjørnar Pettersen (NTNU) | Direct numerical simulation of massive separated flows in one-sided expansion channels |
| Kenate Nemera Nigussa (NTNU), Kjetil Liestøl Nielsen (HiST), Øyvind Borck (NTNU), and Jon Andreas Støvneng (NTNU) | Adsorption of atoms and small molecules on α-Cr2 O3 (0001) surfaces |
| Nicolaas Ervik Groeneboom (UiO) | Cosmology done right: CMB analysis by Gibbs sampling |
| Thorvald Natvig (NTNU) | Dynamic Optimization of MPI Communication |
| Rune Johan Hovland, Anne C. Elster and Magnus Lie Hetland (NTNU) | High Data Volumes and Streaming on Future GPU Systems |
| Daniele G. Spampinato and Anne Elster (NTNU) | Communication Challenges on Multi-GPU Systems |
| Jan C. Meyer and Anne C. Elster (NTNU) | Modelling Overlapping Communication and Computation |
| Åsmund Herikstad and Anne Elster (NTNU) | Parallel Techniques for Estimation and Correction of Aberration on Medical Ultrasound Imaging |
| Eirik Ola Aksens and Anne Elster (NTNU) | Seismic Processing of Porous Rocks on Modern GPUs |
| Daniel Haugen and Anne Elster (NTNU) | Strategies for Handling Large Amounts of Data from Storage to GPUs |
| Katarina Pajchel, Jon K. Nilsen and Alex Read (UiO) | ARC middleware - current usage in the ATLAS experiment and the next generation |
| Amir Khosrowshahi (UMB), Jonathan Baker (Weill Cornell Medical College, New York), Hans Ekkehard Plesser (UMB), Gaute T. Einevoll (UMB), Bruno A. Olshausen (Redwood Center for Theoretical Neuroscience, University of California, Berkeley) | Learning highly overcomplete representations for modeling response properties of visual cortical neurons |
| Jørgen Blakstad and Rune W. Nergård (NTNU) | Generation of Rainbow Tables |
| Li-Ming Yang (UiO) | All-Metal Aromaticity and Its Application to Metallocenes |