News About us Publications Projects Hardware Software Access Support Documentation

The eighth Annual Meeting on High Performance Computing and Infrastructure in Norway

NOTUR2009 – May 18–20, 2009 – Trondheim

Home   -   General   -   Location   -   Programme

NOTUR2009 - PROGRAMME


Monday May 18 - Tutorial Day

Registration starts at 9:00, near Auditorium 5. There will be a short welcome (in Auditorium 5) at 10:00. From there, the participants will go to separate workshop rooms. The workshops start 10:15 and finish ca. 16:00. There will be common lunch at 13:00.

The following tutorials are given:

  • ADF workshop

    The tutorial will be given by Matt Kundrat and Stan van Gisbergen from SCM. Overview presentations are given on the capabilities of and applications with ADF (molecular DFT code), BAND (periodic DFT code), and COSMO-RS (fluid thermodynamics), as well as graphical user interfaces to these codes. GUI demo's by ADF tutors. Followed by a hands-on session where participants can try all software on the Windows desktops in the seminar room.

  • Introduction to Chapel: A Next Generation HPC Language

    The tutorial is given by Steve Deitz, Cray Inc. What is Chapel? It is a new parallel language being developed by Cray Inc as part of Cray's entry in the DARPA HPCS program. The main goal of the progam is to improve programmer productivity by:

    • Improve the programmability of parallel computers
    • Match or improve upon the better performance of current programming models
    • Provide better portability than current programming models
    • Improve robustness of parallel codes.

    Presentation: Introduction To Chapel: A Next Generation HPC Language, Steve Deitz

  • Linux Multicore Performance Analysis and Optimization in a Nutshell

    The tutorial is given by Philip Mucci. In this tutorial, students will learn about the basics of code optimization on modern "multi-core, multi-socket" computer systems. Particular attention will be paid to performance lost to the memory subsystem as well as issues common to multi-core architectures. General guidelines will be introduced to help guide new application development as well as several recurring patterns seen in application optimization. In addition to code tuning, a workflow will be presented that guides the developer through various stages of analysis with the various performance tools available on the system. The tutorial will conclude with a discussion of two parallel computation models, Pthreads and OpenMP and talk about common pitfalls and best practices . This course will be oriented towards the application scientist, thus only a rudimentary knowledge of computer architecture is a requirement. The pace will be lively however, as a significant amount of material will be covered. Those with advanced backgrounds in computer architecture are most welcome to participate and add their breadth of experience to the in-class discussions.

    Presentation: Linux Multicore Performance Analysis and Optimization in a Nutshell, Philip Mucci

  • Doing computations with Cython

    The tutorial is given by Dag Sverre Seljebotn. Python, along with NumPy and other packages, is becoming a popular environment for scientific programming and experimentation. While convenient both for glue code and computations which can be expressed on an array level, Python's lack of speed makes both intensive for-loops and implementing new data structures unviable.

    As an alternative to switching to C, or similar, for parts of the program, Cython allows compilation of Python-like code into regular Python extension modules. While the convenience of the Python language and environment is still available, one can easily add type information to reach the same performance as C. Cython is also popular as a tool for interfacing Python with C code.

ADF workshop

Time Title
09:00 Registration + Coffee
10:00 Welcome and Seminar Practicalities
10:15 Overview presentations ADF, BAND, COSMO-RS, GUI's. Including Q&A. Demonstration of the GUIs.
13:00 Lunch break
14:00 Hands-on session where participants can try our tutorials and exercises to carry out calculations and visualize the results.
16:00 End of Workshop
 

Introduction to Chapel: A Next Generation HPC Language

Time Title
09:00 Registration + Coffee
10:00 Welcome and Seminar Practicalities
10:15 Chapel background, language basics
11:15 Coffee break
11:30 Task parallelism, data parallelism, locality and affinity
13:00 Lunch break
14:00 HPCC case study, compiler overview, hands-on session
16:00 End of Workshop

Linux Multicore Performance Analysis and Optimization in a Nutshell

Time Title
09:00 Registration + Coffee
10:00 Welcome and Seminar Practicalities
10:15 First half of tutorial
13:00 Lunch break
14:00 Second half of tutorial
16:00 End of Workshop
 

Writing Fast Code with Cython

Time Title
09:00 Registration + Coffee
10:00 Welcome and Seminar Practicalities
10:15 First half of tutorial
13:00 Lunch break
14:00 Second half of tutorial
16:00 End of Workshop  


Tuesday May 19

Registration starts at 9:00. The programme starts at 10:00 and finishes ca. 17:10.

Time Title Speaker Abstract Slides
09:00 Registration + Coffee      
10:00 Welcome Arne Sølvberg, Dean of Faculty of Information Technology, Mathematics and Electrical Engineering (IME), NTNU    
10:15 The national e-Infrastructure for science Jacko Koster, UNINETT Sigma   X
10:30 Beyond the Hype: A Berkeley View of Cloud Computing Anthony D. Joseph, University of California at Berkeley X X
11:15 PRACE - Software enabling for Petaflop/s systems Sebastian von Alfthan, CSC, the Finnish IT center for science   X
12:00 Scaling to Petaflop Ola Tørudbakken, Sun Microsystems   X
12:40 Lunch      
13:30 Three-dimensional numerical modelling of water and sediment fl ow in rivers Nils Reidar B. Olsen, NTNU   X
14:10 High Performance Computing on GPUs Anne Elster, NTNU    
14:50 Coffee break / Poster session   X  
16:00 Aerosol-cloud-climate interactions - simulating climate change with the global model CAM-Oslo Alf Kirkevåg, Norwegian Meteorological Institute X X
16:40 Automatic differentiation of density functionals Radovan Bast, CTCC, University of Tromsø X  
17:10 End of Day 1      
 
Dinner on Munkholmen
19:00 Departure by boat to Munkholmen      
19:15 Visit of Munkholmen      
20:00 Dinner      
22:45 Return by boat to Trondheim      


Wednesday May 20

The programme starts at 9:00. The day finishes ca. 15:00.

Time Title Speaker Abstract Slides
09:00 Welcome / Poster winner announcement    
09:15 Honey, I shrunk the parallel computer Erik Hagersten, Uppsala University X X
10:00 High performance computing in Physics of Geological Processes Marcin Krotkiewski, Physics of Geological Processes, University of Oslo X X
10:45 Coffee break      
11:10 The PGAS Programming Model Steve Deitz, Cray Inc.   X
11:55 Semantic Systems Biology - state and challenges Vladimir Mironov, NTNU X X
12:30 Lunch      
13:30 Systems Level Acceleration: A Hybrid Computing Solution Kirk E. Jordan, IBM T.J. Watson Research Center X X
14:15 Inside Intel Nehalem Microarchitecture Andrey Semin, EMEA HPC Technology Manager at Intel Corporation X X
15:00 Closing remarks      


Abstracts

Radovan Bast, CTCC, University of Tromsø
Automatic differentiation of density functionals.

In many areas of computational sciences, the differentiation of functions constitutes an important element in the methodology used. The differentiation is challenging both in terms of the computational time needed to numerically evaluate the derivatives, but also in the implementation of these derivatives, a process which may be error prone when the functions become complicated.

In our contribution we will discuss the concept of automatic differentiation as an attractive alternative to numerical differentiation or to codes generated by symbolic differentiation or manual implementations. We will briefly outline the basic principles of automatic differentiation, and then illustrate how this approach can be used to obtain a compact implementation of time-dependent density functional theory.

Density functional theory is one of the most popular, and accurate, methods for studying the electronic structure of molecules. In this approach, the interactions between electrons are described by approximate exchange-correlation (XC) functionals, which in many cases are functions involving a large number of terms. The XC functionals depend on the electronic density of the molecule and the spin density and their Cartesian gradients.

In order to calculate electronic excitation energies and molecular response properties, time-dependent density functional theory is the method of choice. Such calculations require second and higher order derivatives of XC functionals. Explicit expressions for higher order functional derivatives can be very involved and are in general out of reach for manual differentiation. We will demonstrate that automatic differentiation can greatly simplify the implementation of these functional derivatives without loss of accuracy.


Erik Hagersten, Uppsala University
Honey, I shrunk the parallel computer.

The introduction of multicore architecture brings the promise of great performance at low cost. However, it also introduces new bottlenecks that could even result in slower execution. Tuning for multicore requires insights into complex resource sharing (e.g., caches and bandwidth) and thread interaction (e.g., synchronization, coherence and false sharing). Even your perfectly scalable algorithm may need to be altered to run well.

This session discusses the new challenges associated with multicore performance today and tomorrow. A couple of very simple optimizations will be shown to have dramatic impact on performance and scalability. Multicore may not bring any free lunch, but there is often plenty of low-hanging fruit.

Biography Erik Hagersten

Professor Erik Hagersten holds a chair in computer architecture at Uppsala University Sweden since 1999. He is also the CTO of Acumem AB, developing new technology for multicore optimizations.

Hagersten was the chief architect for Sun Microsystem's high-end server engineering division 1993-1999. He is the architect of the Sun WildFire, the Sunfire Link and the Sun Enterprise 15k/25k UltraSPARC III scalable coherence technology. He coined the Cache-Only Memory Architecture (and it brain-dead acronym COMA) while managing the architecture research group at the Swedish Inst. of Computer Science. The Simics simulator is another result from that group.

Hagersten has been a board member of SNIC (Swedish National Infrastructure for Computing) since 2002. He is the author of more than 50 academic papers, holds more than 100 patents and is a member of the Royal Swedish Academy of Science and Engineering (IVA).


Kirk E. Jordan, Emerging Solution Executive Computational Science IBM T.J. Watson Research Center
Systems Level Acceleration: A Hybrid Computing Solution.

High performance computing (hpc) is a tool frequently used to understand complex problems in numerous areas such as aerospace, biology, climate modeling and energy. Scientists and engineers working on problems in these and other areas demand ever increasing compute power for their problems. However, when the scientific and engineering communities tackle their real problems, they often encounter bottlenecks as a result of pre- or post-processing required for a complete solution of the overall problem. In this talk, I propose a new hpc paradigm: System Level Accelerators, the deployment of diverse compute resources to solve the problem of a single work stream, all with the experience of a single server to the user. I will describe some of the challenges that will need to be considered in designing Systems Level Accelerators and point out some of the work underway to meet these challenges. I will give a few examples of current applications in geosciences and life sciences where we and others are beginning to apply these system level acceleration approaches. In conclusion, some discussion not only on the most obvious way to use ultra-scale, multi-core hpc systems as part of a work flow will be given but also some thoughts on how one might use such systems in a systems level acceleration paradigm to tackle previously intractable problems.


Anthony D. Joseph, Computer Science Division, University of California at Berkeley
Beyond the Hype: A Berkeley View of Cloud Computing.

With Cloud Computing, the long-held dream of computing as a utility, developers of new Internet services no longer require the large capital outlays in hardware to deploy their service or the human expense to operate it. There are three new aspects of large-scale computing in Cloud Computing:

  1. The illusion of infinite computing resources available on demand, thereby eliminating the need for Cloud Computing users to plan far ahead for provisioning.
  2. The elimination of an up-front commitment by Cloud users, thereby allowing companies to start small and increase hardware resources only when there is an increase in their needs.
  3. The ability to pay for use of computing resources on a short-term basis as needed (e.g., processors by the hour and storage by the day) and release them as needed, thereby rewarding conservation by letting machines and storage go when they are no longer useful.

Companies with large batch-oriented tasks can get results as quickly as their programs can scale, since using 1000 servers for one hour costs no more than using one server for 1000 hours. This elasticity of resources, without paying a premium for large scale, is unprecedented in the history of the Information Technology field.

However, there is significant hype surrounding Cloud Computing and in this talk, I will present a balanced discussion of our definition of Cloud Computing, the benefits it offers, and ten significant obstacles to Cloud Computing.

Biography Anthony D. Joseph

Anthony D. Joseph is a professor in Electrical Engineering and Computer Science at the University of California Berkeley, and the Director of Intel Research Berkeley. He received his Ph.D. degree in computer science from MIT in 1998, holds a UC Berkeley Chancellor's Professorship, and is a member of IEEE, ACM, and USENIX. He is developing adaptive techniques for: cloud computing, distributed network monitoring and triggering, network and computer security, and security defenses for machine learning-based decision systems. He also co-leads the DETERlab testbed, a secure scalable testbed for conducting cybersecurity research. His principal field of interest is systems and networking: cybersecurity, datacenter architectures, mobile systems and networking, and overlay networks.


Alf Kirkevåg, Norwegian Meteorological Institute (met.no)
Aerosol-cloud-climate interactions - simulating climate change with the global model CAM-Oslo.

Processes determining the physio-chemical properties of aerosols and their interactions with clouds are key processes in climate research. According to IPCC (2007), the low level of understanding of these processes contribute to large uncertainties in the simulations of climate and climate change. Results from multi-decadal as well as shorter simulations will be presented to illustrate the large sensitivity of radiative forcing and climate response to basic assumptions about the aerosols. For instance, some natural aerosols are poorly known and therefore neglected or only crudely parameterized in climate models. If no artificial constraints are introduced, this leads to an underprediction of the number of natural cloud droplets, which in turn yield severely overestimated cooling due to aerosols. It may also affects the climate sensitivty to greenhouse gases, although to a smaller degree. The ongoing development of more realistic parameterizations of aerosols and their interactions with clouds, the coupling of the various parts of the climate system in a new Norwegian Earth System Model, NorESM, and longer climate simulations as well as a need for increased spatial resolution all call for a drastic increase in the high performance computing resources in climate research in the coming years.

Marcin Krotkiewski, Physics of Geological Processes, University of Oslo
High performance computing in Physics of Geological Processes.

Physics of Geological Processes (PGP) is funded by the Norwegian center of excellence initiative. PGP strives to obtain a fundamental understanding of the processes that shape our planet. Numerical models are the obvious choice for studying these processes given large differences in both geometrical and temporal scales. While 2D models usually run on single CPU computers, 3D models require massively parallel systems. Here, we give an overview of the scientific and technical challenges, and show how the NOTUR architecture helps us to achieve our goals. In particular, we briefly discuss the following activities at PGP.

  1. Faulting and fracturing of rocks is an intrinsically discontinuous process. We use the discrete Element method to simulate the small scale results of earth quakes: fault g ouge fragmentation, deformation localisation and boundary roughness evolution in fault zones. 3D simulations provide insights into dynamic grain scale interactions that control the development of damage structures, the macroscopic mechanical stability, and hence the earthquake potential of a fault.
  2. Violent geophysical processes require numerical models that are capable of resolving the evolution of sharp fronts. For this purpose we use SAGE (developed at Los Alamos), a code that uses locally refining Eulerian meshes to solve the compressible hyrodynamics equations. SAGE uses realistic equations to represent the atmosphere, seawater and ocean crust. It is used to estimate risks and effects of a wide range of violent processes, such as tsunamis formed due to landslides and asteroid impacts, and explosive eruptions in geothermal systems.
  3. The above mentioned are fast processes. Another extreme is the deformation that takes place in the Earth interior over millions of years. Here we assume rocks behave as an incompressible Stokes fluid. Extending on our expertise with fast 2D finite element solvers (www.milamin.org), we have developed a 3D equivalent: BILAMIN. We use body fitted, unstructured meshes and iterative solvers. The parallel code scales on the entire Hexagon cluster (5.5 thousand CPUs). The largest systems solved have 0.5 billion elements. The code helps us understand how folds develop and interact in 3D, and what the effective material properties of heterogeneous rocks are.


Vladimir Mironov, NTNU
Semantic Systems Biology - state and challenges.

Systems biology is a new branch of integrative biology relying on computational modelling of biological processes for hypotheses generation. Systems biology is inherently multidisciplinary and depends on efficient data integration. For a number of reasons this task proved to be hard to achieve. The most recent and promising trend in the area of biological knowledge management is the application of Semantic Web technologies. We termed the fusion between Semantic Web technologies and systems biology Semantic Systems Biology. In this case new hypotheses are generated through deployment of automatic reasoning agents (primarily Description Logics reasoners) over massive biological knowledge bases. We discuss the current state of the field and the computational challenges.


Andrey Semin, Intel Corporation
Inside Intel Nehalem Microarchitecture.

The Core i7 and Xeon 5500 series microprocessors (built on Intel s Nehalem microarchitecture) represents a major advance in Intel processors designs, enabling significant increase in performance of wide range of applications. In this presentation, Andrey Semin, an HPC Technology Manager of Intel in Europe, will provide detailed overview of the new features in Nehalem microarchitecture, and highlight their benefits for HPC applications. In this talk you'll learn how new microarchitecture features, including Turbo Boost Technology, an integrated memory controller, and Hyper-Threading help real-life applications in CAE, Numerical Weather Simulation and Energy sectors. As HPC users always strive getting the best possible sustained performance of their applications, this presentation will cover software application aspects providing some tips-and-tricks for achieving performance gains on Nehalem, by utilizing knowledge about micro-architecture of the processor and system architecture.


Poster Session - Tuesday May 19, 14:50-16:00

The following submissions were accepted:
AuthorsTitle
A. Anderlik (UiB), A. Z. Munthe-Kaas (UiB), O. K. Øye (CMR), E. Eikefjord (UiB), J. Rørvik (UiB), D. M. Ulvang (CMR), F. G. Zöllner (University of Heidelberg), A. Lundervold (UiB), and C. Anderlik (BCCS)Integrated software prototype for medical image processing
Glenn Tørå and Alex Hansen (NTNU) and Pål-Eric Øren (Numerical Rocks)A dynamic network model for imbibition in porous media.
V. Vionnet, L. Bertino and K. A. Lisæter (Nansen and Environmental Remote Sensing Center, Mohn Sverdrup Center for Operational Oceanography, Bergen)Effects of snow cover heterogeneities on sea-ice development
Wenjie Wei, Stuart R. Clark, Xing Cai and Are Magnus Bruaset (Simula Research Laboratory) Parallel Simulation of Dual Lithology Sedimentation
Mustafa Barri, Helge I. Andersson, George K. El Khoury and Bjørnar Pettersen (NTNU) Direct numerical simulation of massive separated flows in one-sided expansion channels
Kenate Nemera Nigussa (NTNU), Kjetil Liestøl Nielsen (HiST), Øyvind Borck (NTNU), and Jon Andreas Støvneng (NTNU)Adsorption of atoms and small molecules on α-Cr2 O3 (0001) surfaces
Nicolaas Ervik Groeneboom (UiO)Cosmology done right: CMB analysis by Gibbs sampling
Thorvald Natvig (NTNU)Dynamic Optimization of MPI Communication
Rune Johan Hovland, Anne C. Elster and Magnus Lie Hetland (NTNU) High Data Volumes and Streaming on Future GPU Systems
Daniele G. Spampinato and Anne Elster (NTNU)Communication Challenges on Multi-GPU Systems
Jan C. Meyer and Anne C. Elster (NTNU)Modelling Overlapping Communication and Computation
Åsmund Herikstad and Anne Elster (NTNU)Parallel Techniques for Estimation and Correction of Aberration on Medical Ultrasound Imaging
Eirik Ola Aksens and Anne Elster (NTNU) Seismic Processing of Porous Rocks on Modern GPUs
Daniel Haugen and Anne Elster (NTNU)Strategies for Handling Large Amounts of Data from Storage to GPUs
Katarina Pajchel, Jon K. Nilsen and Alex Read (UiO)ARC middleware - current usage in the ATLAS experiment and the next generation
Amir Khosrowshahi (UMB), Jonathan Baker (Weill Cornell Medical College, New York), Hans Ekkehard Plesser (UMB), Gaute T. Einevoll (UMB), Bruno A. Olshausen (Redwood Center for Theoretical Neuroscience, University of California, Berkeley) Learning highly overcomplete representations for modeling response properties of visual cortical neurons
Jørgen Blakstad and Rune W. Nergård (NTNU)Generation of Rainbow Tables
Li-Ming Yang (UiO)All-Metal Aromaticity and Its Application to Metallocenes