|
Archive
2010-02-03:
Åpent seminar om framtidige behov for eInfrastruktur for norsk forskning
Open seminar on future needs for eInfrastructure for Norwegian research
Sted: Forskningsrådet, Stensberggata 26, Oslo
Dato: 19. mars 2010
Tid: 9:00 - 16:00
Main page for the seminar.
TALK ABSTRACTS
Paul Gibbon, Head of division Computational Science, Jülich Supercomputing Centre, Forschungszentrum Juelich, Germany. "Meeting the Exascale Challenge at the Juelich Supercomputing Centre." (PDF)
This talk will address recent developments in the field of supercomputing at JSC, beginning with an overview of petascale hardware installed in 2009 together with our present user support infrastructure. The JSC roadmap for exascale computing by 2020 includes plans for two joint `Innovation Centres' together with industrial partners, but also anticipates a major challenge for software developers. Experience over the last few years has shown that in many fields, scientists are increasingly unable to scale their applications to keep up with hardware parallelism, which already runs to hundreds of thousands of cores. To help meet this challenge, JSC has set up several `Simulation Laboratories' providing high-level algorithmic expertise in a number of strategic disciplines. These and other accompanying support measures will be briefly reviewed.
Mats Hamrud, European Centre for Medium-Range Weather Forecasts, Reading, UK. "Earth system modeling on future High Performance Computers."
The predictions from HPC vendors indicate that in the future we will see
little or no increase in the speed of individual cores. What we will
have instead is a rapid increase in the number of cores on individual
chips. This will put increased demand on the scalability of applications
to efficiently utilize these massively parallel computers.
Most existing weather forecasting systems and climate models were
originally developed to run on shared memory vector computers. They
have subsequently been adapted to utilize a limited number of
distributed memory processors. It is doubtful if this gradual adaption
will be sufficient to cope with the foreseen development in the HPC area.
The partial differential equations that needs to be solved in
meteorological problems imply that the problem is not trivially
parallel. In a global model where the problem domain has been spatially
distributed there is an applied global synchronization for each step in
time-stepping algorithm. Only if the user community is best served by
having a large ensemble of independent forecasts or realization do we
have a problem that can trivially use an increased core count.
The adaptation of the models to utilize orders of magnitude increased
core counts can often not be solved within the technical implementation,
it will require algorithmic changes. Algorithms that were developed to
be the most efficient on small core counts may have to be replaced in
the future. This could imply complete re-writes of existing codes. This
change of algorithms and the validation of these new algorithms will
take years to accomplish. This development has to be done by scientist
in the field working closely together with computer experts having a
good understanding of the underlaying hardware structures.
ECMWF as well as many other centers involved in weather forecasting and
climate research are aware of this looming problem and are taking
action. Research directions include new horizontal grids as well as
alternative solvers for elliptical equations. At ECMWF the most pressing
problem is the lack of scalability of our algorithm (incremental 4-D
Var) for obtaining the analysed initial state for starting our forecast
model.
The tentative conclusions presented above are based on initial work in
this area done at ECMWF and other weather forecasting centers.
Dieter van Uytvanck, Max Planck Institute for Psycholinguistics, Nijmegen, the Netherlands. "eHumanities: requirements and particularities in a nutshell." (PDF)
The humanities and social sciences used to be disregarded when talking about eInfrastructure. Fortunately this situation has changed significantly over the last years. In this talk the focus will go the specific requirements that these disciplines pose and the infrastructural components that are proposed to address them. Concepts like virtual collections, persistent identifiers, trust federations, web service chaining, data category registries and repository replication will be introduced. Keeping in mind some experiences from the preparatory phase of CLARIN, a vision of an eHumanities ecosystem will be sketched.
Mats Carlsson, Institute of Theoretical Astrophysics, University of Oslo. "Future needs for eInfrastructure in Astrophysics."
Astrophysics is in a golden age - large telescopes on the ground and general and specialized observatories from space provide exciting observations of objects ranging from the Sun - our nearest star, to the edges of the universe at the beginning of time. New projects in the planning include the detection of gravitational waves from supernovae and colliding black holes and the probable detection of life outside Earth. The complexity of the observations demands detailed numerical simulations for their interpretation. The presentation will concentrate on the Norwegian part in this exploration of our universe and outline the needs for eInfrastructure in the near future and in the 2015-2025 perspective.
Morten Hjorth-Jensen, Department of Physics, University of Oslo. "High-performance computing and quantum mechanical problems." (PDF)
Many-body quantum mechanics deals with the development of stable algorithms and numerical methods for solving Schrodinger's or Dirac's equations for many interacting particles in order to gain information about a given system. Typical examples of popular many-body methods are coupled-cluster methods various types of Monte Carlo methods, perturbative expansions, Green's function methods, the density-matrix renormalization group density-functional theory and large-scale diagonalization methods. The numerical algorithms cover a broad range of mathematical methods, from linear algebra problems to Monte Carlo simulations. Furthermore,
high-performance computing topics such as efficient parallelization are central to any serious study of many-body problems. The areas of applications span from our basic understanding of materials to the smallest constituents of matter and thereby the limits of stability of matter.
Here I will try to give an overview of what are the big challenges for
computational quantum mechanics, with an emphasis on high-performance computing
topics and future hardware and software needs.
Hans-Petter Langtangen & Xing Cai, Simula Research Laboratory. "HPC needs for biomedical flows and productivity of future computational scientists."
First we describe two biomedical flow applications that demand
large-scale high-performance computing: laminar/turbulent transitional
blood flow in the vicinity of aneurysms, and oscillatory flow of
cerebrospinal fluid in the upper spinal canal. These flows
require accurate computations of boundary layers to estimate the pressure
and shear stress at the vessel wall, because the wall stress
influences diseases related to the flows.
The core part of the talk concerns software tools for enabling
scientists to take advantage of the expected diversity of architecture
in high-performance computing. As increasingly complex algorithms
are devised to address real-world applications, the resulting software
will inevitably become painfully complicated.
This software complexity will come on top of
the increasing difficulty of harnessing tomorrow's supercomputers.
To help future computational scientists maintain a reasonable productivity,
we believe that one approach is to decouple the mathematical equations
and numerical algorithms from the concrete implementation. More
specifically, an ideal situation will be that a computational scientist
specifies the equations she wants to solve using a highly abstract
mathematical modeling language, together with a basic outline of
some preferred numerical algorithm. The translation from the mathematical
and numerical description to a working parallel code will be done by
automatic code generation. We have already gained positive experiences
in the FEniCS project, which enables a computational scientist
to focus on science instead of programming details. Of course, expertise
of computer scientists will be vital for implementing a framework
of automatic code generation, which is capable of achieving good performance
on tomorrow's supercomputers. In the latter respect, we would also like to
draw attention to some difficult issues of fully utilizing multicore processors.
Kristin Bakken, The National Library of Norway. "Språkbanken (The Norwegian language resource collection) and The National Library as a digital research infrastructure." (PDF)
The Norwegian Ministry of Culture granted The National Library the task of establishing a Norwegian language resource collection (Språkbanken) in the budget for 2010. Since January 1st 2010 we have been working to establish the foundations for such a resource collection.
Språkbanken is intended to be a repository that will facilitate language-based research and development. This includes the development of commercial products and services. The goal is to have all relevant Norwegian digital language resources that are publicly financed deposited and made available in an open digital repository . free of cost and legally regulated so as to enable use, adaptation and further development. The political motivation is to facilitate the development of language-based technology on the basis of Norwegian. Språkbanken will receive and store language data and linguistic tools on a long-term basis. It will migrate data into updated formats, and it will redistribute data and tools to new users. Språkbanken will to some degree finance further development of exisiting resources and the instigation of new resources. Relevant resources are primary data such as especially designed speech databases, as well as naturally occurring speech in various settings, and large collections of written text covering all genres and vocabularies. Equally important are lexical databases that are phonetically transcribed and linguistically annotated, and the tools to annotate text and speech collections with linguistic information.
The localization of Språkbanken to the national Library must be understood against the background of the library's ambitious digitization program. We are in the process of digitizing our entire collection ranging from text to sound and moving images. The possible synergy between this digitization program and the goals and needs of Språkbanken will be touched upon.
Farid Ould-Saada, Department of Physics, University of Oslo. "High Energy Physics and Computing." (PDF)
The Large Hadron Collider (LHC) ended its first full period of operation "on a high note" on December 16th 2009. First collisions were recorded and over two weeks of data taking, the six LHC experiments accumulated over a million particle collisions at 900 GeV. Collisions recorded at 2.36 TeV set a new world record confirming LHC as the world's most powerful particle accelerator. The first real data have been distributed smoothly around the world on the World-wide LHC Computing Grid (WLCG). LHC is about to start the main research programme, first at 7 TeV energy. After a shutdown expected in 2012 higher collision energies up to 14 TeV will be reached. LHC is the first particle accelerator to directly explore the TeV scale, a new energy frontier. By colliding beams of protons or lead nuclei, LHC will probe deeper into matter than ever before, reproducing conditions that prevailed in the first nanoseconds of the life of the Universe.
Together with the other Nordic countries Norway is committed to contribute computing and storage resources, as well as software, for the processing and analysis of data produced by LHC. The Nordic Tier-1, one of the 11 components of WLCG, helps building and maintaining data storage and analysis infrastructure for the entire HEP community around the LHC. In addition to the Tier-1s and ca. 100 Tier-2s, other computing facilities in universities and laboratories take part in LHC data analysis as Tier-3 facilities, allowing scientists to access Tier-1 and Tier-2 facilities.
The main aim of this talk is to present the estimated needs for eInfrastructure in terms of computing, data storage, services and support for HEP in the long term, in particular from 2015, when major LHC and detector upgrages are expected, requiring higher computing resources. High-quality middleware is another important requiremen in the next 10-15 years.
The NorduGrid Advance Resource Connector (ARC) has been chosen, together with gLite and UNICORE, to empower the European Grid Initiative.
Einar Rønquist, Department of Mathematical Sciences. NTNU. "To what extent is the future need for eInfrastructure predictable?" (PDF)
In certain areas of science and technology the future needs for eInfrastructure are somewhat predictable by following international trends in the various research fields. At the national level, the needs are not always equally clear. Sometimes significant needs occur in a more unpredictable manner or are given indirectly by the emphasis on specific research areas. I will elaborate on this by reflecting on my interaction with various research groups at NTNU within science and engineering. I will also comment on the issue of software development as well as the need for support teams.
Gunnar Wollan, Department of Geosciences, University of Oslo. "The need for HPC power and storage at the Department of Geosciences, today and in the future."
The Department of Geosciences consists of different research sections which implies that the need for HPC and storage is not evenly distributed among the sections. The Department is divided into the following sections:
- Section for Meteorology and Oceanography
- Section for Environmental Geology, Hydrology and Geohazards
- Section for Physical Geography
- Section for Petroleum Geology and Geophysics
- Section for Tectonics, Petrology and Geochemistry
Topics addressed:
- Which sections are the heavy users of HPC and storage at the Department
- Our HPC and storage situation today
- Plans for the near future (today - 2015)
- Our long term need (2015 - 2020, 2025), development in model usage and
complexity and the demand for HPC power and storage
Inge Jonassen, Bergen Center for Computational Science, Uni Research / University of Bergen. "Computational and Systems Biology - Increasing Needs for High Performance Computing and e-Infrastructures."
Biology is to an increasing extent an information science. Knowledge about biological systems represented in the literature and in databases and include large volumes of measurement data and to an increasing extent models of biological systems. Examples of the former are DNA sequences and numbers summarizing gene activity. The latter include mathematical models using differential equations to describe dependencies between abundance of molecules of different species. Increasingly complex data types and models are appearing along with data types of growing volume and complexity. Life science research is now generating data amounts on a scale similar to nuclear physics and the expectation is that the volumes and complexity of data will continue to grow rapidly. In biological research projects one aim to use all available relevant information to help design experiments and in interpreting the resulting data. Enabling biological researchers to deal with this in an efficient manner is clearly a challenge that requires adaptation and development of e-infrastructures tailored to the needs and competence of users with different backgrounds and objectives. Handling the data, allowing queries and data mining to be performed in an efficient way, requires high performance computing. Some data, e.g., personal genomes, require security and privacy. Life science is likely to be the field with the highest requirements for both e-infrastructure and computational resources in the next decades. In this presentation I aim to present an overview with some more details for a few examples. I will also discuss some e-infrastructure projects in the life science area, and discuss how the planned European Bioinformatics Infrastructure (ELIXIR) relates to the challenges.
Ingrid Helen Garmann Østensen, Norwegian Institute of Public Health (FHI) "eInfrastructure challenges - Experiences from a ongoing GWAS project and the MoBa GWAS group."
As one of the largest projects based on The Norwegian Mother and Child Cohort Study (MoBa) data, we have been a pilot project in many aspects of eInfrastructure. This talk will be about our challenges and experiences regarding the process of establishing a solution, not only for our own project, but for all other projects tied to MoBa data.
Bénédicte Ferré, University of Tromsø "Network design and conceptual model for a Virtual Institute of Deep-Sea Observatories. Establishment of cable-based ocean observatories in Norway." (PDF)
Related to the European Strategy Forum on Research Infrastructure roadmap (ESFRI), European Multidisciplinary Seafloor Observation (EMSO) and European Sea-floor Observatory Network (ESONET) research activities, the Virtual Institute of Scientific Users of Deep-Sea Observatories (VISO) will develop a structure for a cyberspace network. Aims are to allow (1) persistence in long-term operation and data access, (2) system interoperability for real-time access and needs, (3) intercommunication in connecting various data streams, (4) community building for increasing interactions across ocean science disciplines to understand the coupling of the ocean, climate and environmental processes. The network design will be performed in cooperation with the Ocean Observatory Initiative of the USA and Canada.
To help building such an e-infrastructure, seven leading Norwegian research institutions and partners from the industry have joint forces in the Norwegian Ocean Observatory Network (NOON) to establish a cable-based observatory network in Norway: from a test site in Hardangerfjorden, via observatories on the shelf off Vesterålen (the gateway to the Barents Sea) and Svalbard towards the Deep Sea. Cable-based ocean observatories (COO) represent a research infrastructure that enables continuous in-situ long term monitoring of the marine life and its environment at real time with low impact to the environment. Such infrastructures overcome limitations from standard instrumentation (e.g. moving sensors like ROVs or gliders, or stationary systems like buoys) that only provide short-time data or offer limited electric power and data bandwidth.
(Bénédicte Ferré, Jürgen Mienert, Svein Winther, Friedrike Hoffmann, Anne Hageberg, Olav Rune Godø and the EMSO, ESONET AND NOON team Institute of Geology, University of Tromsø)
|