IS-ENES3 Summer School on Data Science for Climate Modelling
The Institute of Informatics & Telecommunications at NCSR Demokritos is delighted to be co-organising, along with the IS-ENES3 consortium, and hosting in September 2022 the upcoming IS-ENES3 Summer School on Data Science for Climate Modelling. This Summer School aims to increase expertise and skills on theoretical and practical concepts of Data Science, building upon and mainly targeting how to accelerate scientific discovery from data. Early stage researchers will learn how to analyse, visualise and report on massive datasets, in the scientific domain as well as how to apply data-intensive and data-oriented paradigms and solutions to address scientific discovery in climate science.
Driven by the theoretical background provided by domain, data and computer science experts, the school will adopt a hands-on approach for maximising results focusing on the usage of datasets linked to the IS-ENES3 data services. The school will strengthen the individual expertise of the participating climate and computer scientists, as well as, leverage and emphasise the need of collaboration between them, helping early career scientists with different backgrounds to meet and network.
Applications are now CLOSED
Deadline for submissions: 30/04/2022
Deadline extended: 10/5/2022
At a glance
What: One week, fully funded Summer School
When: 1-7 September, 2022
Where: Athens, Greece
Fee: Free of charge
February 2022: Call opens
30 April 2022: Deadline to apply
10 May 2022: Deadline extended Mid June 2022: Decision of acceptance sent to applicants
1-7 September 2022: IS-ENES3 Summer School on Data Science for Climate Modelling
The IS-ENES3 Summer School on Data Science for Climate Modelling will be held between 1 – 7 September 2022 physically, at the premises of the National Centre for Scientific Research Demokritos.
Please find below the draft programme of the IS-ENES3 Data Science Summer School.
European infrastructures for data-intensive computing | Tiziana Ferrari – EGI
Modern data-intensive and compute-intensive science from all domains involves modelling and simulation at very high resolution for prediction and inference workflows. Given the complexity of the computing workflows and the data required by models which can vary from Gigabytes to Petabytes of information per day to be processed, the
ability to deploy ready to use tools that federate the access to resource to run complex AI-based processing workflows federating access to heterogeneous and distributed computing architectures is required. This requires ground-breaking innovation in computational and data handling capacity needs.
The EGI Federation as e-Infrastructure for data-intensive science has the mission to deliver generic capabilities for high volume and high and high speed data acquisition-volume and high-speed data acquisition and pre-processing, big data assimilation into model, forecast production by different simulation models, real-time processing of data, and validation of accuracy in modelling and simulation.
In this presentation we will see how these issues have been addressed by federating and sharing hundreds of data centres and a large portfolio of scientific applications, and how the European Open Science Cloud initiative is expected to create a European data space for research data.
Towards a Science of Trustworthy Artificial Intelligence | Barry O’Sullivan, University College Cork
There has been an increasing interest in the role of ethics in artificial intelligence and in the notion of Trustworthy AI, in particular. The European Commission has invested significant effort in setting out its strategy and its vision for the importance of delivering Trustworthy AI, including the proposal of a new regulatory instrument, the AI Act. In this talk I will give an overview of the policy context in relation to Trustworthy AI. I will give an overview of the scientific challenges associated with ensuring that AI and data-driven systems are trustworthy. I will highlight some applications that combine AI, in particular constraint programming and optimisation, with simulation-based methods, and the opportunities that these techniques offer in delivering Trustworthy AI.
Open Science Methodologies and Examples | Yannis Ioannidis – NKUA, Athena RC
Open Science is a new paradigm for scientific research and innovation. It changes the way researchers operate, how they share their data and other research products they use and produce, even at intermediate stages, what constitutes a publication, how research and researchers are evaluated, and essentially all aspects of the research endeavour. This presentation will provide a quick tour of the the concept, with definitions and illustrative examples, and will discuss how Open Science is introduced in Europe and elsewhere and the policies that have been adopted in the recent past. It will also highlight the role that the OpenAIRE infrastructure is playing in implementing those policies, its technological architecture and profile, and its contribution to the even more general European Open Science Cloud (EOSC).
Machine/Deep learning (AI)
Theodoros Giannakopoulos – NCSR Demokritos
Basic principles of ML-based solutions for AI applications will be discussed. In particular, an introduction to widely used ML and DL algorithms will be provided, such as SVMs, Decision Trees, Neural Networks, Convolutional Neural Networks, Autoencoders and Sequential Models. Also, practical horizontal ML issues will be discussed, related to data integration (annotation, data gathering etc) and ML end-to-end evaluation.
Climate sciences, Environment
Christian Pagé – CERFACS
Climate: Introduction to the climate system, current and future climate modelling, as well as climate analysis to support climate change impacts’ assessment.
Climate and AI: Examples and ideas on the use of AI techniques for advanced climate data analysis.
Data engineering (DataEng)
Stephan Kindermann –DKRZ
Time consuming parts of climate data analysis activities are data discovery, collection and access of input data as well as sharing of data products (following the FAIR data principles: Findable, Accessible, Interoperable and Reproducible). Thus this module provides a theoretical as well as practical (hands on) introduction to using modern data discovery and data access mechanisms to use in the context of the existing Petabyte range climate data repositories. Also an overview is provided on best practices for data sharing supporting the FAIR data principles. Specific emphasis will be on accessing data analysis ready data hosted in the cloud as well as large data centers. Practical hands on exercises will be provided based on jupyter notebooks with direct access to Petabytes of climate model data hosted at DKRZ supporting a set of pre-installed useful basic data analysis packages.
Complete use-cases / applications (Apps)
Enrico Scoccimarro – CMCC
Francesco Immorlano – CMCC
Giovanni Aloisio – CMCC
In the present module, a case study related to the Tropical Cyclones Detection and Tracking tasks will be presented. Specifically, the analysis will start from the traditional approaches that are currently used by the meteorological centers, in order to point out and explain the effectiveness of novel Machine Learning-based techniques for tackling the aforementioned tasks. The description will be carried on with operational examples of both traditional and Machine Learning models.
How to apply
Applications for the Summer School will open in February 2022 and will close on May 10, 2022, 23:59.
Number of participants & Eligibility Criteria
The number of participants in the Summer School is limited to 40 persons; with this compact group we want to create a committed ‘community’ that will help each other during this School.
The school primarily targets postgraduate students or researchers in related physical and computer sciences. The working language will be English. If the number of applications exceeds the maximum number of participants, participants will be selected according to the following criteria:
- The Summer School aims to attract a diverse selection of participants from the domains of climate sciences, data-intensive computing, AI and related fields. A balanced mix will be prioritised.
- The IS-ENES3 project aims to ensure a diverse and gender equal mix of students from around Europe, thus we encourage researchers from all European countries to submit their application.
Accommodation & Meals
The accommodation and meals of the students will be fully covered by the IS-ENES3 project. Please note that participants are required to make their own arrangements and pay for their travel to Athens, Greece.
Meet the Tutors & Keynote Speakers
|Tiziana Ferrari is the Director of the EGI Foundation. She holds experience in European science policy and governance, open science commons, international standards, service management in highly distributed federated data and compute infrastructures, and large scale project management. She contributed to the definition of European strategy on federated cloud and edge technologies and infrastructures in the context of the H-CLOUD project and has been contributing to the implementation of the first phase of the European Open Science Cloud by leading the first and largest implementation project EOSC-hub. Tiziana was formerly Technical Director and Chief Operations Officer of the EGI Infrastructure, taking care of the operations coordination of the technical infrastructure, one the largest computing platforms for research in the world. Tiziana holds a PhD in Electronics and Data Communications Engineering from the Universita’ degli Studi in Bologna.|
|Yannis Ioannidis is a Professor at the Department of Informatics and Telecommunications of the National and Kapodistrian University of Athens as well as an Associated Faculty at the “Athena” Research and Innovation Center, where he also served as the President and General Director for 10 years (2011-2021). His research interests include Database and Information Systems, Data Science, Data and Text Analytics, Scalable Data Processing, Data Infrastructures and Digital Repositories, Recommender Systems and Personalization, and Human-Computer Interaction, topics on which he has published over 160 articles in leading journals and conferences and also holds three patents. His work is often inspired by and applied to data management and analysis problems that arise in industrial environments or in the context of other scientific fields (Social Sciences and Humanities, Life Sciences, Physical Sciences, Biodiversity, Cultural Heritage) and the Arts. He has also led or is currently leading the creation of new international or spin-off companies. He is an ACM and IEEE Fellow, a member of Academia Europaea, and a recipient of the ACM SIGMOD Contributions Award and several other research and teaching awards. He has served as the Secretary / Treasurer of the Association of Computing Machinery (ACM), has been a member of the ACM Europe Council, and serves as the Faculty Sponsor of the Univ. of Athens ACM Student Chapter. He has been elected President of the ACM (July 2022-June 2024). Last but not least, he is on the strategic management board of the Greek hub of the UN Sustainable Development Solutions Network.|
|Barry O’Sullivan FAAAI, FEurAI, FIAE, FICS, MRIA, is an award-winning academic with more than 25 years experience working in artificial intelligence. He is a full professor at the School of Computer Science & IT at University College Cork and a member of its Governing Body. He is founding Director of both the Insight Centre for Data Analytics at UCC and Director of the SFI Centre for Research Training in AI. In July 2018 Barry was appointed Vice Chair of the European Commission High-Level Expert Group on AI. He is a Fellow and a past President of the European AI Association. He is also a Fellow and a member of the Executive Council of the Association for the Advancement of Artificial Intelligence. He is a member of the Royal Irish Academy, Ireland’s highest academic accolade. He chairs the Advisory Board of the GRACE project at Europol, and advises the Leuven.ai institute at KULeuven (Belgium) and the Computational Sustainability Network, a network of universities in the USA. In 2019 Professor O’Sullivan was appointed by Ireland’s Minister for Health to the Health Research Consent Declaration Committee. In 2020 he was appointed Chair of the Oversight Board of Health Data Research UK (North), led by the University of Liverpool. In 2021 he was, again, appointed by the Minister for Health as Chair of the National Research Ethics Committee for Medical Devices. In 2022 he was appointed by the Minister for Trade Promotion, Digital & Company Regulation to the Enterprise Digital Advisory Forum.
The tutors of the IS-ENES3 Summer School on Data Science for Climate Modelling are the following:
|Theodoros Giannakopoulos received his Ph.D. from the department of Informatics and Telecommunications, UOA, in 2009. He is the coauthor of more than 100 publications in journals and conferences in the fields of pattern recognition and multimedia analysis and the coauthor of a book titled “Introduction to Audio Analysis: A MATLAB Approach”. He is an active member of the open source community, author of the pyAudioAnalysis and deep_audio_features libraries. He is currently a Tenured Researcher at the Institute of Informatics and Telecommunications, NCSR “Demokritos”, Greece. He has several years of experience in tutoring, mostly in Master Programs organized by NCSR Demokritos, courses such as: Machine Learning, Deep Learning, Data Programming and Multimodal Data Analysis. His research interests lie in the fields of multimodal machine learning, music information retrieval and speech analytics.|
|Francesco Immorlano is a Ph.D. student at the Department of Innovation Engineering of the University of Salento (Italy). He collaborates with the Exascale Machine Learning for Climate Change (EMLC2) research unit at the Advanced Scientific Computing (ASC) division of the Euro-Mediterranean Center on Climate Change (CMCC) Foundation. His work is focused on the study and the development of Machine Learning and Deep Learning models with a specific application to Extreme Weather Events and to other Climate Science-related use cases. He is involved in IS-ENES3 and eFlows4HPC European projects and conducts his research activity under the HPC-TRES research program.|
|Stephan Kindermann is a computer scientist working since more then 20 years at the German climate computing center (DKRZ) in the context of data managment infrastructures. He was and is involved in many national and international efforts which are targeting the establishment of infrastructure and services to support the climate community in data distribution, access and processing. This on one hand side includes the involvement of infrastructure projects like IS-ENES, EUDAT, EOSC as well as efforts like ESGF and Copernicus. On the other hand side this targets the establishment of operational data services at DKRZ. A major focus of his interests is the support of climate science workflows with the help of new distributed data infrastructure technologies and services. A key component in this support is the enabling of FAIR data and FAIR data services.|
||Christian Pagé holds a “highly qualified” research engineer position at CERFACS. He has been active in research and development since 1995, covering a large spectrum of atmospheric sciences. He has been involved in many large projects. He is currently involved in improving access to large data volumes for use within the climate community. He has been involved in several European projects, notably FLYSAFE, EUDAT/EUDAT2020, IS-ENES/IS-ENES2, CLIPC, SPECS, DARE, often as a work package leader. He is also involved in the Earth System Grid Federation (ESGF) Compute Working Expert team, on providing data processing near the data storage for large data volumes in a federated infrastructure. He is also part of the ESGF Executive Committee and plays an active role in the FAIR data aspects of the Research Data Alliance (RDA).|
|Enrico Scoccimarro is Senior Scientist at the euro-Mediterranean Center on Climate Change (CMCC), and deputy director of the Climate Simulations and Predictions division (CSP). He has 20 years of experience in climate modelling with a special focus on the coupling between the atmosphere and ocean components of General Circulation Models. He has been partner and WP leader in several international projects (about ten H2020 projects) mainly dealing with high resolution modelling and impacts associated to extreme events. His main research interest is on extreme events such as Tropical Cyclones with particular focus on their interaction with the Climate System. He has been member of the TCMIP (Tropical Cyclone Model Intercomparison Project) and member of the US-CLIVAR Hurricane Working Group since 2011. He is author of more than 60 peer-reviewed publications presented in more than 100 international conferences, with most of the scientific production focusing on extreme events.
Contact us with your questions