Course for the students of NTUU
- Manager: Yuri Gordienko Gordienko
- Teacher: Oleksandr Rokovyi
- Teacher: Yuriy Kochura
- Teacher: Oleg Alenin Alienin
Course for the students of NTUU
Course for the students of NTUU
The course is divided in 9 sessions : 7 sessions of teaching + 2 project/evaluation sessions
The overall planning of the course is the following :
Jan/Feb 2025 (31 Jan, 3, 4 Feb)
Three courses devoted to EU regulations : General data protection Regulation (GDPR), Digital Markets Act (DMA) & Digital Services Act (DSA) and Artificial Intelligence Act
May 2025 (19, 26, 27, 28 May)
Four courses devoted to governance issues of digital infrastructures, covering Internet, Telecommunications networks, Cloud & cybersecurity, software and open source, Artificial Intelligence and system architectures
June 2025 (12, 16 Jun)
The last two sessions are devoted to projects : they will consist in case study definition and analysis in small groups. Each group defines a use case (application, service, appliance, …) and analyses the corresponding ethics and sovereignty aspects that must be taken into account.
Evaluation will be based on the project presentations.
The course presents the field of intelligent data analysis as a novel research and application domain and offers students the instruments that will allow them to develop different data analysis applications.
Topics:
The students will gain an understanding of the different regulations, laws, and practical aspects of academic ethics and integrity, pertaining to research and development in computer science.
Topics :
Description:
The aim of the course is twofold : on one side it provides an introduction to the general principles of networking and an overview of the main protocols of the TCP/IP stack, on the other side more advanced topics involving the evolution of network and transport layer protocols are presented.
Topics:
Complementary content:
The course intercalates theoretical lectures and lab sessions. The main idea consists of presenting the theoretical background of a specific subject, followed by a lab session in which students will learn more details about each model and algorithm with practical examples using the most popular tools and libraries available. The course includes hand-on lab sessions with practical assignments, some of which are evaluated.
The course is connected-systems oriented, which means that, in addition to the most popular datasets, like MNIST and California houses, students will also see other examples of network-related datasets.
Topics:
complementary content:
Lab: management of time-series in Recurrent Neural Networks (RNN)
NOTA: MONDAY DIA 02: TIME: 13H_19H
Prerequisites:
Basics of data and databases.
Basics of programming.
Working knowledge of Python.
Working usage of Command Line Interface (CLI).
Pedagogical objectives:
In an age defined by the sheer magnitude, diversity, and speed of data production, expertise in Big Data Technologies is indispensable. Traditional data management tools are insufficient for managing this data avalanche, necessitating innovative solutions. Our advanced course, 'Big Data Technologies’, is tailor-made to equip students with the knowledge and hands-on skills crucial for navigating the realm of Big Data.
Our goal is simple: to instill a profound understanding of Big Data principles, frameworks, and state-of-the-art tools necessary for constructing resilient data systems capable of handling massive and intricate datasets. Throughout this course, students will master the basics of Big Data, recognize its pivotal role in today's data-centric world, and become proficient in employing various technologies and frameworks to design and implement scalable data solutions.
By the end of this intensive program, students will emerge with a refined skill set, enabling them to harness Big Data technologies adeptly, analyze data on a massive scale, and architect data systems primed for real-world challenges. Graduates will be primed to meet the burgeoning industry demand for skilled Big Data professionals, positioning them as invaluable assets in our data-driven landscape.
Description:
The STC will cover the following topics:
The big picture: tech megatrends.
Data modelling: Data vs data representation; Structured vs unstructured data; Relational data model; Semi-structured data models; Examples: csv, json, xml etc.; Graph data models; Data model vs data format; Data streams; Batch vs stream processing.
Characteristics of big data: The 3 (5) Vs, Big data vs Small data; Getting value out of big data, Big data strategy.
Big data management systems: Relational DBs; No-SQL DBs.
Storing big data: HDFS; Data warehouse; Data lake; Object storage.
Big data retrieval: Querying SQL; Querying JSON; SPARQL.
Big data ingestion: Ingestion infrastructure; Message queues; Pub/Sub; MQTT; Apache Kafka.
Batch processing: MapReduce; Apache Spark; Dask.
Stream processing: Spark Streaming; Apache Flink.
Advanced architectures: from data lakes/DWH to delta lakes & lakehouse architecture
Evaluation modalities:
Written quiz; a project assignment to perform after the STC execution will also be evaluated.
Topics: