Skip to content

Free University of Bozen-Bolzano

Big data methods for economics and business

Semester 1-2 · 27512 · Master in Data Analytics for Economics and Management · 12CP · EN


Module 1 focuses on advanced statistical techniques for analyzing high-dimensional datasets frequently encountered in business intelligence and economic research. Key topics include penalized and convex optimization methods for model selection (such as LASSO), model aggregation techniques, dimension reduction, high-dimensional regression models, and network-based inference using graphical models. The module also introduces multiple testing procedures for identifying significant patterns across many variables. Emphasis is placed on practical implementation using R and Python, and on the ability to apply these tools to extract interpretable, actionable insights from large-scale data in business and economic applications.

Module 2 provides an in-depth introduction to Natural Language Processing (NLP) with a strong focus on modern applications in business and economics. Core topics include algorithmic text classification, sentiment analysis, neural language modeling, and advanced information retrieval using vector-based and neural approaches. Students will learn techniques for web scraping, prompt engineering, and the use of Retrieval-Augmented Generation (RAG) systems, which combine document retrieval with generative models to improve accuracy and relevance. The module also explores recent developments in large language model (LLM) applications, including multi-agent systems and conversational AI, equipping students to critically evaluate and implement state-of-the-art NLP solutions.

Lecturers: Davide Ferrari, Paul Michael Pronobis

Teaching Hours: M1: - 24 hours of in-person lectures - 12 hours of video lectures (counted as 24 hours to account for re-watching) M2: - 24 hours of in-person lectures - 12 hours of video lectures (counted as 24 hours to account for re-watching)
Lab Hours: -
Mandatory Attendance: Recommended, but not required.

Course Topics
M1: • High-dimensional data, big data and the curse of dimensionality • Convex criterions for model selection • Model aggregation and model combining • Introduction to data dimension reduction • High-dimensional regression • Graphical models • Multiple testing M2: 1. Introduction to Natural Language Processing (NLP): Exploring the fundamentals of NLP, including its history, applications, and difference to other neural networks. 2. Algorithmic Text Classification and Sentiment Analysis: Detailed instruction on various algorithms for categorizing text and extracting sentiment, comparing their effectiveness and use cases. 3. Neural Networks in NLP and Language Modeling: An in-depth look at how neural networks are applied in NLP, focusing on using and evaluating different NLP models. 4. Advanced Techniques in Information Retrieval: Utilization of cutting-edge neural network strategies combined with vector space models to efficiently retrieve information. 5. Web Scraping for Knowledge Construction: Techniques for extracting information from the web to build databases for applications that demand current or extensive factual data. 6. Prompt Engineering for Enhanced Language Understanding: Crafting effective prompts to improve relation extraction, answer questions accurately, support dialog systems, and create responsive chatbots. 7. Fine-Tuning: Introducing key steps for adapting pre-trained language models (CLM and MLM) through preprocessing and model training. Also covers performance evaluation using tools like Wandb, enabling effective monitoring and optimization for various NLP tasks. 8. Innovations in Large Language Model (LLM) Applications: Exploring multi-agent conversations and the latest advancements in LLM applications, pushing the boundaries of interactive AI systems.

Teaching format
The course adopts a blended, student-centred approach that emphasises problem-based learning and active engagement. A portion of the lecture content is made available online in advance, allowing students to explore key concepts independently and at their own pace before attending class. This preparatory work enables in-person sessions to focus on the application of knowledge through real-world problems, collaborative activities, and guided discussions — fostering critical thinking and deeper learning. The course is fully aligned with the principles of the Italian Universities Digital Hub (EDUNEXT) initiative (https://edunext.eu), which promotes the integration of digital resources and active learning strategies within university teaching.

Educational objectives
Knowledge and understanding: The student will acquire knowledge of the analytical techniques and tools required to understand and quantitatively analyse economic and business phenomena in order to support decision-making processes. Knowledge of statistical inference, linear models and their generalisations, linear algebra, and optimisation techniques will be consolidated. In-depth knowledge of the main techniques of supervised and unsupervised statistical learning will be acquired, which are functional for the development of analysis and visualisation capabilities of economic and business data. Applying knowledge and understanding: Ability to apply and implement analysis techniques focusing on different types of datasets such as streaming data, tabular data, documents and images and analysis on joint datasets. Ability to apply supervised and unsupervised learning themes, and knowledge modelling, extraction, integration, analysis and exploitation; these skills are declined in various application domains of interest to companies and public and private entities Making judgements: Master graduates will have the ability to apply the acquired knowledge to interpret data in order to make managerial and operational decisions in a business context. Master's graduates will have the ability to apply the acquired knowledge to support processes related to production, management and risk promotion activities and investment choices through the organisation, analysis and interpretation of complex databases. Communication skills: Master's graduates will be able to communicate effectively in oral and written form the specialised contents of the individual disciplines, using different registers, depending on the recipients and the communicative and didactic purposes, and to evaluate the formative effects of their communication. Learning skills: "MSc graduates should be familiar with the tools of scientific research. They will also be able to make autonomous use of information technologies to carry out bibliographic research and investigations both for their own training and for further education. In addition, through the curricular teaching and the activities related to the preparation of the final thesis, they will be able to acquire the ability - to identify thematic links and to establish relationships between methods of analysis and application contexts; - to frame a new problem in a systematic manner and to implement appropriate analysis solutions; - to formulate general statistical-econometric models from the phenomena studied.

Assessment
The overall exam mark will be determined by the assessment of the two modules (M1+M2). M1: Final Exam (60%): The final exam consists of problems related to the use of statistical methods and interpretation of results obtained from the analysis and interpretations of various data sets. Assignments (40%): Data analysis assignments to be handed in will be assigned three times during the semester. M2: Final Exam (60%): The final exam consists of problems related to the use of statistical methods and interpretation of results obtained from the analysis and interpretations of various data sets. Assignments (40%): Data analysis assignments to be handed in.

Evaluation criteria
In both modules the exam modalities are the same for both the attending and the non-attending students. Project work (40% of the final grade) and written exam (60% of the final grade). • Relevant for project work: clarity of presentation, ability to gain useful and novel insights from data, creativity, critical thinking, ability to adhere to reproducible research best practices • Ability to use R and other software to perform basic data preparation tasks, ability to properly use R libraries, ability to choose the best type of graphical representation for different types of data, correct usage of basic statistical tools Ability to use Python to employ (understand, recall and use) data analytics methods in practical settings in relation to data analysis and visualization.

Required readings

M1:

Lederer, J. (2022). Fundamentals of high-dimensional statistics. Springer International Publishing.

M2:

Tunstall, L., Von Werra, L., & Wolf, T. (2022). Natural language processing with transformers. " O'Reilly Media, Inc.




Download as pdf

Modules

Semester 1 · 27512A · Master in Data Analytics for Economics and Management · 6CP · EN

Module A — M1 - Statistical methods for high-dimensional data

This module focuses on advanced statistical techniques for analyzing high-dimensional datasets frequently encountered in business intelligence and economic research. Key topics include penalized and convex optimization methods for model selection (such as LASSO), model aggregation techniques, dimension reduction, high-dimensional regression models, and network-based inference using graphical models. The module also introduces multiple testing procedures for identifying significant patterns across many variables. Emphasis is placed on practical implementation using R and Python, and on the ability to apply these tools to extract interpretable, actionable insights from large-scale data in business and economic applications.

Lecturers: Davide Ferrari

Teaching Hours: - 24 hours of in-person lectures - 12 hours of video lectures (counted as 24 hours to account for re-watching)
Lab Hours: -

Course Topics
• High-dimensional data, big data and the curse of dimensionality • Convex criterions for model selection • Model aggregation and model combining • Introduction to data dimension reduction • High-dimensional regression • Graphical models • Multiple testing

Teaching format
This module adopts a blended, student-centred approach that emphasises problem-based learning and active engagement. A portion of the lecture content is made available online in advance, allowing students to explore key concepts independently and at their own pace before attending class. This preparatory work enables in-person sessions to focus on the application of knowledge through real-world problems, collaborative activities, and guided discussions — fostering critical thinking and deeper learning. The course is fully aligned with the principles of the Italian Universities Digital Hub (EDUNEXT) initiative (https://edunext.eu), which promotes the integration of digital resources and active learning strategies within university teaching.

Required readings

Lederer, J. (2022). Fundamentals of high-dimensional statistics. Springer International Publishing.



Semester 2 · 27512B · Master in Data Analytics for Economics and Management · 6CP · EN

Module B — M2 - Natural language processing and web analytics

This module provides an in-depth introduction to Natural Language Processing (NLP) with a strong focus on modern applications in business and economics. Core topics include algorithmic text classification, sentiment analysis, neural language modeling, and advanced information retrieval using vector-based and neural approaches. Students will learn techniques for web scraping, prompt engineering, and the use of Retrieval-Augmented Generation (RAG) systems, which combine document retrieval with generative models to improve accuracy and relevance. The module also explores recent developments in large language model (LLM) applications, including multi-agent systems and conversational AI, equipping students to critically evaluate and implement state-of-the-art NLP solutions.

Lecturers: Paul Michael Pronobis

Teaching Hours: - 24 hours of in-person lectures - 12 hours of video lectures (counted as 24 hours to account for re-watching)
Lab Hours: -

Request info