BEGIN:VCALENDAR VERSION:2.0 PRODID:-//132.216.98.100//NONSGML kigkonsult.se iCalcreator 2.20.4// BEGIN:VEVENT UID:20260525T184221EDT-0150dT1Lbl@132.216.98.100 DTSTAMP:20260525T224221Z DESCRIPTION:Abstract\n\nThe structure of a neural network is one of the cri tical determinants of its performance and dictates how information flows. The architecture determines the representational capacity\, inductive bias es\, and computational efficiency of a model. These factors directly dicta te how well the network learns from data\, how well it performs on a given task\, and how well it generalises to new examples.\n\nThe challenge of i dentifying a good neural network architecture has traditionally been solve d by manual design\, relying on expert intuition and costly trial-and-erro r empirical experimentation. Such handcrafted designs are commonly reused for multiple applications to reduce the costs associated with structural d esign.\n\nWhile sharing the same architecture across a variety of tasks ma y bring potential benefits of native multimodality\, this choice can be in herently suboptimal for individual heterogeneous application requirements. As machine learning tasks grow more complex\, a rigorous architectural de sign is imperative to achieve robust generalisation and efficient utilisat ion of computational resources.\n\nThis thesis contributes several algorit hms for efficient architecture design and adaptation. It addresses both pa radigms for identifying structure: static\, where the goal is to identify the most suitable fixed architecture for a given task\; and dynamic\, wher e the structure of a neural network is adjusted to identify a specialised configuration for each data point of the task.\n\nIn the static paradigm\, two improvements are proposed in the field of Zero-Shot Neural Architectu re Search\, a domain where an architecture must be selected without traini ng any neural networks. The first contribution is a set of Zero-Shot ranki ng functions specifically designed for fast and memory-efficient evaluatio n of candidate architectures. They outperform state-of-the-art approaches not only in terms of accuracy\, but also in terms of computational efficie ncy. The second contribution is a statistical comparison procedure designe d to achieve improved architecture search stability. This procedure is com patible with common search algorithms and effectively mitigates the proble m of Zero-Shot ranking functions variability.\n\nIn the dynamic paradigm\, the thesis presents two novel sparse Mixture of Experts methods that effi ciently tackle the problem of expert specialisation. The first contributio n is a novel expert routing system that is designed to enforce the special isation of experts. The thesis demonstrates the benefits of the proposed s ystem by explaining how it can be used to achieve effective knowledge tran sfer from a teacher Graph Neural Network into a more efficient student Mix ture of Experts model\, outperforming existing Graph Neural Network knowle dge distillation approaches.\n\nThe second contribution is a Mixture of Ex perts that is specifically tailored to the graph domain. The thesis propos es a novel graph-structure-aware expert routing procedure that is used to distribute inference tasks to a set of heterogeneous experts. This allows the learning architecture to adapt to distinct graph patterns and exhibit robustness across a wide variety of graph learning tasks.\n DTSTART:20251001T173000Z DTEND:20251001T193000Z LOCATION:Room 603\, McConnell Engineering Building\, CA\, QC\, Montreal\, H 3A 0E9\, 3480 rue University SUMMARY:PhD defence of Pavel Rumiantsev – Efficient Algorithms for Automati c Structure Identification and Adaptation in Deep Neural Networks URL:/ece/channels/event/phd-defence-pavel-rumiantsev-e fficient-algorithms-automatic-structure-identification-and-adaptation-3677 14 END:VEVENT END:VCALENDAR