• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Scientists Present New Solution to Imbalanced Learning Problem

Scientists Present New Solution to Imbalanced Learning Problem

© iStock

Specialists at the HSE Faculty of Computer Science and Sber AI Lab have developed a geometric oversampling technique known as Simplicial SMOTE. Tests on various datasets have shown that it significantly improves classification performance. This technique is particularly valuable in scenarios where rare cases are crucial, such as fraud detection or the diagnosis of rare diseases. The study's results are available on ArXiv.org, an open-access archive, and will be presented at the International Conference on Knowledge Discovery and Data Mining (KDD) in summer 2025 in Toronto, Canada.

The problem of imbalanced learning is becoming increasingly relevant across various fields, including banking and medicine. Conventional methods, such as random oversampling, often generate low-quality samples or fail to accurately model rare class data.

Simplicial SMOTE (Synthetic Minority Oversampling Technique), a novel solution proposed by scientists from HSE University and Sber AI Lab, addresses these issues by enabling more accurate modelling of complex topological data structures and improving classifier performance on imbalanced datasets.

It generates new examples of a rare class by leveraging information from multiple closed instances ('simplex'), rather than just two close points, as in the original SMOTE and its well-known modifications. This facilitates a better understanding of the data and advances performance. The technique improves training on imbalanced data, where one class (eg, normal transactions) has many examples, while another class (eg, fraud) has few.

Researchers have experimentally shown on a large number of test datasets that the proposed approach achieves significantly better performance metrics, such as the F1 Score and Matthews Correlation Coefficient, for both the basic SMOTE and its modifications. In particular, an improvement was observed in gradient boosting, a classifier commonly used in practice.

'Our technique is particularly effective for tasks involving imbalanced data, where the rare class holds greater significance. Banks can use Simplicial SMOTE to detect fraud more effectively, and medical centres can apply it to diagnose rare diseases,' says Andrey Savchenko, co-author of the article and Leading Research Fellow at the Laboratories for Theoretical Modelling in AI of the HSE AI and Digital Science Institute.

The new technique can be integrated into existing oversampling algorithms (such as Borderline-SMOTE, Safe-level-SMOTE, and ADASYN), enabling better accuracy without significantly increasing computational complexity. According to the researchers, the developed approach could contribute to the creation of more accurate and reliable machine learning models, thereby improving the quality of analytics.

The study was conducted with support from the HSE Basic Research Programme.

See also:

‘HSE’s Industry Ties Are Invaluable’

Pan Zhengwu has spent the last seven years at HSE University—first as a student of the Bachelor’s in Software Engineering and now in the Master’s in System and Software Engineering at the Faculty of Computer Science. In addition to his busy academic schedule, he works as a mobile software engineer at Yandex and is an avid urban photographer. In his interview with the HSE News Service, Zhengwu talks about the challenges he faced when he first moved to Russia, shares his thoughts on ‘collaborating’ with AI, and reveals one of his top spots for taking photos in Moscow.

Hi-Tech Grief: HSE Researchers Explore the Pros and Cons of Digital Commemoration

Researchers at HSE University in Nizhny Novgorod have explored how technological advancements are transforming the ways in which people preserve the memory of the deceased and significant events. Digital technologies enable the creation of virtual memorials, the preservation of personal stories and belongings of the deceased, interaction with their digital footprint, and even the development of interactive avatars based on their online activity. However, these technologies not only evoke nostalgia and provide a sense of relief but can also heighten anxiety and fear, and delay the process of accepting loss. The study has been published in Chelovek (The Human Being). 

Scientists Find Out Why Aphasia Patients Lose the Ability to Talk about the Past and Future

An international team of researchers, including scientists from the HSE Centre for Language and Brain, has identified the causes of impairments in expressing grammatical tense in people with aphasia. They discovered that individuals with speech disorders struggle with both forming the concept of time and selecting the correct verb tense. However, which of these processes proves more challenging depends on the speaker's language. The findings have been published in the journal Aphasiology.

Implementation of Principles of Sustainable Development Attracts More Investments

Economists from HSE and RUDN University have analysed issues related to corporate digital transformation processes. The introduction of digital solutions into corporate operations reduces the number of patents in the field of green technologies by 4% and creates additional financial difficulties. However, if a company focuses on sustainable development and increases its rating in environmental, social, and governance performance (ESG), the negative effects decrease. Moreover, when the ESG rating is high, digitalisation can even increase the number of patents by 2%. The article was published in Sustainability.

Russian Scientists Develop New Compound for Treating Aggressive Tumours

A team of Russian researchers has synthesised a novel compound for boron neutron capture therapy (BNCT), a treatment for advanced cancer that uses the boron-10 isotope. The compound exhibits low toxicity, excellent water solubility, and eliminates the need for administering large volumes. Most importantly, the active substance reaches the tumour with minimal impact on healthy tissues. The study was published in the International Journal of Molecular Sciences shortly before World Cancer Day, observed annually on February 4.

Scientists Discover Link Between Brain's Structural Features and Autistic Traits in Children

Scientists have discovered significant structural differences in the brain's pathways, tracts, and thalamus between children with autism and their neurotypical peers, despite finding no functional differences. The most significant alterations were found in the pathways connecting the thalamus—the brain's sensory information processing centre—to the temporal lobe. Moreover, the severity of these alterations positively correlated with the intensity of the child's autistic traits. The study findings have been published in Behavioural Brain Research.

Earnings Inequality Declining in Russia

Earnings inequality in Russia has nearly halved over the past 25 years. The primary factors driving this trend are rising minimum wages, regional economic convergence, and shifts in the returns on education. Since 2019, a new phase of this process has been observed, with inequality continuing to decline but driven by entirely different mechanisms. These are the findings made by Anna Lukyanova, Assistant Professor at the HSE Faculty of Economic Sciences, in her new study. The results have been published in the Journal of the New Economic Association.

Russian Physicists Discover Method to Increase Number of Atoms in Quantum Sensors

Physicists from the Institute of Spectroscopy of the Russian Academy of Sciences and HSE University have successfully trapped rubidium-87 atoms for over four seconds. Their method can help improve the accuracy of quantum sensors, where both the number of trapped atoms and the trapping time are crucial. Such quantum systems are used to study dark matter, refine navigation systems, and aid in mineral exploration. The study findings have been published in the Journal of Experimental and Theoretical Physics Letters.

HSE Scientists Develop Application for Diagnosing Aphasia

Specialists at the HSE Centre for Language and Brain have developed an application for diagnosing language disorders (aphasia), which can result from head injuries, strokes, or other neurological conditions. AutoRAT is the first standardised digital tool in Russia for assessing the presence and severity of language disorders. The application is available on RuStore and can be used on mobile and tablet devices running the Android operating system.

HSE Researchers Discover Simple and Reliable Way to Understand How People Perceive Taste

A team of scientists from the HSE Centre for Cognition & Decision Making has studied how food flavours affect brain activity, facial muscles, and emotions. Using near-infrared spectroscopy (fNIRS), they demonstrated that pleasant food activates brain areas associated with positive emotions, while neutral food stimulates regions linked to negative emotions and avoidance. This approach offers a simpler way to predict the market success of products and study eating disorders. The study was published in the journal Food Quality and Preference.