About

This web resource provides everyone with the opportunity to predict the most important spectral properties of BODIPY class compounds using machine learning methods.

Currently, using this resource, you can predict:

These properties will be predicted while accounting for the solvent effect.

Method

All models presented here are trained using strict 5-fold cross-validation (5-CV) with CatBoost. RDKit descriptors are selected to describe both the BODIPY structure and the solvent molecule. Additionally, to more accurately account for the solvent's effect on the spectral properties of BODIPY, solvent polarity parameters are also used as descriptors.

Model Performance Metrics

The predictive performance of each model was evaluated using three standard regression metrics: Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and coefficient of determination (R²). The reported values are averaged across all folds of a strict 5-fold cross-validation, where splitting was performed based on unique SMILES to avoid data leakage.

Property MAE RMSE
Absorption maximum (nm) 9.45 16.10 0.9290
Molar absorption coefficient (logε) 0.15 0.21 0.3139
Emission maximum (nm) 10.44 18.73 0.9113
Fluorescence lifetime (ns) 1.21 1.58 0.5396
Fluorescence quantum yield 0.15 0.20 0.6426
Singlet oxygen generation quantum yield 0.11 0.15 0.6566

You can learn more about the training protocol in our article:

A.A. Ksenofontov, Yu.V. Eremeeva, P.S. Bocharov, D.M. Makarov, SpecML: web tool for predicting the spectral properties of BODIPYs, Spectrochimica Acta - Part A: Molecular and Biomolecular Spectroscopy (2025) 127091. https://doi.org/10.1016/j.saa.2025.127091

Citing SpecML web tool:

A.A. Ksenofontov, Yu.V. Eremeeva, P.S. Bocharov, D.M. Makarov, SpecML: web tool for predicting the spectral properties of BODIPYs, Spectrochimica Acta - Part A: Molecular and Biomolecular Spectroscopy (2025) 127091. https://doi.org/10.1016/j.saa.2025.127091

Dataset

How to use

At the moment, you can use this resource to screen the spectral properties of BODIPYs based on their SMILES representations. To do this, use the "Upload SMILES" block. By uploading an *.xlsx file containing the SMILES of BODIPYs and solvents (no more than 100 SMILES of BODIPY per upload), you can predict all available spectral properties for a large number of compounds. After successful prediction, the results will be available for download by clicking the Download Results as *.xlsx file button.

Additionally, you can predict all available spectral properties for a single compound by entering its SMILES (or drawing the molecule) and selecting the desired solvent. This feature is available in the "Enter SMILES" block. The prediction results will appear in the "Prediction Results" and "Recommended Applications" block below. If needed, you can download the prediction results as an *.xlsx file by clicking Download Results as *.xlsx.

In addition to property prediction, users can activate the "BODIPYs similarity search module". By default, the module returns the top five most similar BODIPYs from dataset, with an adjustable maximum number (up to ten BODIPYs). The similarity search module is designed to analyze and group molecules based on their structural similarity using the Tanimoto index. It processes pre-calculated ECFP4 fingerprints (2048 bits) and returns a sorted list of the most structurally similar molecules along with their associated publication identifiers (DOI).

Module capabilities:

News

Our Team

Alexander Ksenofontov
Alexander Ksenofontov
Team Leader
Pavel Bocharov
Pavel Bocharov
Team Member
Yuliya Eremeeva
Yuliya Eremeeva
Team Member

Contacts

The development team consists of researchers from the G.A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences (Ivanovo, Russia).

If you have any questions, comments, or suggestions about SpecML, feel free to contact us via

This research was funded by the Russian Science Foundation (grant number 24-73-00006).

Our projects