# Fluorescent Neuronal Cells v2

**Name:** Fluorescent Neuronal Cells v2

**Abbreviations**: *FNC*, *fluocells*

**Authors:** Luca Clissa (University of Bologna); Alessandra Occhinegro (University of Bologna); Emiliana Piscitiello (University of Bologna); Ludovico Taddei (University of Bologna); Antonio Macaluso (German Research Center for Artificial Intelligence); Roberto Morelli (University of Bologna); Fabio Squarcio (University of Bologna); Timna Hitrec (University of Bologna); Alessia Di Cristoforo (University of Bologna); Marco Luppi (University of Bologna); Roberto Amici (University of Bologna); Matteo Cerri (University of Bologna); Stefano Bastianini (University of Bologna); Chiara Berteotti (University of Bologna), Viviana Lo Martire (University of Bologna), Davide Martelli (University of Bologna), Domenico Tupone (University of Bologna) and Giovanna Zoccoli (University of Bologna)

**Contact person:** Luca Clissa (University of Bologna), luca.clissa2@unibo.it

**Publication year:** 2024


## Quick Start

The Fluorescent Neuronal Cells dataset is an archive featuring 3 collections of fluorescence microscopy pictures, namely *green, red* and *yellow*, shared through the corresponding zip files.
The dataset contains 1874 images depicting stained neuronal cells in rodent brains. Alongside the data, we provide 750 ground-truth annotations for semantic segmentation, object detection and counting.

The dataset repository is organized as follows:

```
├── README.md # readme file describing the structure and the content of the dataset 
├── metadata_v2.xlsx # summary of data structure
├── green.zip # zip archive of green image collection
├── red.zip # zip archive of red image collection
├── yellow.zip # zip archive of yellow image collection
├── datasheet.md # dataset documentation
├── data_stats.zip # descriptive statistics about data
├── raw_data.zip # zip archive of raw images
└── annotations.zip # zip archive of raw annotation files, in VGG VIA csv format
```

The `data_stats.zip` file contains descriptive statistics about data composition and cell features:

```
├── cells_stats_df.csv # geometrical measures of cells extension and shape
├── aggregate_dataset_features.csv # aggregate statistics of cell features (by image collection) 
├── aggregate_datasetXpartition_features.csv # aggregate statistics of cell features (by collection and data split)
```

Each image collection can be found in the corresponding zip archive. Once uncompressed, the resultant directory will look like as follows:

```
dataset_v2/<collection_name> # with collection_name = green, red or yellow
├── test
│   ├── ground_truths
│   ├── images
│   └── metadata
├── trainval
│   ├── ground_truths
│   ├── images
│   └── metadata
└── unlabelled # no ground-truths
    ├── images
    └── metadata

./ground_truths/
├── masks: png images with binary masks
├── rle: pickle files with Running Length Encoding (RLE) of binary masks
├── Pascal_VOC: *.xml* files with image annotations (polygon, bbox, dot)
├── COCO
    └── annotations_<collection_name>_trainval.json # with collection_name = green, red or yellow
└── VIA
    └── annotations_<collection_name>_trainval.json # with collection_name = green, red or yellow
```


The data are already partitioned into *trainval* and *test* sets for model training and final assessment, respectively. 
Input data can be found under `<data-split>/images` for each image collection folder, while the corresponding annotations are under `<data-split>/ground_truths` (several formats). 
The association between images and annotations can be made based on the filenames.

### Terms of Use

All files in this dataset are licensed under the Creative Commons Attribution 4.0 International License,
available at: https://creativecommons.org/licenses/by/4.0

Specific copyrights are applied to the images and to the annotations:

For the images contained in the folders named `images` and in the `raw_data.zip` archive: 

`Copyright © 2024 Alessandra Occhinegro, Fabio Squarcio, Timna Hitrec, Alessia Di Cristoforo, Marco Luppi, Roberto Amici, Matteo Cerri, Stefano Bastianini, Chiara Berteotti, Viviana Lo Martire, Davide Martelli, Domenico Tupone and Giovanna Zoccoli`

For the annotations contained in the folders named `ground_truths` and in the `annotations.zip` archive

`Copyright © 2024 Luca Clissa, Alessandra Occhinegro, Emiliana Piscitiello, Ludovico Taddei, Antonio Macaluso, Roberto Morelli, Marco Luppi` 


When using Fluorescent Neuronal Cells data collection, please cite [1, 2]:


`Clissa, L., Macaluso, A., Morelli, R., Occhinegro, A., Piscitiello, E., Taddei, L., Luppi, M., Amici, R., Cerri, M., Hitrec, T., Rinaldi, L., Zoccoli, A., 2024. Fluorescent Neuronal Cells v2: Multi-Task, Multi-Format Annotations for Deep Learning in Microscopy. Scientific Data, DOI  https://doi.org/10.1038/s41597-024-03005-9` 

 
`Clissa, L., Occhinegro, A., Piscitiello, E., Taddei, L., Macaluso, A., Morelli, R., Squarcio, F., Hitrec, T., Di Cristoforo, A., Luppi, M., Amici, R., Cerri, M., Bastianini, S., Berteotti, C., Lo Martire, V., Martelli, D., Tupone, D., Zoccoli, G. (2024) Fluorescent Neuronal Cells v2. University of Bologna. DOI https://doi.org/10.6092/unibo/amsacta/7347 [Dataset]`


---

## Abstract

Fluorescent Neuronal Cells v2 is a collection of fluorescence microscopy images and the corresponding ground-truth annotations, designed to foster innovative research in the domains of Life Science and Deep Learning.

This dataset encompasses three image collections wherein rodent neuronal cell nuclei and cytoplasm are stained with diverse markers to highlight their anatomical or functional characteristics.
Specifically, we release 1874 high-resolution images alongside 750 corresponding ground-truth annotations for several learning tasks, including semantic segmentation, object detection and counting.

The contribution is two-fold.
First, thanks to the variety of annotations and their accessible formats, we envision our work would facilitate methodological advancements in computer vision approaches for segmentation, detection, feature learning, unsupervised and self-supervised learning, transfer learning, and related areas.
Second, by enabling extensive exploration and benchmarking, we hope Fluorescent Neuronal Cells v2 would catalyze breakthroughs in fluorescence microscopy analysis and promote cutting-edge discoveries in life sciences.

The data are available at DOI: [10.6092/unibo/amsacta/7347](https://amsacta.unibo.it/id/eprint/7347).
For more information, please refer to [1].

## Data acquisition

The data acquisition design was split into two, independent phases: i) image acquisition and ii) data annotation.

### Image acquisition

The images of *green* and *yellow* image collections were acquired as part of [5], while the *red* images were collected as part of an unpublished experiments.
The raw data are reported as *.TIFF* files inside the *raw_data.zip* archive. 

Each image collection folder contains *.png* images converted without compression from the original file, changing the filename to a progressive index for easier manipulation. 
The file `metadata_v2.xlsx` contains a map between the original filename (`original_name` column) and the corresponding numbered name (`image_name` column).

### Data annotation

The ground-truth labels were collected as polygon annotations through Visual Geometry Group Visual Image Annotator (VIA) tool, and exported as *.csv* files (see *Annotations Protocol.pdf* for detailed instructions regarding the labelling process).
The corresponding raw files are reported inside *annotations.zip* file.

For the green collection, the process was divided into 3 annotation rounds resulting into `c-FOS_first_round_reviewed.csv`,  `c-FOS_second_round_reviewed.csv` and `c-FOS_third_round_reviewed.csv`.

For the red collection, the annotation took four rounds. The result is reported in a single file comprehensive of the whole process, namely `Orx_fourth_round_reviewed.csv`. 

For the yellow collection, previous annotations from v1 [3,4] where used as a starting point and refined with an additional round of revision. The final result is reported in `CTb_first_round_reviewed.csv`.

These raw annotations were then converted to several annotation **formats** (binary masks, rle encoding, COCO json, Pascal VOC xml and VIA json) and **types** (polygon, bounding box, dot-annotation, count), available under the `ground_truths/` folder of each image collection.

```
Note: The previous version of the dataset (Fluorescent Neuronal Cells [3,4]) features the same 283 images contained in the partitions `trainval` and `test` of Fluorescent Neuronal Cells v2. 
However, the files in Fluorescent Neuronal Cells v2 are re-converted from *.TIF* to *.png* preserving metadata. 

Regarding the ground-truths, Fluorescent Neuronal Cells v2 uses the binary masks in the previous version as pre-annotations. 
These are input into VIA and reviewed, refined and corrected to smooth contours, remove holes and small objects and ensure more consistent labels.

The remaining images, `unlabelled/` folder, are introduced in Fluorescent Neuronal Cells v2.
```

## Challenges

Several relevant challenges are present:

- variability in brightness and contrast causes some fickleness in the pictures overall appearance
- cells exhibit varying saturation levels due to the natural fluctuation of the fluorescent emission properties
- substructures of interest have a fluid nature, so the shape of the stained cells may change significantly
- artifacts, bright biological structures -- like neurons' filaments -- and non-marked cells are present
- cells are sometimes clumping together and/or overlapping each other
- broad shift in the number of target cells from image to image, from no stained cells to several dozens

All of these factors make the segmentation/recognition task harder, sometimes creating borderline cases that lead to a subjective interpretation.

## Fundings

This research was partly funded by PNRR - M4C2 - Investimento 1.3, Partenariato Esteso PE00000013 - “FAIR - Future Artificial Intelligence Research” - Spoke 8 “Pervasive AI” and the European Commission under the NextGeneration EU programme.
The collection of original images was supported by funding from the University of Bologna (RFO 2018) and the European Space Agency (Research agreement collaboration 4000123556).

## Ethical approval

All the experiments were conducted following approval by the ethical committee of National Health Authority. Mice and rats underwent experiments in different time and were subjected to different legislations for the ethical approvement of the experimental procedures: i) for rats, the experimental protocol was approved by the Ethical Committee for Animal Research of the University of Bologna and by the Italian Ministry of Health (decree: No.186/2013-B), and was performed in accordance with the European Union (2010/63/UE) and the Italian Ministry of Health (January 27, 1992, No. 116) Directives, under the supervision of the Central Veterinary Service of the University of Bologna and the National Health Authority; ii) for mice, the experimental protocol was approved by the National Health Authority (decree: No.141/2018 - PR/AEDB0.8.EXT.4), in accordance with the DL 26/2014 and the European Union Directive 2010/63/EU, and under the supervision of the Central Veterinary Service of the University of Bologna. All efforts were made to minimize the number of animals used and their pain and distress.


## References

[[1]](https://doi.org/10.1038/s41597-024-03005-9) Clissa, L., et al., 2024. Fluorescent Neuronal Cells v2: Multi-Task, Multi-Format Annotations for Deep Learning in Microscopy. Scientific data, DOI https://doi.org/10.1038/s41597-024-03005-9.

[[2]](https://amsacta.unibo.it/id/eprint/7347) Clissa, L., Occhinegro, A., Piscitiello, E., Taddei, L., Macaluso, A., Morelli, R., Squarcio, F., Hitrec, T., Di Cristoforo, A., Luppi, M., Amici, R., Cerri, M., Bastianini, S., Berteotti, C., Lo Martire, V., Martelli, D., Tupone, D., Zoccoli, G. (2023) Fluorescent Neuronal Cells v2. University of Bologna. DOI https://doi.org/10.6092/unibo/amsacta/7347. [Dataset] 

[[3]](https://amsacta.unibo.it/id/eprint/6706/) Clissa, L., Morelli, R., Squarcio, F., Hitrec, T., Luppi, Marco, Rinaldi, L., Cerri, M., Amici, R., Bastianini, S., Berteotti, C., Lo Martire, V., Martelli, D., Occhinegro, A., Tupone, D., Zoccoli, G., Zoccoli, A. (2021) Fluorescent Neuronal Cells. University of Bologna. DOI https://doi.org/10.6092/unibo/amsacta/6706. [Dataset] 

[[2]](https://doi.org/10.1038/s41598-021-01929-5) Morelli, R., Clissa, L., Amici, R., Cerri, M., Hitrec, T.,
Luppi, M., Rinaldi, L., Squarcio, F. and Zoccoli, A., 2021. Automating cell counting in fluorescent microscopy through 
deep learning with c-ResUnet. Scientific reports, 11(1), p.22920. DOI https://doi.org/10.1038/s41598-021-01929-5 

[[5]](https://www.nature.com/articles/s41598-019-51841-2) Hitrec, T., Luppi, M., Bastianini, S., Squarcio, F.,
Berteotti, C., Martire, V.L., Martelli, D., Occhinegro, A., Tupone, D., Zoccoli, G. and Amici, R., 2019. Neural control
of fasting-induced torpor in mice. Scientific reports, 9(1), p.1-12. DOI https://doi.org/10.1038/s41598-019-51841-2

[[6]](http://amsdottorato.unibo.it/10016/) Clissa, L., 2022. Supporting Scientific Research Through Machine and Deep Learning: Fluorescence Microscopy and Operational Intelligence Use Cases, Alma Mater Studiorum Università di Bologna.  Dottorato di ricerca in Data science and computation, 33 Ciclo. DOI 10.48676/unibo/amsdottorato/10016. [Dissertation thesis]