Preprocessing: pp#
The preprocessing is handled outside of this package. Please refer to https://github.com/mengerj/adata_hf_datasets for the workflow of creating huggingface datasets which can be used to train the custom sentence transformers models used in this project.