miRInter-Trans is a miRNA-ncRNA interaction classifier, based on a Transformer architecture.
A CUDA environment, and a minimum VRAM of 8GB is required.
miRInter-Trans/RNA-FM/environment.yml
1. Embeddings etraction from all unique sequences in the dataset with RNA-FM, from all the unique sequences, with all the augmented sequences too!.
2. Embeddings in 'miRInter-Trans/RNA-FM/redevelop/resuts/.../representations', load them and save as dict, with db_unique_sequences
3. We have seeq_to_emb dict now.
4. Prepare the dataset in 5 holdouts, so train/test separation, in 'prepare_db.py' script, that takes all positive couples, that integrates at the end the embeddings from the seeq_to_emb dictionary.
5. Than we compute negatives with 'classification_dataset_creation.py', with global positives check fix for negatives in train and test set.
6. classification_miRNA-miRNA-example-notebook.ipynb # miRInter-Trans training on a set of miRNA-miRNA interactions, given the output of classification_dataset_creation in a folder.
PLUS. repeat the entire process, without cross validation, but with train/test split that excludes from the test set de-novo mirnas.