Deepfake-audio-detection

This model is a fine-tuned version of motheecreator/Deepfake-audio-detection on the audiofolder dataset. It achieves the following results on the evaluation set:

Loss: 0.0192
Accuracy: 0.9964
Precision: 0.9944
Recall: 0.9990
F1: 0.9967
Auc Roc: 1.0000
Confusion Matrix: [[4974, 34], [6, 6033]]
Classification Report: {'0': {'precision': 0.9987951807228915, 'recall': 0.9932108626198083, 'f1-score': 0.9959951942330797, 'support': 5008}, '1': {'precision': 0.9943959123125103, 'recall': 0.9990064580228515, 'f1-score': 0.9966958532958864, 'support': 6039}, 'accuracy': 0.9963791074499864, 'macro avg': {'precision': 0.996595546517701, 'recall': 0.9961086603213298, 'f1-score': 0.996345523764483, 'support': 11047}, 'weighted avg': {'precision': 0.9963902579447351, 'recall': 0.9963791074499864, 'f1-score': 0.9963782194960733, 'support': 11047}}

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 2
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	Precision	Recall	F1	Auc Roc	Confusion Matrix	Classification Report
0.1006	0.3621	1000	0.1897	0.9651	0.9424	0.9972	0.9690	0.9989	[[4640, 368], [17, 6022]]	{'0': {'precision': 0.9963495812754992, 'recall': 0.9265175718849841, 'f1-score': 0.9601655457837558, 'support': 5008}, '1': {'precision': 0.9424100156494523, 'recall': 0.9971849643980791, 'f1-score': 0.969024056641725, 'support': 6039}, 'accuracy': 0.9651489092061193, 'macro avg': {'precision': 0.9693797984624757, 'recall': 0.9618512681415317, 'f1-score': 0.9645948012127403, 'support': 11047}, 'weighted avg': {'precision': 0.9668627489395077, 'recall': 0.9651489092061193, 'f1-score': 0.9650081770023017, 'support': 11047}}
0.07	0.7241	2000	0.0333	0.9916	0.9914	0.9932	0.9923	0.9997	[[4956, 52], [41, 5998]]	{'0': {'precision': 0.9917950770462277, 'recall': 0.9896166134185304, 'f1-score': 0.9907046476761618, 'support': 5008}, '1': {'precision': 0.991404958677686, 'recall': 0.993210796489485, 'f1-score': 0.9923070560013236, 'support': 6039}, 'accuracy': 0.9915814248212185, 'macro avg': {'precision': 0.9916000178619568, 'recall': 0.9914137049540077, 'f1-score': 0.9915058518387427, 'support': 11047}, 'weighted avg': {'precision': 0.9915818132798093, 'recall': 0.9915814248212185, 'f1-score': 0.9915806270258181, 'support': 11047}}
0.016	1.0862	3000	0.1018	0.9841	0.9727	0.9988	0.9856	0.9998	[[4839, 169], [7, 6032]]	{'0': {'precision': 0.9985555096987206, 'recall': 0.9662539936102237, 'f1-score': 0.9821392327988635, 'support': 5008}, '1': {'precision': 0.9727463312368972, 'recall': 0.9988408676933267, 'f1-score': 0.9856209150326798, 'support': 6039}, 'accuracy': 0.9840680727799402, 'macro avg': {'precision': 0.985650920467809, 'recall': 0.9825474306517752, 'f1-score': 0.9838800739157716, 'support': 11047}, 'weighted avg': {'precision': 0.9844465544410985, 'recall': 0.9840680727799402, 'f1-score': 0.9840425440154849, 'support': 11047}}
0.0209	1.4482	4000	0.0212	0.9957	0.9950	0.9972	0.9961	0.9999	[[4978, 30], [17, 6022]]	{'0': {'precision': 0.9965965965965966, 'recall': 0.9940095846645367, 'f1-score': 0.9953014095771269, 'support': 5008}, '1': {'precision': 0.9950429610046265, 'recall': 0.9971849643980791, 'f1-score': 0.9961128111818707, 'support': 6039}, 'accuracy': 0.995745451253734, 'macro avg': {'precision': 0.9958197788006116, 'recall': 0.995597274531308, 'f1-score': 0.9957071103794988, 'support': 11047}, 'weighted avg': {'precision': 0.9957472795566846, 'recall': 0.995745451253734, 'f1-score': 0.9957449738290548, 'support': 11047}}
0.0233	1.8103	5000	0.0192	0.9964	0.9944	0.9990	0.9967	1.0000	[[4974, 34], [6, 6033]]	{'0': {'precision': 0.9987951807228915, 'recall': 0.9932108626198083, 'f1-score': 0.9959951942330797, 'support': 5008}, '1': {'precision': 0.9943959123125103, 'recall': 0.9990064580228515, 'f1-score': 0.9966958532958864, 'support': 6039}, 'accuracy': 0.9963791074499864, 'macro avg': {'precision': 0.996595546517701, 'recall': 0.9961086603213298, 'f1-score': 0.996345523764483, 'support': 11047}, 'weighted avg': {'precision': 0.9963902579447351, 'recall': 0.9963791074499864, 'f1-score': 0.9963782194960733, 'support': 11047}}

Framework versions

Transformers 4.41.1
Pytorch 2.1.2
Datasets 2.19.1
Tokenizers 0.19.1

MelodyMachine
/

Deepfake-audio-detection

Deepfake-audio-detection

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for MelodyMachine/Deepfake-audio-detection

Space using MelodyMachine/Deepfake-audio-detection 1

Evaluation results