eachanjohnson commited on
Commit
a5ece38
·
verified ·
1 Parent(s): 4b17470

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -1,3 +1,140 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ pipeline_tag: tabular-regression
4
+ tags:
5
+ - chemistry
6
+ - microbiology
7
+ - antibiotics
8
+ library_name: duvida
9
+ datasets:
10
+ - scbirlab/thomas-2018-spark-wt
11
+ ---
12
+
13
+ # Predictor of _Klebsiella pneumoniae_ MICs
14
+
15
+ _Updated:_ Fri 28 Mar 14:27:11 GMT 2025
16
+
17
+ Trained on the _Klebsiella pneumoniae_, WT accumulator phenotype subset of the [human-curated SPARK dataset](https://doi.org/10.1021/acsinfecdis.8b00193) (3920 rows in total for _Klebsiella pneumoniae_).
18
+
19
+ ## Model details
20
+
21
+ This model was trained using [our Duvida framework](https://github.com/scbirlab/duvida),
22
+ as a result of hyperparameter searches and selecting the model that performs best on unseen test data
23
+ (from a scaffold split).
24
+
25
+ Duvida also saves the training data in this checkpoint to allows the calculation of uncertainty metrics
26
+ based on that training data.
27
+
28
+ This model is the best regression model from a hyperparameter search, determined
29
+ by Spearman's $\rho$ on a held-out test set not used in training or early stopping.
30
+
31
+ ### Model architecture
32
+
33
+ - **Regression**
34
+
35
+ ```json
36
+
37
+ {
38
+ "dropout": 0.2,
39
+ "ensemble_size": 10,
40
+ "extra_featurizers": null,
41
+ "learning_rate": 0.0001,
42
+ "model_class": "ChempropModelBox",
43
+ "n_hidden": 3,
44
+ "n_units": 16,
45
+ "use_2d": true,
46
+ "use_fp": true
47
+ }
48
+ ```
49
+
50
+ ### Model usage
51
+
52
+ You can use this model with:
53
+
54
+ ```python
55
+ from duvida.autoclasses import AutoModelBox
56
+ modelbox = AutoModelBox.from_pretrained("hf://scbirlab/spark-dv-2503-kpne")
57
+ modelbox.predict(filename=..., inputs=[...], columns=[...]) # make predictions on your own data
58
+ ```
59
+
60
+ ## Training details
61
+
62
+ - **Dataset:** [SPARK, WT accumulator, _Klebsiella pneumoniae_ subset](https://huggingface.co/datasets/scbirlab/thomas-2018-spark-wt)
63
+ - **Input column:** smiles
64
+ - **Output column:** pmic
65
+ - **Split type:** Murcko scaffold
66
+ - **Split proportions:**
67
+ - 70% training (2045 rows)
68
+ - 15% validation (for early stopping) (723 rows)
69
+ - 15% test (for selecting hyperparameters) (646 rows)
70
+
71
+ Here is the training log:
72
+
73
+ <img src="training-log.png" width=450>
74
+
75
+ And these are the evaluation scores.
76
+
77
+ Train (2045 rows):
78
+
79
+ ```json
80
+
81
+ {
82
+ "Pearson r": 0.879788014533255,
83
+ "RMSE": 0.4014032185077667,
84
+ "Spearman rho": 0.8235991116907959
85
+ }
86
+ ```
87
+
88
+ Validation (723 rows):
89
+
90
+ ```json
91
+
92
+ {
93
+ "Pearson r": 0.7805225413538466,
94
+ "RMSE": 0.7095186710357666,
95
+ "Spearman rho": 0.6299348550927065
96
+ }
97
+ ```
98
+
99
+
100
+ Test (646 rows):
101
+
102
+ ```json
103
+
104
+ {
105
+ "Pearson r": 0.4050551318825592,
106
+ "RMSE": 0.6779211163520813,
107
+ "Spearman rho": 0.4843227707887753
108
+ }
109
+ ```
110
+
111
+ ## Training data details
112
+
113
+ The training data were collated by the authors of:
114
+
115
+ > Joe Thomas, Marc Navre, Aileen Rubio, and Allan Coukell
116
+ > Shared Platform for Antibiotic Research and Knowledge: A Collaborative Tool to SPARK Antibiotic Discovery
117
+ > ACS Infectious Diseases 2018 4 (11), 1536-1539
118
+ > DOI: 10.1021/acsinfecdis.8b00193
119
+
120
+ We cleaned the original SPARK dataset to subset the most relevant columns, remove empty values,
121
+ give succint column titles, and split by species.
122
+
123
+ This particular dataset retains only measurements on bacteria with wild-type accumulation phenotypes.
124
+
125
+ ### Dataset Sources
126
+
127
+ - **Repository:** https://www.collaborativedrug.com/spark-data-downloads
128
+ - **Paper:** https://doi.org/10.1021/acsinfecdis.8b00193
129
+
130
+ ### Data Collection and Processing
131
+
132
+ Data were processed using [schemist](https://github.com/scbirlab/schemist), a tool for processing chemical datasets.
133
+
134
+ The SMILES strings have been canonicalized, and split into training (70%), validation (15%), and test (15%) sets
135
+ by Murcko scaffold for each species with more than 1000 entries. Additional features like molecular weight and
136
+ topological polar surface area have also been calculated.
137
+
138
+ ### Who are the source data producers?
139
+
140
+ Joe Thomas, Marc Navre, Aileen Rubio, and Allan Coukell
data-config.json ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_default_cache": "cache/duvida/data",
3
+ "_in_key": "inputs",
4
+ "_input_cols": [
5
+ "smiles"
6
+ ],
7
+ "_label_cols": [
8
+ "pmic"
9
+ ],
10
+ "_out_key": "labels",
11
+ "input_shape": [
12
+ 2248
13
+ ],
14
+ "output_shape": [
15
+ 1
16
+ ]
17
+ }
data-load-args.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cache": "/nemo/lab/johnsone/home/users/johnsoe/projects/abx-discovery-strategy/models/spark/Klebsiella-pneumoniae/61/cache",
3
+ "features": [
4
+ "smiles"
5
+ ],
6
+ "filename": "/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Klebsiella-pneumoniae/scaffold-split-train.csv.gz",
7
+ "labels": [
8
+ "pmic"
9
+ ]
10
+ }
eval-metrics_test.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "Pearson r": 0.4050551318825592,
3
+ "RMSE": 0.6779211163520813,
4
+ "Spearman rho": 0.4843227707887753
5
+ }
eval-metrics_train.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "Pearson r": 0.879788014533255,
3
+ "RMSE": 0.4014032185077667,
4
+ "Spearman rho": 0.8235991116907959
5
+ }
eval-metrics_validation.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "Pearson r": 0.7805225413538466,
3
+ "RMSE": 0.7095186710357666,
4
+ "Spearman rho": 0.6299348550927065
5
+ }
input-data.hf/data-00000-of-00001.arrow ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eb7f5cd91df22834ff43e89843609a6075e5f26ec1e7aebc9a9a05518a56ce3a
3
+ size 264496
input-data.hf/dataset_info.json ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "builder_name": "csv",
3
+ "citation": "",
4
+ "config_name": "default",
5
+ "dataset_name": "csv",
6
+ "dataset_size": 766413,
7
+ "description": "",
8
+ "download_checksums": {
9
+ "/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Klebsiella-pneumoniae/scaffold-split-train.csv.gz": {
10
+ "num_bytes": 130202,
11
+ "checksum": null
12
+ }
13
+ },
14
+ "download_size": 130202,
15
+ "features": {
16
+ "smiles": {
17
+ "dtype": "string",
18
+ "_type": "Value"
19
+ },
20
+ "inputs": {
21
+ "feature": {
22
+ "dtype": "string",
23
+ "_type": "Value"
24
+ },
25
+ "_type": "Sequence"
26
+ },
27
+ "labels": {
28
+ "feature": {
29
+ "dtype": "float64",
30
+ "_type": "Value"
31
+ },
32
+ "_type": "Sequence"
33
+ }
34
+ },
35
+ "homepage": "",
36
+ "license": "",
37
+ "size_in_bytes": 896615,
38
+ "splits": {
39
+ "train": {
40
+ "name": "train",
41
+ "num_bytes": 766413,
42
+ "num_examples": 2045,
43
+ "dataset_name": "csv"
44
+ }
45
+ },
46
+ "version": {
47
+ "version_str": "0.0.0",
48
+ "major": 0,
49
+ "minor": 0,
50
+ "patch": 0
51
+ }
52
+ }
input-data.hf/state.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_data_files": [
3
+ {
4
+ "filename": "data-00000-of-00001.arrow"
5
+ }
6
+ ],
7
+ "_fingerprint": "a55666bc481927f9",
8
+ "_format_columns": null,
9
+ "_format_kwargs": {},
10
+ "_format_type": null,
11
+ "_output_all_columns": false,
12
+ "_split": "train"
13
+ }
logs-csv/lightning_logs/version_0/hparams.yaml ADDED
@@ -0,0 +1 @@
 
 
1
+ {}
logs-csv/lightning_logs/version_0/metrics.csv ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ epoch,loss,step,val_loss
2
+ 0,,127,3.1612131595611572
3
+ 0,11.470685958862305,127,
4
+ 1,,255,3.4212403297424316
5
+ 1,4.516500473022461,255,
6
+ 2,,383,2.7988340854644775
7
+ 2,4.001670837402344,383,
8
+ 3,,511,2.4266252517700195
9
+ 3,3.4878146648406982,511,
10
+ 4,,639,2.3812172412872314
11
+ 4,3.1538383960723877,639,
12
+ 5,,767,2.249669075012207
13
+ 5,2.8869502544403076,767,
14
+ 6,,895,2.1487362384796143
15
+ 6,2.6796183586120605,895,
16
+ 7,,1023,1.9302494525909424
17
+ 7,2.499631881713867,1023,
18
+ 8,,1151,1.895418405532837
19
+ 8,2.384538173675537,1151,
20
+ 9,,1279,1.8401507139205933
21
+ 9,2.2424700260162354,1279,
22
+ 10,,1407,1.7240188121795654
23
+ 10,2.1825485229492188,1407,
24
+ 11,,1535,1.6076897382736206
25
+ 11,2.0599634647369385,1535,
26
+ 12,,1663,1.5228654146194458
27
+ 12,1.9549221992492676,1663,
28
+ 13,,1791,1.3301759958267212
29
+ 13,1.8475960493087769,1791,
30
+ 14,,1919,1.197400689125061
31
+ 14,1.7222977876663208,1919,
32
+ 15,,2047,1.1432061195373535
33
+ 15,1.662263035774231,2047,
34
+ 16,,2175,1.1259466409683228
35
+ 16,1.5466835498809814,2175,
36
+ 17,,2303,1.133188247680664
37
+ 17,1.476884126663208,2303,
38
+ 18,,2431,1.1027441024780273
39
+ 18,1.4452625513076782,2431,
40
+ 19,,2559,1.158228874206543
41
+ 19,1.378151774406433,2559,
42
+ 20,,2687,1.0784740447998047
43
+ 20,1.318475365638733,2687,
44
+ 21,,2815,1.0715962648391724
45
+ 21,1.2622106075286865,2815,
46
+ 22,,2943,1.075689673423767
47
+ 22,1.2291003465652466,2943,
48
+ 23,,3071,1.0587623119354248
49
+ 23,1.1971018314361572,3071,
50
+ 24,,3199,1.0540975332260132
51
+ 24,1.1468651294708252,3199,
52
+ 25,,3327,1.07569420337677
53
+ 25,1.1313549280166626,3327,
54
+ 26,,3455,1.0563280582427979
55
+ 26,1.1012349128723145,3455,
56
+ 27,,3583,1.1024538278579712
57
+ 27,1.0781406164169312,3583,
58
+ 28,,3711,1.0793583393096924
59
+ 28,1.058035135269165,3711,
60
+ 29,,3839,1.0855896472930908
61
+ 29,1.0287179946899414,3839,
62
+ 30,,3967,1.1220470666885376
63
+ 30,1.0105702877044678,3967,
64
+ 31,,4095,1.0875941514968872
65
+ 31,0.9984549880027771,4095,
66
+ 32,,4223,1.0672334432601929
67
+ 32,0.987288236618042,4223,
68
+ 33,,4351,1.0616916418075562
69
+ 33,0.9705791473388672,4351,
70
+ 34,,4479,1.0746409893035889
71
+ 34,0.9425535798072815,4479,
72
+ 35,,4607,1.0884811878204346
73
+ 35,0.9292816519737244,4607,
74
+ 36,,4735,1.0922189950942993
75
+ 36,0.9182446599006653,4735,
76
+ 37,,4863,1.0835254192352295
77
+ 37,0.9015668034553528,4863,
78
+ 38,,4991,1.0907585620880127
79
+ 38,0.9176275134086609,4991,
80
+ 39,,5119,1.1213494539260864
81
+ 39,0.9064778089523315,5119,
82
+ 40,,5247,1.0660821199417114
83
+ 40,0.8878889083862305,5247,
84
+ 41,,5375,1.0989940166473389
85
+ 41,0.8802230358123779,5375,
86
+ 42,,5503,1.1001183986663818
87
+ 42,0.8679183721542358,5503,
88
+ 43,,5631,1.120335578918457
89
+ 43,0.8558708429336548,5631,
90
+ 44,,5759,1.1134792566299438
91
+ 44,0.8376833200454712,5759,
logs/lightning_logs/version_0/events.out.tfevents.1743097353.cn026.2190138.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:06819b7813d597c5db3ccb8be5dc3e376772c9803a5afb9eaeb499e1524f015e
3
+ size 8094
logs/lightning_logs/version_0/hparams.yaml ADDED
@@ -0,0 +1 @@
 
 
1
+ {}
metrics.csv ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ split,split_filename,config_i,model_class,n_parameters,filename,features,labels,cache,extra_featurizers,use_2d,use_fp,dropout,ensemble_size,learning_rate,n_hidden,n_units,val_filename,epochs,batch_size,RMSE,Pearson r,Spearman rho
2
+ train,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Klebsiella-pneumoniae/scaffold-split-train.csv.gz,61,ChempropModelBox,2690450,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Klebsiella-pneumoniae/scaffold-split-train.csv.gz,['smiles'],['pmic'],/nemo/lab/johnsone/home/users/johnsoe/projects/abx-discovery-strategy/models/spark/Klebsiella-pneumoniae/61/cache,,True,True,0.2,10,0.0001,3,16,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Klebsiella-pneumoniae/scaffold-split-validation.csv.gz,2000,16,0.4014032185077667,0.879788014533255,0.8235991116907959
3
+ validation,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Klebsiella-pneumoniae/scaffold-split-validation.csv.gz,61,ChempropModelBox,2690450,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Klebsiella-pneumoniae/scaffold-split-train.csv.gz,['smiles'],['pmic'],/nemo/lab/johnsone/home/users/johnsoe/projects/abx-discovery-strategy/models/spark/Klebsiella-pneumoniae/61/cache,,True,True,0.2,10,0.0001,3,16,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Klebsiella-pneumoniae/scaffold-split-validation.csv.gz,2000,16,0.7095186710357666,0.7805225413538466,0.6299348550927065
4
+ test,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Klebsiella-pneumoniae/scaffold-split-test.csv.gz,61,ChempropModelBox,2690450,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Klebsiella-pneumoniae/scaffold-split-train.csv.gz,['smiles'],['pmic'],/nemo/lab/johnsone/home/users/johnsoe/projects/abx-discovery-strategy/models/spark/Klebsiella-pneumoniae/61/cache,,True,True,0.2,10,0.0001,3,16,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Klebsiella-pneumoniae/scaffold-split-validation.csv.gz,2000,16,0.6779211163520813,0.4050551318825592,0.4843227707887753
modelbox-config.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dropout": 0.2,
3
+ "ensemble_size": 10,
4
+ "extra_featurizers": null,
5
+ "learning_rate": 0.0001,
6
+ "model_class": "ChempropModelBox",
7
+ "n_hidden": 3,
8
+ "n_units": 16,
9
+ "use_2d": true,
10
+ "use_fp": true
11
+ }
params.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:da1801d43ebc3f74db8d31190366ad537b2b39238c6574eb1b110077ce9f385c
3
+ size 10875372
predictions_test.csv.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5584a18b5d753d00747d2809e5bb0424cec579aaad0ce24f00e4658bbf7a170c
3
+ size 705987
predictions_train.csv.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:20695113483d5c2d973c00badeaf66c46e3ef2c9cf334b0af09ace0cacd31fc0
3
+ size 1940344
predictions_validation.csv.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c0e71ba4083d1dc0c1f0a99d43add1eaf03d0d5c1cab18aa58643d962df4d192
3
+ size 997657
training-args.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "batch_size": 16,
3
+ "epochs": 2000,
4
+ "val_filename": "/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Klebsiella-pneumoniae/scaffold-split-validation.csv.gz"
5
+ }
training-data.hf/cache-d4aeece68b087032.arrow ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3873a4aaf25229484a767fcf91a5c465713b58a1255b3b692925ef7d106faf0d
3
+ size 61560848
training-data.hf/data-00000-of-00001.arrow ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:74b27e7fb54789605b5ce5f5646274068f805585004de84caf37383446fe697a
3
+ size 61282632
training-data.hf/dataset_info.json ADDED
@@ -0,0 +1,126 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "builder_name": "csv",
3
+ "citation": "",
4
+ "config_name": "default",
5
+ "dataset_name": "csv",
6
+ "dataset_size": 766413,
7
+ "description": "",
8
+ "download_checksums": {
9
+ "/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Klebsiella-pneumoniae/scaffold-split-train.csv.gz": {
10
+ "num_bytes": 130202,
11
+ "checksum": null
12
+ }
13
+ },
14
+ "download_size": 130202,
15
+ "features": {
16
+ "smiles": {
17
+ "feature": {
18
+ "dtype": "string",
19
+ "_type": "Value"
20
+ },
21
+ "_type": "Sequence"
22
+ },
23
+ "inputs": {
24
+ "V_d": {
25
+ "dtype": "null",
26
+ "_type": "Value"
27
+ },
28
+ "gt_mask": {
29
+ "dtype": "null",
30
+ "_type": "Value"
31
+ },
32
+ "lt_mask": {
33
+ "dtype": "null",
34
+ "_type": "Value"
35
+ },
36
+ "mg": {
37
+ "E": {
38
+ "feature": {
39
+ "feature": {
40
+ "dtype": "float32",
41
+ "_type": "Value"
42
+ },
43
+ "_type": "Sequence"
44
+ },
45
+ "_type": "Sequence"
46
+ },
47
+ "V": {
48
+ "feature": {
49
+ "feature": {
50
+ "dtype": "float32",
51
+ "_type": "Value"
52
+ },
53
+ "_type": "Sequence"
54
+ },
55
+ "_type": "Sequence"
56
+ },
57
+ "edge_index": {
58
+ "feature": {
59
+ "feature": {
60
+ "dtype": "float32",
61
+ "_type": "Value"
62
+ },
63
+ "_type": "Sequence"
64
+ },
65
+ "_type": "Sequence"
66
+ },
67
+ "rev_edge_index": {
68
+ "feature": {
69
+ "dtype": "float32",
70
+ "_type": "Value"
71
+ },
72
+ "_type": "Sequence"
73
+ }
74
+ },
75
+ "weight": {
76
+ "dtype": "float32",
77
+ "_type": "Value"
78
+ },
79
+ "x_d": {
80
+ "feature": {
81
+ "dtype": "float32",
82
+ "_type": "Value"
83
+ },
84
+ "_type": "Sequence"
85
+ },
86
+ "y": {
87
+ "feature": {
88
+ "dtype": "float32",
89
+ "_type": "Value"
90
+ },
91
+ "_type": "Sequence"
92
+ }
93
+ },
94
+ "labels": {
95
+ "feature": {
96
+ "dtype": "float64",
97
+ "_type": "Value"
98
+ },
99
+ "_type": "Sequence"
100
+ },
101
+ "extra_features": {
102
+ "feature": {
103
+ "dtype": "float32",
104
+ "_type": "Value"
105
+ },
106
+ "_type": "Sequence"
107
+ }
108
+ },
109
+ "homepage": "",
110
+ "license": "",
111
+ "size_in_bytes": 896615,
112
+ "splits": {
113
+ "train": {
114
+ "name": "train",
115
+ "num_bytes": 766413,
116
+ "num_examples": 2045,
117
+ "dataset_name": "csv"
118
+ }
119
+ },
120
+ "version": {
121
+ "version_str": "0.0.0",
122
+ "major": 0,
123
+ "minor": 0,
124
+ "patch": 0
125
+ }
126
+ }
training-data.hf/state.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_data_files": [
3
+ {
4
+ "filename": "data-00000-of-00001.arrow"
5
+ }
6
+ ],
7
+ "_fingerprint": "8e7bb11fd5ae4b41",
8
+ "_format_columns": null,
9
+ "_format_kwargs": {
10
+ "dtype": "float"
11
+ },
12
+ "_format_type": "numpy",
13
+ "_output_all_columns": false,
14
+ "_split": "train"
15
+ }
training-log.csv ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ epoch,step,loss,val_loss
2
+ 0,127,11.470685958862305,3.161213159561157
3
+ 1,255,4.516500473022461,3.421240329742432
4
+ 2,383,4.001670837402344,2.798834085464477
5
+ 3,511,3.4878146648406982,2.4266252517700195
6
+ 4,639,3.153838396072388,2.3812172412872314
7
+ 5,767,2.886950254440308,2.249669075012207
8
+ 6,895,2.6796183586120605,2.1487362384796143
9
+ 7,1023,2.499631881713867,1.9302494525909424
10
+ 8,1151,2.384538173675537,1.895418405532837
11
+ 9,1279,2.242470026016236,1.8401507139205933
12
+ 10,1407,2.1825485229492188,1.7240188121795654
13
+ 11,1535,2.0599634647369385,1.6076897382736206
14
+ 12,1663,1.954922199249268,1.5228654146194458
15
+ 13,1791,1.8475960493087769,1.3301759958267212
16
+ 14,1919,1.7222977876663208,1.197400689125061
17
+ 15,2047,1.662263035774231,1.1432061195373535
18
+ 16,2175,1.5466835498809814,1.1259466409683228
19
+ 17,2303,1.476884126663208,1.133188247680664
20
+ 18,2431,1.4452625513076782,1.1027441024780271
21
+ 19,2559,1.378151774406433,1.158228874206543
22
+ 20,2687,1.318475365638733,1.078474044799805
23
+ 21,2815,1.2622106075286863,1.0715962648391724
24
+ 22,2943,1.2291003465652466,1.075689673423767
25
+ 23,3071,1.1971018314361572,1.0587623119354248
26
+ 24,3199,1.1468651294708252,1.0540975332260132
27
+ 25,3327,1.1313549280166626,1.07569420337677
28
+ 26,3455,1.1012349128723145,1.056328058242798
29
+ 27,3583,1.0781406164169312,1.1024538278579712
30
+ 28,3711,1.058035135269165,1.0793583393096924
31
+ 29,3839,1.0287179946899414,1.0855896472930908
32
+ 30,3967,1.0105702877044678,1.1220470666885376
33
+ 31,4095,0.9984549880027772,1.0875941514968872
34
+ 32,4223,0.987288236618042,1.0672334432601929
35
+ 33,4351,0.9705791473388672,1.0616916418075562
36
+ 34,4479,0.9425535798072816,1.0746409893035889
37
+ 35,4607,0.9292816519737244,1.0884811878204346
38
+ 36,4735,0.9182446599006652,1.092218995094299
39
+ 37,4863,0.9015668034553528,1.0835254192352295
40
+ 38,4991,0.9176275134086608,1.0907585620880127
41
+ 39,5119,0.9064778089523317,1.1213494539260864
42
+ 40,5247,0.8878889083862305,1.0660821199417114
43
+ 41,5375,0.8802230358123779,1.0989940166473389
44
+ 42,5503,0.8679183721542358,1.1001183986663818
45
+ 43,5631,0.8558708429336548,1.120335578918457
46
+ 44,5759,0.8376833200454712,1.1134792566299438
training-log.png ADDED