attilasimko commited on
Commit
ab3ebc8
·
1 Parent(s): 3abd747

shorter responses

Browse files
Files changed (1) hide show
  1. evaluations/models.py +17 -15
evaluations/models.py CHANGED
@@ -10,21 +10,23 @@ system_messages = { "STRICT": """You are a chatbot evaluating github repositorie
10
  Keep your answers short, and informative.
11
  Your answer should be a single paragraph.""",
12
  "PITFALL": """You are a chatbot evaluating github repositories, their python codes and corresponding readme files.
13
- You are looking for common pitfalls in the code. More specifically please consider the follwing pitfalls:
14
- Please explain if you find any design-flaws with regards to the data collection in the code."))
15
- Please explain if you find signs of dataset shift in the code (e.g. sampling bias, imbalanced populations, imbalanced labels, non-stationary environments)."))
16
- Please explain if you find any confounders in the code."))
17
- Please explain if you find any measurement errors in the code (labelling mistakes, noisy measurements, inappropriate proxies)"))
18
- Please explain if you find signs of historical biases in the data used."))
19
- Please explain if you find signs of information leaking between the training and testing data."))
20
- Please explain if you find a model-problem mismatch (e.g. over-complicated/simplistic model, computational challenges)"))
21
- Please explain if you find any signs of overfitting in the code (e.g. high variance, high complexity, low bias)."))
22
- Please explain if you find any misused metrics in the code (e.g. poor metric selection, poor implementations)"))
23
- Please explain if you find any signs of black box models in the code (e.g. lack of interpretability, lack of transparency)"))
24
- Please explain if you find any signs of baseline comparison issues in the code (e.g. if the testing data does not fit the training data)"))
25
- Please explain if you find any signs of insufficient reporting in the code (e.g. missing hyperparameters, missing evaluation metrics)"))
26
- Please explain if you find signs of faulty interpretations of the reported results.
27
- If you don't find anything concerning, please return an empty string.""" }
 
 
28
 
29
  class LocalLLM():
30
  def __init__(self, model_name):
 
10
  Keep your answers short, and informative.
11
  Your answer should be a single paragraph.""",
12
  "PITFALL": """You are a chatbot evaluating github repositories, their python codes and corresponding readme files.
13
+ You are looking for common pitfalls in the code.
14
+ Keep your answer short and informative.
15
+ Only report serious flaws. If you don't find any, return an empty string.
16
+ Answer in a short paragraph, and keep in mind the following common pitfall categories
17
+ Pitfall #1 Design-flaws with regards to the data collection in the code."))
18
+ Pitfall #2 Dataset shift (e.g. sampling bias, imbalanced populations, imbalanced labels, non-stationary environments)."))
19
+ Pitfall #3 Confounders."))
20
+ Pitfall #4 Measurement errors (labelling mistakes, noisy measurements, inappropriate proxies)"))
21
+ Pitfall #5 Historical biases in the data used."))
22
+ Pitfall #6 Information leaking between the training and testing data."))
23
+ Pitfall #7 Model-problem mismatch (e.g. over-complicated/simplistic model, computational challenges)"))
24
+ Pitfall #8 Overfitting in the code (e.g. high variance, high complexity, low bias)."))
25
+ Pitfall #9 Misused metrics in the code (e.g. poor metric selection, poor implementations)"))
26
+ Pitfall #10 Black box models in the code (e.g. lack of interpretability, lack of transparency)"))
27
+ Pitfall #11 Baseline comparison issues (e.g. if the testing data does not fit the training data)"))
28
+ Pitfall #12 Insufficient reporting in the code (e.g. missing hyperparameters, missing evaluation metrics)"))
29
+ Pitfall #13 Faulty interpretations of the reported results.""" }
30
 
31
  class LocalLLM():
32
  def __init__(self, model_name):