Zhihui commited on
Commit
ccc53b9
·
verified ·
1 Parent(s): ee3fd4e

Add quickstart to README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -1
README.md CHANGED
@@ -8,7 +8,73 @@ CTRL-32B is a critic LLM finetuned from [Qwen2.5-Coder-32B-Instruct](https://hug
8
  - **Project Page:** https://critic-rl.github.io/
9
  - **Paper:** https://arxiv.org/abs/2502.03492
10
 
11
- # Citation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
 
13
  ```bibtex
14
  @article{xie2025teaching,
 
8
  - **Project Page:** https://critic-rl.github.io/
9
  - **Paper:** https://arxiv.org/abs/2502.03492
10
 
11
+ ## Quickstart
12
+ We recommend using [vLLM](https://docs.vllm.ai/en/latest/getting_started/quickstart.html) for inference:
13
+ ```python
14
+ from vllm import LLM, SamplingParams
15
+
16
+ def format_prompt_for_ctrl(problem, answer):
17
+ """Given a question-answer pair, we ask the model to generate a critique."""
18
+ return f"""You are tasked with analyzing an answer to a problem and providing constructive feedback. Do NOT provide direct solutions.
19
+
20
+ Problem description:
21
+ <problem>
22
+ {problem}
23
+ </problem>
24
+
25
+ Answer:
26
+ <answer>
27
+ {answer}
28
+ </answer>
29
+
30
+ Structure your response using the following format (without <format> tags):
31
+ <format>
32
+ Analysis:
33
+ {{Analysis}}
34
+
35
+ Improvement suggestions:
36
+ {{Suggestions}}
37
+
38
+ Overall judgment: {{Correct/Incorrect}}
39
+ </format>"""
40
+
41
+ # Sample prompts.
42
+ problem = """Write a python function to check whether every odd index contains odd numbers of a given list."""
43
+ answer = """```python
44
+ def odd_length_sum(arr):
45
+ n = len(arr)
46
+ res = 0
47
+
48
+ # Iterate through each element in the array
49
+ for i in range(n):
50
+ # Calculate the number of subarrays in which arr[i] is present
51
+ count = ((i + 1) * (n - i) + 1) // 2
52
+
53
+ # If the count is odd, add the element to the result
54
+ if count % 2 == 1:
55
+ res += arr[i]
56
+
57
+ return res
58
+ ```"""
59
+ prompts = [
60
+ format_prompt_for_ctrl(problem, answer),
61
+ ]
62
+ # Create a sampling params object.
63
+ sampling_params = SamplingParams(temperature=0.7, top_p=0.8, repetition_penalty=1.05, max_tokens=1024)
64
+
65
+ # Create an LLM.
66
+ llm = LLM(model="Zhihui/CTRL-32B", tensor_parallel_size=2)
67
+ # Generate texts from the prompts. The output is a list of RequestOutput objects
68
+ # that contain the prompt, generated text, and other information.
69
+ outputs = llm.generate(prompts, sampling_params)
70
+ # Print the outputs.
71
+ for output in outputs:
72
+ prompt = output.prompt
73
+ generated_text = output.outputs[0].text
74
+ print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
75
+ ```
76
+
77
+ ## Citation
78
 
79
  ```bibtex
80
  @article{xie2025teaching,