-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
Learning From Mistakes Makes LLM Better Reasoner
Paper • 2310.20689 • Published • 29 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 11 -
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning
Paper • 2308.00436 • Published • 23
李浩
lihaocruiser
AI & ML interests
None yet
Organizations
None yet
LLM-RL
-
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 63 -
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training
Paper • 2306.01693 • Published • 3 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper • 2401.06080 • Published • 29
LLM-Extraction
LLM-RAG
-
Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agent
Paper • 2304.09542 • Published • 5 -
Dense X Retrieval: What Retrieval Granularity Should We Use?
Paper • 2312.06648 • Published • 1 -
Context Tuning for Retrieval Augmented Generation
Paper • 2312.05708 • Published • 16 -
Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models
Paper • 2312.02969 • Published • 15
LLM-Instruct
-
#InsTag: Instruction Tagging for Analyzing Supervised Fine-tuning of Large Language Models
Paper • 2308.07074 • Published -
Evoke: Evoking Critical Thinking Abilities in LLMs via Reviewer-Author Prompt Editing
Paper • 2310.13855 • Published • 1 -
LIMIT: Less Is More for Instruction Tuning Across Evaluation Paradigms
Paper • 2311.13133 • Published -
Group Preference Optimization: Few-Shot Alignment of Large Language Models
Paper • 2310.11523 • Published
LLM-Safety
LLM-Agent
LLM-SyntheticData
LLM-Hallucination
Preprocessing
Embedding
LLM-Prompting
LLM-Emergency
-
Why think step by step? Reasoning emerges from the locality of experience
Paper • 2304.03843 • Published -
Are Emergent Abilities of Large Language Models a Mirage?
Paper • 2304.15004 • Published • 8 -
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Paper • 2407.15017 • Published • 35
LLM-Legal
LLM-Pretrain
-
Data Selection for Language Models via Importance Resampling
Paper • 2302.03169 • Published -
Scaling Data-Constrained Language Models
Paper • 2305.16264 • Published • 17 -
Challenges with unsupervised LLM knowledge discovery
Paper • 2312.10029 • Published • 10 -
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Paper • 2406.11813 • Published • 32
LLM-Evaluation
LLM-Length
-
Extending LLMs' Context Window with 100 Samples
Paper • 2401.07004 • Published • 16 -
Extending Context Window of Large Language Models via Semantic Compression
Paper • 2312.09571 • Published • 15 -
RULER: What's the Real Context Size of Your Long-Context Language Models?
Paper • 2404.06654 • Published • 39
LLM-Dialog
LLM-recomendation
LLM-Summary
LLM-fact
LLM-Reasoning
-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 77 -
Learning From Mistakes Makes LLM Better Reasoner
Paper • 2310.20689 • Published • 29 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 11 -
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning
Paper • 2308.00436 • Published • 23
LLM-Prompting
LLM-RL
-
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Paper • 2305.18290 • Published • 63 -
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training
Paper • 2306.01693 • Published • 3 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 152 -
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper • 2401.06080 • Published • 29
LLM-Emergency
-
Why think step by step? Reasoning emerges from the locality of experience
Paper • 2304.03843 • Published -
Are Emergent Abilities of Large Language Models a Mirage?
Paper • 2304.15004 • Published • 8 -
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Paper • 2407.15017 • Published • 35
LLM-Extraction
LLM-Legal
LLM-RAG
-
Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agent
Paper • 2304.09542 • Published • 5 -
Dense X Retrieval: What Retrieval Granularity Should We Use?
Paper • 2312.06648 • Published • 1 -
Context Tuning for Retrieval Augmented Generation
Paper • 2312.05708 • Published • 16 -
Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models
Paper • 2312.02969 • Published • 15
LLM-Pretrain
-
Data Selection for Language Models via Importance Resampling
Paper • 2302.03169 • Published -
Scaling Data-Constrained Language Models
Paper • 2305.16264 • Published • 17 -
Challenges with unsupervised LLM knowledge discovery
Paper • 2312.10029 • Published • 10 -
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Paper • 2406.11813 • Published • 32
LLM-Instruct
-
#InsTag: Instruction Tagging for Analyzing Supervised Fine-tuning of Large Language Models
Paper • 2308.07074 • Published -
Evoke: Evoking Critical Thinking Abilities in LLMs via Reviewer-Author Prompt Editing
Paper • 2310.13855 • Published • 1 -
LIMIT: Less Is More for Instruction Tuning Across Evaluation Paradigms
Paper • 2311.13133 • Published -
Group Preference Optimization: Few-Shot Alignment of Large Language Models
Paper • 2310.11523 • Published
LLM-Evaluation
LLM-Safety
LLM-Length
-
Extending LLMs' Context Window with 100 Samples
Paper • 2401.07004 • Published • 16 -
Extending Context Window of Large Language Models via Semantic Compression
Paper • 2312.09571 • Published • 15 -
RULER: What's the Real Context Size of Your Long-Context Language Models?
Paper • 2404.06654 • Published • 39
LLM-Agent
LLM-Dialog
LLM-SyntheticData
LLM-recomendation
LLM-Hallucination
LLM-Summary
Preprocessing
LLM-fact
Embedding