Analytical Reasoning Llama

Llama3.2 3B fine-tuned on microsoft/orca-agentinstruct-1M-v1 dataset’s analytical reasoning section.

Training

Trained with Unsloth using QLora R=16 Alpha=32

Eval

The fine tuned model (DevQuasar/analytical_reasoning_r16a32_unsloth-Llama-3.2-3B-Instruct-bnb-4bit)
has gained performace over the base model (unsloth/Llama-3.2-3B-Instruct-bnb-4bit)
in the following tasks.

TestBase ModelFine-Tuned ModelPerformance Gain
leaderboard_bbh_logical_deduction_seven_objects0.25200.43600.1840
leaderboard_bbh_logical_deduction_five_objects0.35600.45600.1000
leaderboard_musr_team_allocation0.22000.32000.1000
leaderboard_bbh_disambiguation_qa0.30400.37600.0720
leaderboard_gpqa_diamond0.22220.27270.0505
leaderboard_bbh_movie_recommendation0.59600.63600.0400
leaderboard_bbh_formal_fallacies0.50800.54000.0320
leaderboard_bbh_tracking_shuffled_objects_three_objects0.31600.34400.0280
leaderboard_bbh_causal_judgement0.54550.56680.0214
leaderboard_bbh_web_of_lies0.49600.51600.0200
leaderboard_math_geometry_hard0.04550.06060.0152
leaderboard_math_num_theory_hard0.05190.06490.0130
leaderboard_musr_murder_mysteries0.52800.54000.0120
leaderboard_gpqa_extended0.27110.28020.0092
leaderboard_bbh_sports_understanding0.59600.60400.0080
leaderboard_math_intermediate_algebra_hard0.01070.01430.0036

Model
DevQuasar/analytical_reasoning_r16a32_unsloth-Llama-3.2-3B-Instruct-bnb-4bit

Adapter
DevQuasar/analytical_reasoning_r16a32_unsloth-Llama-3.2-3B-Instruct-bnb-4bit_adapter

Quantized model
DevQuasar/analytical_reasoning_r16a32_unsloth-Llama-3.2-3B-Instruct-bnb-4bit-GGUF