refactor: Experiments into their own folders

This commit is contained in:
Leni Aniva 2024-10-01 11:06:01 -07:00
parent 22bba57b99
commit 95e90cc026
Signed by: aniva
GPG Key ID: 4D9B1C8D10EA4C50
153 changed files with 3 additions and 153 deletions

View File

@ -29,12 +29,9 @@ python3 -m pantograph.search_llm
## Experiments
In `experiments/`, there is an experiment on running a LLM prover on miniF2F
data. Run with
```sh
python3 experiments/miniF2F_search.py [--dry-run]
```
In `experiments/`, there are some experiments:
1. `minif2f/` is an example of executing a SGLANG based prover on the miniF2F dataset
2. `dsp` is an Lean implementation of Draft-Sketch-Prove
## Referencing

View File

@ -1,34 +0,0 @@
# Lean Draft Sketch Prove (DSP)
based on Sean Welleck's DSP for Isabelle: https://github.com/wellecks/ntptutorial/tree/main/partII_dsp
## Related work
### Tony's AF
Ton'y original AF: ((Yuhuai et al.))[https://arxiv.org/abs/2205.12615]
Tony's paper improve MiniF2F from: `29.6% to 35.2%`, by `5.6%`.
Expert Iteration:
- AF used: "We explore if one can improve neural theorem provers by training the neural models on proofs of automatically translated theorems".
- they only translate **problem theorems** (nl_thm := "problem + answer") then use a prover to get the formal proof.
- ExpIt algorithn:
- `M_0 := Isabelle_Thor()`
- `Search/Prover := Best_First_Search()` # TODO recall best first search
- ExpIT.fine_tune := "train model to predict next proof_step/tactic given current proof_state and previous proof_step on successful proofs.
- i.e., `<x=(proof_state_{t}, proof_step_{t-1}), y=(proof_step_{t})>` #TODO: I think, confirm with Albert https://twitter.com/messages/1253358235-1267913180153548800
Base Model for Neural Theorem Prover (NTP):
- Thor_GPT2 := "We use a pre-trained and fine-tuned Thor based on a GPT-2 with 700M non-embedding parameters." Note: ReProver used 299M parameters enc-dec.
- fine-tuned on the PILE arxiv + github
Neural Theorem Prover (NTP) for `M_0`:
- Thor :=
- The Thor agent is fine-tuned on the PISA dataset which consists of 2.49 million proof steps from the Isabelle/HOL library.
- The model is trained with the objective to predict the next token in va proof step, given the proof state and the last proof step.
- proof step := "tactic in Isabelle" #TODO confirm with Albert https://twitter.com/messages/1253358235-1267913180153548800
Questions:
- Q1: what is this: "we perform deduplication by problem statements" when does it matter? All MATH train are unique, so why would I care about this?
Idea:
- Idea1: use the formal ground truth solution string in MATH, implement Draft Sketch Proof (DSP) for Lean4 + use some symbolic/ntp solver (hammer/tidy/ReProver)

View File

@ -1,3 +0,0 @@
/build
/lakefile.olean
/lake-packages/*

View File

@ -1,3 +0,0 @@
import Mathlib
-- Ensure that Aesop is running

View File

@ -1,75 +0,0 @@
{"version": "1.1.0",
"packagesDir": ".lake/packages",
"packages":
[{"url": "https://github.com/leanprover-community/batteries",
"type": "git",
"subDir": null,
"scope": "",
"rev": "2ead90d24b4fac3a05c9c4294daa39bd8686fb98",
"name": "batteries",
"manifestFile": "lake-manifest.json",
"inputRev": "main",
"inherited": true,
"configFile": "lakefile.lean"},
{"url": "https://github.com/leanprover-community/ProofWidgets4",
"type": "git",
"subDir": null,
"scope": "",
"rev": "ade8c50c8d1172b974738a01447c29bf6f85f7f8",
"name": "proofwidgets",
"manifestFile": "lake-manifest.json",
"inputRev": "ade8c50c8d1172b974738a01447c29bf6f85f7f8",
"inherited": false,
"configFile": "lakefile.lean"},
{"url": "https://github.com/leanprover-community/aesop.git",
"type": "git",
"subDir": null,
"scope": "",
"rev": "a64fe24aa94e21404940e9217363a9a1ed9a33a6",
"name": "aesop",
"manifestFile": "lake-manifest.json",
"inputRev": "v4.10.0-rc1",
"inherited": false,
"configFile": "lakefile.toml"},
{"url": "https://github.com/leanprover-community/quote4",
"type": "git",
"subDir": null,
"scope": "leanprover-community",
"rev": "a7bfa63f5dddbcab2d4e0569c4cac74b2585e2c6",
"name": "Qq",
"manifestFile": "lake-manifest.json",
"inputRev": "master",
"inherited": true,
"configFile": "lakefile.lean"},
{"url": "https://github.com/leanprover/lean4-cli",
"type": "git",
"subDir": null,
"scope": "",
"rev": "a11566029bd9ec4f68a65394e8c3ff1af74c1a29",
"name": "Cli",
"manifestFile": "lake-manifest.json",
"inputRev": "main",
"inherited": true,
"configFile": "lakefile.lean"},
{"url": "https://github.com/leanprover-community/import-graph",
"type": "git",
"subDir": null,
"scope": "leanprover-community",
"rev": "d366a602cc4a325a6f9db3a3991dfa6d6cf409c5",
"name": "importGraph",
"manifestFile": "lake-manifest.json",
"inputRev": "main",
"inherited": true,
"configFile": "lakefile.toml"},
{"url": "https://github.com/leanprover-community/mathlib4",
"type": "git",
"subDir": null,
"scope": "",
"rev": "f5c3f06aa7f6d6c221786d2890c345a00e6341f8",
"name": "mathlib",
"manifestFile": "lake-manifest.json",
"inputRev": "v4.10.0-rc1",
"inherited": false,
"configFile": "lakefile.lean"}],
"name": "Example",
"lakeDir": ".lake"}

View File

@ -1,16 +0,0 @@
import Lake
open Lake DSL
require proofwidgets from git
"https://github.com/leanprover-community/ProofWidgets4" @ "ade8c50c8d1172b974738a01447c29bf6f85f7f8"
require aesop from git
"https://github.com/leanprover-community/aesop.git" @ "v4.10.0-rc1"
require mathlib from git
"https://github.com/leanprover-community/mathlib4" @ "v4.10.0-rc1"
package Example
@[default_target]
lean_lib Example

View File

@ -1 +0,0 @@
../../src/lean-toolchain

View File

@ -1,15 +0,0 @@
# Usage Example
This example showcases how to bind library dependencies and execute the `Aesop`
tactic in Lean. First build the example project:
``` sh
pushd Example
lake build
popd
```
This would generate compiled `.olean` files. Then run the example from the
project root:
``` sh
poetry run examples/aesop.py
```

Some files were not shown because too many files have changed in this diff Show More