Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initializing with pre-specified population #342

Open
charishma13 opened this issue Aug 28, 2024 · 5 comments
Open

Initializing with pre-specified population #342

charishma13 opened this issue Aug 28, 2024 · 5 comments

Comments

@charishma13
Copy link

I would like to know how to initialize my population with n members which have pre-specified structure. For example, if i want my initiate population to have 15 members all of which have same expression eg: 1+x. Are there Pysr options to do it or is it something need to be updated. Thank you.

@MilesCranmer
Copy link
Owner

This feature does not yet exist, but it would certainly be nice to add it or simplify existing alternatives. The current strategy is basically to initialise the state manually. Alternatively you could run a search for 1 iteration, and then manipulate the saved state to specify individual members of the population. On the PySR discussions page there are some threads about this too.

@charishma13
Copy link
Author

charishma13 commented Aug 30, 2024

Thank you for the suggestion @MilesCranmer. I will check the documentation and do the respective changes. I would also like to know in which Julia file does the actual initialization of population happens for every PySR iteration ?

@MilesCranmer
Copy link
Owner

The initialisation function is here: https://github.com/MilesCranmer/SymbolicRegression.jl/blob/master/src/Population.jl#L36-L62

which gets called here:

@charishma13
Copy link
Author

I am currently facing challenges in creating a custom saved_state. The saved_state is a tuple consisting of a population and a hall of fame object. I am in the process of developing a custom implementation for both the population and the hall of fame. To date, I have successfully created the PopMember component, following the guidance provided in the discussion available at MilesCranmer/PySR#443. I am attempting to create a population using PopMember instances, and I was considering calling the struct directly for this purpose. However, I am unsure if this approach will work as intended. I am encountering errors with the following code in highlighted line.

using .SymbolicRegression: Node, Options, equation_search, Dataset, PopMember, HallOfFame, Population
using CSV
using DataFrames

val = Node{Float64}(val=162.0)
xsi = Node{Float64}(val=1.224f0)

options = Options(binary_operators=[+, -, *, /])

csv_file_path = "water_water.csv"
data = CSV.File(csv_file_path) |> DataFrame

X1 = reshape(data."Angle", 1, :)
X2 = reshape(data."OH1", 1, :)
X3 = reshape(data."OH2", 1, :)
X4 = reshape(data."H1H2", 1, :)
X = [X1 X2 X3 X4]

X = reshape(X, 4, :)
y = data."Energy"

# Assuming y is your target variable
y_min = minimum(y)
y_scaled = (y .- y_min) * 2625.5002

dataset = Dataset(X, y_scaled)

# Format to PopMember:
member = PopMember(dataset, val, options; deterministic=false)
member1 = PopMember(dataset, xsi, options; deterministic=false)

>> population = Population{Float32, Float64, Node{Float32}}([member, member1], 2)

ERROR

ERROR: LoadError: TypeError: in Population, in L, expected L<:Real, got a value of type Float64
Stacktrace:
[1] top-level scope
@ ~/LU_Exp/popmembers_hof.jl:77

@charishma13
Copy link
Author

charishma13 commented Sep 14, 2024

Hello @MilesCranmer,

I have managed to populate the Population using the following code: Population{Float64, Float64, Node{Float64}}([member for _ in 1:33],33)

I would like to inquire about where the initialization begins within the SymbolicRegression.jl framework, particularly with respect to functions such as _main_search_loop, _warmup_search, _initialize_search, and _create_workers. Would you please clarify which function is responsible for invoking the Population struct and initiating its initialization?.

Our intention is to modify the process starting from the initial population phase, allowing PySR to search for equations based on a predefined expression given using the saved_state. We have successfully implemented a custom saved_state for equation search and are utilizing it in the equation search process. However, the hall of fame is initiating from our specified expression and is restarting the search from a complexity of 1. Could you please advice on how to use saved_state so that the equation search starts from our defined expression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants