Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameters #1240

Merged
merged 29 commits into from
Sep 25, 2023
Merged

Parameters #1240

merged 29 commits into from
Sep 25, 2023

Conversation

alanlujan91
Copy link
Member

@alanlujan91 alanlujan91 commented Mar 9, 2023

Please ensure your pull request adheres to the following guidelines:

  • Tests for new functionality/models or Tests to reproduce the bug-fix in code.
  • Updated documentation of features that add new functionality.
  • Update CHANGELOG.md with major/minor changes.

@alanlujan91 alanlujan91 requested a review from sbenthall March 9, 2023 18:55
@alanlujan91 alanlujan91 changed the title Parameters [WIP] Parameters Mar 9, 2023
@alanlujan91
Copy link
Member Author

@sbenthall need help with one failing test

FAILED HARK/tests/test_parallel.py::testParallel::test_multi_thread_commands - _pickle.PicklingError: Could not pickle the task to send it to the workers.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@sbenthall sbenthall self-assigned this Mar 22, 2023
@codecov
Copy link

codecov bot commented Apr 7, 2023

Codecov Report

Patch coverage: 80.00% and project coverage change: +0.01 🎉

Comparison is base (9167a76) 73.31% compared to head (96d5653) 73.33%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1240      +/-   ##
==========================================
+ Coverage   73.31%   73.33%   +0.01%     
==========================================
  Files          76       76              
  Lines       12551    12634      +83     
==========================================
+ Hits         9202     9265      +63     
- Misses       3349     3369      +20     
Impacted Files Coverage Δ
...umptionSaving/tests/test_ConsGenIncProcessModel.py 100.00% <ø> (ø)
HARK/core.py 90.36% <77.77%> (-2.96%) ⬇️
HARK/ConsumptionSaving/ConsGenIncProcessModel.py 82.55% <100.00%> (ø)
HARK/ConsumptionSaving/ConsLaborModel.py 82.97% <100.00%> (ø)
...nsumptionSaving/tests/test_IndShockConsumerType.py 79.18% <100.00%> (+0.17%) ⬆️
HARK/utilities.py 24.57% <100.00%> (+0.21%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

HARK/core.py Outdated Show resolved Hide resolved
HARK/core.py Outdated Show resolved Hide resolved
HARK/core.py Outdated
-------
Parameters or value
Parameters object with parameters that apply to age `age_or_key` or value of
parameter with name `age_or_key`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing something, but it feels a little odd for the return type of this in the case where the argument is an age to be a Parameters object.

The Parameters object seem to be about an Agent's entire parameterization, including the age of the agent...but what does it mean to be a Parameters object for one age slice of an agent?

Consider having the return type in the age case be a dictionary or named tuple.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I just thought the slice of a Parameter should also be a Parameter.

We can think of Parameter objects as inputs to agents, but also as inputs to Solvers, or Simulators, or Estimators. Sometimes these objects will only take one age at a time, but I thought they should still receive a collection of parameters wrapped in a Parameters object. Then if we actually need a primitive like a dictionary or named tuple, we could easily get that from the Parameters slice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering to what extent this object is a novel data structure, and to what extent it has specific semantics based on its uses in modeling.

In many ways, the fundamental data structure is an array. We might accomplish a lot of the same thing with a $T \times P$ array, where $T$ is the maximum age and $P$ is the number of parameters. Then we would 'slice' literally by time using the same well-known array slicing interface.

Of course, the difference in this case is that some parameter rows are being defined, up front, as scalars that do not vary. This is largely for convenience -- we don't want to have to manually write in long lists of constant value.

If Parameters is a data structure, then I suppose it makes sense for a slice of Parameters to be another Parameters object.

But if part of the semantics of Parameters is that it has to do with agent ages, then I wonder if the abstraction breaks down.

For example, consider:

In [1]: from HARK.core import Parameters

In [2]: from HARK.ConsumptionSaving.ConsIndShockModel import init_lifecycle

In [3]: params = Parameters(**init_lifecycle)

In [4]: params._term_age
Out[4]: 65

In [5]: param_slice = params[10]

What is param_slice._term_age?

The docs say that _term_age is "The terminal age of the agents in the model." But wouldn't param_slice be a Parameters object with no age varying parameters? Does _term_age maintain the original reference, or does it become 1?

IIRC, when you take a slice of a Python array, you wind up with a view into that array, rather than a copy of that array. This has consequences: when you mutate the slice, you mutate the original array. As in:

>>> a = np.zeros((3,3))
>>> b = a[1:2,1:2]
>>> b[0,0] = 3
>>> a
array([[0., 0., 0.],
       [0., 3., 0.],
       [0., 0., 0.]])

I don't think that with your current implementation, mutating slices would mutate the underlying Parameters object. That may or may not be desirable. But I hope these choices can be made in a considered way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _term_age of a slice would be 1, as it re-interprets all the parameters given, which are now only for a particular time period.

The intended use of this data structure is to both handle the age dimension of models and to solve the "white board" problem.

key : str
name of parameter
value : Any
value of parameter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to get an intuition for how this feels with time-varying item setting.

I suppose if you do:

p = Parameters()

p['PermShkStd'] = [1,2,3,4]

this should set the PermShkStd parameter to the time-vary list given, and it should check to see if it's consistent with the max age of the agent, and so on?

Does this do that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, this is how it looks in action

In [1]: from HARK.core import Parameters

In [2]: from HARK.ConsumptionSaving.ConsIndShockModel import init_lifecycle

In [3]: params = Parameters(**init_lifecycle)

In [4]: params._term_age
Out[4]: 65

In [5]: params["new_var"] = [1]*65

In [6]: params._age_var
Out[7]: ['LivPrb', 'PermGroFac', 'PermShkStd', 'TranShkStd', 'new_var']

In [8]:

return {key: getattr(self, key) for key in self.keys()}

def to_namedtuple(self):
return namedtuple("Parameters", self.keys())(**self.to_dict())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about a from_namedtuple? See above about time slicing a Parameters object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that we need a special case for from_namedtuple. With namedtuple you can do namedtuple._asdict(), so to create a Parameters object from a namedtuple you'd just have to do Parameters(**namedtuple._asdict())

HARK/core.py Outdated Show resolved Hide resolved
@sbenthall
Copy link
Contributor

Thanks for working on this.
This is going towards solving a lot of headaches in HARK!

See inline comments.

Also, once the code is finalized some unit tests of the Parameters object would be good form.
Especially testing the getitem functionality, which looks a little complex, and the automatic testing for the validity fo time varying inputs.

@alanlujan91
Copy link
Member Author

Thanks for working on this. This is going towards solving a lot of headaches in HARK!

See inline comments.

Also, once the code is finalized some unit tests of the Parameters object would be good form. Especially testing the getitem functionality, which looks a little complex, and the automatic testing for the validity fo time varying inputs.

I will add specific tests for Parameters, but one reassuring thing is that Parameters is already being used throughout the codebase in this PR and nothing breaks with a few tweaks.

@llorracc
Copy link
Collaborator

This looks like a big improvement.

One question: In my pseudocode for the redesign, I've decided that there needs to be a setup phase that defines things that apply to the model as a whole, like modl.name='MateosJMP' and so on. This might, for a life cycle model, include things like the terminal age (say, 120). Then the idea is that the structure gets built by doing modl.add_period operations which build, first the terminal period, then the penultimate period, then penultimate, etc. In this way of doing things, the Parameters object as a whole would be built by accretion as the backwards steps get added. The slice of the parameters object that applies to a particular period would be available to all operations within that period (like, part of the period's namespace), but the period would not have access to the slices from other periods. Just wanted to make sure you think about that plan as you design this.

@Mv77
Copy link
Contributor

Mv77 commented Apr 28, 2023

This is awesome and I look forward to it being integrated.

A good check for it would be to go ahead and replace the machinery that yields parameters to solve_one_period with your class's __get_item__. This bit

https://github.com/alanlujan91/HARK/blob/96d56538d633160141e50921dd1197d675d66a97/HARK/core.py#L1182-L1204

@Mv77
Copy link
Contributor

Mv77 commented Jul 24, 2023

Here's a quick sketch of what I have in mind for Parameters

Takes in a dictionary where the key is the name of a parameter and the value is the value of the parameter If the value is an int, float, or np.ndarray, it is time invariant if the value is a list or tuple, it is time varying, unless it is of length 1.

Once this is done, we can ignore the hardcoded structure of time_inv and time_vary, as this will now depend on the parameters in the dictionary.

Sounds pretty good! Would there be a mechanism to update params in ways that imply they switch from time-varying to time-invariant and vice-versa?

@alanlujan91
Copy link
Member Author

Sounds pretty good! Would there be a mechanism to update params in ways that imply they switch from time-varying to time-invariant and vice-versa?

Yes, every time you set a new parameter ( params[reused_key] = something_new ) the parameter gets reinterpreted according to its contents.

@sbenthall
Copy link
Contributor

I'm +1 on using the explicit type of the parameter value to indicate whether or not it is time-varying.

Here's an example of how I've tried implementing time-varying, non-time varying, and aggregate shocks:
https://github.com/econ-ark/HARK/pull/1296/files#diff-af97a9dba05d2a1808c06e91f081d696411de5ce827102199af7c738a4f9d22dR9-R17

Here's a generic draw_shocks function that takes this shock configuration object and samples from it correctly using type checking:
https://github.com/econ-ark/HARK/pull/1296/files#diff-8e59bc753ba90dc0a3314f0ae5c3366d799ed6cb362a010903f83432acaa02b4R42-R55

I think ideally, all the solution and simulation codes are agnostic about whether variables are time-varying or not. It just works.

@sbenthall
Copy link
Contributor

Related: here is how I've implemented generic Monte Carlo simulation so that it uses draws (that might be age-varying, aggegate, or not-age-varying) as part of a general simulation step.

https://github.com/econ-ark/HARK/pull/1296/files#diff-8e59bc753ba90dc0a3314f0ae5c3366d799ed6cb362a010903f83432acaa02b4R291-R293

What I don't have implemented yet is selection of age-appropriate parameters for the entire population of agents (like we already can have age-appropriate draws from an Index or TimeVarying distribution). If that isn't already implemented on the Parameters class in this PR, I can add that later.

@Mv77
Copy link
Contributor

Mv77 commented Aug 7, 2023

Hey,

My jacobian explorations have arrived at the point I need to start messing with time-varying parameters.

I wanted to ask if this PR will be merged in in the near future (or is it in the back burner?) to decide how to proceed.

@alanlujan91
Copy link
Member Author

@Mv77 The Parameters object now works as intended, but I have not plugged it into the AgentType object because doing so requires a bigger structural change than I anticipated. Essentially, every self.XXPARAMXX needs to be replaced with self.params.XXPARAMXX or some other clever way to do it across the whole codebase

@Mv77
Copy link
Contributor

Mv77 commented Aug 8, 2023

Argh yeah that sounds like an annoying job. I imagine remarks and demarks will break too once you do that.
But we should!
How do we solve this public good problem? Happy to chip in updating a couple of models.

@llorracc
Copy link
Collaborator

llorracc commented Aug 9, 2023

This all gets back to my fundamental point that we have not really described what an AgentType should be.

In my view, it make sense to have descriptions of the solution machinery for the "solveOneStage" component of particular problems. Say, PerfForesightCRRA. And it makes sense to be able to string these stages together to define what happens in a period; say, the last stage in the problem is to solve your consumption choice, the second-to-last-stage is to define your labor supply choice, and the first stage is to define your portfolio choice. Then we might have a PortLaborConsPeriod object that strings these problems together and respects the requirements that the inputs and outputs of the successive stages mesh with each others. If we wanted to solve an infinite horizon problem, the algorithm would be just to:

  1. Add another "period" onto the periods we have already solved
  2. Solve it
  3. Check for convergence
    • if it has not converged, go back to 0

Maybe in a finite horizon problem we would want to assume that in the periods from age 90 to 120, the problem is a very simple one, with only stochastic medical shocks (or medical shocks and stochastic pension income). Fine, then the way you do that is you build a constructor that first defines the problem of a 119 year old as just having one stage, which is the solution to a "ConsMedOldOldGuy" problem, and then adds the problem of a 118 year old, which also consists entirely and exclusively of the apparatus necesssary to solve a "ConsMedOldOldGuy" problem, and so on until you get to the 90 year old. Maybe between 90 and 80, there is some utility from other activities, so you perhaps add an extra stage to the problem of those agents: "ConsFairlyOldGuy"; but you continue building back until you get to age 80. Then maybe between 80 and 70 you have more options, like travel, which is the solution to a "ConsYoungOldGuy" problem; so you need to add another stage to the decision problem for those people. Then between 60 and 70 you have people who may or may not have retired, so you have a "retire yes or no" stage, so now there's a "ConsMaybeRetireGuy" problem. And so on.

What is an "AgentType" here?

Or, what is an "AgentType" in an infinite horizon problem, as distinct from just an agent solving a particular type of problem? Why do we need an AgentType at all? Maybe we need StageTypes because there might be stages in the infinite horizon consumer's problem (portfolio optimization, labor supply, consumption) but that is really about stringing together several solution models in order.

@sbenthall
Copy link
Contributor

I agree that:

  • refactoring all the AgentType models is going to be work
  • it's not clear what AgentType is or should be

I believe that @mnwhite @alanlujan91 and I have discussed this and have agreed that it would make sense to have a more modular architecture that cleanly separates: Model definitions, Solvers/solutions, Simulators, and maybe Estimators. These should plug into each other with clean APIs but be distinct.

I also believe we've agreed that AgentType can, in the future, be a class that wraps these components in a useful way. Rather than holding a lot of functionality inside it, the AgentType can be an object that takes a model, a solve, and a simulator.

Essentially, the biggest problem with HARK right now is that because everything is so tightly coupled in the AgentType class with its 'whiteboard problem', and everything that inherits from it, any substantive improvement requires laborious updates to all downstream models.

I think it would make sense for @alanlujan91 to prepare a PR with just the Parameters class, standing alone and with tests, but not blocking on the big AgentType refactor. That way the Parameters object could be used in more modular components (like the simulators and solvers I've been working on).

I believe @mnwhite has volunteered to be a hero and refactor the AgentType classes? Maybe a lightweight way to start would be to shift over to the Parameters object. If Parameters is already in the library, he could adjust that class if it's necessary to do so when putting it into practice.

@llorracc
Copy link
Collaborator

llorracc commented Aug 9, 2023 via email

@sbenthall
Copy link
Contributor

@llorracc I'm troubled that after all of the recent meetings, you are still drawing a dichotomy between "incremental improvements" and "HARK 2.0".

Please consider that we might make incremental improvements to HARK 1.0, and then incrementally improve to HARK 2.0.

@llorracc
Copy link
Collaborator

llorracc commented Aug 9, 2023 via email

@sbenthall sbenthall mentioned this pull request Aug 10, 2023
3 tasks
@alanlujan91
Copy link
Member Author

@Mv77 @sbenthall pending passing of tests, I think this is reached the point where it can stand on its own and be used as needed. If you have time to review this I would appreciate it.

class Model:
"""
A class with special handling of parameters assignment.
"""

def __init__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May we consider initializing a parameters object when a model is created, even if it is not used (for now)?

I know re-working all of hark so that it works with this new object is a huge lift, but maybe creating the object in the existing agent types can open the opportunity to start using it.

For instance, if ConsIndShock had a parameters object of this new class (currently a dictionary), we could start reworking bits like the income process (which is pervasive across hark) to use this new tool. Creating it as an extra property would not break the existing code.

The idea would violate the principle of having a single source of truth for any param but I think it'd be a nice way to start working this in. Would be interested in hearing what @sbenthall @mnwhite think.

Copy link
Contributor

@Mv77 Mv77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me,

It's hard for me to ask for features or be critical of the code without it being used in an agent type or to streamline some of the existing methods. We've agreed that's a huge lift that should be postponed, so I am looking at this as the creation of a structure that will be used in the future. It is a nice structure and I'd be happy to have it merged in.

How are we going to tackle the huge lift?

@sbenthall sbenthall changed the title [WIP] Parameters Parameters Aug 30, 2023
@sbenthall
Copy link
Contributor

This passes my review. Nice work @alanlujan91 . Has the CHANGELOG been updated?

@alanlujan91
Copy link
Member Author

Will do this later today!

@sbenthall
Copy link
Contributor

Hooray! This will be merged now!

@sbenthall sbenthall merged commit da30b24 into econ-ark:master Sep 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants