Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(genLocalState) - debugging utility for log fetching and local maci state generation #758

Merged
merged 26 commits into from
Nov 28, 2023

Conversation

ctrlc03
Copy link
Collaborator

@ctrlc03 ctrlc03 commented Oct 20, 2023

fix #723
fix #447

  • Fix bug of double fetching
  • Add fetching in blocks for generateMaciState
  • Implement serialization and deserialization for all DomainObjects as well as MaciState and Poll
  • In genLocalState save the serialized MaciState and Polls to a JSON file
  • in genProofs add the option to read the state from a json file -> deserialize the json file into a valid MaciState and Polls

How to test:

Run tests/vanilla/testPreFetch.sh

@ctrlc03 ctrlc03 changed the title feat(fetch-logs) - add back fetch log and adapt to MACI v1 feat(fetch-logs) - fetch logs in batches v1 Oct 20, 2023
@ctrlc03 ctrlc03 linked an issue Oct 23, 2023 that may be closed by this pull request
@ctrlc03 ctrlc03 requested review from chaosma and removed request for chaosma October 23, 2023 17:27
@ctrlc03 ctrlc03 self-assigned this Oct 23, 2023
@ctrlc03 ctrlc03 marked this pull request as ready for review October 26, 2023 12:15
@baumstern
Copy link
Member

Isn't #723 a request to add the fetchLogs command back?

const lastBlock = endBlock ? endBlock : await provider.getBlockNumber()

// Fetch event logs in batches
for (let i = fromBlock; i < lastBlock; i += blocksPerRequest + 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's worthwhile to implement test code to avoid issues like the previous case of duplicate logs. Please note that such duplicate logs resulted in generating an incorrect proof.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    // init() should only be called up to 1 time
    assert(
        initLogs.length <= 1, 
        'More than 1 init() event detected which should not be possible',
    )

If there are duplicate logs, this will throw.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are signUpLogs, mergeStateAqSubRootsLogs, mergeStateAqLogs, deployPollLogs, publishMessageLogs...etc. Without adequate tests, confidently predicting the impact of these changes on production becomes challenging. What are your thoughts on this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code incorporates the fix needed to ensure we do not fetch duplicate logs - the same fix that the project encountering the issue proposed + We do have e2e and integration tests which use this genProof and have been working correctly.

@ctrlc03
Copy link
Collaborator Author

ctrlc03 commented Nov 1, 2023

Isn't #723 a request to add the

sure, but there was missing context about that, fetchlogs was replaced by this genMaciStateFromContract in v1. Only thing missing was the ability to fetch in batches

@ctrlc03 ctrlc03 requested a review from baumstern November 1, 2023 18:40
@baumstern
Copy link
Member

baumstern commented Nov 2, 2023

Isn't #723 a request to add the

sure, but there was missing context about that, fetchlogs was replaced by this genMaciStateFromContract in v1. Only thing missing was the ability to fetch in batches

It’s challenging to view genMaciStateFromContract() as a direct replacement for fetchLogs() considering the following operational aspects:

File Output Capability: The original fetchLogs() provides a feature to output logs to a file. This is particularly important when dealing with extensive datasets, ensuring that we can pre-fetch logs and pass them onto the genProofs().

Code Reference: fetchLogs usage

Independent Usability for Debugging: fetchLogs() offers the advantage of being independently usable within the CLI, alongside the genProofs() command. This independence is not just about providing flexibility; it plays a critical role in debugging and verification. Given the historical context, such as the ethStaker round last year where we witnessed genProof() command failures and experienced runtimes extending beyond 10 hours, having the ability to independently run and validate the output of fetchLogs becomes invaluable.

@ctrlc03 ctrlc03 marked this pull request as draft November 2, 2023 18:18
@ctrlc03 ctrlc03 marked this pull request as ready for review November 7, 2023 16:34
@ctrlc03 ctrlc03 mentioned this pull request Nov 7, 2023
8 tasks
Copy link
Member

@baumstern baumstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fetchLogs command, as it's currently implemented, seems to be misnamed as it does more than just fetching logs—it generates and outputs the MACI state, which is an entirely different responsibility. This oversteps the expected functionality of a log-fetching utility, leading to unnecessary complexity and potential confusion:

  • The command requires contract ABIs and the coordinator's private key, which are beyond the scope of simply fetching logs. These requirements suggest operations relevant to MACI state generation and proof creation, not log retrieval.
  • It performs operations with significant time complexity, such as array pushes and sorts on the entire log dataset, which are superfluous for log fetching; these operations incurs O(n) and O(n log n) time complexities.
  • The inclusion of the coordinator's private key raises security concerns, as it's not necessary for fetching logs and should not be requested without cause

This conflation of concerns could be misleading and might introduce inefficiencies or security vulnerabilities. A command should do one thing and do it well—fetching logs shouldn't involve generating state. Consider refactoring to separate these concerns, ensuring the command name and its functionality are in alignment, and that it adheres to the principle of least privilege by not requiring sensitive information where it's not needed.

cli/ts/fetchLogs.ts Outdated Show resolved Hide resolved
for (let i = fromBlock; i < lastBlock; i += blocksPerRequest + 1) {
const toBlock = (i + blocksPerRequest) >= lastBlock ? undefined : i + blocksPerRequest

const tmpInitLogs = await provider.getLogs({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider (Rate Limiting): It's worth noting that this PR addresses the eth_getLogs rate limit by splitting up requests to stay within the 10K logs response cap. However, Alchemy also implements rate limiting through Compute Units Per Second (CUPS), which applies to the API key as a whole. While this PR doesn't need to address CUPS limits right now, it's an important aspect to keep in mind for the future to ensure our application remains within Alchemy's operational boundaries (or generally other node provider services as well).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a sleep flag (optional) which users can use to add a sleep in between each getLog request :) - great suggestion thanksss


// ensure we have at least one address
if ((!contractAddrs||!contractAddrs["MACI"]||!contractAddrs[pollArtifactName]) && !args.maci_contract && !args.poll_contract) {
console.error('Error: MACI and Poll contract addresses are empty or this poll Id does not exist')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message does not guide the user on how to resolve the issue. It would be beneficial to specify which command-line arguments the user needs to pass to rectify this error. For example, we could enhance the error message to something like:

  • Error: MACI and Poll contract addresses are empty or this poll Id does not exist. Please ensure you include '--maci-contract' and '--poll-contract' with the respective contract addresses as arguments.

This clarification would help users to self-diagnose and fix the issue without delving into the codebase or documentation.

const pollArtifactName = `Poll-${pollId}`

// ensure we have at least one address
if ((!contractAddrs||!contractAddrs["MACI"]||!contractAddrs[pollArtifactName]) && !args.maci_contract && !args.poll_contract) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conditional check seems to allow the execution to proceed even when poll_contract is not provided, which results in a TypeError at the address validation step:

% node build/index.js fetchLogs -n 150 -o fetchlogs.log -sk 'macisk.aa' -x '0x37091e3064c20637950e007500049477B0b96636'

/Users/daehyun/workspace/pse/maci/cli/ts/utils.ts:33
    return address.match(/^0x[a-fA-F0-9]{40}$/) != null
                   ^
TypeError: Cannot read properties of undefined (reading 'match')
    at validateEthAddress (/Users/daehyun/workspace/pse/maci/cli/ts/utils.ts:33:20)
    at /Users/daehyun/workspace/pse/maci/cli/ts/fetchLogs.ts:143:28
    at step (/Users/daehyun/workspace/pse/maci/cli/build/fetchLogs.js:33:23)
    at Object.next (/Users/daehyun/workspace/pse/maci/cli/build/fetchLogs.js:14:53)
    at /Users/daehyun/workspace/pse/maci/cli/build/fetchLogs.js:8:71
    at new Promise (<anonymous>)
    at __awaiter (/Users/daehyun/workspace/pse/maci/cli/build/fetchLogs.js:4:12)
    at fetchLogs (/Users/daehyun/workspace/pse/maci/cli/ts/fetchLogs.ts:121:35)
    at /Users/daehyun/workspace/pse/maci/cli/ts/index.ts:249:24
    at step (/Users/daehyun/workspace/pse/maci/cli/build/index.js:34:23)
    at Object.next (/Users/daehyun/workspace/pse/maci/cli/build/index.js:15:53)
    at /Users/daehyun/workspace/pse/maci/cli/build/index.js:9:71
    at new Promise (<anonymous>)
    at __awaiter (/Users/daehyun/workspace/pse/maci/cli/build/index.js:5:12)
    at main (/Users/daehyun/workspace/pse/maci/cli/ts/index.ts:137:14)
    at Object.<anonymous> (/Users/daehyun/workspace/pse/maci/cli/ts/index.ts:254:5)

To prevent this runtime error, we should consider making the poll_contract argument mandatory or add a more descriptive error message when it's missing. This would improve the user experience by providing immediate and clear feedback on what is required for successful execution.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mm if you do not pass a poll ID, it will default to poll ID number 0.. where are you trying this? on a local network or on a live testnet?

)

parser.addArgument(
['-e', '--eth-provider'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --eth-provider argument is essential for the operation of this script as it stands. Omitting this results in a NETWORK_ERROR, indicating that the script cannot detect the network.

% node build/index.js fetchLogs -n 150 -o fetchlogs.log -sk 'macisk.aaa' -x '0x37091e3064c20637950e007500049477B0b96636' -p '0x37091e3064c20637950e007500049477B0b96636'

/Users/daehyun/workspace/pse/maci/cli/node_modules/@ethersproject/logger/src.ts/index.ts:269
        const error: any = new Error(message);
                           ^
Error: could not detect network (event="noNetwork", code=NETWORK_ERROR, version=providers/5.7.2)
    at Logger.makeError (/Users/daehyun/workspace/pse/maci/cli/node_modules/@ethersproject/logger/src.ts/index.ts:269:28)
    at Logger.throwError (/Users/daehyun/workspace/pse/maci/cli/node_modules/@ethersproject/logger/src.ts/index.ts:281:20)
    at JsonRpcProvider.<anonymous> (/Users/daehyun/workspace/pse/maci/cli/node_modules/@ethersproject/providers/src.ts/json-rpc-provider.ts:483:23)
    at step (/Users/daehyun/workspace/pse/maci/cli/node_modules/@ethersproject/providers/lib/json-rpc-provider.js:48:23)
    at Object.throw (/Users/daehyun/workspace/pse/maci/cli/node_modules/@ethersproject/providers/lib/json-rpc-provider.js:29:53)
    at rejected (/Users/daehyun/workspace/pse/maci/cli/node_modules/@ethersproject/providers/lib/json-rpc-provider.js:21:65)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)

Could we adjust the argument's configuration to be mandatory?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could it be that you were not running a hardhat local node? if you do not pass it as argument then it will default to the local rpc.. if no local node is running it will ofc give this error

Comment on lines 40 to 41
maciPrivkeyGroup.addArgument(
['-sk', '--privkey'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the main functionality of fetchLogs command is to retrieve event logs, including a private key argument might not be required. Removing it could simplify the command-line interface and minimize potential security concerns.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is needed to generate the maci state :)

cli/ts/fetchLogs.ts Outdated Show resolved Hide resolved
Comment on lines 163 to 177
// fetch abis
const [ maciContractAbi ] = parseArtifact('MACI')
const [ pollContractAbi ] = parseArtifact('Poll')

const maciContract = new ethers.Contract(
maciAddress,
maciContractAbi,
provider,
)

const pollContract = new ethers.Contract(
pollAddress,
pollContractAbi,
provider
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maciContract and pollContract instances are created here but do not appear to be used later in the code. If these instances are not required for subsequent operations, their creation could unnecessarily consume resources and possibly lead to confusion about their purpose. It might be worth considering their removal to streamline the codebase unless there is a future use for them that is not immediately apparent. This would also help in maintaining clean and efficient code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed thanks

Comment on lines 365 to 394
if (args.state_file) {
// @todo actually read the file first
const content = JSON.parse(fs.readFileSync(args.state_file, 'utf-8').toString())
maciState = MaciState.fromJSON(content)
// ensure we merge all messages
maciState.polls.forEach((poll: Poll) => poll.mergeAllMessages())
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduce to reading from a state file if args.state_file is provided, and then generating the proof based on the JSON content. However, there appears to be no test coverage to verify the correctness of the serialization/deserialization process and the subsequent proof generation using the parsed data. Could we introduce unit test that validate the correct functioning of the proof generation when using parsed data?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I can add one e2e! unit tests for deserialization and serialization are in the core package btw

fromBlock = txn.blockNumber
if (args.state_file) {
// @todo actually read the file first
const content = JSON.parse(fs.readFileSync(args.state_file, 'utf-8').toString())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that the input file format is correct. It's important to handle potential exceptions that could arise from incorrect file formats. Adding error handling for the JSON parsing process could prevent runtime errors and improve the robustness of the script.

@ctrlc03
Copy link
Collaborator Author

ctrlc03 commented Nov 9, 2023

The fetchLogs command, as it's currently implemented, seems to be misnamed as it does more than just fetching logs—it generates and outputs the MACI state, which is an entirely different responsibility. This oversteps the expected functionality of a log-fetching utility, leading to unnecessary complexity and potential confusion:

  • The command requires contract ABIs and the coordinator's private key, which are beyond the scope of simply fetching logs. These requirements suggest operations relevant to MACI state generation and proof creation, not log retrieval.
  • It performs operations with significant time complexity, such as array pushes and sorts on the entire log dataset, which are superfluous for log fetching; these operations incurs O(n) and O(n log n) time complexities.
  • The inclusion of the coordinator's private key raises security concerns, as it's not necessary for fetching logs and should not be requested without cause

This conflation of concerns could be misleading and might introduce inefficiencies or security vulnerabilities. A command should do one thing and do it well—fetching logs shouldn't involve generating state. Consider refactoring to separate these concerns, ensuring the command name and its functionality are in alignment, and that it adheres to the principle of least privilege by not requiring sensitive information where it's not needed.

Ok so the file now has a different name, generateLocalState to avoid confusion. It is not a 100% replacement for fetchLogs which was present in v0.x but it is a new utility for Maci v.1x - it provides support for fetching logs in chunks, sleep option for avoiding rate limiting of RPC providers, and the ability to save the full state in a nice and optimal JSON representation. This allows for better debugging, as it can be analyzed separately if any issues arise when generating proofs.

@ctrlc03 ctrlc03 changed the title feat(fetch-logs) - fetch logs in batches v1 feat(generateLocalState) - debugging utility for log fetching and local maci state generation Nov 9, 2023
@ctrlc03 ctrlc03 requested a review from baumstern November 9, 2023 15:38
@ctrlc03 ctrlc03 changed the title feat(generateLocalState) - debugging utility for log fetching and local maci state generation feat(genLocalState) - debugging utility for log fetching and local maci state generation Nov 9, 2023

echo "gen proofs ..."

# generate the local state
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice demonstration of how to use new command!

…s correctly and fix bug in curentMessageBatchIndex
… limiting, change utility name, add better error handling
…t as parameter when de-serializing the object
Copy link
Member

@baumstern baumstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@baumstern baumstern merged commit ca82385 into dev Nov 28, 2023
8 checks passed
@baumstern baumstern deleted the feat/fetchlogs branch November 28, 2023 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing MACI CLI, fetchLog, in v1.1.1 Couldn't get a large amount of logs from Alchemy
3 participants