Runs slow. Anyone interested in improving performance? #43

mrolle45 · 2017-11-27T21:18:12Z

I don't want to take the time right now to submit performance enhancements, but perhaps @moyix or some other person reading this not would like to do the work.
I find that a tremendous amount of time is spent with file reads, string concatenations, and substring operations. There are two ways to speed things up that I have seen, and would be simple to implement:

In StreamFile class, cache the stream pages, so you only have to read them once from the file. Or better, if the platform supports mmap, just mmap the entire PDB file, create a buffer for it, and take a slice of the buffer for a stream page whenever you need it. In the non-mmap case, you could add a method to clear the cache, to be called, for instance, after parsing the entire stream.
In StreamFile._read, see how many pages are spanned by the request. Use the above cache / mmap to get slices of individual pages. Return the slice, or a concatenation of two slices, or use CStringIO to assemble more than two slices. Using _read_pages is inefficient because then you have to take a slice of the result.

I think this would eliminate most of the time spent in parsing a PDB as a whole. You could try profiling pdbparse with a large file, such as ntoskrnl.pdb.

The text was updated successfully, but these errors were encountered:

ZhangShurong · 2018-08-21T15:10:23Z

@mrolle45 I tried mmap, But it still very slow, Do you have any suggestions?

moyix · 2018-08-21T19:06:55Z

You should try profiling, but my guess is that some of the slowness is due to the use of Construct. One workaround is to only parse the streams you need for a particular task; you can see an example of this here:

pdbparse/examples/pdb_tpi_vtypes.py

Lines 160 to 162 in ea5f2aa

    
           pdb = pdbparse.parse(args[0],fast_load=True) 
        
           pdb.STREAM_TPI.load() 
        
           pdb.STREAM_DBI.load()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runs slow. Anyone interested in improving performance? #43

Runs slow. Anyone interested in improving performance? #43

mrolle45 commented Nov 27, 2017

ZhangShurong commented Aug 21, 2018

moyix commented Aug 21, 2018

Runs slow. Anyone interested in improving performance? #43

Runs slow. Anyone interested in improving performance? #43

Comments

mrolle45 commented Nov 27, 2017

ZhangShurong commented Aug 21, 2018

moyix commented Aug 21, 2018