Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP]: SystemStackError: stack level too deep when parsing specific PDF files with catalogs[:Resources] #242

Open
tagliala opened this issue Nov 8, 2024 · 1 comment

Comments

@tagliala
Copy link

tagliala commented Nov 8, 2024

Refers to #214, but it is not the same issue

I can't replicate #214 with the same use case because I'm getting malformed PDF file? error

Unfortunately I can't share the file, but this is the error

SystemStackError: 7657 -> 37
     +->> 3812 cycles of 2 lines:
     | ~/dev/combine_pdf/lib/combine_pdf/parser.rb:737:in `merge'
     | ~/combine_pdf/lib/combine_pdf/parser.rb:737:in `block in <class:PDFParser>'
     +-<<

It is happening here:

(inheritance_hash[:Resources][:referenced_object] || inheritance_hash[:Resources]).update((catalogs[:Resources][:referenced_object] || catalogs[:Resources]), &HASH_UPDATE_PROC_FOR_OLD)

I've tested against some open PRs, several versions of combine_pdf, and checked existing issues, unfortunately there's no solution at the moment

I'm still investigating

Standalone reproducible test case (can't provide the file, sorry)

#!/usr/bin/env ruby

# frozen_string_literal: true

FILE = ARGV[0]

if FILE.nil?
  puts "Usage: #{File.basename($0)} path.to.pdf"
  exit 1
elsif !File.exist?(FILE)
  puts "File '#{FILE}' not found"
  exit 1
end

require 'bundler/inline'

gemfile(true) do
  source 'https://rubygems.org'

  gem 'combine_pdf'
  gem 'minitest'
end

require 'combine_pdf'
require 'minitest/autorun'

class BugTest < Minitest::Test
  def test_combine_pdf_parse
    assert CombinePDF.load(FILE)
  end
end
> puts catalogs[:Resources][:referenced_object].keys.inspect

Keys:
[:Font, :XObject, :ExtGState, :ProcSet, :indirect_generation_number, :indirect_reference_id]

Problem in key: `:XObject`

Keys (:XObject):
[:Im19, :Im22, :Im23, :Im26, :Im37, :Im47, :Im48, :Tr20, :Tr24, :Tr52]

Problem in keys:
[:Tr20, :Tr24, :Tr52]

Keys (:Tr20)
[:is_reference_only, :referenced_object]

Problem in key: :referenced_object

Keys (:referenced_object)
[:Type, :Subtype, :BBox, :Resources, :Group, :Length, :Filter, :raw_stream_content, :indirect_generation_number, :indirect_reference_id]

Problem in key:
:Resources

Keys (:Resources)
[:is_reference_only, :referenced_object]

Problem in key:
:referenced_object

Keys: (:referenced_object)
[:Font, :XObject, :ExtGState, :ProcSet, :indirect_generation_number, :indirect_reference_id]

Problem in key:
:XObject

Keys (:XObject):
[:Im19, :Im22, :Im23, :Im26, :Im37, :Im47, :Im48, :Tr20, :Tr24, :Tr52]

Same as above? Recursive Hash?
@tagliala
Copy link
Author

tagliala commented Nov 8, 2024

Question: Is "HASH_UPDATE_PROC_FOR_OLD" a custom implementation of a reverse_deep_merge algorithm?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant