-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interesting work #56
Comments
Hi! I just took a look at your library. It looks great, I would agree about collaborating on an exif library would be useful. This library focuses on coverage and performance. I think Phil Harvey's exiftool is a great library and would be useful to have something similar in golang. What ideas did you have? |
So I primarily created my own library because I wanted to load images lazily onto disk. I work with hyperspectral images which are in the gigabytes of size and it'd be nice to implement the My pipeline basically is as follows:
Right now the process is dog slow, taking several minutes to process a single result hyperspectral image. I think if the images are cached smartly the process can be sped up by a lot. My problem is that I do not know how to generically process the images. There's a pipeline for it but I do not understand how it was created. So I do not know how to process the images I have today based on their EXIF data i.e. implement the image.Image interface. So there are my reasons for building the library. As for having an exiftool in golang- I'm all for for it. Take what you need from my library! What I'm particularily fond of from my library is that extracting all exif data and visualizing it is pretty easy and straightformward. I can't tell if your library only extracts only certain exif tags contained or all exif tags, judging by the exif2.Exif type. I also think there's value in providing a simple implementation with no external dependencies which others can then build upon. As it stands most of other exif libraries have tons of imports and are inmensely complex and hard to maintain (thinking of the go-exif library). I'd like to have a library that is simple, performant and easy to use. |
My need is to process tens of hundreds of images quickly, many of these images are RAW images and don't need to be fully read from disk. I like your usage of I agree that a "ReadSeeker interface" is more efficient than loading the entire image into memory. In regards to the exif2.Exif type, I agree that it is limited. @mholt brought this up previously #49. The challenge that I have faced with using a map is that it significantly slows down processing. I believe most, if not all commonly used fields can be represented in a a more efficient way. This library could improve on reducing dependencies. Some of the dependencies are for testing purposes but a more simple/straight forward library would be beneficial. What interface methods would you suggest for simple, performant, and easy to use? @abrande, @matrixik , and @mholt would appreciate any insight you have. |
The output of my images is TIFF I believe. I'm not really sure. If I set the file extension to As for my suggestions, I am no perfomance engineer- I have little knowledge on performant data structures like trees and such. I'd be happy with a library that is not blazingly fast, just fast enough and a pleasure to use. I wonder if the way I deal with the API would serve your use case. It does have more boilerplate but I'm afraid you would not be able to get around it without sacrificing flexibility or performance: xResTag, err := decoder.GetTag(fp, 0, exifid.XResolution)
if err != nil {
panic("resolution tag not found")
}
fmt.Printf("the x resolution is %v", xResTag.Value()) Theres a couple things to note about this API:
|
I do love the idea of combining your libraries! In my use case, decoding and processing images is way more expensive than reading metadata. So while I value performance, squeezing out every possible allocation (as awesome as that is) is not the top priority for me, since image processing in my pipeline is way slower anyway. That said, a library that streams the input rather than loading the whole thing into memory is a must, at least. Which is why I like your packages so far. What is also important for my use case is the ability to read any and all metadata fields. I'm ok with A way to iterate all the data and also access any arbitrary fields are necessary for me. Maybe it'd be possible to combine these features into a performance focused library with a note in the docs that using these may have more allocations? I'm personally ok with that for my needs. |
@mholt It does sound like my library might fulfill your requirements for exif parsing. That said, I'm not sure if it covers all of your needs. It'd be great to have an experience report to know if it's worth merging the libraries. Maybe there's a fatal flaw in how I've approached the problem that is only known on real use. |
To clarify- I think we could better make the decision on how to merge the libraries with more UX. As it stands I'm unsure on how to go about it... |
Haven't been able to get back to replying. Below I have some suggestions for what the end user would work with. This can be done while retaining flexibility and performance. I think it would be important to agree on what would be the best User facing interface. The non-exported code can change but it would be important to agree on a simple API that can remain constant. Helper values for GPS, Dates, and "etc" would still need to be defined as there are many types included in exif information. Below are some suggestions. // example
sr := io.ReadSeeker // any implementation of io.ReadSeeker
reader, err := exif.Decode(sr)
if err != nil {
panic("exif decoder failed")
}
xResTag, err := reader.GetTag(0, exififd.XResolution) // GetTag(ifd Number, Tag ID) Tag
if err != nil {
panic("resolution tag not found")
}
fmt.Printf("the x resolution is %v", xResTag.Value()) // tag.Value() string
// Exif Tag
type Tag interface {
Value() any // returns tag value as specified by defaults
Uint16() uint16 // reffered to Short
Int16() int16 // reffered to SignedShort
Uint32() uint32 // reffered to Long
Int32() int32 // reffered to SignedLong
Uint64() uint64 // refrered to as double
Rational32() Rational32 // returns tag as Rational32
Rational64() Rational64 // returns tag as Rational64
String() string // returns string value of tag
Bytes() []byte // returns original byte value of tag
Valid() bool // returns false if tag does not meet "valid parameters"
}
// Exif Rational
type Rational interface {
Float32() float32
Float64() float64
}
type Rational32 [2]uint32 // Similar to Exif Rational
type Rational64 [2]uint64 Thanks for the suggestions. |
Hey! Thanks for getting back and for providing a comprehensive list of suggestions for the API. Do try to test out my library and see if you get what I mean. Instead of having all the integer methods there is a single // Exif Tag
type Tag interface {
Int() (int64,error) // from 8bit integers to 32 bit integers (I couldnt find examples of 64bit integers in EXIF)
Float() (float64, error) // double or single precisions
Rational() (Rational, error)
Bytes() ([]byte, error) // can return both string or byte kind.
} See code relevant to the Tag interface here. Moreover I'd like you to see if the rational.Rational intertface I have devised suits your needs. I tried to make everything as simple as possible and easy to use and understand! |
(I'm following along, and hope to have more time to respond later this week -- but don't wait up for me in the meantime 😅 - discussion so far looks great) |
It may take me some time to give an "experience report", as I'd need to refactor a significant part of my app to use the different API -- although I understand why that'd be helpful. @evanoberholster Overall, I like that API -- and I do like @soypat's suggestion of fewer methods. For instance, I imagine |
The String() method is currently used to implement the fmt.Stringer interface so users can easily pretty print it. You can still convert the I'll also mention in passing that I saw your conversation today with Dustin over at go-exif and went ahead and tried my hand at implementing a exif data offset searcher. |
Is that different than just getting the value as a string? 🤔 I feel like that's all I'd want a Stringer to do anyway...
I know, it's just annoying 🙃 Not a big deal though if that's just how it has to be.
Cool! That seems useful. Again, if it helps for information, my particular use case is to be able to load images, potentially very large ones, efficiently (streaming), and to extract as much of the metadata as possible. 👍 |
The issue here is that a Tag is defined by the Exif ID and the data it contains so I'd like to print out a human readable representation of both fields. For the special case of wanting only the value's string representation I've created a new
Hmm- I wonder if the API would be comfy enough for this use case 🤔. I imagine maybe the stream reader could be wrapped to implement the |
I see... is its output kind of like what |
Yes, more or less.
Another issue I found is that if we want |
I've written my own exif parser recently. Maybe we can consolidate work?
My library is quite simple- Keep in mind I learned about exif last week and all of it is quite new to me.
https://github.com/soypat/exif
Benchmarks show your library is around 2 times faster for parsingexif metadata.
The text was updated successfully, but these errors were encountered: