Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Requirement] Slicing dataset based on a certain log. #37

Open
YooSunYoung opened this issue Jun 14, 2024 · 4 comments
Open

[Requirement] Slicing dataset based on a certain log. #37

YooSunYoung opened this issue Jun 14, 2024 · 4 comments

Comments

@YooSunYoung
Copy link
Member

Executive summary

Dataset should be split based on the timestamp of the certain logs.

Context and background knowledge

During the measurement, there will be some changes and they will be logged into the files.
They will have timestamps and the dataset should be binned(sliced) based on thaose timestamps.

Inputs

File or dataset that contains NXlog or other time-dependent fields.

Methodology

I think it'll be nice if there is a binning helper that uses foreign key for binning.
For example, we need to use the timestamp of the rotation_angle to slice the data, and the timestamp will be the foreign key.

Otherwise, we can just assume the foreign key will always be the timestamp...?

Outputs

Sliced dataset as....

@scipp/ess-maintainers I'm not sure if it should be a DataGroup or DataArray...

Which interfaces are required?

Python module / function

Test cases

There is a timepix dataset that recorded rotation angle of the sample.
The dataset should be split based on the rotation angle.

Comments

No response

@SimonHeybrock
Copy link
Member

The description here is unclear, I am not sure what is required, aside from existing Scipp functionality.

@YooSunYoung
Copy link
Member Author

We just need a helper that derives timestamp slicing based on the f144 values.

We could just do it case-by-case with existing scipp functionality but I thought it's a very common thing that every instrument needs.
So I think it's worth wrapping it as a common tool...?

@SimonHeybrock
Copy link
Member

Can you clarify what you have in mind, beyond the built-in functionality like https://scipp.github.io/user-guide/binned-data/filtering.html#Compute-derived-event-parameters-from-time-series-or-other-metadata? Aside from that, is this a duplicate of #24?

@jokasimr
Copy link
Contributor

jokasimr commented Jul 9, 2024

@YooSunYoung

Is this the functionality that is required here?

import h5py
import scippnexus as snx
import scipp as sc

with h5py.File('dummy.nxs', mode='w', driver="core", backing_store=False) as f:
    # setup, create a NXlog...
    da = sc.DataArray(   
        sc.array(dims=['time'], values=[1.1, 2.2, 3.3]),
        coords={
            'time': sc.epoch(unit='ns')
            + sc.array(dims=['time'], unit='s', values=[4.4, 5.5, 6.6]).to(
                unit='ns', dtype='int64'
            )
        },    
    )
    log = snx.create_class(f, 'log', NXlog)    
    snx.create_field(log, 'value', da.data)
    snx.create_field(log, 'time', da.coords['time'] - sc.epoch(unit='ns'))
    log = snx.Group(log, definitions=snx.base_definitions())

    # Slice part of it based on the `time` column
    print(log['time', sc.scalar(5.5, unit='s').to(unit='ns'):])

Output:

<scipp.DataArray>
Dimensions: Sizes[time:2, ]
Coordinates:
* time                    datetime64             [ns]  (time)  [1970-01-01T00:00:05.500000000, 1970-01-01T00:00:06.600000000]
Data:
                            float64  [dimensionless]  (time)  [2.2, 3.3]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants