Set correct checksums and samps_per_frame in Record.wrsamp#424
Set correct checksums and samps_per_frame in Record.wrsamp#424tompollard merged 5 commits intomainfrom
Conversation
The original intention of this code was that get_write_fields would return a list of signal information fields to be saved, and check_sig_cohesion would check that those specified fields matched the actual signal data. However, this did not do what was intended. HeaderMixin.get_write_fields returns a 2-tuple of (rec_write_fields, sig_write_fields), not a list of field names. Therefore, the parameter to check_sig_cohesion never had "checksum" or "init_value" as elements, and therefore, check_sig_cohesion never actually checked the values of these fields when writing a record. Many existing test cases rely on being able to read and write a record as-is (missing fields, incorrect init_values, checksums that are correct mod 2**16 but not equal to the result of calc_checksum); these tests would fail if Record.wrsamp actually required init_value and checksum fields to match as check_sig_cohesion expects. This code should be redesigned, and the test cases probably should be less strict. In the meantime, however, make it clear that this function isn't actually doing what it was pretending to do.
When the expanded argument to wrsamp is false, we are writing the d_signal data, which contains one sample per frame by definition. Therefore, the header file must not claim that the signal files will contain multiple samples per frame. The samps_per_frame attribute of the Record object may or may not be set appropriately for the actual data contained in the Record object; for example, rdrecord(r, smooth_frames=True) will set samps_per_frame according to the number of samples per frame in the original record.
When writing a record, the application generally shouldn't have to worry about details like calculating checksums. The wfdb package *should* ensure that the output files it generates are correct, and *should not* trust applications to calculate checksums correctly. Moreover, although it was originally intended for SignalMixin.wr_dats to verify that the application-supplied checksums were correct, this never actually worked, and existing test cases depend on that fact. So it is not reasonable for wr_dats to raise an exception if the checksums are wrong. Instead, update the checksums in the Record object when wrsamp is called (and note that we have to do this before calling wrheader, which is called before wr_dats). For compatibility with old test cases, that expect to be able to read and write records quasi-unmodified, we don't set checksums for signals that previously had a checksum of None, or that previously had a checksum that was correct mod 2**16.
Also reformat this as an f-string for readability.
|
Thanks @bemoody, basically this looks good to me, though I don't totally understand the purpose of the Not directly related to this pull request, but I think the addition/removal of fields in the header makes them less human-readable. I'd prefer consistent fields that are sometimes populated with something like "NULL" than fields that disappear when not required,. |
|
If samples-per-frame is omitted from the header file, it defaults to 1, which is what we want when writing Multi-frequency was a fairly late addition to the format, which is why the samples-per-frame field in the header is optional. If I were designing a format or an API now, multi-frequency support wouldn't be optional. |
|
Makes sense, thanks Benjamin. |
There are a couple of ways to write a record using this package. One is
wfdb.wrsamp, the other is creating aRecordobject and calling thewrsampmethod.Record.wrsampwas, I think, meant to be more low-level and perhaps only intended for internal use of the package. But for various reasons, applications may want or need to call this method directly. So I think we need to support it, and make it more fool-proof; in particular, this method shouldn't generate invalid WFDB records even if the application supplies wrong parameters.Here I address two of those parameters -
checksumandsamps_per_frame.Typically,
checksumis set bywfdb.rdrecord. But naturally, the input data may have been modified (explicitly by the application, or implicitly e.g. bysmooth_frames) between reading the record and writing it. So when callingRecord.wrsamp, we want to set this field to the actual checksums of the signals. For backward compatibility and for applications that want to edit an existing header file, we leave the checksums alone if they're absent or if they're already correct.(WFDB checksums are stored and written as an integer but only the low 16 bits are used. So values of -1 and 65535 are equally correct. Yes, this is kind of broken.)
samps_per_frameis only relevant when writing multi-frequency data. If you are writing single-frequency signal files, then the header file must not claim there are multiple samples per frame. Since the field is optional, we can simply drop it whenexpandedis false.(If you are writing multi-frequency signal files, then
Record.wrsampdoes correctly verify thatsamps_per_frameis consistent with the dimensions ofe_d_signal.)Fixes issue #333.