[Tracking] Tracking issue for the new memory efficient dataset.

Starting from `2.0a11`, PyHealth starts to use a disk-based memory efficient dataset to reduce the memory usage for large dataset such as MIMIC4. 

This issues tracks any potential bugs or improvements required for new memory efficient dataset. 

## Improvements
- [ ] https://github.com/sunlabuiuc/PyHealth/issues/763
- [x] https://github.com/sunlabuiuc/PyHealth/issues/774
https://github.com/sunlabuiuc/PyHealth/pull/783
- [x] https://github.com/sunlabuiuc/PyHealth/issues/764
https://github.com/sunlabuiuc/PyHealth/pull/766
- [x] Batched processing for task transformation to speed up.
https://github.com/sunlabuiuc/PyHealth/pull/750
- [x] Support multi-worker for task transformation.
https://github.com/sunlabuiuc/PyHealth/pull/748
- [x] Support configure `n_worker` for dask.
https://github.com/sunlabuiuc/PyHealth/pull/743

## Bugs
- [ ] https://github.com/sunlabuiuc/PyHealth/issues/833
- [ ] https://github.com/sunlabuiuc/PyHealth/issues/765
https://github.com/sunlabuiuc/PyHealth/pull/770
- [x] https://github.com/sunlabuiuc/PyHealth/issues/782
https://github.com/sunlabuiuc/PyHealth/pull/784
- [x] Temporary folder for dataset is not proprely cleaned after dataset processing.
https://github.com/sunlabuiuc/PyHealth/pull/753
- [x] Cached data is not cleaned if the program crashed in the middle, which may lead to corrupted cache file.
https://github.com/sunlabuiuc/PyHealth/pull/753
- [x] Incorrect null handling for patient_id and timestamp.
https://github.com/sunlabuiuc/PyHealth/pull/746
- [x] https://github.com/sunlabuiuc/PyHealth/issues/742
https://github.com/sunlabuiuc/PyHealth/pull/744

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tracking] Tracking issue for the new memory efficient dataset. #740

Improvements

Bugs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Tracking] Tracking issue for the new memory efficient dataset. #740

Description

Improvements

Bugs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions