Skip to content

TypeError: Couldn't cast array of type #8

@rayquazaMega

Description

@rayquazaMega

Thank you for this excellent work! I'm attempting to load dataset from CharlieDreemur/OpenManus-RL using the following code:

import datasets
datasets.load_dataset('CharlieDreemur/OpenManus-RL', name=None)

and I got TypeError below:

TypeError: Couldn't cast array of type
struct<role: string, content: string, type: string>
to
{'role': Value(dtype='string', id=None), 'content': Value(dtype='string', id=None), 'loss': Value(dtype='bool', id=None)}

It appears that the keys in the JSON file are not consistent across rows. To address this, I tried modifying the feature structure, but I ran into the same error:

from datasets import Features, Value
features = Features({
         'role': Value('string'),
         'content': Value('string'),
         'type': Value('bool')
     })
datasets.load_dataset('CharlieDreemur/OpenManus-RL', features=features)
datasets.table.CastError: Couldn't cast
id: string
conversations: list<item: struct<role: string, content: string, loss: bool>>
  child 0, item: struct<role: string, content: string, loss: bool>
      child 0, role: string
      child 1, content: string
      child 2, loss: bool
content: string
role: string
type: bool
to
{'role': Value(dtype='string', id=None), 'content': Value(dtype='string', id=None), 'type': Value(dtype='bool', id=None)}
because column names don't match

How can i fix this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions