by jasonpbecker on 1/22/23, 1:47 PM with 9 comments
by jasonpbecker on 1/22/23, 3:34 PM
``` df = Explorer.DataFrame.from_csv(filename = "my_file.txt", delimiter: "|", infer_schema_length: nil) {:error, {:polars, "Could not parse `OTHER` as dtype Int64 at column 3.\nThe current offset in the file is 4447442 bytes.\n\nConsider specifying the correct dtype, increasing\nthe number of records used to infer the schema,\nrunning the parser with `ignore_parser_errors=true`\nor adding `OTHER` to the `null_values` list."}} ```
Note, I added `infer_schema_length: nil` assuming that the data type discovery via sampling was just less good in `polars`, since this would have it read the whole file before determine types, but it still failed.
by af3d on 1/22/23, 2:43 PM
by nuc1e0n on 1/23/23, 6:36 AM
The thing that would prevent such issues is validation of the data you accept.
by jasonpbecker on 1/22/23, 1:47 PM