I need to scan very large JSONL files efficiently and am considering a parallel grep-style approach over line-delimited text.

Would love to hear how you would design it.

  • anton@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    0
    ·
    18 days ago

    Don’t have a thread doing line by line file reads, just to have it in memory. There is a piece of software optimized for tasks like this, the OS.
    Just mmap your file and start processing.