Hadoop with Python

Spark can read files residing on the local filesystem, any storage source supported by Hadoop, Amazon S3, and so on. Spark supports text files, SequenceFiles, any other Hadoop Input‐ Format, directories, compressed files and wildcards.

I should I knew Spark can read compressed gigabyte CSVs.