Read stream from S3 using clojure / Java

I have a large file on S3. I hope to decode and parse it when downloading I happen to use clojure Amazon library, but any library can

I can easily get a stream:

(def stream (-> (get-object "some-s3-bucket" "some-object-key") :input-stream))

; returns: #<S3ObjectInputStream com.amazonaws.services.s3.model.S3ObjectInputStream

But how to read streams? Can I read one line at a time (the extracted content is JSON line)?

(if there is any ambiguity in my question, I only care about the reading of the stream, not any part of gzip decoding)

Solution

Because s3objectinputstream just extends Java io. InputStream, you can:

>Use clojure's reader function to get BufferedReader. > Read data from the reader in any way allowed by clojure

>Get the delayed row sequence from BufferedReader using line SEQ If this makes sense for your JSON It may not. > Use an inert JSON parser, such as clj lazy JSON This particular parser can even handle raw streams, so you can safely skip step (1)

The content of this article comes from the network collection of netizens. It is used as a learning reference. The copyright belongs to the original author.
THE END
分享
二维码
< <上一篇
下一篇>>