Java – the ideal place to store binary data that can be rendered by calling a URL
I am looking for an ideal (performance efficient and maintainable) place to store binary data In my case, these are images I have to do some image processing, zoom the image and store it in a suitable location, which can be accessed through restful services
So far from my research, I have several choices, such as:
>NoSQL solutions such as mongodb and gridfs > store files in the file system in the directory hierarchy, and then use the web server to access images through URL > Apache jackrabbit document repository > store in caches such as Memcache and squid proxy
Do you want to choose which idea and why it will work or have a better method?
Solution
Just start using gridfs and operate exactly as you described
According to my experience so far, the main advantage of gridfs is that it does not require a separate file storage system Our entire persistence layer has been put into Mongo, so the next logical step is to store the file system there Flat namespaces are just rock and roll, allowing you to use rich query languages to get files based on any metadata you want to attach to them In our application, we use an 'appdata' object embedded with all ownership information to ensure that
Another thing to consider for NoSQL file storage, especially gridfs, is that it will be partitioned and extended together with other data If you have stored the entire database key values in the Mongo server, eventually if you need to expand the server cluster with more machines, your file system will grow
It can feel a small "black box" because the binary data itself is divided into blocks, which frightens those file systems used to based on classic directories With the help of management programs such as rockmongo, this has been alleviated
In short, storing images in gridfs is as simple as inserting documents, and most drivers in all major languages can handle everything for you In our environment, we upload images on the endpoint and use PIL to perform resizing Then get the image from Mongo from another endpoint that only outputs data, and imitate it as JPEG
Good luck!
Edit:
To illustrate how to upload a simple file using gridfs, here is the simplest method in pymongo, the python library
from pymongo import Connection import gridfs binary_data = 'Hello,world!' db = Connection().test_db fs = gridfs.GridFS(db) #the filename kwarg sets the filename in the mongo doc,but you can pass anything in #and make custom key-values too. file_id = fs.put(binary_data,filename='helloworld.txt',anykey="foo") output = fs.get(file_id).read() print output >>>Hello,world!
You can also query custom values if you like, which can be very useful if you want to query based on custom information related to the application
try: file = fs.get_last_version({'anykey':'foo'}) return file.read() catch gridfs.errors.NoFile: return None
These are just some simple examples. Many drivers in other languages (PHP, ruby, etc.) have cognates