What is the best way to detect repeated uploads of files in the Java environment?
As part of a Java - based web application, I will accept the uploaded xls& . CSV (and possibly other types) files Each file will be uniquely renamed using a combination of parameters and timestamp
I want to be able to identify any duplicate files Copy I mean, it's the same file regardless of its name Ideally, I want to be able to detect duplicates as soon as possible after uploading so that the server can include this information in the response (if the processing time is based on the file size, it will not cause too much delay.)
I've read about running MD5 on files and storing the results as unique keys... But I doubt there's a better way Is there a better way
Any suggestions on how best to approach this point are appreciated
thank you.
Update: I know nothing about using MD5 I have used Perl (Digest:: MD5) several times in the past I think there may be another (better) solution in the Java world But it seems that I was wrong
Thank you for your answers and comments I feel very good about using MD5 now
Solution
When processing uploaded files, use digestoutputstream to modify OutputStream to calculate the summary of the file when writing Store the final summary somewhere and the unique identifier of the file (in hexadecimal form as part of the file name?)