Java secure coding guide: input verification
brief introduction
In order to ensure the security of Java programs, we believe that any input from external users may have malicious attack intention. We need to verify all user inputs to a certain extent.
This article will lead you to explore some scenarios of user input verification. Let's have a look.
Verify after string standardization
Usually, we need to filter some special characters during string verification, and then verify the string.
We know that in Java, characters are encoded based on Unicode. However, in Unicode, the same character may have different representations. So we need to standardize the characters.
Java has a special class normalizer to deal with the problem of character standardization.
Let's take the following example:
public void testNormalizer(){
System.out.println(Normalizer.normalize("\u00C1",Normalizer.Form.NFKC));
System.out.println(Normalizer.normalize("\u0041\u0301",Normalizer.Form.NFKC));
}
Output results:
Á
Á
We can see that although the Unicode of the two are different, the final characters are the same. Therefore, we must normalize before character verification.
Consider the following example:
public void falseNormalize(){
String s = "\uFE64" + "script" + "\uFE65";
Pattern pattern = Pattern.compile("[<>]"); // 检查是否有尖括号
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
throw new IllegalStateException();
}
s = Normalizer.normalize(s,Normalizer.Form.NFKC);
}
Where \ ufe64 represents <, and \ ufe65 represents >. The original intention of the program is to judge whether the input string contains angle brackets. However, because Unicode characters are directly passed in, direct compile cannot be detected.
We need to make the following changes to the code:
public void trueNormalize(){
String s = "\uFE64" + "script" + "\uFE65";
s = Normalizer.normalize(s,Normalizer.Form.NFKC);
Pattern pattern = Pattern.compile("[<>]"); // 检查是否有尖括号
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
throw new IllegalStateException();
}
}
Normalize before character validation.
Note the formatting of untrusted strings
We often use formatting to format the string. When formatting, if the formatted string contains user input information, we should pay attention to it.
Take the following example:
public void wrongFormat(){
Calendar c = new GregorianCalendar(2020,GregorianCalendar.JULY,27);
String input=" %1$tm";
System.out.format(input + " 时间不匹配,应该是某个月的第 %1$terd 天",c);
}
It's OK to take a rough look, but our input contains formatting information. Finally, the output result is:
07 时间不匹配,应该是某个月的第 27rd 天
In disguise, we get the information inside the system. In some cases, it may expose the internal logic of the system.
In the above example, we should also take input as a parameter, as shown below:
public void rightFormat(){
Calendar c = new GregorianCalendar(2020,27);
String input=" %1$tm";
System.out.format("%s 时间不匹配,应该是某个月的第 %terd 天",input,c);
}
Output results:
%1$tm 时间不匹配,应该是某个月的第 27rd 天
Use runtime with caution exec()
We know runtime Exec () is used to call system commands. If a malicious user calls "RM - RF /", everything is over.
So, we're calling runtime When using exec (), be careful to detect user input.
Take the following example:
public void wrongExec() throws IOException {
String dir = System.getProperty("dir");
Runtime rt = Runtime.getRuntime();
Process proc = rt.exec(new String[] {"sh","-c","ls " + dir});
}
In the above example, we read dir from the system properties, and then execute the LS command of the system to view the contents of dir.
If a malicious user assigns dir:
/usr & rm -rf /
Then the commands actually executed by the system are:
sh -c 'ls /usr & rm -rf /'
This leads to malicious deletion.
There are several methods to solve the above problems. The first method is to verify the input. For example, we only run dir, which contains specific characters:
public void correctExec1() throws IOException {
String dir = System.getProperty("dir");
if (!Pattern.matches("[0-9A-Za-z@.]+",dir)) {
// Handle error
}
Runtime rt = Runtime.getRuntime();
Process proc = rt.exec(new String[] {"sh","ls " + dir});
}
The second method is to use the switch statement to limit specific inputs:
public void correctExec2(){
String dir = System.getProperty("dir");
switch (dir){
case "/usr":
System.out.println("/usr");
break;
case "/local":
System.out.println("/local");
break;
default:
break;
}
}
Another is not to use runtime Instead of using the exec () method, use the method that comes with Java.
Regular expression matching
In the process of building regular expressions, if user-defined input is used, input verification is also required.
Consider the following regular expression:
(.*? +public\[\d+\] +.*<SEARCHTEXT>.*)
The above expression is intended to search for user input in log information such as public [1234].
However, the user can actually enter the following information:
.*)|(.*
Eventually, the regular expression becomes as follows:
(.*? +public\[\d+\] +.*.*)|(.*.*)
This results in matching all log information.
There are also two solutions. One is to use the white list to judge the user's input. One is to use pattern Quote() to escape malicious characters.
Code for this article:
learn-java-base-9-to-20/tree/master/security