Four coding skills that Java programmers must know
If you were asked to optimize your Java code now, what would you do? In this article, the author introduces four methods to improve system performance and code readability. If you are interested in this, let's take a look.
Our usual programming task is to apply the same technology suite to different projects. In most cases, these technologies can meet the goal. However, some projects may require some special technologies, so engineers have to study deeply to find the simplest but most effective methods. In a previous article, we discussed four special technologies that can be used when necessary. These special technologies can create better Java software; In this article, we will introduce some general design strategies and goal realization technologies that help to solve common problems, namely:
Only purposeful optimization
Use enumerations whenever possible for constants
Redefine the equals () method in the class
Use polymorphism as much as possible
It is worth noting that the techniques described in this article are not applicable in all cases. In addition, users need to consider when and where these technologies should be used.
1 . Only purposeful optimization
Large software systems must be very concerned about performance. Although we want to write the most efficient code, many times we can't start if we want to optimize the code. For example, will the following code affect performance?
It depends. As can be seen from the above code, its processing algorithm is O (n) ³) (use the large o symbol), where n is the size of the list set. If n is only 5, there will be no problem and only 25 iterations will be performed. But if n is 100000, it may affect the performance. Please note that even so, we can't judge that there will be a problem. Although this method requires 1 billion logical iterations, whether it will have an impact on performance remains to be discussed.
For example, if the client executes this code in its own thread and asynchronously waits for the calculation to complete, its execution time may be acceptable. Similarly, if the system is deployed in the production environment, but there is no client to call, there is no need to optimize this code, because it will not consume the overall performance of the system at all. In fact, after optimizing the performance, the system will become more complex. Unfortunately, the performance of the system has not been improved.
The most important thing is that there is no free lunch in the world. Therefore, in order to reduce the cost, we usually implement optimization through technologies such as caching, loop expansion or pre calculated value, which increases the complexity of the system and reduces the readability of the code. If this optimization can improve the performance of the system, it is worth it even if it becomes complex, but before making a decision, you must first know these two information:
What are the performance requirements
Where are the performance bottlenecks
First, we need to know clearly what the performance requirements are. If the end user is within the requirements and there is no objection from the end user, there is no need for performance optimization. However, when new functions are added or the data volume of the system reaches a certain scale, it must be optimized, otherwise problems may occur.
In this case, we should not rely on intuition or inspection. Because even experienced developers like Martin Fowler are prone to some wrong optimizations, as explained in refactoring (page 70):
After analyzing enough programs, you will find that the interesting thing about performance is that most of the time is wasted in a small part of the code in the system. If all the codes are optimized in the same way, the final result is that 90% of the optimization is wasted, because the optimized code does not run frequently. The time spent in optimization because there is no goal is a waste of time.
As a veteran developer, we should take this view seriously. The first guess not only did not improve the performance of the system, but also 90% of the development time was completely wasted. Instead, we should execute common use cases in the production environment (or pre production environment), find out which part is consuming system resources in the execution process, and then configure the system. For example, if only 10% of the code consumes most resources, optimizing the remaining 90% of the code is a waste of time.
According to the analysis results, in order to use this knowledge, we should start with the most common situations. Because this will ensure that the actual efforts can ultimately improve the performance of the system. After each optimization, the analysis steps should be repeated. This can not only ensure that the performance of the system is really improved, but also see the part of the performance bottleneck after the system is optimized (because after solving one bottleneck, other bottlenecks may consume more overall resources of the system). It should be noted that the percentage of time spent in the existing bottlenecks is likely to increase, because the remaining bottlenecks are temporarily unchanged, and the whole execution time should be reduced with the elimination of the target bottleneck.
Although it takes a lot of capacity to comprehensively check the profile in the Java system, there are still some common tools to help find the performance hotspots of the system, including JMeter, appdynamics and yourkit. In addition, you can also refer to dzone's performance monitoring guide for more information on Java program performance optimization.
Although performance is a very important part of many large software systems and a part of the automated test suite in the product delivery pipeline, it can not be optimized blindly and purposelessly. Instead, you should make specific optimizations for the performance bottlenecks you already know. This can not only help us avoid increasing the complexity of the system, but also let us avoid detours and waste time on optimization.
2. Use enumeration as much as possible for constants
There are many scenarios where users need to list a set of predefined or constant values, such as HTTP response codes that may be encountered in web applications. One of the most common implementation technologies is to create a new class. There are many static final values in this class. Each value should have a comment describing the meaning of the value:
It's great to have this idea, but there are still some disadvantages:
The passed in integer value is not strictly verified
Cannot call a method on a status code because it is a basic data type
In the first case, a specific constant is simply created to represent a special integer value, but there are no restrictions on methods or variables, so the value used may exceed the defined range. For example:
Although 15000 is not a valid HTTP response code, the client must provide a valid integer because there is no restriction on the server. In the second case, there is no way to define methods for state codes. For example, if you want to check whether a given status code is a successful code, you must define a separate function:
To solve these problems, we need to change the constant type from basic data type to custom type, and only allow specific objects of custom classes. This is exactly what Java enum is for. With enum, we can solve these two problems at once:
Similarly, you can now require that a valid status code be provided when calling a method:
It is worth noting that this example shows that if it is a constant, enumeration should be used as much as possible, but it does not mean that enumeration should be used in any case. In some cases, you may want to use a constant to represent a special value, but other values are also allowed. For example, you may know the PI. We can use a constant to capture this value (and reuse it):
Therefore, the rules for using enumeration can be summarized as follows:
When all possible discrete values are known in advance, enumeration can be used
Taking the HTTP response code mentioned above as an example, we may know all the values of the HTTP status code (which can be found in RFC 7231, which defines the HTTP 1.1 protocol). Therefore, enumeration is used. In the case of calculating pi, we do not know all possible values about PI (any possible double is valid), but we also want to create a constant for circular rugs to make the calculation easier (easier to read); Therefore, a series of constants are defined.
If you can't know all the possible values in advance, but you want to include the fields or methods of each value, the simplest way is to create a new class to represent the data. Although it is not said that enumeration should never be used in any scenario, the key to knowing where and when enumeration is not used is to be aware of all values in advance and prohibit the use of any other values.
3. Redefine the equals () method in the class
Object recognition can be a difficult problem to solve: if two objects occupy the same location in memory, are they the same? If their IDs are the same, are they the same? Or if all fields are equal? Although each class has its own identification logic, many western countries need to judge whether it is equal in the system. For example, there is a class that represents an order purchase
... as written below, there must be a lot of similarities in the code:
The more these logic calls (in turn, contrary to the dry principle), purchase
Class identity information will also become more and more. If, for some reason, purchase is changed
If the identity logic of the class (for example, the type of identifier is changed), there must be many locations where the identity logic needs to be updated.
We should initialize this logic inside the class instead of propagating the identity logic of the purchase class too much through the system. At first glance, we can create a new method, such as issame. The input parameter of this method is a purchase object, and compare the ID of each object to see whether they are the same:
Although this is an effective solution, it ignores the built-in function of Java: using the equals method. Each class in Java inherits the object class, although it is implicit, so it also inherits the equals method. By default, this method checks the object ID (the same object in memory), as shown in the following code snippet in the object class definition (version 1.8.0_131) in the JDK:
This equals method acts as a natural place to inject identity logic (by overriding the default equals Implementation):
Although the equals method looks complex, since the equals method only accepts parameters of type objects, we only need to consider three cases:
The other object is the current object (i.e. originalpurchase. Equals (originalpurchase)). By definition, they are the same object, so it returns true. The other object is not a purchase object. In this case, we cannot compare the purchase ID. therefore, the two objects are not equal
Other objects are not the same object, but are instances of purchase. Therefore, whether they are equal depends on whether the ID of the current purchase is equal to that of other purchase. Now we can reconstruct our previous conditions as follows:
In addition to reducing replication in the system, refactoring the default equals method has some other advantages. For example, if we construct a list of purchase objects and check whether the list contains another purchase object with the same ID (different objects in memory), we will get the value of true, because the two values are considered equal:
Generally, wherever you need to judge whether two classes are equal, you only need to use the overridden equals method. If you want to use the equals method implicit due to inheriting the object object to judge equality, you can also use the = = operator, as follows:
It should also be noted that when the equals method is overridden, the hashcode method should also be overridden. More information about the relationship between the two methods and how to correctly define hashcode
Method, see this thread.
As we can see, rewriting the equals method can not only initialize the identity logic within the class, but also reduce the diffusion of this logic in the whole system. It also allows the Java language to make informed decisions about the class.
4. Use polymorphism as much as possible
For any programming language, conditional sentence is a very common structure, and it exists for a certain reason. Because different combinations can allow users to change the behavior of the system according to a given value or the instantaneous state of the object. Assuming that the user needs to calculate the balance of each bank account, the following code can be developed:
Although the above code meets the basic requirements, it has an obvious defect: the user only determines the behavior of the system according to the type of a given account. This not only requires the user to check the account type every time before making a decision, but also needs to repeat this logic when making a decision. For example, in the above design, the user must check both methods. This can lead to runaway situations, especially when receiving a request to add a new account type.
We can use polymorphism to make implicit decisions, rather than using account types to distinguish. In order to achieve this, we convert the specific class of bankaccount into an interface, and pass the decision-making process into a series of specific classes, which represent each type of bank account:
This not only encapsulates the information specific to each account into its own class, but also enables users to change the design in two important ways. First, if you want to add a new bank account type, just create a new concrete class, implement the bankaccount interface, and give the concrete implementation of the two methods. In the conditional structure design, we must add a new value in the enumeration, add a new case statement in the two methods, and insert the logic of the new account under each case statement.
Second, if we want to add a new method in the bankaccount interface, we just need to add a new method in each concrete class. In conditional design, we must copy the existing switch statement and add it to our new method. In addition, we must add logic for each account type in each case statement.
Mathematically, when we create a new method or add a new type, we must make the same number of logical changes in polymorphism and conditional design. For example, if we add a new method to the polymorphic design, we must add the new method to the concrete classes of all n bank accounts, while in the conditional design, we must add n new case statements to our new method. If we add a new account type in the polymorphic design, we must implement all m numbers in the bankaccount interface, and in the conditional design, we must add a new case statement to each m existing method.
Although the number of changes we must make is equal, the nature of changes is completely different. In polymorphic design, if we add a new account type and forget to include a method, the compiler will throw an error because we do not implement all the methods in our bankaccount interface. In conditional design, there is no such check to ensure that each type has a case statement. If a new type is added, we can simply forget to update each switch statement. The more serious the problem, the more we repeat our switch statement. We are human beings and we tend to make mistakes. Therefore, whenever we can rely on the compiler to remind us of errors, we should do so.
The second important note about these two designs is that they are externally equivalent. For example, if we want to check the interest rate of a checking account, the condition design will be similar to the following:
Instead, the polymorphic design will be similar to the following:
From an external point of view, we just call getintereunk () on the bankaccount object. This will be more obvious if we abstract the creation process as a factory class:
It is very common to replace conditional logic with polymorphic classes, so methods to reconstruct conditional statements into polymorphic classes have been published. Here is a simple example. In addition, Martin Fowler's refactoring (P. 255) also describes the detailed process of performing this refactoring.
Like other techniques in this article, there is no hard and fast rule as to when to perform a transition from conditional logic to a polymorphic class. In fact, we do not recommend it under any circumstances. In Test Driven Design: for example, Kent Beck designed a simple currency system to use polymorphic classes, but found that this made the design too complex, so he redesigned his design into a non polymorphic style. Experience and reasonable judgment will determine when is the appropriate time to convert conditional code to polymorphic code.
Conclusion
As programmers, although the conventional technology usually used can solve most problems, sometimes we should break this Convention and actively demand some innovation. After all, as a developer, expanding the breadth and depth of our knowledge can not only make us make smarter decisions, but also make us smarter and smarter.